<<

University of Kentucky UKnowledge

Theses and Dissertations--Biology Biology

2020

FROM GENES TO : ECOLOGICAL WITH IN PINETUM AND N. LECONTEI

Emily E. Bendall University of Kentucky, [email protected] Author ORCID Identifier: https://orcid.org/0000-0003-2524-088X Digital Object Identifier: https://doi.org/10.13023/etd.2020.221

Right click to open a feedback form in a new tab to let us know how this document benefits ou.y

Recommended Citation Bendall, Emily E., "FROM GENES TO SPECIES: WITH GENE FLOW IN NEODIPRION PINETUM AND N. LECONTEI" (2020). Theses and Dissertations--Biology. 62. https://uknowledge.uky.edu/biology_etds/62

This Doctoral Dissertation is brought to you for free and open access by the Biology at UKnowledge. It has been accepted for inclusion in Theses and Dissertations--Biology by an authorized administrator of UKnowledge. For more information, please contact [email protected]. STUDENT AGREEMENT:

I represent that my thesis or dissertation and abstract are my original work. Proper attribution has been given to all outside sources. I understand that I am solely responsible for obtaining any needed copyright permissions. I have obtained needed written permission statement(s) from the owner(s) of each third-party copyrighted matter to be included in my work, allowing electronic distribution (if such use is not permitted by the fair use doctrine) which will be submitted to UKnowledge as Additional File.

I hereby grant to The University of Kentucky and its agents the irrevocable, non-exclusive, and royalty-free license to archive and make accessible my work in whole or in part in all forms of media, now or hereafter known. I agree that the document mentioned above may be made available immediately for worldwide access unless an embargo applies.

I retain all other ownership rights to the copyright of my work. I also retain the right to use in future works (such as articles or books) all or part of my work. I understand that I am free to register the copyright to my work.

REVIEW, APPROVAL AND ACCEPTANCE

The document mentioned above has been reviewed and accepted by the student’s advisor, on behalf of the advisory committee, and by the Director of Graduate Studies (DGS), on behalf of the program; we verify that this is the final, approved version of the student’s thesis including all changes required by the advisory committee. The undersigned agree to abide by the statements above.

Emily E. Bendall, Student

Dr. Catherine R. Linnen, Major Professor

Dr. David W. Weisrock, Director of Graduate Studies FROM GENES TO SPECIES: ECOLOGICAL SPECIATION WITH GENE FLOW IN NEODIPRION PINETUM AND N. LECONTEI

______

DISSERTATION ______A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the College of Arts and Sciences at the University of Kentucky

By Emily Elizabeth Bendall Lexington, Kentucky Director: Dr. Catherine R. Linnen, Associate Professor of Biology Lexington, Kentucky 2020

Copyright © Emily Elizabeth Bendall 2020 https://orcid.org/0000-0003-2524-088X ABSTRACT OF DISSERTATION

FROM GENES TO SPECIES: ECOLOGICAL SPECIATION WITH GENE FLOW IN NEODIPRION PINETUM AND N. LECONTEI

My dissertation focuses on how differences accumulate across the genome during ecological speciation with geneflow. To do this I used two species of Neodiprion pine , which are plant-feeding hymenopterans with high host specificity. I used experimental crosses to measure both intrinsic and extrinsic postzygotic isolation and to understand the contribution of specific traits to . Despite substantial and haploid males in which all recessive incompatibilities should be expressed, I found surprisingly little evidence of intrinsic postzygotic isolation. Recombination in males may reconstitute viable genotypes and counteract the effects of haploidy in males. Nevertheless, hybrids have drastically reduced due to intermediate host-use traits causing strong extrinsic postzygotic isolation. Together, these results suggest that divergent selection on host-use traits is the primary driver of speciation in these, and likely other, plant-feeding . Next, I performed a QTL mapping study of the traits under divergent selection that contribute to extrinsic postzygotic isolation to understand how genetic architecture can constrain or promote speciation and . I found that opposing between host-choice and host-use traits composes the genetic basis of the earlier detected extrinsic postzygotic isolation. This opposing dominance is part of a growing body of work showing that trait mismatch and not hybrid intermediacy is typically how extrinsic postzygotic isolation is formed. My fourth chapter focuses on how haplodiploid sex determination shapes how populations accumulate differences across the genome during speciation. Using a combination of demographic analysis of pine sawflies, population genetic simulations, and a meta-analysis, I found that compared to diploids, haplodiploids have predictably higher and more variable differentiation across the genome when they diverge in the presence of gene flow. Overall, Neodiprion sawflies present a great opportunity to better understand the genetics of adaptation and speciation.

KEYWORDS: Ecological Speciation, Adaptation, Evolutionary Genetics, Neodiprion, Haplodiploidy, Extrinsic Postzygotic Isolation Emily Elizabeth Bendall

04/20/20 Date FROM GENES TO SPECIES: ECOLOGICAL SPECIATION WITH GENE FLOW IN NEODIPRION PINETUM AND N. LECONTEI

By Emily Elizabeth Bendall

Catherine R. Linnen Director of Dissertation

David W. Weisrock Director of Graduate Studies

04/20/2020 DEDICATION

To my father, who always inspired me. I will be eternally grateful for all his love and support. ACKNOWLEDGMENTS

I would not have been able to complete this dissertation without the help of so many people. Catherine Linnen has been an amazing mentor and has provided me with so much support, guidance, and insights. I truly couldn’t have done this dissertation without her. To all of the undergrads that assisted in the research; Melanie Hurst, Nicole Payne, Anna Sosso, Julian Horn, Ethan Hunt, Hallee Mason, Parisa Shamaei Zadeh, Ndeye Thiaw, Brianna Washington, Kayla Mattingly, and Ally Appel; and all of those that helped with rearing and taking care of the countless sawflies, I wouldn’t have been able to do this research without you. Robin Bagley, Claire O’Quin, Kim Vertacnik, Danielle Herrig, Ashleigh Glover, John Terbot, and Megan Thomas as the Linnen lab has provided so much support through being there to commiserate when something inevitably goes wrong (something always goes wrong), to celebrate with when things finally start working, and for countless pieces of advice. All of you made the Linnen lab a wonderful place to work. The lab environment got even better once we joined with the Weisrock lab a few years ago to form the Super Lab. Thank you to everyone who has made the Super Lab such as success!

The Biology department as a whole has been very supportive and have given me many opportunities. I would especially like to thank Jacquie Burke who made sure that all of the critical behind the scenes work got done and that I never missed a deadline. Jacquie’s door was always open and she was a continual source of positivity. Dave Weisrock and Dave Westneat, as directors of graduate studies were both instrumental in making the program run smoothly. Additionally, Dave Weisrock was on my committee with Chuck Fox and Jeremy Van Cleve. My committee provided me with important guidance and insights during my dissertation. My collaborators, Vitor Sousa, Amanda Moehring, Andres Bendesky, and Natalie Niepoth, were absolutely delightful to work with and added so much to my dissertation. I also want to thank all of my funding sources, the University of Kentucky, the biology department, USDA, and NSF for their monetary support.

iii The support of my family has been invaluable. Thank you for all ways being there to listen to me and only occasionally telling me to stop talking about my work. I love you all!

iv TABLE OF CONTENTS

ACKNOWLEDGMENTS ...... iii

LIST OF TABLES ...... ix

LIST OF FIGURES ...... x

CHAPTER 1. INTRODUCTION ...... 1

1.1 ECOLOGICAL SPECIATION...... 2

1.2 GENETICS OF ADAPTATION ...... 6

1.3 GENOME WIDE PATTERNS OF DIVERGENCE ...... 8

1.4 SAWFLIES AS A STUDY SYSTEM ...... 11

1.5 DISSERTATION OVERVIEW...... 12 1.5.1 Chapter 2. Oviposition traits generate extrinsic postzygotic isolation between two pine species ...... 12 1.5.2 Chapter 3. Genetic architecture of oviposition traits ...... 13 1.5.3 Chapter 4. Lack of intrinsic postzygotic isolation in haplodiploid male hybrids despite high genetic distance ...... 14 1.5.4 Chapter 5. Haplodiploidy leads to heterogeneous genomic divergence during speciation with gene flow...... 14 1.5.5 Chapter 6. conclusion ...... 15

CHAPTER 2. OVIPOSITION TRAITS GENERATE EXTRINSIC POSTZYGOTIC ISOLATION BETWEEN TWO PINE SAWFLY SPECIES ...... 16

2.1 ABSTRACT ...... 16

2.2 INTRODUCTION ...... 17

2.3 METHODS ...... 21 2.3.1 collection and rearing ...... 21 2.3.2 Host needle width ...... 22 2.3.3 Oviposition behavior ...... 22 2.3.4 Ovipositor morphology ...... 24 2.3.5 Cross oviposition behavior and success ...... 24

2.3.6 Impact of oviposition traits on BCL success ...... 26

v

2.4 RESULTS ...... 28 2.4.1 Host needle width ...... 28 2.4.2 Oviposition behavior ...... 28 2.4.3 Ovipositor morphology ...... 29 2.4.4 Cross oviposition behavior and success ...... 31

2.4.5 Impact of oviposition traits on BCL oviposition success ...... 34

2.5 DISCUSSION ...... 35

2.6 CONCLUSIONS ...... 40

CHAPTER 3. THE GENETIC ARCHITECTURE OF OVIPOSTION TRAITS AND HATCHING SUCCESS IN NEODIPRION PINETUM AND N. LECONTEI...... 42

3.1 ABSTRACT ...... 42

3.2 INTRODUCTION ...... 42

3.3 METHODS ...... 46 3.3.1 Crosses ...... 46 3.3.2 Phenotyping ...... 47 3.3.3 Data analysis of phenotypes and hatching rate ...... 48 3.3.4 DNA extraction and sequencing ...... 49 3.3.5 QTL mapping ...... 49

3.4 RESULTS ...... 50 3.4.1 Crosses and sequencing ...... 50 3.4.2 Phenotypes and hatch rates ...... 51 3.4.3 QTL mapping ...... 54

3.5 DISCUSSION ...... 59

CHAPTER 4. LACK OF INTRINSIC POSTZYGOTIC ISOALTION IN HAPLODIPLOIDD MALE HYBRIDS DESPITE HIGH GENETIC DISTANCE...... 63

4.1 ABSTRACT ...... 63

4.2 INTRODUCTION ...... 64

4.3 METHODS ...... 66 4.3.1 Study System Details and Overall Approach ...... 66 4.3.2 IPI in females ...... 67

vi

4.3.3 IPI in males ...... 68 4.3.4 Comparing observed IPI in haplodiploids to predicted IPI in heteromorphic diploid taxa ...... 69

4.4 RESULTS ...... 70 4.4.1 IPI in females ...... 70 4.4.2 IPI in males ...... 72 4.4.3 Genetic distance ...... 72

4.5 DISCUSSION ...... 73

CHAPTER 5. HAPLODIPLOIDY LEADS TO HETEROGENEOUS GENOMIC DIVERGENCE DURING SPECIATION WITH GENE FLOW ...... 79

5.1 ABSTRACT ...... 79

5.2 INTRODUCTION ...... 79

5.3 GENOMIC DIVERGENCE IN PINE SAWFLIES ...... 81

5.4 META-ANALYSIS OF SEX ...... 84

5.5 SIMULATIONS OF HAPLODIPLOIDS ...... 85

5.6 SIMULATIONS OF AUTOSOMES AND X/Z CHROMOSOMES...... 87

5.7 MECHANISMS ...... 88

5.8 CONCLUSIONS ...... 90

5.9 METHODS ...... 91 5.9.1 Population sampling ...... 91 5.9.2 DNA sequencing...... 92 5.9.3 DNA processing and variant calling...... 92 5.9.4 Population divergence statistics ...... 93 5.9.5 Meta-analysis ...... 94 5.9.6 Simulations of divergent selection ...... 95

CHAPTER 6. CONCLUSION ...... 97

APPENDICES ...... 101

APPENDIX A. OVIPOSITION TRAITS GENERATE EXTRINSIC POSTZYGOTIC

ISOLATION BETWEEN TWO PINE SAWFLY SPECIES ...... 101

vii

APPENDIX B. GENETIC ARCHITECTURE OF OVIPOSITION TRAITS AND

HATCHING SUCCESS IN NEODIPRION PINETUM AND N. LECONTEI ...... 105

APPENDIX C. LACK OF INTRINSIC POSTZYGOTIC ISOLATION IN

HAPLODIPLOID HYBRID MALES DESPITE HIGH GENETIC DISTANCE ...... 110

APPENDIX D. HAPLODIPLOIDY LEADS TO HETEROGENEOUS GENOMIC

DIVERGENCE DUING SPECIATION WITH GENE FLOW ...... 114

REFERENCES ...... 121

VITA ...... 138

viii

LIST OF TABLES

Table 2.1 Post hoc tests (Z-tests) for interspecific crosses ...... 32 Table 4.1 Proposed mechanisms for Haldane’s rule and predictions regarding how haplodiploids should differ from diploids in rate of of IPI ...... 78

ix

LIST OF FIGURES

Figure 2.1. Needle width is a potential source of selection on oviposition traits ...... 20 Figure 2.2. Cross design for assessing postzygotic isolation...... 25 Figure 2.3. N. pinetum females preferred P. strobus more strongly than N. lecontei females...... 29 Figure 2.4. N. pinetum and N. lecontei females differed in their egg-laying pattern...... 30 Figure 2.5. N. pinetum females had smaller, straighter ovipositors than N. lecontei females...... 31 Figure 2.6. Oviposition preference and success depends on cross-type and host...... 33 Figure 2.7. BCL females with pinetum-like oviposition traits have higher oviposition success on P. strobus...... 34 Figure 3.1 Oviposition phenotypes across the different cross types...... 53 Figure 3.2. Oviposition traits that affect hatching...... 53 Figure 3.3 QTL mapping of oviposition behavior...... 55 Figure 3.4. QTL mapping of ovipositor morphology...... 56 Figure 3.5. QTL mapping of ovipositor shape...... 57 Figure 3.6. QTL mapping of hatching success...... 58 Figure 4.1. Viability and fertility for hybrid females and hybrid males...... 71 Figure 4.2. N. pinetum and N. lecontei have lower isolation than taxa with heteromorphic sex chromosomes at the same genetic distance...... 73 Figure 4.3. A two-locus model of IPI in haplodiploids and diploids consistent with the dominance model...... 77 Figure 5.1 Population genetic of N. pinetum and N. lecontei...... 83 Figure 5.2. Meta-analysis of sex chromosomes and autosomes across 28 pairs. ... 85 Figure 5.3. Simulations of FST between haplodiploids and diploids...... 87 Figure 5.4. Mechanisms causing differences in levels of divergence between haplodiploids and diploids...... 89

x

CHAPTER 1. INTRODUCTION

Currently there are 1.8 million described species (Roskov et al. 2019), and an estimated 8.7 million to 1 trillion species on Earth in total (Mora et al. 2011; Locey and Lennon 2016). Understanding the origins of this amazing amount of has been a major goal of since the beginning of the field (Darwin 1859). When Darwin wrote by Means of , he was trying to understand the role of natural selection in generating biodiversity.

“On the view that each species has been independently created, I see no explanation of this great fact in the classification of all organic beings; but, to the best of my judgement, it is explained through inheritance and the complex action of natural selection, entailing and divergence of character.” – Darwin 1859, pg 118

Starting with the modern synthesis, the focus of speciation research shifted to the geographic mode of speciation (Mayr 1942). However, it is difficult to classify speciation by geography, because multiple geographic modes can be involved in a single speciation event and modern geographic context does not always reflect historical geography (Butlin et al. 2008). Despite geography not being a good way to classify speciation modes, it still has a large impact on the process (Nosil 2012). It has become clear that speciation with gene flow, either primary or secondary, is common and gene flow is a powerful evolutionary force that strongly impacts the speciation process (Taylor and Larson 2019).

In more modern times, there has been a resurgence in trying to understand the mechanisms underlying speciation (Schluter 2000, 2001; Via 2001). Is speciation driven by drift or selection, and if selection what type? Additionally, as sequencing has gotten more powerful and cheaper, there has been an increasing interest in understanding the genetic mechanisms of speciation (Seehausen et al. 2014; Wolf and Ellegren 2016). In order to fully understand the mechanisms and genetics of speciation we have to 1) identify the traits under selection, 2) find the genes underlying , 3) determine 1

how adaptations affect fitness, and 4) understand how changes in fitness create reproductive isolations. This approach connects genotype to phenotype to fitness to species. Additionally, reproductive isolation can affect the genome beyond the selected loci. Identifying the genomic consequences of reproductive isolation is critical to understanding the speciation process.

Researchers have generally taken one of two approaches to investigating speciation (Byers et al. 2017). One is a more mechanistic method that aims to connect genotype to phenotype to fitness (Barrett and Hoekstra 2011), and the other is a genomic approach that searches for signatures of selection (Beaumont and Balding 2004; Storz 2005; Nosil et al. 2009a). Each approach has its advantages and pitfalls making it valuable to combine both approaches (Stinchcombe and Hoekstra 2008). However, these approaches have merged in few taxa (Byers et al. 2017). There are several criteria that make a system a promising candidate. Genomic resources, such as an annotated genome, facilitates finding loci underlying adaptations and testing the loci for fitness effects. It can be difficult to infer historical selection from current environmental conditions, making it beneficial to use taxa that have not completed the speciation process, especially taxa that can be crossed in the lab (Siepielski et al. 2009). Finally, it is helpful to have good natural history records to identify potential traits under selection and reproductive barriers. In my dissertation I aim to combine approaches to understand speciation, especially speciation with gene flow, from genes to species and from species to genomic patterns.

1.1 Ecological speciation

One of the mechanisms driving speciation that has gotten a lot of attention recently is divergent natural selection and its role in ecological speciation (Nosil 2012). During ecological speciation, environmentally based divergent selection causes reproductive isolation (Rundle and Nosil 2005; Schluter 2009). For ecological speciation to proceed a source of divergent selection, a form of reproductive isolation, and a genetic mechanism linking divergent selection to reproductive isolation are needed (Nosil 2012).

There are multiple lines of evidence for divergent selection being involved in speciation. First, divergent selection has been measured directly through reciprocal

2

transplants with each species showing reduced fitness in the other (Nosil et al. 2005; Lowry et al. 2009). Additional evidence comes from parallel speciation (Rice and Hostert 1993; Schluter and Nagel 1995). During parallel speciation, independent populations evolve the same phenotype in similar environments. The repeated evolution of traits is unlikely to occur through drift alone. The one caveat to demonstrating parallel speciation is confirming that the phenotype has independent origins instead of a single origin followed by gene flow to the other populations (Coyne and Orr 2004; Butlin et al. 2008).

Any form of reproductive isolation can occur during ecological speciation as long as it arises as a consequence of ecologically based divergent natural selection. However, there are forms of reproductive isolation that are unique to ecological speciation, including extrinsic postzygotic isolation (Nosil 2012). Extrinsic postzygotic isolation occurs when hybrids have reduced fitness in because the hybrids are not suited to the environment, typically because hybrids have intermediate traits that don’t match the parental environments. (Rice and Hostert 1993; Rundle and Whitlock 2001; Rundle and Nosil 2005). As long as hybrids have intermediate phenotypes and there is no intermediate habitat, then extrinsic postzygotic isolation is direct consequence of divergent natural selection (Nosil 2012). Extrinsic postzygotic isolation has been commonly overlooked in favor of premating isolation or intrinsic postzygotic isolation.

Postzygotic isolation is more effective at stopping gene flow than (Irwin 2019). During assortative mating causes interspecific mating to be rare, but some hybrids are made. F1 hybrids tend to mate with each other, but also mate with the parental species at a higher rate than interspecific matings. The backcross and hybrids link the two species causing a broad hybrid zone. When there is postzygotic isolation hybridization frequently occurs, but low F1 fitness causes intermediates to be rare leading to a narrow hybrid zone. Prezygotic isolation has traditionally been considered more important during the early stages of speciation (Mayr 1963; West-Eberhard 1983; Coyne and Orr 1989; Grant and Grant 1997; Price and Bouvier 2002; Rieseberg and Willis 2007; Schumer et al. 2017) because prezygotic isolation seems to evolve more rapidly than postzygotic isolation. However, most studies

3

comparing prezygotic and postzygotic isolation only look at intrinsic postzygotic isolation (Coyne and Orr 2004). Unlike intrinsic postzygotic isolation, extrinsic postzygotic isolation does not require the evolution of genetic incompatibilities and should evolve at a rate closer to prezygotic isolation (Cahenzli et al. 2018; Christie and Strauss 2018). Extrinsic postzygotic isolation is likely an important but currently understudied form of reproductive isolation.

Finally, there needs to be a genetic link between divergent selection and reproductive isolation. Tight physical linkage or pleiotropy prevents gene flow from breaking apart the loci for reproductive isolation and divergent selection (Rice and Hostert 1993). Pleiotropic loci can affect both reproductive isolation and divergently selected traits when reproductive isolation is a direct consequence of divergent selection (Schemske and Bradshaw 1999; Hawthorne and Via 2001; Boughman 2002; Via and Hawthorne 2002; Lowry et al. 2008). For example, habitat preference will automatically cause habitat isolation (Rice and Salt 1990). Pleiotropy should result in a stronger link than tight linkage because it is completely immune to the effects of gene flow and recombination (Rice and Hostert 1993; Kirkpatrick and Ravigné 2002; Gavrilets 2004). However, most studies are not fine scale enough to distinguish between tight linkage and pleiotropy (Macnar and Christie 1983; Wright et al. 2013). Another genetic mechanism is one-allele assortative mating, where one allele fixes in both species resulting in reduced interspecific mating (Felsenstein 1981; Servedio and Noor 2003). One example is an allele that causes a female to mate with males of her size. If the species differ in body size, then the females will preferentially mate with conspecifics. One-allele assortative mating is an effective genetic mechanism since gene flow can actually promote the formation of reproductive isolation by spreading the allele between populations (Ortíz- Barrientos and Noor 2005). Divergent selection can also be linked to reproductive isolation through strong selection (Feder et al. 2012). As the strength of divergent selection increases the reduction in effective migration increases, causing larger genomic regions to diverge. Because speciation involves reducing gene flow between species, any mechanism that reduces effective migration will cause species to proceed along the speciation continuum.

4

An additional mechanism is epistatic interactions through Bateson-Dobzhansky- Muller incompatibilities (BDMIs) (Bateson 1909; Dobzhansky 1937; Muller 1942). In a simple BDMI there are two different loci in the ancestral population, where the alleles at each locus are compatible. After divergence, an alternative allele for one of the loci fixes in one population and an alternative allele for the other locus fixes in the second population. Because the alternative alleles have never been tested together, they interact negatively forming an incompatibility in the hybrids. BDMIs allow for the evolution of incompatibilities without any population crossing a fitness valley. Although BDMIs are not necessarily formed through divergent selection, they can be (Turelli et al. 2001). One way in which ecologically based BDMIs may occur is through opposing dominance of preference and performance traits (Ohshima 2008; Matsubayashi et al. 2010; Ohshima and Yoshizawa 2010). Hybrids will have a preference for one habitat but will have traits ill-suited for that habitat resulting in low hybrid fitness and reproductive isolation. Although these genetic mechanisms differ in efficacy, they probably all play a role in ecological speciation.

Ecological speciation is not the only possible speciation mechanism. Speciation through drift and order speciation, where different alleles fix in different populations in response to the same selection pressure, are additional possible mechanisms (Mani and Clarke 1990). However, these mechanisms are difficult when there is gene flow. In mutation order speciation, gene flow will spread universally favored alleles between populations, eroding differences between these populations (Barton and Bengtssont 1986; Kondrashov 2003).Without any selection, gene flow will easily erode differences between populations that evolved through drift (Turelli et al. 2001). Ecological speciation also proceeds more easily in the absence of gene flow, but it is still effective at maintaining species divergence when there is gene flow (Schluter 2009; Nosil 2012). With speciation with gene flow being common, ecological speciation is an important mechanism for generating biodiversity. Evidence from field and laboratory studies clearly shows that ecological speciation occurs and is a taxonomically general process (Rundle and Nosil 2005; Funk et al. 2006; Schluter 2009; Nosil 2012; Van der Niet et al. 2014). Through rigorous testing and mounting studies many details of ecological speciation are well understood. However, major questions still remain

5

unanswered, such as what traits are typically under divergent selection and what are the relative importance of different forms of reproductive isolation.

1.2 Genetics of adaptation

One factor that can promote or hinder ecological speciation is the genetic architecture of adaptive traits. Genetic architecture describes the genetic basis of a trait, and includes but is not limited to the number and distribution of mutational effects, dominance, pleiotropy, and epistasis (Mackay 2001; Hansen 2006). The genetic architecture of traits can have very important evolutionary consequences, and it is therefore critical to understand both the causes and consequences of genetic architecture.

The genetic architecture of adaptation has been debated since the time of Darwin. Initially there were two schools of thought that fell on opposites extremes of the spectrum. Darwin thought that natural selection acted on infinitesimally small variations (micro-mutationalism)(Mayr 1982). Adaptation due to small variation would allow for the precise matching of phenotype to environment that Darwin observed (Darwin 1859). This idea was adopted and supported by the biometricians (e.g. Karl Pearson and Walter Weldon)(Provine 2001). At the other end of the spectrum, the Mendelians (e.g William Bateson) thought that adaptation was driven by large effect (Provine 2001). These large effect mutations would allow for evolutionary leaps.

One of the earliest mathematical models of the genetic architecture of adaptation was Fisher’s geometric model (Fisher 1930). In this model a population has been knocked off of its phenotypic optimum by an environmental change. Through adaptation the population moves toward the optimum. This adaptation occurs through new mutations that are subject to natural selection alone. To understand the typical effect size of adaptation, Fisher calculated the probability that a random mutation of a given phenotypic effect size would be beneficial. Infinitesimally small effect size mutations have a 50% chance of being beneficial. As the phenotypic effect size increases, the probability of the mutation being beneficial sharply declines, because it is more likely to have deleterious, pleiotropic effects. Given that small mutations were most likely to be beneficial, Fisher concluded that mutations of small effect size are the genetic basis of

6

adaptation. Kimura then built upon Fisher’s work by including the effect of drift (Kimura 1983), and Orr (1998) added in a temporal component leading to new expectations for genetic architecture of adaptive traits. As theory has progressed it has become more complex with additional factors being taken into account.

Based on current empirical work and theory to date, it seems that the genetic architecture of adaptation is variable (Dittmar et al. 2016). Although Kimura and Orr added complexity to Fisher’s initial model, the models are overly simple and do not capture the complexity of adaptation and individual populations. Evolutionary processes and development basis of traits are likely to affect the genomics of adaptation. However, each of these factors will have different contributions in different populations. This may cause there to be more than one genetic architecture of adaptation (Dittmar et al. 2016). Although there may not be a single genetic architecture, we can understand the effect of various factors to create a more comprehensive framework for testing hypotheses about the genetic architecture of adaptation.

Characteristics of the fitness surface, origin of adaptive alleles, and drift all impact genetic architecture, but the impact of gene flow has received a lot of attention (Haldane 1927; Coyne and Orr 1998; Orr 1998; Barrett and Schluter 2008; Hedrick 2013; Matuszewski et al. 2014, 2015; Dittmar et al. 2016). During speciation or even local adaptation that occurs in the face of gene flow, adaptive alleles must remain in the right genetic background causing gene flow to potentially have large effects on genetic architecture. Gene flow will cause divergently adaptive alleles to be homogenized across the genome (Griswold 2006). Small effect alleles are more susceptible to the homogenizing effect of gene flow than large effect mutations under the assumption that phenotypic effect size is reflective of selection coefficient (Yeaman and Otto 2011; Yeaman and Whitlock 2011). Therefore, it is expected that large effect mutations will persist and be the basis for adaptation. However simulations show that closely linked small effect loci can act in the same manner as a large effect locus because it will be difficult for recombination to act on very tightly linked loci (Yeaman and Whitlock 2011). Those linked loci will effectively be one large effect locus. Even if adaptation initially happens in allopatry, large or clustered loci are expected to be present after a

7

period of gene flow. Large effect mutations should replace the initial small effect adaptive mutations as the diverging populations experience gene flow (Yeaman and Whitlock 2011).

Another factor that may have a large impact on effect sizes of adaptive alleles is pleiotropy. A large effect mutation in a pleiotropic gene is likely to be deleterious in at least one of the traits it controls, leading to a reduced net fitness (Fisher 1930). Additionally, large effect mutations have been shown to be more pleiotropic (Wagner et al. 2008; Wang et al. 2010). If a highly pleiotropic gene is implicated in adaptive evolution then it should occur through many small effect mutations in the pleiotropic genes (Fisher 1930; Tenaillon 2014). Alternatively, if there are large-effect mutations, they should be biased towards the cis-regulatory region of the pleiotropic gene (Sucena and Stern 2000; Shapiro et al. 2004, 2006; Gompel et al. 2005; Prud’homme et al. 2006). Since regulatory elements are able to change the expression of the gene in a specific stage or tissue without affecting the function in other areas (Carroll 2008). The specificity of regulatory elements is therefore able to escape the effects of deleterious pleiotropy. In empirical systems, multiple factors will be occurring simultaneously. To make additional progress in the field of adaptation genetics, theory that incorporates multiple factors and empirical systems where the whole evolutionary context is known is needed.

1.3 Genome wide patterns of divergence

Divergent adaptation is not equivalent to speciation. After adaptation occurs, these loci must act to reduce effective migration. Initially the effect is limited to closely linked loci, but as the number of adaptive loci increases the effective migration rate is reduced along the whole genome (Seehausen et al. 2014). The loci that contribute to reproductive isolation are resistant to introgression causing peaks of differentiation at these loci and surrounding linked loci, but other loci are able to introgress freely and have low levels of divergence. This process creates heterogeneity in divergence across the genome, which is frequently used to detect loci that are contributing to reproductive isolation with barrier loci being at or near peaks of differentiation (Wu 2001; Turner et al. 2005; Nosil et al. 2009a). However, this is an oversimplified model and many factors can

8

influence patterns of heterogeneity, and it is critical that we understand how different patterns of heterogeneity can arise in order to properly interpret genome scans (Ravinet et al. 2017). Beyond interpreting genome scans, these factors can also impact the speciation process itself and may either promote or constrain speciation.

Various evolutionary forces will affect patterns of heterogeneity in divergence, including the effect size of the divergent loci. When a few large effect mutations are the first to diverge, divergence gradually increases across the genome in an additive manner (Flaxman et al. 2014). This is because these initial mutations are strong enough to constrain gene flow. Each additional mutation works to reduce levels of gene flow until the genome is highly differentiated giving rise to the canonical pattern of peaks at divergent loci and valleys at other loci On the other hand, if many small effect mutations are the initial starting point for divergence, then the process is very different (Flaxman et al. 2014; Nosil et al. 2017). These small mutations do not contribute to population divergence, because the strength of selection is too weak to overcome migration. They are not strong enough to individually reduce the rate of gene flow. Instead these mutations continually build until there is a critical mass (tipping point), which results in the genome quickly diverging, raising the baseline of divergence and making it difficult to detect outliers.

Most mutations will be deleterious instead of advantageous (Eyre-Walker and Keightley 2007). Background selection will purge these deleterious alleles reducing variation at linked loci leading to reduced divergence (Charlesworth et al. 1993; Charlesworth 2012; Cutter and Payseur 2013). The effects of background selection will be stronger in regions that have many potential deleterious alleles, such as in coding regions (Lohmueller et al. 2011; Cutter and Payseur 2013; Enard et al. 2014). Additionally, the strength of background selection will change along the speciation continuum (Cutter and Payseur 2013). When a population first enters a new environment, the population is typically far from the fitness optimum and more mutations are likely to be beneficial (Orr 1998). As populations adapt and diverge, they move closer to the optimum and more mutations are likely to move them away from the optimum.

9

These evolutionary forces will be modulated by demographic history. When the effective population size is small, drift will be stronger and new beneficial mutations will be more likely to be lost. In order for new mutations to be maintained in the population stronger selection is necessary (Yeaman and Otto 2011). If the population size changes it will affect the efficacy of selection and drift causing the baseline level of divergence to shift (Ferchaud and Hansen 2016). Without taking demographic history into account it is difficult to detect outliers. Additionally, there will be different patterns if there was primary divergence with gene flow or secondary contact. Upon secondary contact, gene flow does not immediately homogenize the genome so the effects of diverging in allopatry are still visible (Bierne et al. 2013; Feder et al. 2013). In allopatry universally beneficial alleles can arise in only one population. If there has not been sufficient gene flow for the allele to spread to the other population it will appear as if there is divergent selection at that locus. Furthermore, when there is divergent selection, local sweeps will extend farther away from the selected loci in allopatry because recombination and gene flow cannot break up linked loci (Ravinet et al. 2017). Barrier loci that diverged in allopatry will be easier to detect.

Genome features, such as mutation rate, recombination rate, and gene density, vary across the genome and affect patterns of heterogeneity. Areas with low mutation rate will have low levels of divergence since fewer potentially adaptive alleles enter the population (Francioli et al. 2015). In areas with high mutation rates, many new mutations are occurring at loci linked to the barrier loci causing some measures of differentiation to be downward biased (Foll and Gaggiotti 2008). This is especially problematic when using reduced representation sequencing because the barrier locus may not be sequenced, and we use linked loci to determine where barrier loci are. Recombination rate variation also has multiple effects. When the recombination rate is low, signatures of selection extend farther away from the barrier locus (Stephan 2010; Nachman and Payseur 2012; Cutter and Payseur 2013). This leads to more efficient reduction of the effective migration rate and causes barrier loci to be detected more easily (Ravinet et al. 2017). Additionally, clusters of barrier loci can evolve when there is low recombination allowing for small effect loci to escape the homogenizing force of gene flow (Yeaman 2013). Finally, gene dense regions are more likely to experience both positive and

10

background selection because mutations in these regions have more functional effects (Stephan 2010; Lohmueller et al. 2011; Cutter and Payseur 2013; Enard et al. 2014). Although positive selection is more likely to occur, it may need to be stronger to overcome the reduced effective population size caused by background selection (Ravinet et al. 2017). Additionally, gene dense regions can facilitate clustering of coadapted allele because genes tend to cluster by function (Hurst et al. 2002, 2004; Al-Shahrour et al. 2010).

Evolutionary forces, demographic history, and genome features are not independent, and interact with each other to shape patterns of heterogeneity. To understand the process of speciation and to accurately detect loci contributing to reproductive isolation it is critical to understand how these forces interact with each other. The speciation process is not identical across taxa. If any of these factors systematically vary between taxa, they may explain why the speciation process differs between taxa.

1.4 Sawflies as a study system

Neodiprion pine sawflies are a Holarctic genus of (Coppel and Benjamin 1965). They are host specialists that have an intimate association with their hosts throughout their life cycle: adults mate on the host, the females use their saw-like ovipositor to cut a slit for her eggs in the needle, the larvae eat the pine needles, and then spin a cocoon on or below the host (Coppel and Benjamin 1965). Pine sawflies are a good system for studying these questions for several reasons. First, their natural history has been well studied in the past and their life history is well known (Coppel and Benjamin 1965). They have a number of ecologically variable traits (number of hosts used, specific hosts used, color, larval behavior) that are good for studying the effects of the environment on the speciation process. Additionally, they are also able to be easily collected in the field and reared in the lab. Finally, there are also genetic resources available to facilitate connecting genotype and genetic patterns to other levels of the speciation process (Vertacnik and Linnen 2015; Linnen et al. 2018).

11

Specifically, I am studying a pair of sister species, Neodiprion pinetum and N. lecontei that differ in host use (Linnen and Farrell 2007, 2008a). N. pinetum is a specialist on white pine (), while N. lecontei uses a wide range of pines, but avoids white pine. White pine has thin non-resinous needles. All of the hosts N. lecontei uses have thicker more resinous needles. Adapting to new hosts has been shown to drive speciation repeatedly in phytophagous insects (Matsubayashi et al. 2010). There are a couple of lines of evidence that suggest that adapting to a host might drive speciation between N. pinetum and N. lecontei (Linnen and Farrell 2010; Bagley et al. 2017).

There is also gene flow still occurring between N. pinetum and N. lecontei. We have found morphological hybrids in the wild, and there is mitochondrial introgression. Despite this gene flow, N. pinetum and N. lecontei are able to remain morphologically, behaviorally, and genetically distinct. This makes them an optimal species pair to study the speciation process in. Once speciation has been completed, additional barriers can arise making it difficult to understand what barriers and factors drove speciation. On the other hand, it is important that significant reproductive isolation has evolved to determine if taxa are undergoing speciation or just local adaptation (that may or may not lead to speciation).

1.5 Dissertation overview

The goal of my dissertation is to connect genotype, phenotype, fitness, and species together to fully understand the speciation process between N. pinetum and N. lecontei. Additionally, I aim to test major outstanding evolutionary hypotheses within this framework. The first half of my dissertation focuses on the role of host use in speciation, and the second half focuses on the impact of haplodiploidy.

1.5.1 Chapter 2. Oviposition traits generate extrinsic postzygotic isolation between two pine sawfly species

In my second chapter I aim to connect phenotype to fitness to species and understand the role of host use in this process. To do this I test the hypothesis that divergent selection on oviposition traits leads to extrinsic postzygotic isolation.

12

Ovipositing correctly for a specific host is critical for eggs to survive until hatching. N. lecontei and N. pinetum have hosts that dramatically differ in needle characteristics. The specificity needed for proper oviposition combined with striking differences in needles likely results in divergent selection on oviposition traits. If hybrids have intermediate or mismatched oviposition traits this can lead to reduced hatching of their eggs and extrinsic postzygotic isolation.

To test this hypothesis, I first characterized multiple behavioral and morphological oviposition traits for both N. pinetum and N. lecontei. If differences in host are leading to extrinsic postzygotic isolation, then hybrids should have reduced fitness and this reduced fitness should be host dependent. To test this, I generated hybrid and backcross females and measured their hatching success on white pine and jack pine (one of N. lecontei’s hosts). Finally, if there is selection on oviposition traits, backcrosses on white pine will have higher fitness if they have N. pinetum like traits. I looked at the effect of ovipositor morphology and egg pattern on hatching for N. lecontei backcrosses on white pine.

1.5.2 Chapter 3. Genetic architecture of oviposition traits

In my third chapter I connect genotype to phenotype and genotype to fitness by examining the genetic architecture of oviposition traits. Many factors can influence the genetic architecture of adaptive traits, but there are two that are particularly relevant here. The first is the impact of gene flow. When there is gene flow recombination can break apart coadapted alleles, so loci should be either clustered together or there should be fewer large effect loci. This way the adaptive alleles are more likely to be inherited together. The second factor I consider is the difference in genetic architecture for behavioral and morphological traits. Many small effect loci are thought to be most likely for morphological traits. The genes involved in these traits are typically in highly conserved developmental pathways, where large changes are likely to cause negative effects in other traits. Conversely, behavioral traits are less likely to be controlled by these conserved pathways and can tolerate large effect alleles.

13

To test the effect of geneflow and the type of trait on the genetic architecture of oviposition traits, I conducted a QTL mapping study. I made backcrosses in both directions and measured multiple oviposition traits. I also measured hatching success, so I could directly map fitness to see if the loci affecting oviposition traits are also affecting fitness. I completed whole genome resequencing on the mapping population and completed the analysis in R/qtl2.

1.5.3 Chapter 4. Lack of intrinsic postzygotic isolation in haplodiploid male hybrids despite high genetic distance

During speciation it is common for multiple reproductive barriers to arise, and one barrier that is of particular interest is intrinsic postzygotic isolation, because it is thought to be the most permanent and impenetrable barrier. One of the most widely observed patterns in biology is Haldane’s rule, where if one sex has greater inviability or sterility it is the heterogametic sex, concerns intrinsic postzygotic isolation. Evolutionary biologists have long been interested in understanding the mechanisms underlying Haldane’s rule. Dominance theory and faster-X, which are based on recessive mutations being visible in the heterogametic sex, have been proposed as common mechanisms. These mechanisms predict that greater hemizygosity leads to faster evolution of intrinsic postzygotic isolation. Under these mechanisms haplodiploids should evolve intrinsic postzygotic isolation faster than diploids because the entire genome is analogous to a sex .

In this chapter I test the hypothesis that haplodiploids should have greater levels of intrinsic postzygotic isolation for a given genetic distance compared to diploids. I measured sterility and inviability in hybrid females and males and calculated the genetic distance between N. pinetum and N. lecontei. I then compared the amount of isolation to previously published expectations of isolation in diploids for the same level of divergence

1.5.4 Chapter 5. Haplodiploidy leads to heterogeneous genomic divergence during speciation with gene flow

For my fifth chapter, in an effort to connect speciation to genetic patterns I studied how genetic changes accumulate across the genome and the influence of

14

haplodiploidy on divergence. During speciation with gene flow, loci that are causing reproductive isolation are resistant to introgression, causing peaks of differentiation at these loci. The exact pattern of divergence is influenced by many features including the strength of divergent selection. In haplodiploids, the presence of haploid males may have a significant impact on patterns of divergence, because unlike diploids any recessive mutation is immediately visible to selection. I hypothesize that haplodiploids will have higher and more variable divergence.

To test this hypothesis, I examined patterns of divergence between N. pinetum and N. lecontei. I then completed a meta-analysis between autosomes and Z/W chromosomes which should mirror haplodiploids because they are haploid in one sex and diploid in the other. Finally, I used simulations to test the generality and potential mechanisms underlying our hypothesis.

1.5.5 Chapter 6. conclusion

In my conclusion I discuss the progress and the areas that need additional work to connect all the levels of speciation together.

15

CHAPTER 2. OVIPOSITION TRAITS GENERATE EXTRINSIC POSTZYGOTIC ISOLATION BETWEEN TWO PINE SAWFLY SPECIES

This chapter has been previously published as:

Bendall, E. E., K. L. Vertacnik, and C. R. Linnen. 2017. Oviposition traits generate extrinsic postzygotic isolation between two pine sawfly species. BMC Evol. Biol. 17

2.1 Abstract

Background: Although empirical data indicate that ecological speciation is prevalent in nature, the relative importance of different forms of reproductive isolation and the traits generating reproductive isolation remain unclear. To address these questions, we examined a pair of ecologically divergent pine-sawfly species: while Neodiprion pinetum specializes on a thin-needled pine (Pinus strobus), N. lecontei utilizes thicker-needled pines. We hypothesized that extrinsic postzygotic isolation is generated by oviposition traits. To test this hypothesis, we assayed ovipositor morphology, oviposition behavior, and host-dependent oviposition success in both species and in F1 and backcross females.

Results: Compared to N. lecontei, N. pinetum females preferred P. strobus more strongly, had smaller ovipositors, and laid fewer eggs per needle. Additionally, we observed host- and trait-dependent reductions in oviposition success in F1 and backcross females. Hybrid females that had pinetum-like host preference (P. strobus) and lecontei- like oviposition traits (morphology and egg pattern) fared especially poorly.

Conclusions: Together, these data indicate that maladaptive combinations of oviposition traits contribute to extrinsic postzygotic isolation between N. lecontei and N. pinetum, suggesting that oviposition traits may be an important driver of divergence in phytophagous insects.

16

2.2 Introduction

Evolutionary biologists have long recognized that natural selection plays an important role in the formation of new species (Darwin 1859; Dobzhansky 1937; Mayr 1942, 1947). However, it is only within the last two decades that ecological speciation— the process by which environmentally based divergent selection gives rise to reproductive isolation (Schluter 2000, 2001)—has become the focus of sustained research effort. During this time, laboratory and field studies in a wide range of organisms have demonstrated unequivocally that ecological speciation occurs in nature (Rundle and Nosil 2005; Schluter 2009; Nosil 2012; Van der Niet et al. 2014). Moreover, comparative data suggest that ecological divergence plays a fundamental and taxonomically general role in driving speciation (Funk et al. 2006). Nevertheless, while some aspects of ecological speciation are now fairly well understood, many major questions—including the relative importance of different forms of reproductive isolation (RI), and the types of traits that generate RI—remain unresolved (Nosil 2012).

Any form of RI can, in theory, contribute to ecological speciation so long as it arises as a consequence of divergent natural selection. However, one form of RI that may be especially important is extrinsic postzygotic isolation (hereafter, EPI), in which intermediacy or maladaptive combinations of traits in hybrids causes low fitness in both parental environments (Rice and Hostert 1993; Rundle and Whitlock 2001; Coyne and Orr 2004; Rundle and Nosil 2005). EPI is thought to be a particularly common form of RI in ecological speciation because, so long as hybrids are intermediate and intermediate environments are lacking, it is a direct and automatic result of divergent selection (Nosil

2012). As such, EPI should be among the earliest barriers to arise during speciation. Additionally, when there is gene flow between diverging populations, EPI will lead to direct selection for assortative mating via reinforcement (Dobzhansky 1937, 1940). However, although EPI is one of only two forms of RI that are unique to ecological speciation (the other being immigrant inviability, (Nosil et al. 2005)), it is understudied relative to other forms of RI. One possible reason for the dearth of EPI studies is that it is challenging to distinguish between extrinsic (ecologically dependent) and intrinsic (due to genetic incompatibilities) sources of reduced hybrid fitness (Coyne and Orr 2004).

17

To date, three techniques have been proposed to distinguish between extrinsic and intrinsic sources of postzygotic isolation. The simplest of these is to compare the fitness of F1 hybrids in the wild to their fitness in a benign environment, in which the source of ecologically based selection has presumably been removed. If reduced hybrid fitness disappears in the “benign” habitat, this implies that the reduction was environmentally dependent (Hatfield and Schluter 1999). The main limitation of this approach is that it does not control for stress-related expression of intrinsic hybrid incompatibilities (Hatfield and Schluter 1999). A second, more rigorous approach is to rear backcrosses of F1s to both parental forms in both parental environments. EPI predicts that each backcross type will perform best in the parental habitat to which it is most genetically similar (Rundle and Whitlock 2001). A final technique is to examine how specific hybrid traits impact fitness in parental . This approach requires knowledge of the traits contributing to EPI and can be accomplished in one of two ways, the first of which is to experimentally manipulate parental individuals to resemble hybrids (Rundle and Nosil 2005; Nosil 2012). For many traits and organisms, however, these phenotypic modifications would be impractical, if not impossible. An alternative to direct modification is to take advantage of trait variation in F1 hybrids, F2, or backcross individuals and track how different trait values and combinations impact fitness in parental environments (e.g, (Mcbride and Singer 2010; Martin and Wainwright 2013)).

One group of organisms that has featured prominently in empirical and theoretical studies of ecological speciation and EPI is plant-feeding insects. Several lines of evidence support the hypothesis that changes in host use are an important driver of ecological speciation in insects, including: (1) phylogenetic studies that show elevated rates of diversification among lineages of phytophagous insects compared to non-phytophagous insects (Mitter et al. 1988; Farrell 1998; Wiens et al. 2015), (2) comparative studies that demonstrate an association between changes in host use and speciation (Funk et al. 2006; Winkler and Mitter 2008; Linnen and Farrell 2010) (but see (Nyman et al. 2010)), and (3) a growing list of empirical case studies that have confirmed key predictions of ecological speciation (reviewed in (Berlocher and Feder 2002; Funk et al. 2002; Matsubayashi et al. 2010; Nosil 2012)). However, while evidence supporting ecological speciation in insects is strong, the contribution of EPI remains unknown. In particular, although indirect

18

evidence for host-related EPI exists for many taxa (reviewed in (Matsubayashi et al. 2010)), few direct tests exist (but see (Rundle 2002; Egan and Funk 2009; Kuwajima and Kobayashi 2010; Mcbride and Singer 2010; Soudi et al. 2016)). Moreover, in most cases, the specific traits contributing to EPI have not been identified (but see (Mcbride and Singer 2010)). Understanding the mechanistic basis of EPI is critical if we are to understand whether biases exist in the types of traits (e.g., morphological, physiological, behavioral) that contribute to reduced hybrid performance.

To investigate the prevalence of EPI—and the traits that produce it—we focus here on pine sawflies in the genus Neodiprion (Hymenoptera, ), a Holarctic group of pine specialists that develop in intimate association with their host plants: adults mate on the host, females embed their eggs in the host tissue, and larvae complete their development on the host, spinning their cocoons on or beneath the host (Coppel and Benjamin 1965). Population genomic data from a single species, N. lecontei, indicate that divergence in host use contributes to population differentiation (Bagley et al. 2017), and comparative data from multiple species indicate that host-associated population differentiation occasionally progresses to speciation (Linnen and Farrell 2010). However, the mechanisms linking divergent host use to population differentiation and RI have not been identified.

To explore mechanistic links between host-use divergence and speciation in Neodiprion, we examined a pair of sister species that differ in host use, N. pinetum and N. lecontei (Linnen and Farrell 2007, 2008a). N. pinetum is a specialist that feeds on Pinus strobus, while N. lecontei feeds on a wider range of Pinus hosts, but generally avoids P. strobus. These species are interfertile in the lab (personal observation) and two lines of evidence indicate that they hybridize in the wild: (1) we have collected hybrids—which are identifiable via their intermediate larval coloration—at multiple field sites (personal observation), and (2) mitochondrial introgression has occurred between these two species (Linnen and Farrell 2007). Nevertheless, despite widespread and occasional hybridization, these two species remain morphologically, behaviorally, and genetically distinct. These observations suggest that there are postzygotic barriers to gene flow. Given that lab-reared hybrids are viable and fertile (personal observation), we

19

Figure 2.1. Needle width is a potential source of selection on oviposition traitsA. Mean mature needle width (+/- SEM) of different pine species preferred by N. pinetum (white) and N. lecontei (grey). Letters indicate hosts that differ significantly at P < 0.05 (Table A4). N. pinetum’s preferred host species has significantly thinner needles than those of N. lecontei’s hosts. B. A N. lecontei female uses her saw-like ovipositor to carve an egg pocket into a pine needle (Photo by R.K. Bagley).

hypothesize that postzygotic barriers between N. lecontei and N. pinetum are largely extrinsic in nature, stemming from their specialization on different Pinus hosts.

Additionally, we hypothesize that EPI has arisen as a consequence of divergence in oviposition traits. The most striking difference between the hosts of N. pinetum and N. lecontei is that P. strobus needles are far thinner and less resinous than other Pinus hosts (Fig. 2.1A). This difference is important because Neodiprion females use a saw-like ovipositor to carve egg pockets into the pine needle (Fig. 2.1B); the eggs must survive within these pockets for anywhere between a week to eight months (Wilkinson et al. 1966; Knerer 1984). During this period, two major sources of egg mortality across the genus are desiccation and drowning in pine resin. For example, if an ovipositing female cuts her egg pockets too deeply, she can damage the host needle, causing the needle to dry out and the eggs to die (Wilkinson 1971; Knerer and Atwood 1973; McCullough and Wagner 1993; Codella and Raffa 2002). Alternatively, for resinous host needles, failure to sufficiently drain host resins can result in egg drowning (Wilkinson 1971; McCullough and Wagner 1993). Given the substantial fitness costs of improper oviposition, selection

20

is expected to favor a close match between oviposition traits (morphology and behavior) and host plant needle characteristics (needle width and resin content). When two species with divergent oviposition phenotypes hybridize, hybrid females may have reduced fitness stemming from trait intermediacy or maladaptive combinations of oviposition traits.

To test the hypothesis that divergence in oviposition traits produces EPI between N. pinetum and N. lecontei, we evaluated a series of predictions. First, we predicted that N. pinetum and N. lecontei would have behavioral and morphological traits that are suited to the needle characteristics of their respective hosts. Second, we predicted that hybrids and backcrosses would have reduced oviposition success (i.e., egg hatching) compared to each species, and that this reduction in success would be host dependent. Finally, if EPI is generated by oviposition traits, we predicted that oviposition success of backcrosses would be dependent on their oviposition traits. Together, our results provide compelling evidence that maladaptive combinations of oviposition traits contribute to extrinsic postzygotic isolation in Neodiprion lecontei and Neodiprion pineum.

2.3 Methods

2.3.1 Insect collection and rearing

We collected N. pinetum and N. lecontei larvae throughout the eastern United States (Table A1). We brought larvae back to the lab, transferred them to plastic boxes (32.4 cm x 17.8 cm x 15.2cm) with mesh lids, and fed them pine foliage from their natal host species ad libitum. We collected cocoons as they were spun and stored them in individual gelatin capsules until adult emergence. We maintained all larvae and cocoons at 22°C, 70% relative humidity, and an 18-6 h light-dark cycle (Knerer 1984; Harper et al. 2016). Upon emergence, live adults were stored at 4°C until needed for crosses, morphological measurements, or behavioral assays. To propagate additional generations, we placed adult females and males into a mesh cage (35.6 cm x 35.6 cm x 61 cm) with seedlings of the pine species they were collected on. We allowed the adults to mate and the females to oviposit. After the eggs hatched, we reared larvae as described above.

21

2.3.2 Host needle width

N. pinetum uses Pinus strobus (white pine) exclusively, while N. lecontei has 8 primary pine hosts (P. banksiana, P. resinosa, P. echinata, P. palustris, P. elliottii, P. rigida, P. taeda, and P. virginiana) (Benjamin 1955; Coppel and Benjamin 1965; Wilson et al. 1992) To characterize the oviposition environment, we measured the widths of needles collected from 10 trees from each of 6 Pinus species, including Pinus strobus and 5 of N. lecontei’s primary hosts (P. banksiana, P. resinosa, P. echinata, P. virginiana, P. rigida). Host collection locations are indicated in Table B2. For each pine tree, we measured the width of 10 needles using digital calipers (Mitutoyo CD-6”PMX), then averaged these values to produce an average needle width per tree. We used P. strobus and P. banksiana (jack pine) seedlings as hosts for oviposition in all experiments (purchased from Itasca Greenhouse in Cohasset, MN and North Central Reforestation, Inc. in Evansville, MN). To assess how seedling needles (experimental hosts) compare to needles from mature hosts (typical hosts in nature), we measured needle widths for P. strobus and P. banksiana seedlings using the same approach described above. To analyze the differences in host needle width among mature pine species, and between mature hosts and seedlings, we performed two ANOVAs with Tukeys post hoc tests.

2.3.3 Oviposition behavior

Our hypothesis that divergent selection has shaped oviposition traits in N. pinetum and N. lecontei predicted that N. pinetum (thin-needle specialist) would have a stronger preference for P. strobus and would lay fewer, more widely spaced eggs per needle than N. lecontei (thick-needle specialist). We also predicted that, compared to N. lecontei, N. pinetum would cut fewer “preslits,” which is an oviposition behavior in which Neodiprion females cut a non-egg-bearing slit near the base of the pine needle, presumably to defuse the resin canal defense (McCullough and Wagner 1993). We evaluated these predictions via a choice experiment. We first placed females in a clear 3.25-ounce deli cup with a single male until mating occurred. We then placed each mated female in a mesh cage (35.6 cm x 35.6 cm x 61 cm) with two P. banksiana seedlings and two P. strobus seedlings. We checked the cage daily until the female either oviposited or

22

died. For each female, we scored whether or not oviposition occurred. When oviposition occurred, we recorded host choice. Because N. pinetum and N. lecontei tend to cluster all of their eggs on a single branch terminus (Rauf and Benjamin 1980; Wilson et al. 1992), host choice is best described as a categorical trait with two possible outcomes: P. banksiana or P. strobus. We excluded 3 females that laid eggs on both hosts (representing 3.15% of the total sample).

To describe oviposition pattern, we counted the number of eggs, the number of egg bearing needles (EBN), and the number of EBN with preslits. We then used these data to calculate, for each female, the average number of eggs per needle (number of eggs/number of EBN) and the proportion of EBN with preslits (number of EBN with preslits/total number of EBN). We then placed the egg-bearing seedling into an individual mesh sleeve cage (25.4 cm x 50.8 cm), and watered as needed until egg hatching occurred. After hatching, we removed all EBN and imaged 10 randomly selected needles per seedling with a Canon EOS Rebel t3i camera equipped with an Achromat S 1.0X FWD 63 mm lens. Using these images, we measured the space between eggs in ImageJ (Schneider et al. 2012) and, for each female, averaged egg spacing data across each needle. Sample sizes for each oviposition trait we scored are given in Table A1.

To determine whether the two species differ in willingness to oviposit, we analyzed the proportion of N. pinetum or N. lecontei females that oviposited on any host with a generalized linear mixed effects model using a logit link factor, species as a fixed effect, and population (where each collecting location/host species combination was considered a separate population) as a random effect nested within species. To determine whether the two species differ in host preference, we used the same generalized linear mixed effects model to analyze the proportion of ovipositing females that chose P. strobus. To determine whether the two species differ in the average number of eggs per needle we used an ANOVA with species, natal host, and population as factors. To determine whether the two species differ in average spacing between the eggs, we used an ANOVA. To determine whether the two species differ in the proportion of needles

23

with preslits, we arcsine transformed the data and performed an ANOVA on the transformed data with species and population as factors.

2.3.4 Ovipositor morphology

In addition to behavior, we examined ovipositor morphology, with the prediction that, compared to N. lecontei, N. pinetum would have smaller ovipositors. To characterize ovipositor morphology, we used five females from each of five populations from each species (N = 25 females per species; Table A1). We used females preserved at -80°C from either the parental phenotyping experiments or that had been frozen upon emergence. To control for body size, we measured the length of the forewing. We then removed the ovipositor, and mounted a single lancet (inner saw) using an 80:20 permount:toluene solution. We photographed each mounted lancet at 5x magnification using a Zeiss DiscoveryV8 stereomicroscope with an Axiocam 105 color camera and ZEN lite 2012 software (Carl Zeiss Microscopy, LLC Thornwood, NY). Using this software, we measured the length from the top of the second annulus to the top of the penultimate annulus, and measured width at the second annulus. We then performed morphometric analysis, which allows us to test for shape differences while controlling for size of the ovipositor. For this analysis, we placed 9 landmarks and 21 sliding landmarks on each ovipositor (see “Results”). We then examined ovipositor shape using Geomorph (Adams and Otárola-Castillo 2013). We applied a general procrustes alignment by minimizing binding energy. To determine whether the two species differed in ovipositor shape, we performed a procrustes ANOVA with forewing (as an allometric measurement), species, and population as factors. To determine whether the two species differed in ovipositor length or width, we used ANOVAs that included species, population nested within species, and forewing length. We completed all measurements and landmark placements in ImageJ version 1.49V (Schneider et al. 2012).

2.3.5 Cross oviposition behavior and success

If postzygotic isolation contributes to reproductive isolation between N. pinetum and N. lecontei, hybrids should have reduced fitness relative to pure parental species; if this isolation is ecologically dependent, this reduction in fitness should be host-

24

dependent. To test these predictions, we used the cross design outlined in Fig. 2.2 to generate F1 and backcross individuals between a N. pinetum population collected on P. strobus in Crossville, TN and a N. lecontei population collected on P. echinata (shortleaf pine) in Lexington, KY (Table A1). It is important to note here that Neodiprion, like most hymenopterans, have arrhenotokous haplodiploidy, in which unfertilized eggs develop into haploid males (Heimpel and Boer 2008; Harper et al. 2016). Thus, males resulting from an interspecific cross carry maternal chromosomes only (Fig. 2.2). Our crosses involved 6 types of female, which we compared to make inferences regarding postzygotic isolation: parental lecontei (L), parental pinetum (P), lecontei female-pinetum male F1

(F1LP), pinetum female-lecontei male F1 (F1PL), lecontei backcross (BCL), and pinetum backcross (BCP). Larvae were reared on the oviposition host that their mother chose. F1PL females used in the cross were reared on P. strobus and F1LP females were reared on P. banksiana. Backcross females were reared on a mixture of P. strobus and P. banksiana.

Figure 2.2. Cross design for assessing postzygotic isolation.Because Neodiprion have haplodiploid sex determination, unfertlilized eggs from an interspecific mating will produce male offspring of the mother’s genotype. Backcross females were unmated, while parental species and F1 hybrid females were mated.

We placed individual females of each cross-type in a choice cage as described above (“Oviposition behavior”) and recorded whether or not oviposition occurred and, when it did occur, the preferred host. As we are specifically interested in reduced fitness due to oviposition traits, we used oviposition success as our measure of female performance. A female was considered to have “successful” oviposition if at least one of

25

her eggs hatched and “unsuccessful” oviposition if no eggs hatched within 4 weeks. We chose 4 weeks as a cut-off because this is well beyond the typical egg development time for both species under our rearing conditions (generally <16 days, personal observation). Finally, we attempted to recover every female as soon as possible after or oviposition occurred. Recovered females were preserved in 100% EtOH and stored at - 20°C for future use. Sample sizes are located in Table A3.

To determine whether the direction of F1 cross (i.e., F1LP vs. F1PL) differed in oviposition willingness, preference, or success, we used Z-tests. Because we did not observe any significant differences (see “Results”), we combined both cross-types into a single F1 category for the remaining analyses. To determine whether female cross-type

(L, P, F1, BCL, BCP) differed in willingness to oviposit or in host preference, we used GLMs with a logit link factor and cross-type as a fixed effect, followed by post hoc Z- tests.

When there is postzygotic isolation, hybrids have reduced fitness compared to parental forms. To determine whether hybrids had reduced oviposition success compared to the parental species, we analyzed our hatch success data with a GLM using a logit link factor with cross-type as a fixed factor, followed by post hoc Z-tests. Additionally, if postzygotic isolation is “extrinsic” (due to the host plant), then oviposition success should be host-dependent. More specifically, each backcross type is expected to have the highest fitness (oviposition success) in the environment corresponding to the parent to which it is most similar genetically (i.e., there should be a cross-type-by-host interaction, Fig. A1) (Rundle and Whitlock 2001). To test these predictions, we used the same GLM model as for total postzygotic isolation, but added the interaction between cross-type and chosen host. To control for a possible rearing effect on hatch success of backcrosses, rearing host was added as a fixed effect to a GLM model that included backcross type, oviposition host, and their interaction. We used Z-tests for all post hoc tests.

2.3.6 Impact of oviposition traits on BCL success

If oviposition traits are under selection and contribute to reduced hybrid fitness, the oviposition success of hybrid females should be dependent on these traits. To test

26

these predictions, we focused on the BCL females because they were the only cross-type for which we had an appreciable sample size for all relevant traits (ovipositors, oviposition pattern, and hatching success). Additionally, because there was very little variation in hatch success on P. banksiana (see “Results”), we focused our analyses on P. strobus, with the prediction that the BCL females with pinetum-like traits (ovipositor morphology and oviposition behavior) would have the highest oviposition success on P. strobus. For these analyses, we scored oviposition success as a binary trait (hatch or no hatch) as described above. To describe oviposition pattern, we assigned each female to one of two categories: “pinetum,” if she laid 3 or fewer widely spaced eggs per needle and “non-pinetum,” if she laid more than 3 eggs per needle and/or eggs were spaced close together. To describe ovipositor morphology, we dissected and mounted female ovipositors as described above, with the addition of a rehydration step for EtOH- preserved females. The rehydration step consisted of six 10-minute incubations of the female abdomen (at room temperature) in decreasing EtOH concentrations (100%, 95%,

80%, 65%, 50%, and 25% EtOH), followed by overnight incubation in water.

To determine whether having a pinetum-like ovipositon pattern increased the proportion of BCL females whose eggs hatched on P. strobus, we used a GLM with a logit link factor and oviposition pattern as a factor. Next, to determine whether “successful” females had more pinetum-like ovipositors than “unsuccessful” females, we used Geomorph (procrustes ANOVA accounting for forewing length) to compare ovipositor shape between females with and without egg hatching on P. strobus (Adams and Otárola-Castillo 2013). To determine whether ovipositor size affected the hatching rate on P. strobus, we performed separate GLMs with length and width data, both with a logit link factor and size as a continuous factor. To determine if oviposition pattern was correlated with ovipositor morphology we performed separate ANOVAs for ovipositor length and width, and a procrustes ANOVA for ovipositor shape.

Finally, we also used the BCL data to determine whether there was any relationship between host choice and oviposition traits, which may occur if females exhibit behavioral plasticity (e.g., alter oviposition behavior based on chosen host or alter host preference based on having a particular ovipositor morphology). To determine whether host choice

27

correlates with oviposition pattern we used a GLM with a logit link factor and chosen host as a factor. To determine whether host choice correlates with ovipositor morphology, we performed a procrustes ANOVA. To determine if ovipositor length and width correlated with oviposition host we performed two ANOVAs. All statistical analyses were performed using R version 3.2.3 (R Core Team 2015).

2.4 Results

2.4.1 Host needle width

Mature P. strobus had significantly thinner needles than all of the N. lecontei hosts (F5,54 = 72.42, P < 0.001, Table B4, Fig. 2.1A). Likewise, P. strobus seedlings had significantly thinner needles than P. banksiana seedlings (P < 0.001, Fig. A2, Table B5) However, because needles from P. banksiana seedlings were thinner than mature foliage (P < 0.001) and needles from P. strobus seedlings did not differ significantly from mature foliage (P = 0.15), the differences between our experimental hosts (F3,36 =188.47, P < 0.001) are likely to be less extreme than differences typically experienced by ovipositing females in nature. In the discussion, we consider possible implications for the difference between seedling needles (experimental hosts) and mature needles (typical hosts).

2.4.2 Oviposition behavior

N. pinetum and N. lecontei did not differ significantly in the proportion of females that oviposited (χ21 = 0.14, P = 0.28, Fig. 2.3A). However, the two species did differ significantly in host preference, with N. pinetum exhibiting much stronger preference for

P. strobus than N. lecontei (χ21 = 6.47, P = 0.0011, Fig. 2.3B). N. pinetum also laid fewer eggs per needle (F1,25 = 21,50, P < 0.0001, Fig. 2.4A): whereas N. pinetum laid an average of 1.7 eggs per needle, N. lecontei averaged 7.2 eggs per needle. N. pinetum

28

Figure 2.3. N. pinetum females preferred P. strobus more strongly than N. lecontei females. A. N. pinetum and N. lecontei did not differ in the proportion of females that laid eggs when placed in a host choice arena (P > 0.05). B. Of the females that oviposited, the proportion that chose P. strobus was higher for N. pinetum than N. lecontei (P < 0.05).

females also spaced their eggs farther apart than N. lecontei (F1, 22 = 62.86, P <0.001, Fig. 2.4B). Images representative of N. pinetum and N. lecontei oviposition pattern are shown in Fig 2.4D, E. Finally, N. pinetum females cut fewer preslits than N. lecontei females

(F1, 17 = 46.12, P <0.001, Fig. 2.4C, F): whereas none of the N. pinetum females we tested cut a preslit, all N. lecontei females cut at least one.

2.4.3 Ovipositor morphology

The 30 landmarks chosen for morphometric analysis are illustrated in Fig. 2.5A. N. lecontei and N pinetum females differed in ovipositor morphology: compared to N.

29

Figure 2.4. N. pinetum and N. lecontei females differed in their egg-laying pattern. A. On average, N. pinetum females laid fewer eggs per needle than N. lecontei females. B. On average, N. pinetum females spaced eggs farther apart than N. lecontei females. C. Across all egg-bearing needles (EBN), N. pinetum females cut preslits less often than N. lecontei females. All comparisons were significant at P < 0.05. D. Representative oviposition pattern of N. lecontei females: many, closely spaced eggs per needle. E. Representative oviposition pattern of N. pinetum females: few, widely spaced eggs per needle. F. A preslit (indicated by an arrow) cut by a N. lecontei female on a P. banksiana seedling (photos by R.K. Bagley).

30

lecontei ovipositors, N. pinetum ovipositors were shorter (χ21 = 139.18, P <0.001, Fig.

2.5C), narrower (χ21 = 186.71, P <0.001, Fig. 2.5D), and had a distinctly straighter shape

(F1, 39 = 138.31, P < <0.001, Fig. 2.5B).

Figure 2.5. N. pinetum females had smaller, straighter ovipositors than N. lecontei females. A. A representative N. lecontei ovipositor with landmarks (black circles) and sliding landmarks (white circles) used in morphometrics analyses. B. Principle components analysis of ovipositor shape of N. pinetum females (white circles) and N. lecontei females (grey circles). The warp grids represent the change in ovipositor shape along principle component axis 1. C. N. pinetum has narrower ovipositors than N. lecontei. D. N. pinetum has shorter ovipositors than N. lecontei. Shape (B), length (C), and width (D) differences were all significant at P < 0.05.

2.4.4 Cross oviposition behavior and success

The direction of the F1 hybrid cross (i.e., F1LP vs. F1PL) had no effect on the female’s willingness to oviposit (Z = 1.20, P = 0.23), preference (Z = 1.20, P = 0.23), or oviposition success (Z = 1.01, P = 0.31). Given these findings, we combined F1 cross directions in subsequent analyses.

Females from the different cross-types differed significantly in their willingness to oviposit (χ24 = 46.37, P <0.001, Fig. 2.6A). In particular, BCL females oviposited 31

significantly more often than all other types of females (Table 2.1). The cross-types also differed in their preference for P. strobus, with preference for this host declining as the individuals became more genetically different from N. pinetum (χ24 = 82.40, P <0.001, Fig. 2.6B). None of the N. lecontei in our cross oviposited on P. strobus. The only cross- types that did not differ significantly in their P. strobus preference were P vs. BCP and P vs. F1 (Table 2.1).

Table 2.1 Post hoc tests (Z-tests) for interspecific crosses

The cross-types also differed in their oviposition success (χ24 =13.03, P = 0.011, Fig. 2.6C), and the F1 females had significantly lower hatching success than any of the other cross-types (Table 2.1). When oviposition host and an interaction between host and cross-type were added, cross-type remained significant (χ24 = 14.92, P = 0.0049, Fig.

2.6D). Additionally, there was a significant effect of host on oviposition success (χ21 =44.43, P <0.001): across all cross-types, females that chose P. strobus had lower hatching success than females that chose P. banksiana (Fig. 2.6D). Also, although none of the N. lecontei females involved in the cross chose P. strobus, four of the N. lecontei females from our multi-population preference experiment did chose P. strobus (Fig. 2.3B). Notably, all four of these females experienced complete hatching failure (Fig. A3). Although both cross-type and oviposition host significantly impacted hatching success, the interaction between them was not significant (χ23= 2.37, P = 0.50). We also found

32

that rearing host (BC larvae were reared on whatever host species their F1 mother had chosen) did not affect backcross oviposition success (χ21 = 0.22, P = 0.64).

Figure 2.6. Oviposition preference and success depends on cross-type and host. A. Proportion of females from each cross-type that laid eggs when placed within a host choice arena. Compared to other cross-types, BCL females were more willing to oviposit when placed in a host choice arena. B. Proportion of egg-laying females that chose P. strobus. Preference for P. strobus declined as the proportion of N. lecontei alleles increased. C. Oviposition success (proportion of females with at least one hatching egg) was significantly lower for F1 females, indicating that there is post- zygotic isolation. D. Oviposition success was lower on P. strobus (white bars) than on P. banksiana (gray bars) (P < 0.05); this host-dependent reduction in fitness is consistent with extrinsic postzygotic isolation. Compared to P and BCP females, F1 and BCL females had lower oviposition success on P. strobus. However, the host-by- cross-type interaction was not significant (P > 0.05). Oviposition success data are not available for “L” females on P. strobus because no L females chose P. strobus in this experiment (“NA”). In all panels, statistical significance at P < 0.05 is indicated by differing letters (see Table 2.1). In (D), letters refer to oviposition success on P. strobus only (no differences were observed on P. banksiana). Cross-type abbreviations are as indicated in Fig. 2.2.

33

2.4.5 Impact of oviposition traits on BCL oviposition success

BCL females that had a pinetum-like oviposition pattern were significantly more likely to have eggs that hatched on P. strobus than if they deviated from this pattern (χ21=

3.85, P =0.0498, Fig. 2.7A). Also, BCL females that successfully oviposited on P. strobus had significantly shorter ovipositors than unsuccessful females (χ21= 9.50, P = 0.0021, Fig. 2.7B). In contrast, successful and unsuccessful females did not differ in ovipositor width (χ21= 0.019, P = 0.89) or ovipositor shape (F1, 17=1.16, P = 0.24).

In BCL females, host choice (P. strobus vs. P. banksiana) did not correlate with oviposition pattern (χ21= 0.14, P = 0.70), ovipositor length (F1, 38 =1.81, P = 0.19), ovipositor width (F1, 38 = 0.0056, P = 0.94), or ovipositor shape (F1, 38 =1.86, P = 0.22).

Finally, oviposition pattern was unrelated to ovipositor length (F2,17= 0.20, P = 0.82), ovipositor width (F2, 17 = 0.024, P = 0.98), or ovipositor shape (F2, 17 =1.10, P = 0.35).

Together, these results imply that host preference, oviposition pattern, and ovipositor morphology are genetically independent traits.

Figure 2.7. BCL females with pinetum-like oviposition traits have higher oviposition success on P. strobus. A. Oviposition success (proportion of females with at least one hatching egg) was higher for BCL females that had a pinetum-like oviposition pattern (<3 eggs per needle) compared to females that lacked this pattern (>3 eggs per needle) (P <0.05). B. Females that laid successfully had shorter ovipositors than those that did not (P < 0.05).

34

2.5 Discussion

Empirical data from diverse taxa indicate that ecological speciation is common in nature (Rundle and Nosil 2005; Schluter 2009; Nosil 2012; Van der Niet et al. 2014), and that changes in host use frequently initiate ecological speciation in plant-feeding insects (Berlocher and Feder 2002; Matsubayashi et al. 2010). However, the contributions of specific divergent traits to EPI are unknown in most systems. In this study, we evaluated evidence of oviposition traits generating extrinsic postzygotic isolation between a pair of Neodiprion sawfly species that specialize on different pines. We found compelling evidence of EPI stemming from maladaptive combinations of oviposition traits. Here, we discuss the limitations, as well as broader implications of our work for ecological specialization and speciation in plant-feeding insects and future research directions in this promising empirical system.

Although all sawflies in the genus Neodiprion feed on host plants in the family Pinaceae (mostly in the genus Pinus), different sawfly species tend to specialize on different pine hosts (Ross 1955; Coppel and Benjamin 1965). Previous analyses at both the inter- and intraspecific levels indicate that changes in host use are associated with population differentiation and speciation in this genus (Linnen and Farrell 2010; Bagley et al. 2017). In this study, we investigated a potential causal relationship between adaptation to different hosts and reproductive isolation. In particular, we hypothesized that maladaptive combinations of divergent oviposition traits give rise to extrinsic postzygotic isolation between Neodiprion species. Consistent with this hypothesis, we found that sister species N. pinetum (a thin-needled specialist) and N. lecontei (occurs on thicker-needled hosts) differed in multiple behavioral and morphological traits related to oviposition (Fig. 2.3-5). In terms of behavior, N. pinetum females preferred P. strobus (white pine), laid a small number of widely spaced eggs on each needle, and never cut resin-draining preslits. In contrast, N. lecontei females generally avoided P. strobus, laid many closely spaced eggs per needle, and almost always cut preslits. In terms of morphology, N. pinetum females had smaller, straighter ovipositors than N. lecontei females. Together, N. pinetum traits likely enable females to insert eggs into P. strobus without damaging the thin needles to the point that they dry out and the eggs die, while

35

N. lecontei traits should better equip females to circumvent host defenses and prevent eggs from being overwhelmed by resin.

Although N. lecontei and N. pinetum appear to be specialized to oviposit on different hosts, they do hybridize in nature (personal observation, [35]), indicating that premating barriers are incomplete. Nevertheless, the strong genetic, behavioral, and morphological differentiation between these two sympatric species ([35]; Figs. 2.3-5) suggest that there are postzygotic barriers to gene exchange. Consistent with this prediction, we found that F1 females had reduced oviposition success relative to the two parental species (Fig. 2.6C). For these females, there were two potential sources of oviposition failure: botched oviposition (which would be host-dependent and therefore extrinsic in nature) and egg inviability (which could stem from intrinsic genetic incompatibilities or from extrinsic egg-host interactions). Our observation that hybrid females had reduced oviposition success only when they chose P. strobus suggests that postzygotic isolation between N. lecontei and N. pinetum is largely attributable to extrinsic, rather than intrinsic, factors. By contrast, oviposition success of hybrid females on the more “benign” P. banksiana seedlings was indistinguishable from oviposition success of pure N. lecontei and N. pinetum females. Although this finding is consistent with EPI, an alternative explanation for these results is that intrinsic genetic incompatibilities between the species are more pronounced in the P. strobus environment (Hatfield and Schluter 1999; Rundle and Whitlock 2001; Rundle and Nosil 2005). One way to control for intrinsic genetic incompatibilities is to compare the fitness of both backcross types in both parental environments (Rundle and Whitlock 2001). Using this method, we found that BCP females had high oviposition success on both hosts, while

BCL females had high oviposition success on P. banksiana only (Fig 2.6D).

While seemingly at odds with predictions under EPI, our observation that BCP females had high oviposition success on both hosts could be attributable to our experimental design. There are two main sources of experimental error that could have precluded us from detecting reduced hatch success on P. banksiana. First, by scoring oviposition success as a binary trait (hatch or no hatch), we lumped together females with a wide range of hatching success (from <10% to 100%). Failure to account for variation

36

in non-zero hatch success would have reduced our power to detect all but the most extreme differences in oviposition success. In other words, while lecontei-like oviposition traits led to a complete failure on P. strobus consistently enough that we could detect it with our crude measure of success, we had little power to detect subtler reductions in hatching success.

The second potential source of experimental error in our assessment of EPI is that we used pine seedlings in lieu of larger trees, which we could not accommodate in our growth rooms. However, as our needle width data indicate (Fig. A2), the pine seedlings we used did not fully recapitulate differences in the host age classes that are typically selected by ovipositing N. lecontei and N. pinetum females in the wild (Rauf and Benjamin 1980; Averill et al. 1982). In particular, the needles of our P. banksiana seedlings were considerably thinner than needles from older trees. Moreover, resin content tends to increase as pine trees age (Lin et al. 2001). Thus, while the P. strobus seedlings we used replicated the challenge of laying eggs on a thin needled-host, the P. banksiana seedlings did not replicate the challenge of laying on a thick, resinous needle.

Despite these possible experimental artifacts, we do have an additional line of direct evidence supporting the existence of EPI due to oviposition traits: on P. strobus,

BCL females with lecontei-like oviposition traits (ovipositor morphology and egg-laying behavior) had reduced oviposition success compared to BCL females with pinetum-like oviposition traits (Fig. 2.7). Because all BCL females share the same genetic makeup (i.e., same proportion of N. lecontei and N. pinetum alleles), these differences cannot be explained by intrinsic genetic incompatibilities. Taken together, our cross data indicate that maladaptive combinations of oviposition preference and oviposition traits in hybrids generate EPI between N. lecontei and N. pinetum. Intriguingly, maladaptive combinations of preference and performance traits have been reported in several other insect taxa (Forister 2005; Ohshima 2008; Matsubayashi et al. 2010; Mcbride and Singer 2010), suggesting that this might be a widespread cause of reduced hybrid fitness.

Our analysis of traits in BC females also demonstrates how examination of specific traits in hybrid individuals can be used as an alternative to the “modify-parental- phenotype” test of EPI that has been proposed, but never utilized (Rundle and Nosil

37

2005). In our case, modifying parental phenotypes was not an option because our focal phenotypes were either behavioral (host preference, oviposition pattern) or involved a delicate morphological structure (ovipositor) that we could not alter readily—we suspect that the same is probably true of many organisms in which one might want to investigate EPI. However, as we have shown here, genetic crosses can serve a similar function as parental modification. In particular, by generating recombination among loci underlying ecologically relevant traits and assessing fitness in recombinant individuals, we could begin to tease apart how individual traits and interactions between them contribute to reduced fitness of hybrids in parental environments. To date, we know of only one other study that has taken advantage of trait variation in hybrids to make inferences regarding EPI in plant-feeding insects: McBride and Singer’s (2010) study of EPI in Euphydryas butterflies (see also Martin and Wainwright 2013 for an example in Caribbean pupfishes). In their study, McBride and Singer reared F1 hybrids between allopatric, host-specialized populations on both parental hosts and, for four behavioral traits, found that trait intermediacy in the hybrids reduced their fitness on both hosts.

To date, numerous studies—many of which focused on plant-feeding insects— have reported evidence of EPI (see (Funk et al. 2002; Matsubayashi et al. 2010; Nosil 2012)).While only a handful of these have employed a more rigorous approach (e.g., reciprocal backcross or trait-focused studies) that controls for genetic incompatibilities (Rundle 2002; Egan and Funk 2009; Kuwajima and Kobayashi 2010; Mcbride and Singer 2010; Soudi et al. 2016), the emerging picture from this body of work is that EPI frequently accompanies ecological speciation. However, in only a handful of cases have the traits underling EPI been identified (Hatfield and Schluter 1999; Mcbride and Singer 2010). Importantly, although EPI is a direct consequence of adaptive divergence, adaptive divergence does not always produce EPI. For example, if intermediate trait values do not impact fitness in parental environments or if an intermediate environment is available in nature, hybrids will not experience ecologically based reductions in fitness (Seehausen et al. 2014). As more traits are explicitly tested for their role in EPI, we can begin to ask more specific questions about its mechanistic basis, such as: which traits (behavior, physiology, morphology) and which aspects of (reproduction, food acquisition and processing, ) are most likely to produce EPI?

38

Based on their findings in Euphydryas butterflies, McBride and Singer (2010) proposed that behavioral traits—especially niche preferences—might be especially important drivers of EPI. In support of this argument they provided two additional examples. First, two European blackcap populations that migrate in opposite directions to their wintering grounds produce hybrids with a tendency to migrate in an intermediate and maladaptive direction (Helbig 1991). Second, hybrids between apple- and hawthorn host races of Rhagoletis pomonella have a tendency to avoid both parental hosts, making it difficult for them to locate suitable oviposition sites (Linn et al. 2004; Forbes et al. 2005). By contrast, our hybrids did not exhibit a reduction in willingness to oviposit (in fact, for reasons that are currently unclear to us, BCL seemed more willing to oviposit than other cross types; Fig. 2.6), indicating that “host confusion” is not contributing to EPI in this system. Nevertheless, our data are consistent with the overall importance of behavioral traits (in our case, host preference and oviposition pattern) in driving EPI.

Additionally, similar to our finding that oviposition traits contribute to EPI in Neodiprion, three of the four traits implicated in reduced hybrid fitness in Euphydryas butterflies were related to oviposition. Experimental and natural history work in additional Neodiprion species suggest that this phenomenon might be widespread in the genus (Wilkinson 1971; Knerer and Atwood 1973; McCullough and Wagner 1993). For example, on the resinous host slash pine (P. elliottii), failure to cut preslits by ovipositing N. excitans females invariably resulted in eggs being engulfed by resin and failing to hatch (Wilkinson 1971). Additionally, needle desiccation correlated positively with the number of eggs per needle laid by N. lecontei females in a population infesting P. resinosa (Codella and Raffa 2002). More generally, there are numerous anecdotal reports of egg mortality due to either needle desiccation or drowning in resin, and these outcomes seem to vary with needle thickness, needle resin content, and female oviposition pattern (e.g., number of eggs per needle, presence of preslit, depth of egg slits and preslits) (Warren and Coyne 1958; Martineau 1959; Kapler and Benjamin 1960; Wilkinson 1961, 1971; Knerer and Atwood 1973; McCullough and Wagner 1993; Codella and Raffa 2002). Together, these observations suggest that oviposition traits are under strong selection both within and between Neodiprion species. Intriguingly, host preference, ovipositor morphology, and oviposition pattern are also among the most

39

variable traits in the genus and are often useful in species identification (Ross 1955; Ghent 1959; Linnen and Smith 2012). If host-related selection has shaped this variation, inter- and intraspecific variation in host preference should correlate with variation in other oviposition traits; this prediction could be tested using a comparative approach.

Beyond Neodiprion, oviposition-related traits—which include traits related to finding and choosing a host, selecting a site within the host for egg deposition, depositing eggs in specific patterns on or within the host tissue, defusing host defenses, ovipositor morphology, and egg morphology—could profoundly impact the fitness of any egg- laying phytophagous insect female and are therefore likely to be frequent targets of natural selection (Janz 2003). In support of this argument, numerous studies have reported host-associated differentiation in oviposition traits, including: clutch size in seed beetles (Messina and Karren 2003), ovipositor morphology in moths (Groman and Pellmyr 2000), ovipositor length in gall-inducing Asphodylia flies (Joy and Crespi 2007), ovipositor length in fig wasps (Weiblen and Bush 2002), ovipositor size in Plateumaris leaf beetles (Sota et al. 2007), clutch size and oviposition site in butterflies (Singer and McBride 2010), and multiple morphological and behavioral traits in pine sawflies (this study). However, in the context of traits driving ecological specialization and speciation in plant-feeding insects, research has focused almost exclusively on female host preference and larval performance (i.e., growth and survival rates when feeding on a particular host plant). To understand the role of host specialization in phytophagous insect speciation, it is critical that we examine additional host-related traits.

2.6 Conclusions

In this study, we have demonstrated that oviposition traits contribute to EPI between N. lecontei and N. pinetum. While these observations are consistent with ecological speciation, the evidence is not yet iron-clad and many important questions remain. First, while we focused here on oviposition traits, other traits—such as larval performance—could also contribute to EPI. In future work, we hope to quantify the impact of individual traits—and the interaction between them—on host-dependent reductions in hybrid fitness. Second, while we have focused here on EPI, there are other

40

sources of reproductive isolation between these species (personal observation). To evaluate the contribution of EPI to total isolation, we must quantify the strength of EPI relative to other reproductive barriers (Nosil 2007; Sobel and Chen 2014). Finally, understanding how divergent traits give rise to RI requires that we identify the genetic mechanisms (i.e., linkage or pleiotropy) linking them (Rundle and Nosil 2005; Nosil 2012). As we have demonstrated here, these species are interfertile in the lab; thus, a QTL mapping approach is feasible in this system. Additionally, identification of causal loci—which is required if we are to distinguish between pleiotropy and linkage—will be facilitated by the availability of annotated genome assemblies for N. lecontei (Vertacnik and Linnen 2015) and N. pinetum (in progress).

While a long-term goal is to identify all host-related traits under selection, all reproductive barriers, and their underlying genes in N. lecontei and N. pinetum, these efforts will provide a single snapshot at one time point in speciation. Because these species have been diverging for up to several million years (Linnen and Farrell 2008b, 2010), they have had time to accumulate many differences and barriers to reproduction, which will make it difficult to determine which reproductive barriers arose first. To get at this question, we can examine other Neodiprion species and populations at different stages along the “speciation continuum” (Nosil et al. 2009b). For example, there is evidence of host-associated differentiation in at least two Neodiprion species (Neodiprion lecontei; (Bagley et al. 2017); Neodiprion abietis, (Knerer and Atwood 1972, 1973)), and possibly other Neodiprion species as well. Although much work remains, extensive natural history data, experimental tractability, and growing genomic resources make Neodiprion an exceptionally rich system for addressing many long-standing questions regarding the evolution of host specialization and its role in generating the staggering diversity of phytophagous insects.

41

CHAPTER 3. THE GENETIC ARCHITECTURE OF OVIPOSTION TRAITS AND HATCHING SUCCESS IN NEODIPRION PINETUM AND N. LECONTEI

3.1 Abstract

One area of evolutionary biology that has recently gained momentum is the genetic architecture of adaptation and speciation. Here we aim to connect genotype to phenotype to fitness in Neodiprion pine sawflies. N. pinetum and N. lecontei use different hosts and have oviposition traits that match their respective hosts resulting in extrinsic postzygotic isolation. We use QTL mapping to find the loci underlying oviposition traits and hatching success as a measure of fitness. We found that there is opposing dominance between host preference and other oviposition traits leading to trait mismatch in the hybrids and reduced hybrid fitness. Additionally, all traits that we successfully mapped were multigenic, which was unexpected because there is ongoing gene flow between N. pinetum and N. lecontei. Few large effect changes are expected in order to successfully oppose the homogenizing force of gene flow. Ovipositor shape, ovipositor length, and the presence of a preslit all effect the likelihood of hatching. Although, the presence of preslits is the only trait in which there was an interaction with host choice. Hatching success did map to the same regions as some of the oviposition traits including ovipositor morphology and the presence preslits, indicating that these regions may include speciation genes. Although more fine scale analyses are needed, the results make Neodiprion pine sawflies a promising system for studying the genetics of adaptation and speciation.

3.2 Introduction

It has been long debated if speciation can proceed if accompanied by gene flow (Mayr 1942), but mounting evidence suggests that divergence with gene flow may be common (Taylor and Larson 2019). The most likely speciation mechanism to occur when there is gene flow is ecological speciation. During ecological speciation, ecologically based divergent selection leads to reproductive isolation (Rundle and Nosil 2005; Schluter 2009). This divergent selection can counteract the effects of gene flow. Thus ecological speciation can proceed when there is gene flow even if it is less efficient than

in the absence of gene flow (Schluter 2009; Nosil 2012). Many details of ecological speciation are well understood, but there are still outstanding questions especially regarding the genetic basis of ecological speciation (Schluter and Conte 2009; Arnegard et al. 2014).

One of the outstanding questions is about the typical genetic architecture of adaptations, such as the number of loci, the effect size, the level of pleiotropy, and their arrangement in the genome (Mackay 2001; Hansen 2006; Dittmar et al. 2016). One factor that affects genetic architecture is the distance from the fitness optimum (Orr 1998). commonly involves the colonization of a new environment that is followed by divergent selection (Schluter and Conte 2009). When a new environment is colonized the population is generally far from the fitness optimum. When a population is far from the optimum large effect loci are favored. However, as a population begins to get closer to the optimum large effect loci are likely to overshoot the optimum causing smaller effect loci to be favored. This creates an exponential distribution with few large effect loci and many small effect loci (Orr 1998).

However, additional factors can influence the genetic architecture of adaptation, such as gene flow. Gene flow homogenizes the genomes of the two populations (Griswold 2006). Small effect loci are more susceptible to this homogenization causing coadapted alleles be broken apart by recombination (Yeaman and Otto 2011; Yeaman and Whitlock 2011). This results in large effect loci being favored when there is gene flow. Alternatively, tightly linked small effect loci act as a single large effect locus when there is gene flow (Yeaman and Whitlock 2011). Pleiotropy can also affect the genetic architecture of adaptations. In highly pleiotropic loci, large adaptive effects on one trait are likely to have deleterious effects on the other traits causing small effect loci to be more likely when there is pleiotropy (Fisher 1930; Wagner et al. 2008; Wang et al. 2010). Additionally, there have been debates about whether different traits are more likely to be controlled by pleiotropic loci. It has been argued that morphological traits are more likely to be controlled by highly conserved pleiotropic developmental pathways than physiological or behavioral traits (Carroll 2006, 2008). This should lead to morphological adaptations being controlled by more smaller effect changes than behavioral or

43

physiological adaptations. However, this idea is highly contentious (Hoekstra and Coyne 2007; Craig 2009).

The genetic architecture of traits under divergent selection can also affect the speed and likelihood of speciation. During speciation, when there are large effect loci under divergent selection the selected loci and linked loci through divergence hitchhiking increase in divergence and have a reduced effective migration rate (Flaxman et al. 2014). Weaker divergent selection and other evolutionary forces can then act at these linked loci to further increase divergence (Via 2012). Eventually these islands of speciation spread enough to reduce the genome-wide effective migration rate (Seehausen et al. 2014). Speciation proceeds differently if the loci under divergent selection are under only weak selection (Flaxman et al. 2014; Nosil et al. 2017). In this case, individual loci are not enough to reduce the effective migration rate. Instead there is a tipping point where genome-wide divergence occurs after enough weakly selected loci accumulate. Large effect loci are more likely to remain in the population and be resistant to gene flow, but speciation happens quickly after the tipping point is reached and other factors, such as areas of reduced recombination and intrinsic postzygotic isolation, are not needed to complete speciation.

However, finding the loci under divergent selection and the genes that are responsible for speciation is not as straight forward as it may initially seem. For a gene to be labelled as adaptive it is not enough for it to affect a trait under selection. Instead for an allele to be adaptive its effects on the selected trait have to increase the organism’s fitness (Barrett and Hoekstra 2011). To designate an allele as adaptive, genotype has to be connected to phenotype, phenotype to fitness, and fitness to genotype. There are several ways to make these connections, but one promising method is to perform QTL mapping on traits known to be under selection and to additionally map fitness from reciprocal crosses of hybrids (Hall et al. 2010; Anderson et al. 2011). For a gene to be considered a speciation gene it has to contribute to reproductive isolation and to have occurred before speciation is complete (Nosil and Schluter 2011). Determining if a gene has contributed to reproductive isolation before speciation is complete is difficult, unless one studies incipient species, and then it is still difficult to know how much of an effect

44

the allele had at the time that it began to affect reproductive isolation. Despite the difficulty of confidently determining adaptive alleles and speciation genes, useful evolutionary insights result when we do (Blackman 2016).

One set of taxa that has been particularly well studied in the context of ecological speciation is phytophagous insects (Matsubayashi et al. 2010). Phytophagous insects tend to have close association with their host plants and shifting to a new host provides ample opportunity for divergent selection to act. We examine the genetic architecture of adaptations during ecological speciation with gene flow in Neodiprion pine sawflies (Hymenoptera, Diprionidae). Pine sawflies have an intimate association with their host throughout their life cycle, including the females using her saw-like ovipositor to carve a pocket in the pine needle for her eggs (Coppel and Benjamin 1965). Ovipositing correctly to match the host needle traits is vital for the survival of the eggs. If oviposition is incorrectly performed the eggs risk drying out or drowning in host resin, which both result in embryonic mortality (Wilkinson 1971; Knerer and Atwood 1973; McCullough and Wagner 1993; Codella and Raffa 2002).

We specifically examined a pair of sister species, N. pinetum and N. lecontei, that differ in host use (Linnen and Farrell 2007, 2008a). N. pinetum is a specialist on white pine (Pinus strobus), whereas N. lecontei is more of a generalist that uses a wide range of pine hosts but avoids white pine. White pine has thinner and less resinous needles than N. lecontei’s hosts (Wu and Hu 1997; Gernandt et al. 2005; Bendall et al. 2017). Both species have a suite of morphological and behavioral oviposition traits that allow each species to correctly oviposit in their hosts (Bendall et al. 2017). N. pinetum lays fewer, widely spaced eggs per needle, and does not cut preslits, which are small non-egg bearing cuts that allow resin to drain from the needle. N. lecontei has many, closely space eggs per needle and routinely cuts preslits. Additionally, N. pinetum has a thinner, shorter, and straighter ovipositor than N. lecontei. Overall N. pinetum’s oviposition traits are well suited to thin low resin hosts and N. lecontei’s traits are well suited for thicker more resinous hosts. Hybrids prefer white pine but have traits that are ill-suited for it resulting in high hatching failure and extrinsic postzygotic isolation. Both morphological and behavioral oviposition traits have been directly linked to reduced hybrid fitness.

45

Here we use QTL mapping of both oviposition traits and hatching success as a measure of fitness to understand the genetic architecture of adaptation and speciation in pine sawflies. There is ongoing gene flow between N. lecontei and N. pinetum, allowing us to test the effects of gene flow on genetic architecture. Additionally, we have both morphological and behavioral traits to directly to test if different trait types have different genetic architectures. Furthermore, since extrinsic postzygotic isolation is a direct consequence of divergent selection and speciation is not fully complete (i.e there is still gene flow) any locus that differentially effects hatching on the different hosts is a potential speciation gene.

3.3 Methods

3.3.1 Crosses

We collected N. lecontei and N. pinetum in the field as larvae. N. pinetum was collected on P. strobus in Kentucky, and N. lecontei was collected on P. virginiana at the University of Kentucky Arboretum and in Crossville, TN. The larvae were reared to adulthood in the lab following standard rearing protocols (Harper et al. 2016; Bendall et al. 2017). Wild caught individuals were reared for up to 3 generations before being used in the QTL crosses.

When we started the crosses, it was unknown if hybrid males were fertile. Given the genetic distance between N. pinetum and N. lecontei it is likely that hybrid males are sterile, making intercrossing impossible. Additionally, white pine preference is partially dominant, but the dominance of the other traits was unknown (Bendall et al. 2017). Therefore, we made backcrosses in both directions. We mated a single N. lecontei female with a N. pinetum male. We placed the mated female in a choice cage with two Pinus banksiana (jack pine) and two Pinus strobus (white pine) seedlings and allowed her to oviposit. The adults (grandparents of the cross) were preserved in 100% EtOH. We reared resulting offspring according to standard protocol. We mated resulting F1 females with a N. pinetum male to make pinetum backcross females and mated F1 females with a

N. lecontei male to make lecontei backcrosses. We placed the mated F1 females on a P. banksiana seedling to maximize the success of the cross. F1 females prefer P. strobus 46

when given a choice but have very low hatching success on P. strobus due to a mismatch in oviposition traits and needle traits. F1 females do well on P. banksiana and will readily oviposit on them in a no choice lab setting. The offspring of this cross result in hybrid males backcross females.

To increase our sample size, we mated a resulting hybrid male to a N. lecontei female. This cross generates lecontei backcross females. Males are haploid and do not undergo meiosis, causing all of the daughters produced from a hybrid male to have identical recombinant chromosome composition. Individual females from these crosses were used, but whole families were not.

3.3.2 Phenotyping

After backcross females were made, we placed each unmated female into a choice cage as described above. After a female oviposited or died we placed her in 100% EtOH. We then recorded if the female oviposited, the host the female oviposited on, the number of eggs laid, the average number of eggs per needle, and the proportion of egg bearing needles with preslits. We then placed the egg bearing seedling in a mesh sleeve (25.4 cm x 50.8 cm) for the remainder of the development period and watered as needed. To measure hatching success, we counted the number of larvae 48 hrs after hatching. We considered a female to have no hatching after 30 days post oviposition. We calculated hatching success in two ways, the proportion of eggs that hatched and whether or not any eggs hatched. After counting hatching, we removed all egg bearing needles and preserved at -80 °C. We imaged 10 randomly selected egg bearing needles per seedling with a Canon EOS Rebel t3i camera equipped with an Achromat S 1.0X FWD 63 mm lens. Using these images, we measured the space between eggs in ImageJ (Schneider et al. 2012) and, for each female, averaged egg spacing data across each needle.

Additionally, we examined ovipositor morphology. To control for body size, we removed the right hindleg, and mounted it in a well slide using permount. We then removed the abdomen and rehydrated the ovipositor for 24 hrs. After rehydration, we removed the ovipositor, and mounted a single lancet (inner saw) using an 80:20 permount:toluene solution. We photographed each mounted lancet at 5x magnification

47

using a Zeiss DiscoveryV8 stereomicroscope with an Axiocam 105 color camera and ZEN lite 2012 software (Carl Zeiss Microscopy, LLC Thornwood, NY). We measured the length of the ovipositor from the top of the second annulus to the top of the penultimate annulus and the width at the second annulus using Image-J. We also counted the number of annuli because N. lecontei have 1 more annuli than N. pinetum (Ross 1955). To examine ovipositor shape, we placed 9 landmarks and 21 sliding landmarks. We then applied a general Procrustes alignment by minimizing binding energy in Geomorph (Adams and Otárola-Castillo 2013).

3.3.3 Data analysis of phenotypes and hatching rate

In addition to phenotyping the backcrosses we phenotyped pure N. lecontei, N. pinetum and F1 hybrids (see Table S1 for sample sizes). For all phenotypes we used F1 females that were generated while making the backcrosses. For the number of eggs per needle, the proportion of needles with preslits, and host choice we used N. pinetum and N. lecontei females that were collected at the same times and from the same populations as the individuals used to make the backcrosses. For ovipositor morphology and egg spacing, we used previously published data for N. pinetum and N. lecontei. To test if the different cross-types differed in phenotype, we used separate ANOVAs for average egg spacing, average eggs per needle, ovipositor length, and ovipositor width. For comparisons that were significant, we performed Tukey post hoc tests. For the presence of an extra annuli and for host choice, we used chi square tests followed by fisher exact tests. For the proportion of needles with a preslit, we used a logistic regression followed by Tukey post hoc tests. For ovipositor morphology, we redid the general Procrustes alignment with all samples and performed a Procrustes ANOVA.

To test if oviposition traits affect hatching success, we performed a logistic regression on the proportion of eggs that hatched with the average eggs per needle, average egg spacing, the presence of an extra annuli, ovipositor length, ovipositor width, square root arc sine transformed proportion of needles with preslits, white host choice, the first four principle components of ovipositor shape, family, and leg length as covariates. We also tested for an interaction between white host choice and all other oviposition traits. 48

3.3.4 DNA extraction and sequencing

We extracted DNA from heads and thoraxes of the backcross females and grandparents using Qiagen DNeasy blood and tissue kits, and quantified the DNA using Quant-IT DNA high sensitivity kits. We used Tn5 tagmentation for library preparation, following Picelli et al. (2014) with a few modifications. After precharging Tn5 with annealed adapters, we used 10 ng (1ng/7 l) of DNA with 1 l of pre-charged Tn5 and 2 l TAPS-DMF for the tagmentation reaction. After tagmentation was complete, we killed the Tn5 with 2.5 l of 2.5% SDS. We then added 1 l tagmentation reaction, 5 l

OneTaq® 2X Master Mix, 2 l H20, 1 ul 10 M illumina i7 primer, and 1 ul 10 M illumina i5 primer for PCR. The primers for each sample are listed in Table S2. The PCR program is as follows: 3 min at 72°C, 30 sec at 94°C, and then 14 cycles of 10 sec at 94°C, 15 sec at 62°C, and 30 sec at 68°C, followed by 5 min at 68°C. The samples were pooled into 5 libraries of backcrosses of up to 96 samples and 1 library of grandparents, and then cleaned using AMPure XP beads. We used 0.6 volume of beads followed by 0.2 volume of beads using the supernatant. We eluted the DNA in 11 l of Tris 10mM. After libraries were quality checked on a bioanalyzer at UK HealthCare genomics core, we sent them to Admera Health for sequencing. The libraries were sequenced using 150 PE sequencing on 2 lanes of an Illumina Highseq X.

3.3.5 QTL mapping

We used trimmomatic to remove adapters from demultiplexed reads. We aligned the reads to the N. lecontei genome (Nlec1.1 GenBank assembly accession number- GCA_001263575.2) using bowtie2 with the very sensitive setting. We used samtools to remove any reads that had a Q score below 30 and that mapped to more than one location in the genome. We called SNPs using bcftools. For the grandparents we generated a list of SNPs that were differentially fixed in N. pinetum and N. lecontei to use as markers for the QTL mapping. For the backcrosses we kept sites that were fixed differences in the grandparents. To impute genotype likelihoods and infer ancestry for all SNPs, we then ran Ancestry HMM separately for the pinetum and lecontei backcrosses because introgression history differed between the two cross types (Corbett-Detig and Nielsen

49

2017). Because many markers likely have no recombination between them, we then thinned the markers so that the genotype likelihood differed by more than 0.3 between neighboring SNPs to retain only informative markers. Genotype likelihoods were converted to a hard genotype call if one of the genotypes had a probability of greater than 0.55.

We ran the QTL mapping using Haley Knott regression in R/qtl2 (Broman et al.

2019). To run the mapping on all individuals simultaneously, we used the F2 cross design setting with every chromosome designated as an X chromosome. The backcross direction was designated by the maternal ancestry of the hybrid male (i.e. a hybrid male with an N. lecontei mother was the same as the lecontei backcross direction). In haplodiploids this is genetically identical to our backcross design that was used to create our mapping population. We mapped average egg spacing, average number of eggs per needle, the proportion of needles with preslits, the proportion of eggs that hatched, the proportion of eggs that hatched on white pine, the proportion of eggs that hatched on jack pine, ovipositor length, ovipositor width, and the first four PCA axes for ovipositor shape. The first four axes cumulatively explained 79% of the variation and every axis after that explained less than 1% of the variation (Figure B1). We also mapped whether or not a female oviposited, whether a female chose white pine, and the presence of an extra annuli, if any eggs hatched (on white pine, on jack, or on either host) using the binomial setting for these traits. For all traits we used family as a phenotypic covariate, and leg length as a covariate for continuous traits to control for body size. To determine the number and location of peaks we used a LOD cut-off score of 4 and visually examined the QTL plots to determine the boundaries of a peak.

3.4 Results

3.4.1 Crosses and sequencing

We had 10 pinetum backcross families that resulted from a single F1 family that were mated to males from a single family (Table B1). This resulted in 185 pinetum backcross females that were phenotyped and sequenced. We had 14 lecontei backcross families that resulted from 4 F1 families. A total of 6 N. lecontei male families were 50

mated to F1 females to make the backcrosses. This resulted in 195 lecontei backcross females that were phenotyped and sequenced. For the backcrosses made from F2 males, we had 6 backcross families and sequenced 16 females. Each backcross female had an average read count of approximately 3 million reads.

We sequenced 6 N. lecontei grandparents that were all female and 8 N. pinetum grandparents, 5 males and 3 females. Each grandparent had an average read count of approximately 20 million reads. There were 683,677 fixed differences between the N. lecontei and N. pinetum grandparents that mapped to a chromosome. About 80 % of the genome is anchored to chromosomes. The rest of the genome is on unplaced scaffolds. We retained 505,876 markers after thinning.

3.4.2 Phenotypes and hatch rates

For all phenotypes, the cross-types differed suggesting that these traits have a genetic basis (p<0.01, Figure 3.1, Table B2, Table B3). Some traits such as the average eggs per needle, ovipositor shape, and ovipositor width appear to be additive. However, some traits have complete or partial dominance and the dominance is not always in the same direction. White pine preference is dominant in the N. pinetum direction. The presence of an extra annuli, ovipositor length, presence of a preslit, and average egg spacing are all dominant in the N. lecontei direction. For those traits that do show dominance preference traits are dominant in the N. pinetum direction and performance traits are dominant in the N. lecontei direction.

Several traits significantly impacted fitness in the backcrosses (Table B4). There was higher hatching success when females laid on jack pine (Figure 3.2A). There was also higher hatching success when females had longer ovipositors (Figure 3.2B), and PCA 3 for ovipositor shape affected hatching (Figure 3.2C, Figure B1). There was also an interaction with choosing white pine and having a preslit (Figure 3.2D). On jack pine a greater proportion of females with eggs that hatched cut preslits compared to females that did not have any hatching success. On white pine a smaller proportion of females with eggs that hatched cut preslits compared to females that didn’t have any hatching success.

51

52

Figure 3.1 Oviposition phenotypes across the different cross types. All phenotypes differed between cross types. A. The proportion of females that oviposited on white pine. B. The average number of eggs per needle. C. The average space between eggs in mm. D. The proportion of egg bearing needles with preslits. E. Ovipositor length from the top of the second annulus to the top of the penultimate annulus. F. Ovipositor width at the second annulus. G. The proportion of females that have an extra annulus. H. A principle components analysis of ovipositor morphology with warp grids showing how ovipositor shape changes along the PC axis. The color represents the cross type: Purple is N. pinetum, dark blue is pinetum backcross, teal is F1, green is lecontei backcrosses, and yellow is N. lecontei. All cross types significantly differed from each other except for F1 hybrids and lecontei backcrosses. Different letters denote pairwise comparisons that are significantly different in post- hoc tests.

Figure 3.2. Oviposition traits that affect hatching. A. On jack pine there is a greater proportion of females that have hatching. B. Females that have eggs that hatched tend to have longer ovipositors. C. Principle component axis 3 affects hatching. D. There is an interaction between preslit presence and host choice. On jack pine all of the females that had hatching cut preslits. On white pine of the females that had hatching a smaller proportion cut preslits compared to females that did not have hatching.

53

3.4.3 QTL mapping

All of the traits had at least one QTL peak, with the exception of white host choice. For the rest of the traits the number of peaks ranged from 2 to 12 (Table B5). Morphological and behavioral traits did not systematically vary in the number of peaks. Morphological traits had between 2 and 12 peaks, while behavioral traits had between 2 and 10 peaks.

The highest LOD score for a trait did differ between traits. In general, continuous traits had higher maximum LOD scores than binary traits. Additionally, morphological traits had higher maximum LOD scores than behavioral traits. Morphological traits also had higher sample sizes than behavioral traits, because with the exception of oviposition willingness all behavior traits required the female to oviposit for measurement (Table B5).

The QTL peaks are spread throughout the genome (Figure 3.3-3.5), but there are regions where multiple traits have peaks. For ovipositor morphology, multiple traits have peaks with high LOD scores on chromosomes 1and 3. On chromosome 1, ovipositor length, ovipositor width, ovipositor shape PCA 1, PCA 2, and PCA 4 all have large peaks. On the far end of chromosome 3, ovipositor length, the presence of an extra annuli, ovipositor shape PCA 2, and PCA 3 all have large peaks. In the middle of chromosome 3, ovipositor width, ovipositor shape PCA 3, and PCA 4 all have large peaks. Not all of the peaks cluster for ovipositor morphology cluster. For the example, the presence of an extra annuli uniquely maps to chromosome 4. Additionally, ovipositor morphology has some overlapping regions with ovipositor behavior. For the presence of a preslit there is a peak on chromosome 1 that is in the same region as the ovipositor morphology peaks. For the proportion of needles with preslits there is a peak in the middle of chromosome 3.

In addition to oviposition traits, hatching success also had QTL peaks (Figure 3.6). The proportion of eggs that hatched mapped to chromosome 1 and 7 (Figure 3.6A). There was a peak on chromosome 6 for whether or not there was hatching (Figure 3.6B). These fitness peaks overlap with oviposition traits. The peak for the proportion of eggs

54

A. B. 5 Oviposition willingness * 5 White Choice *

4 4

e

e

r

r

o

o c

3 c 3

S

S

D

D

O

O L 2 L 2

1 1

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome C. 8 D. 8 Avg. Eggs per Needle Avg. Egg Spacing * 6 6

*

e

e r

* r o

* * o c

* c * *

S

S

4 * * * 4

D

D

O

O

L L

2 2

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

E. F. 8 5 Presence of Preslit Prop. of Needles w/ Preslit * * * * * *

4 6

e

e r

r * * o o *

c 3 *

c * *

S

S

4 *

D

D

O

O L

2 L

2 1

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

Figure 3.3 QTL mapping of oviposition behavior. A. Whether or not a female oviposited. B. Ovipositing on white pine. C. The average eggs per needle D. Average spacing between eggs. E. Whether a female cut preslits. F. The proportion of egg bearing needles with preslits. The blue stars represent peaks that exceed the LOD score of 4 significance cutoff.

55

A. 15 Ovipositor Length * * * *

e 10

r o

c * S

* D

O * L * 5 * * *

1 2 3 4 5 6 7 Chromosome

B. 15 Ovipositor Width * *

10

e

r o

c *

S

D

O * * L * ** * 5 * * *

1 2 3 4 5 6 7 Chromosome C. 5 Extra annuli * *

4

e

r o

c 3

S

D O

L 2

1

1 2 3 4 5 6 7 Chromosome

Figure 3.4. QTL mapping of ovipositor morphology. A. Ovipositor length from the top of the second annulus to the top of the penultimate annulus. B. Ovipositor width at the second annulus. C. the presence of an extra annulus. The blue stars represent peaks that exceed the LOD score of 4 significance cutoff.

56

A. B. PCA 1 White* Choice PCA 2

8 8

e e

r 6 6 r

o *

o c

* c *

S

S

* * D

D * *

4 4

O

O

L L

2 2

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

C. D. PCA 3 * PCA 4

8 * 8 e

6 e 6

r

r

o o

c *

* c S

* * S

* * * D 4 * D 4

O *

O

L L

2 2

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

Figure 3.5. QTL mapping of ovipositor shape. A. Principle components axis 1. B. Principle components axis 2. C. Principle components axis 3. D. Principle components axis 4. The blue stars represent peaks that exceed the LOD score of 4 significance cutoff. that hatched on chromosome 1 overlapped with ovipositor width, ovipositor length, ovipositor shape PCA 1, PCA 2, PCA 4, the presence of presilts, and the average eggs per needle. The peak for the proportion of eggs that hatched on chromosome 7 overlapped with whether or not a female oviposited and a smaller peak for ovipositor shape PCA 3. The peak in chromosome 6 for whether or not there was hatching overlapped with peaks for the proportion of egg bearing needles with preslits, the presence of a preslit, and a smaller peak for ovipositor shape PCA 2.

57

A. 8 B. Prop. Hatched * 5 Any Hatching * *

6 4

e

e

r

r o

* o c

* c 3

S

S

4

D

D

O

O L L 2

2 1

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome C. D. Prop. Hatched * * Any Hatching Jack Jack 5 * 3

4 *

e

e

r

r

o

o c

c 2 S

3 S

D

D

O

O L 2 L 1

1

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

E. F. Prop. Hatched * 4 * Any Hatching White * White 6

* * 3 e

* e r * r

o *

o c ** c

S *

4 S

2

D

D

O

O

L L

2 1

1 2 3 4 5 6 7 1 2 3 4 5 6 7 Chromosome Chromosome

Figure 3.6. QTL mapping of hatching success. A. The proportion of eggs that hatched across all hosts. B. Whether a female had any hatching C. The proportion of eggs that hatched on jack pine D. Whether a female had any hatching on white pine. E. The proportion of eggs that hatched on jack pine F. Whether a female had any hatching on whirte pine. The blue stars represent peaks that exceed the LOD score of 4 significance cutoff.

58

When hatching success on a specific host is mapped, the patterns are a bit different. The proportion of eggs hatched on white or jack has a high number of peaks (Figure 3.6C). On jack pine, the proportion of eggs that hatched map to chromosome 1, 2, 3, and 5. These peaks tend to overlap with peaks for ovipositor morphology as well as the number of eggs per needle. On white pine, the proportion of eggs that hatched had 10 peaks with the two largest on chromosome 3 and 4 (Figure 3.6E). The peak on chromosome 3 corresponds to the peaks for ovipositor shape (PCA 3 and PCA 4).

3.5 Discussion

Here we aimed to connect genotype to phenotype to fitness in order to understand the genetic architecture of adaptation and speciation. We were able to connect genotype to phenotype by comparing oviposition traits across cross-types and by mapping oviposition traits. We were also able to connect phenotype to fitness by demonstrating that oviposition traits affected hatching success in the backcrosses and that at least one trait, presence of a preslit affected hatching success on white pine and jack pine differently. Finally, we connected fitness to genotype by mapping hatching success. We found QTL peaks for both the proportion of eggs that hatched and for the presence of hatching as well as for hatching on both hosts and for each host separately. Below we discuss some of the insights into genetic architecture of adaptation and speciation that we gained through this study and the limitations of this study.

The first insight into the genetic architecture of adaptation was that many of the oviposition traits had some dominance and the dominance between host preference and host performance traits were in opposite direction. Host preference was dominant in the N. pinetum direction and host performance traits were dominant in the N. lecontei direction. This opposing dominance may be important in the formation of reproductive isolation, especially extrinsic postzygotic isolation. Extrinsic postzygotic isolation results from an ecological mismatch between hybrid phenotypes and the environment the hybrid occupies. Traditionally, it has been conceptualized as hybrids having intermediate phenotypes compared to the parental species and being unable to successfully survive and reproduce in either parental habitat (Nosil et al. 2005). However, in many cases traits

59

under divergent selection are not exactly intermediate in the hybrids (Thompson et al. 2019). Instead the traits tend to show evidence of dominance and resemble one of the parents. If dominance is in different directions for different traits this can lead to a mismatch between traits, and reduced hybrid fitness (Matsubayashi et al. 2010). Opposing dominance between divergently selected traits challenges the notion that extrinsic postzygotic isolation is due to intermediacy of hybrids falling between parental niches. Alternatively, it may be more common that hybrids possess novel multivariate traits that mostly resemble one of the parents but contain moderately mismatched traits.

Opposing dominance may be an important source of reproductive isolation in phytophagous insects if the opposing dominance is between host preference and performance traits. Reproductive isolation will likely be stronger when there is opposing dominance compared to trait intermediacy. If hybrids have traits that are close to one parent but are in the wrong habitat then the hybrid is even farther from the phenotypic and fitness optimum than if the hybrid had intermediate traits, resulting in even greater reduced hybrid fitness. Opposing dominance that involves host preference also bypasses one of the greatest limitations of extrinsic postzygotic isolation: hybrid intermediacy only results in reduced fitness in so far as there is no intermediate environment (Nosil 2012). If intermediate environments exist, then hybrids can successfully use those instead. However, if there is dominance for host preference, other hosts can be available, but the hybrid will still choose a host in which it is ill adapted.

Opposing dominance may also affect the genetic architecture of adaptation and speciation. For example, in order for there to be opposing dominance the different traits have to be controlled by different loci resulting in a multi-locus genetic architecture (Matsubayashi et al. 2010). Opposing dominance may also decrease the necessity for tight linkage or pleiotropy between divergently selected traits by minimizing the homogenizing force of gene flow in a similar manner as a Bateson Dobzhansky Muller incompatibility (Bateson 1909; Dobzhansky 1937; Muller 1942). Despite ongoing gene flow between N. pinetum and N. lecontei there are a relatively large number of QTL peaks for the different oviposition traits, which was opposite of what we expected based on theoretical predictions.

60

However, our data is consistent with other empirical systems. Benthic and limnetic three spine stickleback in Paxton lake exhibit many of the same characteristics as our system: significant sexual isolation, differinces in multiple traits that allow them to adapt to their respective environments, the presence of extrinsic postzygotic isolation, and a lack of intrinsic postzygotic isolation (Hatfield and Schluter 1999; Rundle et al. 2000; Rundle 2002; McGee et al. 2013). Hybrids with the greatest reduction in fitness had mismatches in jaw morphology, and multiple unlinked loci controlled jaw morphology (Arnegard et al. 2014). Much of the theory surrounding genetic architecture of adaptations has focused on the evolution of a single trait with additivity. In reality, adapting to a new environment typically involves multiple traits with strong multifarious selection (Rice and Hostert 1993). Nonadherence to strict additivity and multifarious selection may have facilitated the larger number of QTL. The opposing dominance between host preference and oviposition traits in sawflies is part of a larger pattern in the way that extrinsic postzygotic isolation manifests and may influence the expectations for genetic architecture.

Additionally, there was no evidence for a difference in genetic architectures between ovipositor morphology and oviposition behavior traits. However, we are limited in making inferences about genetic architecture about the different traits by using a single LOD cut off score for all the traits. Different traits have different information content because of sample size and if they are continuous. To more accurately determine the number of peaks, simulations need to be conducted for each trait to obtain an individual LOD cutoff for each trait. If our results remain after individualizing the LOD cutoff it could be due to two possible explanations. The first is that the level of pleiotropy does not differ between morphological and behavioral traits (Hoekstra and Coyne 2007). The second possibility is that the level of pleiotropy differs but that differences in pleiotropy do not translate into differences in effect size. In order to determine the effects of pleiotropy we need to narrow the QTL to single genes.

Connecting phenotype to fitness was more straightforward. Although not many traits significantly impacted fitness, both behavioral and morphological traits affected fitness. Ovipositor length and ovipositor shape impacted whether or not a female had

61

eggs that hatched. The only trait that appeared to be under divergent selection was the presence of a preslit due to the interaction between host choice and the presence of a preslit. As expected for adaptive loci, these selected traits mapped to the same regions as hatching (fitness) mapped to. For example, all traits that affected fitness (ovipositor length, PCA 3 of ovipositor shape, and presence of a preslit) mapped to the same area on chromosome 1 as hatching success on both white and jack mapped to. These overlapping regions represent putative adaptive and speciation genes. The QTL are currently, quite broad and need to be significantly narrowed to determine if fitness peaks truly overlap with peaks for selected traits.

Overall, we were able to successfully connect genotype to phenotype to fitness on a very broadscale and to begin answering questions about the typical genetic architecture of adaptation. Further work needs to be done to accomplish a better resolution of genetic architecture, including fine mapping, calculating effect sizes, and performing simulations to inform LOD cut off scores. Studies identifying adaptive alleles and speciation genes are difficult because they require deep understanding of natural history, species pairs that haven’t completed speciation, crossing in the laboratory, and genetic resources, but they have the potential to give great insight into the process of adaptation and speciation.

62

CHAPTER 4. LACK OF INTRINSIC POSTZYGOTIC ISOALTION IN HAPLODIPLOIDD MALE HYBRIDS DESPITE HIGH GENETIC DISTANCE

This chapter has been previously published as:

Bendall, E. E., K. M. Mattingly, A. J. Moehring, and C. R. Linnen. 2020, Lack of intrinsic postzygotic isolation in haplodiploid male hybrids despite high genetic distance. bioRxiv, doi: 10.1101/2020.01.08.898957

4.1 Abstract

Evolutionary biologists have long been interested in understanding the mechanisms underlying Haldane’s rule. The explanatory theories of dominance and faster-X, which are based on recessive alleles being expressed in the heterogametic sex, have been proposed as common mechanisms. These mechanisms predict that greater hemizygosity leads to both faster evolution and greater expression of intrinsic postzygotic isolation. Under these mechanisms, haplodiploids should evolve and express intrinsic postzygotic isolation faster than diploids because the entire genome is analogous to a sex chromosome. Here, we measure sterility and inviability in hybrids between Neodiprion pinetum and N. lecontei, a pair of haplodiplopids that differ morphologically, behaviorally, and genetically. We compare the observed isolation to that expected from published estimates of isolation in diploids at comparable levels of genetic divergence. We find that both male and female hybrids are viable and fertile, which is less isolation than expected. We then discuss several potential explanations for this surprising lack of isolation, including alternative mechanisms for Haldane’s rule and a frequently overlooked quirk of haplodiploid genetics that may slow the emergence of complete intrinsic postzygotic isolation in hybrid males. Finally, we describe how haplodiploids, an underutilized resource, can be used to differentiate between mechanisms of Haldane’s rule.

63

4.2 Introduction

Barriers to gene flow enable species to diverge along independent evolutionary trajectories. For this reason, the evolution of reproductive isolation is a central focus of speciation research. Although there are many different types of reproductive barriers (Dobzhansky 1951; Coyne and Orr 2004), the most impermeable and permanent of these is intrinsic postzygotic isolation (IPI), which is the inability to produce viable, fertile hybrids. At a genetic level, hybrid inviability and sterility are often caused by the accumulation of Bateson-Dobzhansky-Muller incompatibilities (BDMIs) in diverging populations (Bateson 1909; Dobzhansky 1937; Muller 1942). While neutral or beneficial in the parental genomes, negative epistasis among BDMIs in hybrid genomes results in IPI.

In early stages of speciation, sterility or inviability is often restricted to one sex of the hybrid offspring (Coyne and Orr 1989, 1997). When this occurs, it is almost always the heterogametic sex (XY, ZW) that is sterile or inviable, a pattern known as Haldane’s rule (Haldane 1922; Schilthuizen et al. 2011). To date, multiple non-mutually exclusive mechanisms have been proposed to explain Haldane’s rule. Two explanations that have gained considerable empirical support are dominance theory and faster-X theory (Schilthuizen et al. 2011; Delph and Demuth 2016). Both of these assume that BDMIs are, on average, at least partially recessive in the hybrids.

First, under dominance theory, heterogametic hybrid malfunction is explained by BDMIs involved in autosomal-sex chromosome interactions (Turelli and Orr 1995). Whereas hybrids of the homogametic sex will express only those X (or Z)-linked BDMIs that are at least partially dominant, hybrids of the heterogametic sex will express all X (or Z)-linked BDMIs, regardless of dominance. There is empirical evidence for the dominance theory, particularly for inviability loci. Most of these loci have been identified in (Heikkinen and Lumme 1998; Coyne et al. 2004; Masly and Presgraves 2007), but also many other plant and taxa (Salazar et al. 2005; Carling and Brumfield 2008; Brothers and Delph 2010; Demuth et al. 2013).

The faster-X explanation for Haldane’s rule stems from the observation that the X (or Z) chromosome often has a disproportionate impact on hybrid fitness compared to

64

autosomes, a pattern known as the large X-effect (Charlesworth et al. 1986). One explanation for the large X-effect is that new beneficial mutations that are partially recessive will have a faster substitution rate on the X chromosome compared to the autosomes (Charlesworth et al. 1986). This is because on the X chromosome, new recessive alleles are immediately visible to selection in heterogametic individuals. This faster accumulation of substitutions on the X provides more opportunities for BDMIs to arise. Faster-X evolution can lead to Haldane’s rule either via exacerbating the effect of dominance described above or via the fixation of alleles that act in the heterogametic sex only (Coyne and Orr 2004).

A shared feature of dominance and faster-X theories is that the expression of recessive alleles on sex chromosomes in the heterogametic sex results in stronger postzygotic isolation compared to the homogametic sex. All else equal, both mechanisms predict that the rate of evolution of IPI should correlate positively with the extent of hemizygosity. In support of this prediction, Drosophila species that have a larger proportion of their genome on the X chromosome evolve IPI more rapidly than species with smaller X chromosomes (Turelli and Begunt 1997). Additionally, taxa with heteromorphic sex chromosomes evolve IPI at lower levels of genetic divergence than taxa with homomorphic or no sex chromosomes (Lima 2014).

Although Haldane’s rule has predominantly been studied in diploid taxa with sex chromosomes, it is also applicable to haplodiploids (Haldane 1922; Koevoets and Beukeboom 2009). In haplodiploids, males develop from unfertilized eggs and are haploid, and females develop from fertilized eggs and are diploid (Normark 2003). Thus, in haplodiploid systems the entire genome is analogous to a sex chromosome. Because hemizygosity is maximized in haplodiploids, dominance and faster-X theory predict that evolution of IPI should be maximized in haplodiploid taxa (Koevoets and Beukeboom 2009).

Although there is some empirical evidence of Haldane’s rule in haplodiploids (Koevoets et al. 2012), there are currently no direct comparisons between the rate of IPI evolution in diploids and haplodiploids. Here, we take advantage of a recent study that surveyed the literature and used linear regression to estimate the relationship between

65

genetic divergence and the strength of IPI for diploid taxa with heteromorphic sex chromosomes (Lima 2014). Using this regression line, we asked whether the observed level of IPI in a haplodiploid species pair exceeds the expected IPI for diploid taxa at a comparable level of genetic divergence, as predicted under both dominance and faster-X theories.

To estimate IPI in a haplodiploid species pair, we focused on a pair of sister species in the pine-sawfly genus Neodiprion: N. pinetum and N. lecontei (Order: Hymenoptera; Family: Diprionidae) (Linnen and Farrell 2008a). These species have substantial extrinsic postzygotic isolation stemming from oviposition traits (Bendall et al. 2017). Specifically, whereas N. pinetum females embed their eggs within the needles of a thin-needled pine species (Pinus strobus), N. lecontei females deposit their eggs in thicker, more resinous needles in other pine species. While females of each species have oviposition traits well-suited to their respective hosts, hybrid females have maladaptive combinations of oviposition traits that lead to hatching failure: they prefer the thin- needled host, but have traits better suited to thicker, more resinous needles. More generally, this species pair has many morphological and behavioral differences and many fixed genetic differences (genome-wide FST = 0.6, unpublished data). Overall, given the substantial genetic and phenotypic divergence between this species pair and the complete hemizygosity of haploid males, we expected hybrid males to be sterile or inviable. Shockingly, we found no evidence of IPI. In the discussion, we consider possible explanations for this surprising result, including a frequently overlooked quirk of haplodiploid genetics that may drastically slow the emergence of complete IPI in hybrid haploid males.

4.3 Methods

4.3.1 Study System Details and Overall Approach

The N. pinetum and N. lecontei lab lines that were used in this study were derived from larvae collected in the field (Table C1) and propagated in the lab for 1-4 generations following standard lab protocols (Harper et al. 2016; Bendall et al. 2017). We evaluated hybrid female and hybrid male viability and fertility relative to their purebred 66

counterparts according to the crossing scheme illustrated in Figure C1. As is the case in most hymenopterans, unfertilized Neodiprion eggs give rise to haploid males, while fertilized eggs give rise to diploid females. Thus, interspecific crosses create hybrid females (“F1”) and pure-species males. To obtain hybrid males (“F2”), we allowed hybrid females to reproduce. For these experiments we used a combination of mated females (produce both males and females if fertilization is successful) and virgin females (produce all-male colonies). Because females of both species lay their entire egg complement on a single branch terminus and colonies are gregarious throughout development (Coppel and Benjamin 1965), all viability measures were colony-level measurements. Additionally, for our viability estimates, we only used colonies that produced live adults. This enabled us to rule out non-IPI related sources of colony failure, such as lab , diapause, or extrinsic postzygotic isolation (Coppel and Benjamin 1965; Bendall et al. 2017). Fertility measures were based on the reproductive success of individual adults.

With these data, we evaluated presence/absence of hybrid inviability in each sex and in both directions of the cross. We also evaluated presence/absence of hybrid sterility in both directions of the cross for females and in one direction of the cross for males (due to sample limitations). If we observed any evidence of viability or fertility for a particular cross/sex combination, we considered that combination to lack IPI. These qualitative measures of IPI were comparable to published IPI measures from diploid taxa (Coyne and Orr 1989; Lima 2014).

4.3.2 IPI in females

To evaluate hybrid female viability, we crossed N. lecontei females with N. pinetum males (LxP) and vice versa (PxL). We then released each mated female into a mesh cage with two P. strobus and two P. banksiana seedlings (preferred hosts for N. pinetum and N. lecontei, respectively). When the female oviposited, we reared the resulting colonies on the host chosen for oviposition. To evaluate hybrid female viability, we calculated the proportion of colonies that had adult females emerge from crosses that had any adult emergence. We did this for both directions of the hybrid cross (PxL N = 15, LxP N = 23), as well as purebred N. pinetum (N = 15) and N. lecontei (N = 20). To 67

determine if female viability differed among cross types, we performed a logistic regression followed by a Tukey’s HSD post-hoc test.

To evaluate hybrid female sterility, we recorded oviposition success (i.e., whether or not a female laid eggs) and, if the female oviposited, the number of eggs laid for four cross types: F1(PxL) hybrid female mated to a N. pinetum male (N = 41), F1(LxP ) hybrid female mated to a N. lecontei male (N = 32), N. lecontei female mated to a N. lecontei male (N = 124), and N. pinetum female mated to a N. pinetum male (N = 108). All females were placed into choice cages as described above. To remove possible effects of mating, we also evaluated oviposition success and egg number for three types of virgin females (we did not have F1(PxL) available for this experiment): F1(LxP) females (N = 35), N. lecontei females (N = 58), and N. pinetum females (N = 86). We performed a logistic regression to test if oviposition willingness differed, and an ANOVA to test if egg number differed between female type. For both analyses we used Tukey’s post-hoc tests. We performed separate analyses for virgin and mated females. All statistical analyses were performed in R (3.6.0).

4.3.3 IPI in males

To evaluate hybrid male viability, we placed mated females of different types into oviposition cages (a combination of “choice” and “no-choice” cages were used) and reared the resulting offspring to adulthood. We estimated male viability for each cross type as the proportion of colonies that had adult male emergence. To generate F2(LxP) hybrid males, we crossed F1(LxP) hybrid females with either N. lecontei or N. pinetum males (N = 32). To generate F2(PxL) hybrid males, we backcrossed F1(PxL) hybrid females to

N. pinetum males (N = 9). These crosses result in backcross females and F2 males. For comparison, we also examined male emergence in pure N. pinetum (N = 34) and N. lecontei (N = 18) crosses. We performed a logistic regression and a Tukey’s HSD post- hoc test to determine if hybrids had lower rates of male emergence compared to the pure species.

We evaluated hybrid male sterility in one direction of the cross (LxP; due to availability of males). First, we examined sperm motility in N. pinetum (N = 20), N.

68

lecontei (N = 47), and F2(LxP) males (N = 39). Upon eclosion from cocoons, adult males were stored at 4C until use to prolong life. In some cases, males were used in mating assays prior to testes dissection, then returned to 4C for a minimum of 24 hours until further use. Males were warmed to room temperature for a minimum of one hour prior to dissection. From each male, we removed both testes and placed each testis on a siliconized slide in 50 l of testes buffer (183 mM KCl, 47 mM NaCl, 10 mM Tris-HCl, pH 6.8). After piercing a testis, we imaged the sperm at 40x with a Nikon E800 DIC. Neodiprion males have sperm that form bundles. We recorded sperm motility for each male (both testes combined) as no motility (no moving bundles), low motility (0-35% moving bundles), or normal motility (>35% moving bundles), Because mating status did not impact motility, we combined data from unmated and mated males (Chisq= 2.66, p= 0.103). To determine whether hybrid males had reduced sperm motility, we performed a Kruskal-Wallace test, followed by Tukey’s post-hoc tests.

To test whether hybrid males could mate successfully, we used no-choice mating assays. We placed a single N. lecontei female in a clear 3.25-oz container with either a N. lecontei (N = 36) or F2 (LxP) hybrid male (N = 37) (N. pinetum males and females were not available). We observed each pair for 2 hours and recorded whether they mated during that time. To test if mating success differed between N. lecontei and hybrid males, we performed a logistic regression. Mating does not indicate that hybrid males produce viable sperm. To evaluate hybrid male fertility, we placed each mated female in a cage with a P. banksiana seedling and reared resulting colonies as described above. For all colonies with a F2 father that produced adults, we evaluated whether there was successful fertilization by recording the proportion that produced adult females (diploid females indicate successful fertilization).

4.3.4 Comparing observed IPI in haplodiploids to predicted IPI in heteromorphic diploid taxa

Lima (2014) conducted a meta-analysis of published IPI estimates for taxa with heteromorphic, homomorphic, and no sex chromosomes. Using logistic regression, he calculated the expected level of IPI (with a 95% confidence interval) for a given genetic distance (Nei’s D) for these three categories. If haplodiploidy is analogous to extreme sex

69

chromosome heteromorphy, we predict that sawflies will have higher levels of IPI than diploids with heteromorphic sex chromosomes at the same genetic distance. To test this prediction, we calculated Nei’s D for N. pinetum and N. lecontei and compared the observed level of IPI for this species pair to expectations derived from the relationship between Nei’s D and IPI in diploid taxa with heteromorphic sex chromosomes (Lima 2014).

To calculate Nei’s D, we used adegenet (Nei 1978; Jombart 2008) with SNP data derived from ddRAD sequencing of 44 N. lecontei and 23 N. pinetum individuals (data from Bendall et al. in prep, ch 5). The individuals in the genetic dataset were from the same populations that established the lab lines we used to measure IPI. To calculate overall IPI between this species pair, we used the scale from Coyne and Orr (1989), which ranges from 0 (no IPI) to 1 (complete IPI). In brief, each sex that is either completely inviable or infertile in each direction of the cross adds 0.25 to the IPI score. We also calculated sex-specific IPI as in Lima (2014). IPI for each sex could take on three possible values: 0, if the sex was viable and fertile in both directions of the cross; 0.5, if the sex was inviable or infertile in one direction only; and 1, if the sex was inviable or infertile in both directions. With the estimates of IPI and genetic distance, we asked whether our observed IPI fell outside of Lima’s (2014) 95% confidence interval for IPI in heteromorphic taxa for our observed genetic distance. We compared observed to expected IPI for both overall and sex-specific measures of IPI.

4.4 Results

4.4.1 IPI in females

Interspecific crosses produce just as many colonies with viable female adults as intraspecific crosses (Figure 4.1A; Chisq = 2.20, P = 0.53). These data indicate that hybrid females are viable in both directions of the cross. Hybrid females are also fertile in both directions of the cross. Whether mated or virgin, hybrid females are no less willing to oviposit than non-hybrid females (Figure 4.1B, Figure C2A, Table C2, Table C3). When mated hybrid females oviposited, they also laid just as many eggs as N. lecontei and more eggs than N. pinetum (Figure 4.1C, Figure C2B, Table C2, Table C3). Egg

70

number was similar for all virgin females (P=0.053, Table C2). Overall, hybrid females are viable and fertile in both directions of the cross.

Figure 4.1. Viability and fertility for hybrid females and hybrid males. A. The proportion of colonies with adult females out of all colonies with adult emergence. B. The proportion of females that oviposited. C. The average number of eggs laid for pure and hybrid females. D. The proportion of colonies with adult males out of all colonies with adult emergence. E. Proportion of males that had normal sperm motility. F. Compared to N. lecontei males, hybrid males mate less frequently with N. lecontei females. Error bars represent standard error. Different letters denote pairwise comparisons that are significantly different in post-hoc tests; lack of letters indicate that there were no significant differences.

71

4.4.2 IPI in males

Hybrid males are viable in both directions of the cross. Although the proportion of colonies that produced adult males varied among cross type (Figure 4.1D, Chisq = 14.5, P = 0.002), hybrid males didn’t have reduced male emergence compared to the pure species (Table C4).

Compared to N. lecontei, hybrid F2(LxP) males did not have reduced sperm motility (Figure 1E; Chisq= 1.03, P= 0.60). However, N. lecontei females were less willing to mate with hybrid males than they were with N. lecontei males (Figure 4.1F; Chisq =3.93, P = 0.045). This constitutes a form of extrinsic postzygotic isolation (behavioral isolation) in at least one direction of the cross. Nevertheless, hybrid males did mate successfully with some N. lecontei females. Of the 10 hybrid-male-fathered colonies that produced adults, 70% produced adult females, indicating that hybrid males are fertile. Overall, hybrid males are viable in both directions of the cross and fertile in one direction (N. lecontei female x N. pinetum male). The fertility of the reciprocal cross is unknown.

4.4.3 Genetic distance

Using 21,590 SNPs genotyped in sympatric N. lecontei and N. pinetum populations, our Nei’s D estimate was 0.36. If we assume that the untested hybrid male type (which differs from the tested hybrid male type only in the mitochondrial genome) was fertile, N. pinetum and N. lecontei have an IPI score of 0. For a Nei’s D of 0.36, this IPI score is outside of the 95% confidence interval for expected IPI in heteromorphic taxa (Figure 4.2). However, IPI deviated in the opposite direction of what we predicted: N. pinetum and N. lecontei have lower IPI than expected given their genetic distance. The individual sexes also had IPI scores of 0. While the male-specific IPI was lower than the 95% confidence interval, the female-specific IPI fell within the 95% confidence interval (which included 0).

If we assume instead that the untested hybrid male type was infertile, this would give an overall IPI of 0.25, a male-specific IPI of 0.5, and a female-specific IPI of 0. These IPI estimates, which are the maximum possible IPI for this species pair, did fall

72

within the 95% confidence interval of the heteromorphic taxa. However, these adjusted IPI scores still fell below the heteromorphic regression lines.

Figure 4.2. N. pinetum and N. lecontei have lower isolation than taxa with heteromorphic sex chromosomes at the same genetic distance. The thick line is the expected amount of IPI for taxa with heteromorphic sex chromosomes at Nei’s D of 0.36 (from Lima 2014). The gray shaded area represents the 95% confidence interval. The circles represent the observed isolation between N. pinetum and N. lecontei. The blue circle is the maximum potential isolation and the pink circle is the observed isolation. Isolation for both sexes combined, females only, and males only are shown.

4.5 Discussion

Haplodiploids should have higher levels of IPI than diploids for a given genetic distance because all recessive mutations are expressed in the haploid males. We found that N. lecontei and N. pinetum hybrids are fertile and viable for both sexes, making the IPI score lower than expected given the genetic distance between N. pinetum and N. lecontei. Here, we consider several possible reasons for the lack of intrinsic isolation in hybrid haploid males.

First, not all genetic mechanisms that have been hypothesized to explain Haldane’s rule rely on the hemizygous nature of sex chromosomes. Although there is significant support for dominance theory and faster-X, mechanisms that rely on chromosomal segregation or the sex-determining properties of the X chromosome may also contribute to Haldane’s rule. These mechanisms will not cause IPI to evolve more rapidly--and may even slow the emergence of IPI--in haplodiploids. For example, under meiotic drive, the sex ratio becomes distorted from 50:50 when drive elements evolve on 73

the X chromosome (McDermott and Noor 2010). Strong negative selection against distorted sex ratios favors suppressors on other chromosomes (autosomal or Y) that restore a balanced sex ratio. This cycle of antagonistic of drivers and suppressors results in rapid evolution of the X chromosome and increased divergence between species. With increased divergence comes an increased number of incompatibilities. In several Drosophila groups meiotic drive has been implicated in hybrid sterility (Hauschteck-Jungen 1990; Tao et al. 2001; Orr and Irving 2005).

In theory, meiotic drive could also produce Haldane’s rule in many haplodiploids. The genetic mechanism underlying haplodiploidy in Neodiprion sawflies and many other Hymenoptera is complimentary sex determination, in which sex is determined by heterozygosity at one or more sex-determining loci (Cook 1993; Harper et al. 2016). If an individual is hemizygous (haploid) or homozygous (diploid) at all sex-determining loci, then they are male. Individuals that are heterozygous (diploid) at one or more sex- determining loci are female. Meiotic drive elements can be linked to these sex- determining loci. As drive causes the frequency of an allele at the sex determining locus to increase, the number of homozygous individuals increases. Linked meiotic drive elements would create an unbalanced sex ratio that increases the proportion of diploid males in the population. Diploid males tend to be inviable or sterile, and diploid male production should be strongly selected against (Van Wilgenburg et al. 2006). However, sex-determining loci and linked sites make up a small proportion of the genome. Thus, if meiotic drive is an important source of IPI, haplodiploids should not have faster evolution of IPI, and may evolve IPI more slowly depending on the proportion of the genome that is sex linked.

Although dominance theory has wide support for causing male inviability, there is a lack of support when it comes to sterility (Presgraves 2010b). An alternative mechanism to explain the evolution of sterility in the heterogametic sex is incorrect pairing of sex chromosomes during meiosis. Unlike homomorphic chromosomes, which pair by overall , heteromorphic sex chromosomes usually match by small stretches of shared sequence, which are often rapidly evolving repeat sequences. These repeat sequences can differ between species, causing hybrid X and Y (or Z and W)

74

chromosomes to be unable to pair or separate properly during meiosis, making the hybrid’s gametes sterile. Since only heteromorphic sex chromosomes require this form of meiotic pairing, this mechanism could explain Haldane’s rule. Chromosome separation failures during meiosis, leading to sterility in hybrids, have recently been reported for mice (Schwahn et al. 2018) mosquitoes (Lang and Sharakhov 2019) yeast (Rogers et al. 2018), and Drosophila (Kanippayoor et al. 2020). Because haplodiploids lack sex chromosomes and males do not undergo meiosis to produce gametes, improper pairing of sex chromosomes cannot cause hybrid male sterility in haplodiploids, and hybrid males are therefore fertile.

Even if there is a common underlying genetic mechanism that gives rise to Haldane’s rule across taxa in nature that should give rise to IPI in our system, there could be system-specific reasons why IPI was not observed. For example, the lack of IPI may be due to the divergence history of the species pair we examined, which has ongoing gene flow. If there is sufficient gene flow, deleterious genetic combinations are quickly produced and purged, preventing the evolution of hybrid inviability and sterility (Agrawal et al. 2011). In haplodiploids these deleterious alleles should be purged more quickly (Avery 1984). Whether there is sufficient gene flow in this system to cause selection against IPI remains to be tested.

Alternatively, the low levels of male IPI may be a consequence of haplodiploidy itself. Haplodiploidy may influence the evolution of IPI in more complex ways that depend on the subtleties of the genetic basis of the BDMIs. For example, BDMIs may be formed when there a mildly deleterious mutation in one species followed by a compensatory mutation in the same species (Kondrashov et al. 2002), as observed in some mitonuclear interactions (Barreto and Burton 2012). When the deleterious mutation is placed into the other genetic background without the compensatory mutation, the hybrid suffers low fitness. Since most mildly deleterious segregating variants in a population are recessive (Simmons and Crow 1977), and all recessive variants are expressed in haplodiploid males, selection will be more efficient at removing these variants from the population, resulting in fewer mildly deleterious mutations in haplodiploids (Avery 1984). If BDMIs formed through compensatory mutations are

75

common, haplodiploids would be expected to evolve IPI more slowly than diploids. This reduced IPI in haplodiploids is only applicable if the dominance theory underlies sterility, since faster-X theory specifically deals with positively selected recessive alleles. To rigorously test this idea, the specific mutations involved in BDMI and their fitness effects in the original population must be known.

Finally, the haplodiploid inheritance mechanism may account for the absence of complete inviability and sterility in hybrid males. To illustrate why, we propose a verbal model describing the effects of haplodiploid inheritance. We consider a simple two-locus BDMI in which there is one locus on the autosome and a second locus on the X chromosome that interact to cause the incompatibility (Figure 4.3A). In one species, a derived co-dominant autosomal mutation fixes, and in the other species a recessive mutation fixes on the X chromosome. When these species hybridize, hybrids are sterile or inviable when one copy of the derived autosomal allele and only the derived X allele is present in the hybrid, as would be observed in the heterogametic sex. Only one direction of the cross will experience IPI under this simple model.

For the direction of the cross with IPI, all diploid hybrid males have one set of autosomal chromosomes from each species and the X from the maternal species causing all males to have the incompatibility (Figure 4.3B). The females will be viable and fertile, since the incompatible X locus is heterozygous. In haplodiploids, the incompatibilities would be on two different autosomes since they do not have sex chromosomes (Figure

4.3C). A F1 female has the same genotype as the diploid female and is viable and fertile.

However, there is no true F1 hybrid male. Instead, the first generation of hybrid males

(F2) are the offspring of hybrid females, allowing for recombination to occur before hybrid males are formed. Unlike diploids, not all males will have all incompatibilities. Instead, only 25% of the males will have the derived allele at both loci and will be inviable or infertile in this two-locus model. Although this model is more simplistic than many BDMIs in nature, it shows how recombination in F1 females allows viable allelic combinations to be formed in hybrid males. The exact effect of haplodiploidy inheritance on the evolution of IPI will depend on the genetic architecture of the BDMIs (e.g. effect size, dominance, genomic locations), and further modeling is necessary.

76

A. A A

A A B B a a

B B a A B b a a

B a a b b

B b

B. C.

b b X B B b b X B a a A A a a A

b B b b B a A a A a A

Viable Inviable

b B b B A A a a

Inviable Viable Viable Viable

Figure 4.3. A two-locus model of IPI in haplodiploids and diploids consistent with the dominance model. A. The evolution of a two-locus BDMI in diploids with incompatible loci on an autosome (long rectangle) and X chromosome (short rectangle). The red uppercase alleles are ancestral. The X-linked derived allele (yellow b) is recessive and the autosomal derived allele (blue a) is at least codominant. In hybrids, an individual with at least one A allele and homozygous or hemizygous for the b allele, such as the male shown at the right, will be inviable or sterile. Only one direction of the cross will experience IPI under this model B. In diploids, hybridization leads to viable females and inviable males. All males have the same genotype that include the incompatibility. C. In haplodiploids, hybrid females are viable. Hybrid males aren’t formed until the second generation. Only 25% of the males will have the incompatibility and be inviable.

77

We have proposed several explanations for why we detected lower IPI than expected under dominance theory and faster-X. These explanations fall into three categories 1) system specific effects 2) inheritance mechanism of haplodiploids, and 3) alternative mechanisms for Haldane’s rule. Meiotic drive, pairing of sex chromosomes during meiosis, and dominance theory formed through compensatory mutations are non- mutually exclusive mechanisms that may all be important drivers of the evolution of BDMIs and Haldane’s rule. Importantly, haplodiploids can be used to distinguish between these different mechanisms because they have different predictions for the level of IPI in haplodiploids compared to diploids (Table 4.1). To tease apart these different possibilities, quantitative measures of sterility and inviability from many haplodiploid and diploid taxa are needed. The influence of system-specific effects such as interspecific gene flow and species population sizes should also be examined. Biologists have been trying to understand the mechanism underlying Haldane’s rule for almost a century. Haplodiploids have been an underutilized resource in this search and have the potential to provide novel insight into the underlying basis of this phenomenon.

Table 4.1 Proposed mechanisms for Haldane’s rule and predictions regarding how haplodiploids should differ from diploids in rate of evolution of IPI Mechanism Inviability Sterility Explanation Dominance theory increased increased All recessive incompatibilities are exposed in haplodiploids, but only recessive incompatibilities on the X chromosome are exposed in diploid males. Meitotic drive depends depends Drive elements linked to the sex-determining loci will result in IPI. Sex-linked loci in haplodiploids make up a small proportion of the genome. Haplodiploids should have a similar rate of evolution of IPI as taxa with homomorphic sex chromosomes, where most of the sex chromosomes are pseudo-autosomal. Haplodiploids will have lower amounts of IPI compared to diploid taxa where a large proportion of the genome is sex-linked. Improper segregation unbiased decreased There are no sex chromosomes in haplodiploid males that separate during meiosis during meiosis, and so misegregation of nonhomologous chromosomes cannot cause sterility in haplodiploid males. If this is the main mechanism of Haldane’s rule then haplodiploids should have reduced sterility but not inviability compared to diploids. Dominance theory decreased decreased Deleterious mutations are removed from the population faster in caused by haplodiploids than diploids. These deleterious mutations are more compensatory likely to be removed from the population before compensatory mutations mutations evolve, leading to lower levels of IPI in haplodiploids.

78

CHAPTER 5. HAPLODIPLOIDY LEADS TO HETEROGENEOUS GENOMIC DIVERGENCE DURING SPECIATION WITH GENE FLOW

5.1 Abstract

Genome-wide patterns of heterogeneity during divergence have been well documented and studied, but there has been little research into how taxon specific factors influence these patterns. Here we investigate how haplodiploidy may affect genomic patterns during divergence with gene flow using an empirical case study of Neodiprion pine sawflies, a meta-analysis of sex chromosomes and autosomes, and simulations of haplodiploids and diploids. We hypothesize that haplodiploids will have higher levels and greater variability in divergence because all recessive mutations are immediately visible to selection in the haploid males. We find that Neodiprion pine sawflies have ongoing migration, high levels of divergence, and high levels of heterogeneity. X and Z chromosomes are similar to haplodiploids because they are haploid in one sex and diploid in the other, and this similarity is reflected in their divergence. Sex chromosomes have higher and more variable divergence than autosomes. In the simulations we find that the reduced probability of losing the beneficial allele, a faster time to equilibrium, a greater equilibrium allele frequency, and greater migration-selection threshold in haplodiploids all contribute to haplodiploids having greater and more variable divergence than diploids. This study uses the unique sex determination system of haplodiploids to bridge the gap between the fields of faster-X and understanding how differences accumulate during divergence with gene flow.

5.2 Introduction

Divergence enables populations and species to accumulate differences via mutation, drift, and selection, but the rate at which these differences accumulate is highly heterogeneous across the genome (Nosil et al. 2009a). Recently, a growing body theoretical and empirical research has demonstrated that demographic, evolutionary history, and genome features interact to create these complex heterogeneous patterns (Seehausen et al. 2014; Wolf and Ellegren 2016; Ravinet et al. 2017). With the increasing availability of genome-wide divergence data we are also seeing highly heterogeneous

patterns across taxa (see Ravinet et al. 2017 for examples). If any of the factors that influence genome-wide patterns of heterogeneity vary systematically across taxa, this may result in predictable patterns in how differences accumulate for specific taxa. Despite the increasing amount of research into patterns of heterogeneity, taxon specific factors are understudied. Additionally, one increasing goal in evolutionary genomics is to use genome patterns of heterogeneity to locate loci that are involved in reproductive isolation and to infer evolutionary processes involved in divergence (Ravinet et al. 2017). Properly interpreting genome scans is critical for achieving this goal.

Due to shared evolutionary history, genome features are likely to be conserved within taxonomic groups and represent a promising starting point for understanding between taxon differences in heterogeneity (Tamames 2001; Feng et al. 2010; Tsai et al. 2010; Smukowski and Noor 2011). Haplodiploids, due to their sex determination system, share a unique set of genome features that are likely to influence patterns of divergence. Haplodiploidy has evolved multiple times and about 15% of all are haplodiploid, making the effects of haplodiploidy essential to understand (Normark 2003; De La Filia et al. 2015). In haplodiploids males are haploid and develop from unfertilized eggs, while females are diploid and develop from fertilized eggs. The existence of these haploid males may have predictable effects on patterns of genomic divergence.

In haploid males, selection will be more efficient, especially on recessive mutations. This will have multiple consequences on divergently selected alleles during divergence. First, in diploids new or low-frequency recessive mutations will evolve primarily via drift and even highly beneficial mutations can be lost (Haldane 1924, 1927; Turner 1981; Charlesworth 1992). In haplodiploids, recessive mutations will be immediately visible to selection and beneficial recessive mutations will be much more likely to remain in the population (Avery 1984). Additionally, even if beneficial mutations are retained in the population selective sweeps are faster in haplodiploids than diploids potentially leading to greater amounts of linked variation (Hartl 1972; Avery 1984). Both of these will cause higher levels of divergence in the haplodiploids compared to the diploids, in a manner analogous to faster-X. Although faster-X can be caused by factors other than ploidy differences (Meisel and Connallon 2013).

80

The consequences of more effective selection will be even more pronounced when there is divergence with gene flow. We predict that haplodiploids should be able to maintain higher levels of differentiation given a level of gene flow and be able to begin diverging at higher levels of gene flow due to more effective selection. In addition to higher levels of divergence, divergence with gene flow will also result in higher variation in divergence across the genome. During selective sweeps, hitchhiking will cause large blocks of loci linked to the selected site to increase in frequency (Via 2012). In the absence of gene flow these linked loci will remain at high frequency until there are new mutations at the linked sites (Nosil and Feder 2012). When there is migration these blocks of linked loci will be quickly eroded, and only selected loci will remain at high frequency. This will lead to more peaks and valleys resulting in a large variance in divergence across the genome. Additionally, without gene flow populations can diverge through drift, diminishing the effects of more effective selection in the haplodiploids.

Divergence with gene flow has becoming increasingly recognized as an important speciation process (Taylor and Larson 2019). Although several studies have pointed out higher levels of divergence in sex chromosomes, mostly on Z chromosomes (Martin et al. 2013; Penalba et al. 2017), in taxa where there is divergence with gene flow, there has been no concerted effort to study role of faster-X has in the context of divergence with gene flow and heterogeneity in divergence. This study importantly bridges the gap between these fields. We hypothesize that haplodiploids should have greater and more heterogeneous divergence than diploids when there is divergent selection with gene flow. We tested this hypothesis in three different ways: an empirical case study using Neodiprion pine sawflies, a meta-analysis of sex chromosomes, and simulations of divergence in haplodiploids and diploids

5.3 Genomic divergence in pine sawflies

Neodiprion pine sawflies are Hymenoptera, and like all Hymenoptera are haplodiploid (Cook 1993). Pine sawflies have an intimate association with their host throughout their life span- the adults mate on the host, the females embed the eggs into the needles, the larvae eat the needles and spin their cocoons on or below the host

81

(Coppel and Benjamin 1965). We focused on a pair of sister species, N. pinetum and N. lecontei, that differ in host use (Linnen and Farrell 2007, 2008a). The differences in host- use directly result in reproductive isolation between these species (Bendall et al. 2017). Many traits are involved in host use (behavioral, physiological, and morphological)(Coppel and Benjamin 1965; Codella and Raffa 2002; Lindstedt et al. 2011; Bendall et al. 2017). Since so many traits are involved, many regions of the genome are likely under divergent selection.

N. pinetum and N. lecontei are found in the Eastern United States and have largely overlapping ranges (Benjamin 1955; Rauf and Benjamin 1980). They are also interfertile and produce viable and fertile hybrids that are capable of backcrossing in either direction (Bendall et al. 2017). We have previously collected morphological hybrids, that have intermediated coloration (Figure 5.1A). Given these findings, gene flow between these species is likely. The combination of multifarious divergent selection and likely introgression make this an ideal system for understanding the genomic consequences of haplodiploidy. We collected individuals of N. pinetum and N. lecontei from a sympatric population in Kentucky (Table 1). We performed ddRAD sequencing on N. pinetum

(N=23), N. lecontei (N=44), three putative hybrids, and one lab reared F1 hybrid. We also sequenced N. lecontei from Michigan (N= 18).

First, we wanted to confirm that there is gene flow between N. pinetum and N. lecontei. To confirm that these intermediate hybrids we performed Admixture with the sympatric N. pinetum, N. lecontei, and four hybrids (3 wild, 1 lab-reared)(Alexander et al. 2009). We tested K 1-5 and ran 100 permutations per K. K =2 had the lowest cross validation score (Table D2). We found that the putative hybrids were genetically admixed and had approximately 50% of their ancestry from each species (Figure 5.1A). The wild hybrids were indistinguishable from the lab reared hybrid. For N. pinetum and N. lecontei there were no admixed individuals beyond the F1, except for a single admixed N. pinetum.

82

Figure 5.1 Population genetic of N. pinetum and N. lecontei. A. Representative pictures of N. pinetum, N. lecotnei, and putative morphological hybrids above an Admixture plot of these populations. N. pinetum is in white and N. lecontei is in grey. Putative morphological hybrids are genetically admixed with approximately half of their ancestry coming from each species. B. N. pinetum and N. lecontei have diverged with continuous but asymmetric gene flow. Width of boxes is representative of population size and width of arrows is proportional to migration rate. C. A Manhattan plot of FST across the genome and a density plot of FST between N. pinetum and N. lecontei. FST is highly variable across the genome.

To test if there was introgression beyond the F1 generation we performed an ABBA-BABA test. N. lecontei has three distinct population clusters- North, Central, and South (Bagley et al. 2017). The Kentucky population of N. lecontei is in the Central population. We performed the ABBA-BABA test with N. lecontei from Michigan (North), N. lecontei from Kentucky, N. pinetum, and N. virginianus as the outgroup. We found significant introgression between sympatric N. pinetum and N. lecontei (P = 2.12 x

10-15). Using demographic modeling in fastsimcoal2, we found that there was asymmetric

83

migration throughout divergence with N. lecontei to N. pinetum having the higher migration rate (Figure 5.1B, Table 3)(Excoffier et al. 2013). N. pinetum and N. lecontei diverged 1,500,000 generations ago. With 1-3 generations per year the divergence time is between 500,000 and 1,500,000 years ago (Benjamin 1955; Rauf and Benjamin 1980).

Next, we examined genomic patterns of differentiation between N. lecontei and N. pinetum. The genome-wide mean FST was 0.63. There is a large amount of heterogeneity in divergence across the genome, which results in a bimodal pattern density distribution (Figure 5.1C). Divergent loci were spread throughout the genome without any clear clustering. Overall there is gene flow between N. pinetum and N. lecontei in sympatry and there is a high level of heterogeneity in genomic divergence. However, this is only a single taxon pair and there could be factors beyond haplodiploidy that can explain the patterns of heterogeneity shown.

5.4 Meta-analysis of sex chromosomes

Divergence in X and Z chromosomes should mirror divergence in haplodiploids since these chromosomes are haploid in one sex and diploid in the other (Avery 1984). Comparing divergence on X and Z chromosomes to the autosomes allows us to evaluate the effects of ploidy while controlling for the effects of demographic history. To evaluate if the patterns observed in sawflies remain when examining multiple taxa, we performed a meta-analysis of sex chromosomes using 28 taxon pairs (Table D4). A qualitative meta- analysis previously reported that most studies comparing autosomes to sex chromosomes found that X/Z chromosomes had higher levels of divergence, but here we quantitatively examine levels of and variance in divergence (Irwin 2018). After accounting for differences in effective population size, we found that sex chromosomes had significantly higher (p =0.047) and but not more variable (p=0.079) divergence than autosomes (Figure 5.2). The difference between autosomes and sex chromosomes was greatest when there was higher overall divergence. So far, our empirical evidence supports our hypothesis that haplodiploids have higher and more variable divergence, but we still do

84

not know under exactly which conditions this difference will arise and the mechanisms producing the difference.

Figure 5.2. Meta-analysis of sex chromosomes and autosomes across 28 taxon pairs. Sex chromosomes have higher mean (A, p= 0.047) but not variance (B, p=0.079) than autosomes.

5.5 Simulations of haplodiploids

To test if haplodiploidy leads to predictable differences in divergence, we simulated diploid and haplodiploid populations experiencing divergent selection in the face of gene flow (Figure D1). In the simulations, a single population at equilibrium split into two populations and a single divergently selected recessive mutation entered the population at the time of divergence. These two diverging populations were allowed to evolve for another 2000 generation. We simulated a 500-kb region to include effects on linked variation. To evaluate how different scenarios shape patterns, we simulated a range of selection coefficients and migration rates. We scaled effective population sizes and recombination rates to isolate the effects of ploidy. To mimic genomic data sets, we simulated 1000 500-kb regions for each parameter combination and calculated the mean

FST and variance in FST across all simulations. Haplodiploids had greater mean and variance in divergence when there is both migration and selection (Figure 5.3A, B). The difference in FST increased as the migration rate and selection coefficient increased.

85

86

Figure 5.3. Simulations of FST between haplodiploids and diploids. A. The ratio of mean FST between haplodiploids and diploids. When there is migration and selection haplodiploids have greater mean FST than diploids B. The ratio of variance in FST between haplodiploids and diploids. When there is migration and selection haplodiploids have greater variance in FST than diploids C. The ratio of mean FST between sex chromosomes and autosomes. When there is migration and selection sex chromosomes have greater mean FST than autosomes. The difference between sex chromosomes and autosomes is greater than the difference between haplodiploids and diploids. D. The ratio of mean FST between sex chromosomes and autosomes for simulations where the divergently selected mutation was retained. E. The ratio of mean FST between haplodiploids and diploids at selected sites only. FST differences between haplodiploids and diploids is greatest at selected sites at intermediate selection coefficients and migration rates. F. The ratio of mean FST between sex chromosomes and autosomes at selected sites only. The difference between sex chromosomes and autosomes is less than the difference between haplodiploids and diploids at the selected sites. G. The ratio of mean FST between haplodiploids and diploids for simulations where the divergently selected mutation was retained. There is still a difference in mean FST, although the difference is smaller. H. The ratio of mean FST between haplodiploids and diploids at the selected site in a deterministic model. The white lines denote where the ratio between haplodiploids and diploids is 1.

5.6 Simulations of autosomes and X/Z chromosomes

We completed a meta-analysis of sex chromosomes and autosomes since they should recapitulate the differences between haplodiploids and diploids. However, differences in effective population sizes can have effects on patterns of divergence. To test if sex chromosomes are comparable to haplodiploids, we performed simulations as above but without correcting for differences in population sizes. As expected, X and Z chromosomes had higher mean divergence than autosomes during simulations, and the difference was greatest when there was migration and selection (Figure 5.3C). The

difference in FST across the genome between autosomes and sex chromosomes was more extreme than that observed between haplodiploids and diploids with equivalent effective population sizes. However, at the selected sites the difference between autosomes and sex chromosomes was less extreme than that observed between haplodiploids and diploids (Figure 5.3E, F). X/Z chromosomes have ¾ the effective population size of autosomes increasing the effect of drift. At neutral sites this leads to reduced , and

increased FST because FST is a relative measure of divergence (Cruickshank and Hahn 2014). At selected sites, the smaller population size decreases the efficacy of selection causing the new divergently selected mutation to be lost more frequently than in the

87

haplodiploids (Vicoso and Charlesworth 2009). In sex chromosomes the effect of drift quantitatively effects patterns of divergence, but the difference between sex chromosomes and autosomes qualitatively recapitulates the differences between haplodiploids and diploids.

5.7 Mechanisms

Multiple mechanisms could give rise to the observed differences between diploids and haplodiploids. To further explore their individual contributions, we examined the fate of a new or low-frequency divergently selected mutation in the population where it was beneficial. The first mechanism we proposed was that divergently selected recessive alleles in the haplodiploids were escaping loss via drift at a higher frequency than in the diploids. To test this mechanism, we compared the number of simulations in which the divergently selected mutation was retained. As expected, the beneficial mutation was retained in a larger proportion of simulations in the haplodiploids (Figure 5.4A). As the selection coefficient increases, the difference in the proportion of simulations where the beneficial allele is retained becomes greater between haplodiploids and diploids when the new mutation has a very low starting frequency (f=0.01). At slightly higher starting frequencies (f=0.1) the difference between haplodiploids and diploids in the probability of retaining the beneficial allele begins to increase as the selection coefficient increases. As the probability of allele loss approaches zero in the haplodiploid, the difference between haplodiploids and diploids decreases. The areas of parameter space where there are large differences in the probability of retaining allele matches the areas where there are large differences in mean and variance in FST between haplodiploids and diploids. These results support the increased retention of beneficial alleles in haplodiploids as one of the mechanisms leading to greater differentiation under divergence-with-gene-flow.

When we only consider simulations where the divergently selected mutation is retained, there is still a, albeit smaller, difference in FST between haplodiploids and diploids, suggesting that additional mechanisms are contributing (Figure 5.3G). To examine these additional mechanisms, we first looked at simulations where the divergently selected mutation was retained. In these simulations, haplodiploids had a

88

higher migration threshold and were able to begin diverging at higher levels of gene flow (Figure 5.4B).

Next we examined the impact of selection in the absence of drift. To do so, we used simulations with deterministic model of the selected mutation alone and tracked the allele trajectory (Figure 5.3F). The first property of the allele trajectory that we looked at was the equilibrium allele frequency. As long as there is gene flow, it is unlikely that

Figure 5.4. Mechanisms causing differences in levels of divergence between haplodiploids and diploids. A. Diploids have a higher probability of losing the divergently selected mutation. B. When the divergently selected mutation is retained, haplodiploids are able to begin diverging at lower selection coefficients and higher migrations rates. The lines represent the migration-selection threshold where FST is greater than the neutral expectations. C. The allele trajectories for haplodiploids and diploids at different selection coefficients and m=0.0038 in a deterministic model. The lines represent the difference in allele frequency of the divergently selected mutation between the two populations. Haplodiploids have a greater difference in allele frequency at equilibrium. D. The ratio of time to equilibrium between haplodiploids and diploids in a deterministic model. Haplodiploids reach equilibrium faster than diploids at high selection coefficients.

89

divergently selected alleles will fix because there will be constant input of the deleterious allele from the other population. After a migration-selection equilibrium was reached, haplodiploids had a greater allele frequency difference of the divergently selected mutation between the two populations (Figure 5.4C). More efficient selection in the haplodiploid shifted the selection-migration equilibrium to higher levels of divergence. Finally, we examined the speed at which the divergently selected mutation reached equilibrium. Haplodiploids reached equilibrium more quickly, which should cause a larger number of linked loci to hitchhike to higher frequency with the beneficial mutation

(Figure 5.4D). With more loci affected, the greater the average FST will be. This mechanism may only result in a temporary difference in FST. Eventually migration, recombination, and mutation will erode the elevated FST at the distantly linked loci (Przeworski 2002). The final mechanism we considered was the migration threshold for divergence to occur. Haplodiploids were able to diverge (higher FST than neutral expectations) at higher migration rates than diploids.

5.8 Conclusions

Overall, three lines of independent evidence strongly support our hypothesis that haplodiploids have greater and more variable divergence. Additionally, we found that more efficient selection in the haploid males acts in multiple ways to create the observed pattern. The patterns observed between haplodiploids and diploids is largely consistent with faster-X, where there is typically greater divergence on the X (and Z) chromosome. Historically, theoretical treated haplodiploids and diploids identically to sex chromosomes and autosomes (Hartl 1972; Avery 1984; Hedrick and Parker 1997).

Here, we expand on faster-X in several ways. First, we isolate the effect of ploidy from all additional factors involved in faster-X, such as gene density, higher mutation rate on the X-chromosome, and chromosomal location of sex-biased genes (Charlesworth et al. 2018). Second, we examine both levels of divergence and variability in divergence across the genome, instead of just simply substitution rate (Charlesworth et al. 1986). Next, we examine patterns of divergence during divergence with gene flow. The interactions between drift, migration, and selection can cause complicated patterns of

90

divergence across the genome. During divergence with gene flow, selected alleles can remain in a population without ever fixing, because there is a constant input of deleterious alleles from the other population until gene flow ceases (Whitlock and Gomulkiewicz 2005). Classically, faster-X has only focused on selection and drift, where the divergently selected mutation is either fixed or lost (Charlesworth et al. 1986; Caballero 1995). Here, we found that when there is both selection and migration haplodiploids have higher and more variable divergence than diploids. Finally, we examine the effects of specific mechanisms on patterns of divergence. We found that four different mechanisms; loss of beneficial allele, time to equilibrium, equilibrium allele frequency, and migration threshold; contribute to haplodiploids having greater divergence than diploids.

Beyond expanding on faster-X, we show how taxon-specific factors, such as haplodiploidy, can have profound consequences for how differences accumulate during divergence. As sequencing costs have dramatically dropped and as genomics resources for non-model systems has increased, there has been a growing field attempting to use genomic patterns to identify loci that contribute to reproductive isolation and to infer the evolutionary forces involved. Understanding taxon-specific factors is critical for properly interpreting empirical data in the quest to connect genome patterns to evolutionary processes.

5.9 Methods

5.9.1 Population sampling

N. pinetum and N. lecontei have largely overlapping ranges in the eastern part of the United States and all sampling occurred in the overlapping region. We sampled 23 individuals from 1 population of N. pinetum in Kentucky. There are three population clusters of N. lecontei; North, Central, and South (Bagley et al. 2017). We sampled 44 N. lecontei individuals from Kentucky. This population is sympatric with the sampled N. pinetum and is part of the Central N. lecontei cluster. We also sampled 18 individuals from Michigan, which are part of the North cluster. Additionally, we sampled 3

91

morphological female hybrids based on intermediate coloration from Kentucky and 1 lab reared female F1 as a control (Table D1).

5.9.2 DNA sequencing

We extracted DNA using a CTAB/phenol-chloroform-isoamyl alcohol method (Chen et al. 2012). We visualized the DNA on a 0.8% agarose gel to confirm quality. To quantify the DNA we used a Quant-iT High-Sensitivity DNA Assay Kit (Invitrogen – Molecular Probes, Eugene, OR, USA).

For N. pinetum, N. lecontei, and hybrids we used a modified protocol ddRAD sequencing protocol from Bagley et al. (2017) and (Peterson et al. 2012). We fragmented the DNA using NlaIII and EcoRI. We assigned each individual along with additional samples from other projects to one of eight libraries. During adapter ligation, we assigned each sample one 48 unique in-line barcodes (Table D1). We used the 5-10 bp variable length barcodes used in Burford Reiskind et al 2016. We then pooled each group of samples and size selected for 379-bp fragment (+/- 76bp) on a PippinPrep (Sage Science, Beverly, MA). We did 12 rounds of high-fidelity PCR amplification (Phusion High-Fidelity DNA Polymerase, NEB, Ipswich, MA) using PCR primers that included one of 12 unique Illumina multiplex read indices (Table D1). To allow for the detection of PCR duplicates, we included a string of 4 degenerate bases next to the Illumina read index (Schweyen et al. 2014). We used a Bioanalyzer 2100 (Agilent, Santa Clara, CA) to check library quality. The libraries were sequenced at the University of Illinois’ Roy J. Carver Biotechnology Center, using two lanes of Illumina HiSeq 4000 150 bp single-end reads.

5.9.3 DNA processing and variant calling

We aligned the demultiplexed reads to the N. lecontei reference genome (Nlec1.1 GenBank assembly accession number- GCA_001263575.2) using the very sensitive setting in bowtie2 (Langmead and Salzberg 2012; Vertacnik and Linnen 2015; Linnen et al. 2018). We only retained reads that aligned to one locus in the reference genome and had a phred score of greater than 30. We removed PCR duplicates using a custom script.

92

We called SNPs in samtools (Li et al. 2009). Both male and female larvae are morphologically identical. To confirm that all of the individuals were female, we filtered on heterozygosity. Since males are haploid, they should have very low heterozygosity. Any heterozygosity above zero is due to sequencing error. We required all sites to have a minimum of 7x coverage and 50% missing data or less. For SNPs, we filtered out sites with heterozygote excess under Hardy Weinberg equilibrium. We retained sites with homozygote excess, because it can be caused by population structure. We used a minor allele frequency (MAF) filter of 0.01 We removed any individual that was missing more than 70% of the data. We performed all filtering in VCFtools (Danecek et al. 2011). After filtering we had 35,649 sites.

5.9.4 Population divergence statistics

To confirm that our morphological hybrids were genetically admixed we performed Admixture with K =1- 5 for Kentucky N. lecontei and N. pinetum, and hybrids (wild caught and lab reared). We ran 100 replicates per K and chose the K with the lowest CV score. To test for introgression, we performed an ABBA-BABA test. We used N. virginianus as the outgroup. We did whole genome sequencing on a single N. virginianus female. We used 150 PE on a Nextseq at Georgia Genomics Facility. Library preparation and sequencing were both completed at the sequencing facility. We removed the adapters using cutadapt 1.16 and contaminants using the standard and pine databases in Kraken (Martin 2011; Wood and Salzberg 2014). We followed the above filtering strategy, except for filtering on PCR duplicates as the library preparation differed from the other samples. For the ABBA-BABA analysis, we only kept sites that were present in N. virginianus. For the ingroup we used Kentucky N. lecontei, Michigan N. lecontei, and Kentucky N. pinetum. We used a custom script to perform the ABBA-BABA test in 50kb windows using a jackknife approach to determine significance.

To further test for gene flow we performed demographic modeling in Fastsimcoal2 using N. pinetum and N. lecontei from Kentucky (Excoffier et al. 2013). We used SNPs that were 1kb away from the start or end of a gene so that we only used putatively neutral sites. We removed MAF filtering, subsampled to four individuals per species for each site, and removed any sites with missing data. We created an unfolded 93

2D-SFS using N. virginianus to infer the derived allele. We tested 5 demographic models- 1) Divergence without gene flow, 2) divergence with continuous bidirectional migration, 3) divergence in isolation followed by a single bout of secondary contact (unidirectional gene flow), 4) divergence with bidirectional migration that stops before divergence is complete, and 5) divergence in isolation followed by continuous secondary contact (bidirectional). All models except the model of continuous gene flow had equal parameters, so we could directly compare likelihood scores (Table D2). We ran each model 100 times starting from different parameter combinations and selected the run with the highest likelihood in order to estimate parameter values. We computed FST for Kentucky N. lecontei and N. pinetum in VCFtools both per site and genome-wide to look at genome-wide patterns of divergence (Danecek et al. 2011).

5.9.5 Meta-analysis

We performed a meta-analysis comparing mean and variance in FST between autosomes and X/Z chromosomes. We collected data from 11 published studies for a total of 28 taxon pairs (Table D3). In order for a data set to be included we had to be able to access a vcf file for genome scale data (e.g. RAD, GBS, whole genome resequencing). There had to be no known inversions in the sex chromosomes, and the sex chromosomes had to be heteromorphic. We excluded any study that had pooled sequencing.

We used SNPs that had been previously called in the original studies. If the original study accounted for difference in sex chromosome ploidy between males and females, we retained their methodology. Otherwise we only kept the homogametic sex. We accomplished this by removing low heterozygosity individuals (<1%) if sex was not stated. We calculated Weir and Cockerhams FST using a combination of VCFtools, diveRsity, and custom R scripts (Weir and Cockerham 1984; Danecek et al. 2011;

Keenan et al. 2013). We calculated the mean and variance in per site FST for autosomes and sex chromosomes. To test if sex chromosomes differed from autosomes in mean and variance in FST we used a non-parametric permutation approach, permuting loci between autosomes and sex-chromosomes. Assuming a sex-ratio of 1:1 sex-chromosomes are expected to have an effective size Ne of 3/4 of the autosome Ne, resulting in lower FST values for autosomes even if divergence is purely neutral. To account for that difference 94

in Ne we re-scaled the resampled average FST values of autosomes by 4/3. We performed 1,000,000 permutations to compute the distribution of difference between means and variances. We then compared our observed difference in mean and variance to the expected.

5.9.6 Simulations of divergent selection

We used Slim 3 to simulate the effect of haplodiploidy on patterns of divergence. We simulated a 500 kb region with an effective population size of 1000 individuals, a mutation rate of 2.5e-7 per site per generation, and a recombination rate of 2.5e-7 between consecutive sites per generation in the haplodiploids. For the diploids we scaled the effective population size, recombination rate, and mutation rate by ¾ to make them equal to the haplodiploids. The ancestral population was allowed to evolve for 8,000 generations to allow for the population to attain mutation-drift equilibrium. To mimic divergence due to divergent selection acting at a single locus, we assumed that the ancestral population diverged into two populations corresponding to two different environments. We simulated a single divergently selected recessive mutation occurred at position 250 kb (Figure D1) with a starting allele frequency of either 0.01 or 0.1 in each population. Our simulations reflect a soft sweep from standing neutral genetic diversity, because the mutations were placed on random but different genomic backgrounds. This assumes that a neutral mutation became under divergent selection at the time of split. The two populations were then allowed to evolve for another 2,000 generations. We simulated a range of migration rates (-0.5 to 1 log 2Nm) and selection coefficients (0 to 3 log 2Ns). For each set of parameter combinations we performed 1000 simulations. Variance and mean FST across 20Kb windows were then calculated across simulations. We repeated the same simulations without rescaling to simulate the difference between sex chromosomes and autosomes.

These simulations were also used to determine the mechanism creating differences between haplodiploids and diploids. To see if loss of the divergently selected allele due to drift was the mechanism, we calculated the number of simulations where the divergently selected allele was lost in the beneficial population. To test if there were

95

additional mechanisms acting, we recalculated the variance and mean FST with only simulations where the divergently selected allele was retained.

We then examined three additional mechanisms because there was still a difference in FST when using the subset of simulations where the selected allele was not lost. For all further analyses we used simulations where the divergently selected mutation starting frequency was 0.1 because when there was a starting allele frequency of 0.01 divergently selected mutation was lost in almost all simulations for the diploids. The next mechanism we tested was the migration threshold. We calculated the migration threshold as when FST was greater than the neutral expected FST given the divergence time, migration rates and effective sizes of populations. The next two mechanisms deal with the allele trajectories at the selected locus, so to test these mechanisms we performed deterministic simulations of the divergently selected mutation alone. This assumed that two populations split at time zero and that the allele frequencies in each population are simply determined by selection and migration at each generation. We followed the allele trajectories for 100000 generations. We tested several combinations of migration rates (m) and selective coefficients (s), with m ranging from 0 (no migration) to 0.1, and s ranging from 0 (neutral) to 0.3. Based on the allele trajectories we determined when the populations had reached a selection-migration equilibrium. We defined the equilibrium allele frequencies in each population as the frequencies at generation 100,000, assuming it was sufficient to attain equilibrium. The time to equilibrium was defined as the time when the allele frequency changed by less than ε*peq over 1000 generations (and that it also changed less than ε*peq in the last generation), where peq is the equilibrium allele frequency and ε is an error term set to 10-12.

96

CHAPTER 6. CONCLUSION

My dissertation has started to connect genotype to phenotype to fitness during speciation between N. pinetum and N. lecontei. We found that opposing dominance between host choice and oviposition traits causes extrinsic postzygotic isolation between N. pinetum and N. lecontei. In short, hybrids prefer white pine, N. pinetum’s host, but have oviposition traits that are more similar to N. lecontei. This results in hybrid females butchering the pine needles and causing the eggs to die of desiccation. In contrast, we find no evidence of intrinsic postzygotic isolation, despite high levels of genetic difference. If Haldane’s rule is caused by dominance or faster-X, haplodiploids should evolve intrinsic postzygotic isolation quickly since all recessive mutations are exposed in the haploid males. The lack of intrinsic postzygotic isolation is potentially due to different mechanisms causing Haldane’s rule or due to an underappreciated facet of haplodiploid sex determination, where recombination in hybrid males can reconstitute viable phenotypes. With the addition of studies on more haplodiploid taxa, haplodiploids represent a promising set of taxa to untangle the mechanisms underlying Haldane’s rule.

Beyond understanding the phenotypic and genetic basis of specific forms of reproductive isolation, I explored genome wide patterns of divergence between N. pinetum and N. lecontei. I found that there is a bimodal distribution of divergence, and that the fixed differences are spread throughout the genome. I hypothesized that the high level of and variability in divergence is due to more efficient selection in haplodiploids and was confirmed by a meta-analysis of sex chromosomes and simulations. Furthermore, simulations showed that the difference between haplodiploids and diploids was due to a lower probability of losing new recessive mutations to drift, differences in allele trajectory, and a higher migration-selection threshold.

Despite the progress made, there are still several areas in which a better understanding of the relationship between genotype, phenotype, and fitness is needed. Although we know the traits causing extrinsic postzygotic isolation, the genotypic basis

97

still needs fleshing out. For example, there is evidence that some loci for ovipositor morphology and oviposition behavior are overlapping with fitness loci, but the QTLs are very wide. In order to know if these are in fact the same loci, there needs to be fine mapping of these traits. Ideally, this will be followed with functional testing to further confirm that these loci are adaptive and contribute to the speciation process (Barrett and Hoekstra 2011; Vertacnik and Linnen 2017). Our understanding of the genetic architecture of these oviposition traits is also still limited. The current evidence suggests that the genetic architecture of oviposition traits is multigenic, which is inconsistent with theoretical predictions. When there is divergence with gene flow, it is expected that there will be few large effect loci to overcome the homogenizing force of gene flow. However, I do not have effect size estimates, so it is possible that few large effect loci drove the initial evolution of reproductive isolation followed by small effect loci once the migration rate was reduced which would fit with theoretical predictions.

Additionally, the understanding of the evolution of Bateson-Dobzhansky-Muller incompatibilities and intrinsic isolation would be greatly enhanced by more theoretical work. Dominance theory and faster-X should lead to more incompatible interactions in haplodiploids compared to diploids even if individual incompatibilities do not translate into complete inviability and fertility. Additionally, it is common for multiple incompatibilities to form the basis of intrinsic postzygotic isolation (Cabot et al. 1994; Presgraves 2010a; Guerrero et al. 2017). Therefore, it is likely that how haplodiploids and diploids differ in levels of intrinsic isolation may vary across the speciation spectrum. When there are few incompatibilities, diploids are likely to have greater levels of isolation but as the speciation process continues haplodiploids might have higher levels because they evolve multiple incompatibilities more rapidly. Modeling different amounts and type of epistatic interactions is essential for predicting when and how haplodiploids and diploids will differ in levels of intrinsic postzygotic isolation.

Here I have focused on only postzygotic isolation, but there is evidence of additional prezygotic barriers, such as habitat isolation and sexual isolation. To fully understand the speciation process in this system it is necessary to quantify all of the barriers and determine their phenotypic and genetic basis. I also used individuals from

98

one sympatric population of N. pinetum and N. lecontei for the majority of the experiments. N. lecontei uses a different set of pine hosts in other parts of their range and both species are likely exposed to different selective pressures in different locations (Bagley 2017; Bagley et al. 2017). Spatial dynamics of reproductive isolation and genomic patterns of divergence is worth examining. Especially because many of my hypothesis are related to the impacts of gene flow. Populations that are more spatially distant or are experiencing no or reduced gene flow would make an important control for the experiments reported here.

My examination of patterns of divergence in haplodiploids is also oversimplified. I only looked at the patterns of divergence when there was only a single site under selection per linkage group. However, multiple sites under positive selection or background selection are likely to occur and effect patterns of divergence. Although both strongly and weakly selected loci tend to cluster when there is gene flow, the timing and patterns are highly dependent on the strength of selection providing an additional opportunity for patterns of divergence to differ between haplodiploids and diploids (Rafajlović et al. 2016). Background selection has similar effects to positive selection by reducing neutral variation at linked loci, and purifying selection is also more efficient in haplodiploids (Avery 1984; Charlesworth et al. 1993; Charlesworth 2012). However, the effects of background selection on patterns of divergence are complicated by Hill- Robertson effects when deleterious mutations are linked to other loci under selection (Felsenstein 1974). The interaction between background selection and divergent selection is likely to have widespread consequences for patterns of divergence because regions of the genome that are likely to be under strong divergent selection (i.e. gene dense regions) also have a greater probability of experiencing background selection (Stephan 2010; Lohmueller et al. 2011; Cutter and Payseur 2013; Enard et al. 2014). Theoretical work on these more nuanced scenarios will aid in interpreting empirical patterns.

Overall, the natural history knowledge, the genomic resources, the ability to rear and cross individuals in the lab, their place on the speciation continuum, and current research make Neodiprion a promising emerging model system for speciation genomics. In order for the field to move forward new model systems that range in geographic

99

context, speciation progress, and genomic features are needed to make more general conclusions about the speciation process. The research in my dissertation also highlights how theory and empirical work can inform each other to efficiently push the field of evolutionary genetics forward.

100

APPENDICES

APPENDIX A. OVIPOSITION TRAITS GENERATE EXTRINSIC POSTZYGOTIC ISOLATION BETWEEN TWO PINE SAWFLY SPECIES

Table A1. Collection locations and number of females from each population populations Additional file used in parental phenotyping experiments. Table S1. Collection locations and number of females from each population populations used in parental phenotyping experiments.

Collection Oviposition Oviposition Eggs Per Egg Ovipositor Species City, State Latitude Longitude Preslit Host Willingness Preference Needle Spacing Morphology

N. pinetum Crossville, TN 35.980 -85.015 P. strobus 43 23 6 0 5 5 N. pinetum Lexington, KY 38.042 -84.442 P. strobus 5 5 0 0 0 5 N. pinetum Lexington, KY 38.003 -84.525 P. strobus 2 1 1 1 1 0 N. pinetum Lexington, KY 38.032 -84.565 P. strobus 5 4 4 4 3 5 N. pinetum Lexington, KY 37.971 -84.498 P. strobus 1 1 1 1 1 5 N. pinetum Lexington, KY 37.973 -84.500 P. strobus 4 2 1 1 0 5 N. pinetum Georgetown, KY 38.249 -84.549 P. strobus 2 2 0 0 0 0 N. pinetum Florence, KY 39.008 -84.65 P. strobus 2 2 0 0 0 0 Total 64 40 13 7 10 25 N. lecontei Lexington, KY 38.014 -84.504 P. virginiana 25 16 16 14 10 5 N. lecontei Crossville, TN 35.980 -85.015 P. virginiana 6 3 3 3 2 5 N. lecontei Spooner, WI 44.600 -84.713 P. banksiana 6 2 2 2 2 0 N. lecontei Lexington, KY 38.014 -84.504 P. echinata 44 31 0 0 0 5 N. lecontei Lexington, KY 38.014 -84.504 P. rigida 0 0 0 0 0 5 N. lecontei Goshen, KY 38.402 -85.586 P. echinata 0 0 0 0 0 5 Total 81 52 21 19 14 25 Bold indicates populations used in the interspecific crosses

Table A2. Collection locations for mature pine needles Table S2. Collection locations for mature pine needles Pine Species City, State Latitude Longitude P. banksiana Necedah, WI 44.115 -90.118 P. echinata Lexington, KY 37.973 -84.500 London, KY 37.071 -84.211 P. resinosa Necedah, WI 44.115 -90.118 P. rigida Lexington, KY 37.973 -84.500 Liberty, KY 37.221 -84.956 P. strobus Lexington, KY 38.034 -84.506 P. virginiana Lexington, KY 37.973 -84.500 London, KY 37.071 -84.211

Table S3. Sample sizes for interspecific crosses Oviposition Oviposition Oviposition Success on Success on Cross-type Willingness Preference Success P. Strobus P. banksiana N. pinetum 54 23 18 15 3 101 BCP 53 26 25 22 3 F1 Hybr id 72 39 37 25 12

BCL 97 85 80 35 45 N. lecontei 19 10 9 0 9

Table S4. Tukeys HSD for mature needle widths Comparison Difference Lower 95% CI Upper 95% CI P-value P. strobus vs P. echinata -0.34 -0.51 -0.18 1.8x10-6 P. strobus vs. P. resinosa -0.51 -0.68 -0.34 <1x10-7 P. strobus vs. P. virginiana -0.45 -0.62 -0.28 <1x10-7 P. strobus vs. P. banksiana -0.84 -1.00 -0.67 <1x10-7 P. strobus vs. P. rigida -0.93 -1.10 -0.76 <1x10-7 P. echinata vs. P. resinosa -0.17 -0.33 -0.00 0.047 P. echinata vs. P. virginiana 0.11 -0.06 0.27 0.40 P. echinata vs. P. banksiana -0.49 -0.66 -0.33 <1x10-7 P. echinata vs. P. rigida -.06 -0.75 -0.42 <1x10-7 P. resinosa vs. P. virginiana -0.06 -0.23 0.11 0.89 P. resinosa vs. P. banksiana -0.33 -0.49 -0.16 5.3x10-6 P. resinosa vs. P. rigida -0.42 -0.58 -0.25 <1x10-7 P. virginiana vs. P. banksiana -0.39 -0.55 -0.22 1x10-7 P. virginiana vs. P. rigida -0.48 -0.64 -0.31 <1x10-7 P. banksiana vs P. rigida -0.09 -0.07 0.26 0.58

Table S2. Collection locations for mature pine needles Pine Species City, State Latitude Longitude P. banksiana Necedah, WI 44.115 -90.118 P. echinata Lexington, KY 37.973 -84.500 T able S2. CollectioLno nlodcoanti,o KnsY f or matur3e 7p.i0n7e1 n eedles- 84.211 PP.i nree sSipneocsiaes CNiteyc, eSdtathe, WI La4ti4tu.1d1e 5 Lon-g9i0tu.1d1e 8 PP.. rbiagnikdsai ana NLeecxeidnaght,o Wn,I KY 443.171.59 73 -90.-18148.5 00 P . echinata LLeixbinegrttyo,n K, KYY 373.977.32 21 -84.-58040.9 56 P . strobus LLoenxdionng, tKonY, KY 373.087.10 34 -84.-28141.5 06 P. resinosa Necedah, WI 44.115 -90.118 P. virgi niana Lexington, KY 37.973 -84.500 P . rigida LLeoxinndgotonn, ,K KYY 373.977.30 71 -84.-58040.2 11 Liberty, KY 37.221 -84.956 P. strobus Lexington, KY 38.034 -84.506 Table A3. Sample sizes for interspecific crosses TPa.b vleir Sgi3n.i aSnaam plLe esxiiznegst ofon,r KinYte rspec3i7f.i9c7 c3r osses -84.500 LOovnidpoons,i tKioYn Ov3i7p.0o7s1it ion -8O4.v21ip1o sition Success on Success on Cross-type Willingness Preference Success P. Strobus P. banksiana

NT.a bplien eSt3u.m S ample sizes5 f4o r interspecific c2r3o sses 18 15 3 BC Ovipo5si3t ion Ovipos2it6io n Oviposit2io5n Success o2n2 Success on3 CrPoss-type F1 Hybrid Willin7g2n ess Prefere3n9c e Succes3s7 P. Strobu2s5 P. banksian1a2 N. pinetum 54 23 18 15 3 BCL 97 85 80 35 45 NB.C lPe contei 513 9 261 0 25 9 22 0 3 9 F1 Hybrid 72 39 37 25 12

BCL 97 85 80 35 45 TNa. blelec oSn4t e. iT ukeys HS19D for mature n1e0e dle widths 9 0 9 Comparison Difference Lower 95% CI Upper 95% CI P-value Table P. str oA4.bus vTukeys P. ecsh iHSDnata for mature-0 needle.34 widths-0 .51 -0.18 1.8x10-6 Table S4. Tukeys HSD for mature needle widths P. strobus vs. P. resinosa -0.51 -0.68 -0.34 <1x10-7 Comparison Difference Lower 95% CI Upper 95% CI P-value -7 PP.. strobuuss vvss P. P. e. cvhiirngaitnai ana -0.-304. 45 -0.5-10 .62 -0.18 -0.28 1.8x10<-6 1x10 -7 PP.. strobuuss vvss. .P P. .r ebsainoksaia na -0.-501. 84 -0.6-81 .00 -0.34 -0.67 <1x10-<7 1x10 -7 PP.. strobuuss vvss. .P P. .v ririgindiaa na -0.-405. 93 -0.6-21 .10 -0.28 -0.76 <1x10-<7 1x10 PP.. estcrhoibnuast av sv. sP. .P b.a rnekssiinaonasa -0.-804. 17 -1.0-00 .33 -0.67 -0.00 <1x10-70 .047 PP.. estcrhoibnuast av sv. sP. .P ri. gviidrag iniana -0.903.1 1 -1.1-00 .06 -0.76 0.27 <1x10-7 0.40 PP.. echiinaattaa vvss. .P P. .r ebsainoksaia na -0.-107. 49 -0.3-30 .66 -0.00 -0.33 0.047< 1x10-7 PP.. echiinaattaa vvss. .P P. .v ririgiindiaa na 0.1-1.0 6 -0.0-60 .75 0.27 -0.42 0.40 <1x10-7 PP.. reecshiinoastaa vvss. .P P. .b vainrkgsiinainaan a -0.-409. 06 -0.6-60 .23 -0.33 0.11 <1x10-7 0.89 PP.. reecshiinoastaa vvss. .P P. .r bigaidnak siana -.0-60 .33 -0.7-50 .49 -0.42 -0.16 <1x105-7. 3x10-6 PP.. resiinoossaa vvss. .P P. .v ririgiindiaa na -0.-006. 42 -0.2-30 .58 0.11 -0.25 0.89 <1x10-7 -6 PP.. vreirsginionsian vas .v Ps. bPa. nbkasinaknsaia na -0.-303. 39 -0.4-90 .55 -0.16 -0.22 5.3x10 1 x10-7 -7 PP.. vreirsginionsian vas .v Ps. rPig. irdiag ida -0.-402. 48 -0.5-80 .64 -0.25 -0.31 <1x10< 1x10-7 P. virginiana vs. P. banksiana -0.39 -0.55 -0.22 1x10-7 P. banksiana vs P. rigida -0.09 -0.07 0.26 0.58 P. virginiana vs. P. rigida -0.48 -0.64 -0.31 <1x10-7

P. banksiana vs P. rigida -0.09 -0.07 0.26 0.58

Table A5. Tukeys HSD post hoc test for P. strobus and P. banksiana needle widths Table S5. Tukeys HSD post hoc test for P. strobus and P. banksiana needle widths Comparison Difference Lower 95% CI Upper 95% CI P-value Mature P. strobus vs. Mature P. banksiana -0.84 -0.94 -0.73 <1x10-7 P. banksiana Seedling vs. Mature P. banksiana -0.39 -0.50 -0.29 <1x10-7 P. strobus Seedling vs. Mature P. banksiana -0.75 -0.86 -0.64 <1x10-7 P. banksiana Seedling vs. Mature P. strobus 0.44 0.34 0.55 <1x10-7 P. strobus Seedling vs. Mature P. strobus 0.09 -0.02 0.19 0.15 P. strobus Seedling vs. P. banksiana Seedling -0.36 -0.46 -0.25 <1x10-7

102

Figure A1. Host-dependent fitness ranking predictions under extrinsic postzygotic Figure S1. Host-dependent fitness ranking predictions under extrinsic postzygotic isolation. Cross-type is indicated as in Fig. 2 (P= N. pinetum, BCP = N. pinetum isolation. Cross-typbackcross,e is indic aF1=ted F a1s hybrids. in Fig .BC 2 (LP= =N. N lecontei. pinet ubackcross,m, BCP =L= N N.. p leconteiinetum). backcross, F1= F1 hybrids. BCL= N. lecontei backcross, L= N. lecontei).

Figure S2. SeedlinFigureg ne eA2dl eSeedlings partia needleslly rec partiallyapitula trecapitulatee differen differencesces betwe betweenen mat matureure P. P. banksiana and P. sbanksianatrobus. Mande aP.n strobuswidth.s Mean (+/- SwidthsEM) (+/ fo-r SEM) need forles needlestaken takenfrom from seed lings and seedlings and mature pines (P. banksiana and P. strobus). Although P. mature pines (P. bbanksianaanksiana needles and P are. s talwaysrobus )thicker. Alth thanoug P.h Pstrobus. ban kneedles,siana nP.e ebanksianadles are always thicker than P. stroseedlingsbus ne ehavedles thinner, P. ba needlesnksian thana s ematureedlin gP.s banksianahave thin trees.ner n Statisticaleedles t han mature P. banksiana treessignificance. Statistica atl sPi g

103

Figure A3. Oviposition success of N. lecontei and N. pinetum on Figure SP.3 .banksiana Ovipos iandtio nP. s strobus.uccess On of P. N strobus. lecon, N.te ilecontei and N has. p inetum on P. banksiana and P. strobus.complete On P. sfailuretrobu tos, hatch. N. le contei has complete failure to hatch.

Eggs Collection Oviposition Oviposition Egg Ovipositor Species City, State Latitude Longitude Per Preslit Host Willingness Preference Spacing Morphology Needle

Crossville, N. pinetum TN 35.980 -85.015 P. strobus 43 23 6 0 5 5 Lexington, N. pinetum KY 38.042 -84.442 P. strobus 5 5 0 0 0 5 Lexington, N. pinetum KY 38.003 -84.525 P. strobus 2 1 1 1 1 0 Lexington, N. pinetum KY 38.032 -84.565 P. strobus 5 4 4 4 3 5 Lexington, N. pinetum KY 37.971 -84.498 P. strobus 1 1 1 1 1 5 Lexington, N. pinetum KY 37.973 -84.500 P. strobus 4 2 1 1 0 5 Georgetown, N. pinetum KY 38.249 -84.549 P. strobus 2 2 0 0 0 0 Florence, N. pinetum KY 39.008 -84.65 P. strobus 2 2 0 0 0 0 Total 64 40 13 7 10 25 Lexington, P. N. lecontei KY 38.014 -84.504 virginiana 25 16 16 14 10 5 Crossville, P. N. lecontei TN 35.980 -85.015 virginiana 6 3 3 3 2 5 104 P.

N. lecontei Spooner, WI 44.600 -84.713 banksiana 6 2 2 2 2 0 Lexington, N. lecontei KY 38.014 -84.504 P. echinata 44 31 0 0 0 5 Lexington, N. lecontei KY 38.014 -84.504 P. rigida 0 0 0 0 0 5 N. lecontei Goshen, KY 38.402 -85.586 P. echinata 0 0 0 0 0 5

APPENDIX B. GENETIC ARCHITECTURE OF OVIPOSITION TRAITS AND HATCHING SUCCESS IN NEODIPRION PINETUM AND N. LECONTEI

Table B1 Cross history of the backcross families used in the QTL mapping

# of Male family that backcross backcross F1 female was females type F1 family backcrossed to backcross family from each lecontei RX011 LX022v3 LBX001 28 lecontei RX011 LX022v3 LBX002 4 lecontei RX011 LX022v3 LBX005 16 lecontei RX011 LX022v3 LBX007 9 lecontei RX011 LX022v3 LBX008 14 lecontei RX011 LX022v3 LBX009 24 lecontei RX011 LX022v3 LBX010 28 lecontei RX011 LX022v3 LBX012 1 lecontei RX011 LX022v5 LBX011 7 lecontei RX011 LL287v1 LBX004 1 lecontei RX012 LX022v3 LBX013 19 lecontei RX012 LX022v3 LBX014 3 lecontei RX012 LX022v3 LBX016 11 lecontei RX013 INV7 RX013xINV7-1 26 lecontei LX024xNP066v1 LX029v1 LBX023 1 lecontei LX024xNP066v1 LX043v1 LBX027 2 lecontei LX024xNP066v1 LX024xNP066v1 LBX020 1 pinetum RX004 PX014v2 PBX002 37 pinetum RX004 PX014v2 PBX004 5 pinetum RX004 PX014v2 PBX005 10 pinetum RX004 PX014v2 PBX006 26 pinetum RX004 PX014v2 PBX008 1 pinetum RX004 PX014v2 PBX009 35 pinetum RX004 PX014v2 PBX010 3 pinetum RX004 PX014v2 PBX011 18 pinetum RX004 PX014v2 PBX012 26 pinetum RX004 PX014v2 PBX013 24

# of female family backcross that F2 male females backcross was backcrossed from each type F2 male family to backcross family family lecontei PBX011 FEB1 FEB1xPBX011-1 3 lecontei PBX012 FEB1 FEB1xPBX012-1 2 lecontei PBX012 FEB1 FEB1xPBX012-3 3 lecontei PBX013 FEB1 FEB1xPBX013-1 2 lecontei RX008V1 LL276-17 KMM5 3 lecontei INV7 LL276-02 KMM20* 3

* treated as a backcross in all analyses, but is actually pure N. lecontei

105

Table B2 Statistical results for ovipositon traits across different cross types

Proportion of White Average needles with Ovipositor Ovipositor Extra choice Eggs per needle egg spacing preslits length width annuli Ovipositor shape Comparion P value Z score P value P value Z score P value P value P value P value Z score P value P-PBX 0.302 -4.238 <0.001 <0.001 0.01 1 <0.001 <0.001 <0.001 6.173 <0.001 P-F1 0.094 -5.516 <0.001 <0.001 -0.011 1 <0.001 <0.001 <0.001 5.8108 <0.001 P-LBX <0.001 -10.624 <0.001 <0.001 -6.781 1 <0.001 <0.001 <0.001 15.682 <0.001 P-L <0.001 -14.006 <0.001 <0.001 -0.012 1 <0.001 <0.001 <0.001 15.418 <0.001 PBX-F1 0.206 -3.011 0.0197 0.002 -2.97 0.018 <0.001 0.357 <0.001 4.2126 <0.001 PBX-LBX <0.001 -11.532 <0.001 <0.001 -0.012 <0.001 <0.001 <0.001 <0.001 21.834 <0.001 PBX-L <0.001 -14.783 <0.001 <0.001 -6.208 <0.001 <0.001 <0.001 <0.001 16.249 <0.001 F1-LBX 0.023 -2.867 0.031 0.331 1.100 0.765 0.269 <0.001 <0.001 1.2897 0.107 F1-L <0.001 -7.161 <0.001 0.011 2.651 0.045 0.066 <0.001 1 4.8174 <0.001 LBX-L <0.001 -7.15 <0.001 0.107 -2.393 0.089 0.52 <0.001 1 6.551 <0.001

Table B3 Pairwise post hoc tests for for ovipositon traits across different cross types

test statistic Sampsle size test statistic value df P value TableWhite A2.3 choi cPairwisee post hoc tests30 0for for ovipositonChi square traits 9across9.383 different4 cross< types 1e-1 6 Eggs per needle 300 Chi square 346.93 4 < 1e-16 Average egg spacing 263 F value 61.306 4 < 1e-16 Proportion of needles with preslits 294 Chi square 100.89 4 < 1e-16 ovipositor length 445 F value 106.67 4 < 1e-16 ovipositor width 445 F value 141.84 4 < 1e-16 Extra annuli 447 Chi square 238.46 4 < 1e-16 ovipositor shape 444 Z score 10.457 4 < 1e-4

106

Table B4. Statistical results the effects of oviposition traits on the presence of hatching

Comparison Chi square df P white host chosen 14.39 1 <0.001 Eggs per needle 0.298 1 0.585 Average egg spacing 1.504 1 0.22 presence of preslit 0.559 1 0.455 ovipositor length 7.363 1 0.007 ovipositor width 2.955 1 0.086 Extra annuli 2.15 1 0.143

PCA 1 0.083 1 0.773 PCA 2 0.042 1 0.838 PCA 3 4.303 1 0.038 PCA 4 0.412 1 0.521 Eggs per needle: white host choice 0.032 1 0.857 Average egg spacing: white host choice 1.161 1 0.281 presence of preslit: white host choice 8.989 1 0.003

ovipositor length: white host choice 0.169 1 0.681 ovipositor width: white host choice 0.019 1 0.891 Extra annuli: white host choice 3.002 1 0.083 PCA 1: white host choice 0.52 1 0.471 PCA 2: white host choice 1.623 1 0.203 PCA 3: white host choice 0.412 1 0.521 PCA 4: white host choice 1.802 1 0.179

Family 38.558 28 0.088 Leg length 1.658 1 0.198

107

Table B5. Summary of QTL mapping results for oviposition traits and hatching success

Trait # of loci trait type sample size Oviposition willingness 2 Behavior 390 White chosen 0 Behavior 229 Average egg spacing 3 Behavior 222 Average eggs per needle 8 Behavior 229 Proportion of needles with preslits 10 Behavior 227 Presence of preslit 3 Behavior 228 Ovipositor length 9 Morphology 383 Ovipositor width 12 Morphology 383 Ovipositor shape - PCA 1 2 Morphology 383 Ovipositor shape - PCA 2 6 Morphology 383 Ovipositor shape - PCA 3 8 Morphology 383 Ovipositor shape - PCA 4 3 Morphology 383 Extra annuli 2 Morphology 384 Proportion of eggs that hatched 2 Fitness 230 Hatching presence 1 Fitness 222 Proportion of eggs that hatched on Jack 4 Fitness 95 Hatching presence on Jack 0 Fitness 95 Proportion of eggs that hatched on white 10 Fitness 127 Hatching presence on white 1 Fitness 127

108

A.

6

0

.

0

4

0

.

0

2

0

.

%

0

7

4

.

0

2

0

2

.

:

0

2

C

2

P

0

.

0

-

4

0

.

0

-

6

0

.

0 - -0.15 -0.10 -0.05 0.00 0.05 0.10

PC 1: 40.53%

B.

2

0

.

0

%

8

0

.

0

6

.

:

0

4

C

P

2

0

.

0

-

4

0

.

0 - -0.05 0.00 0.05

PC 3: 9.65%

Figure B1. Principle components analysis of ovipositor shape in the backcross mapping population. The warp grids show the shape of the ovipositor along the PC axis. A. PC axis 1 and 2. B. PC axis 3 and 4.

109

APPENDIX C. LACK OF INTRINSIC POSTZYGOTIC ISOLATION IN HAPLODIPLOID HYBRID MALES DESPITE HIGH GENETIC DISTANCE

Table C1. Sampling locations for N. pinetum and N. lecontei used in fertility and viability estimates.

Male Male Male fertility - Female Female Male sperm mating female Population viability fertility viability motility success production State Lattitude Longitude N. lecontei Arboretum X X X X X X KY 38.014 -84.504 Table A3.1Cro sSamplingsville, TN locations forX N. pinetumX and N. leconteiX used in fertilityTN 35 .and980 -85.015 Spooner X WI 45.822 -91.888 viability estimates.Grayling, MI X X X MI 44.657 -84.696 High St. X X KY 38.044 -84.497 Tates Creek X KY 37.970 -84.511 Clay's Mill X KY 37.984 -84.559

N. pinetum Ecton Park X KY 38.016 -84.490 Regency Rd X X X KY 38.003 -84.525 Man 'O War X KY 37.971 -84.498 Georgetown South X X X X X X KY 38.249 -84.549 Georgetown North X X X X KY 38.248 -84.545 Cardinal Run X X X X KY 38.032 -84.565 Belleau Woods X X X KY 37.973 -84.500 Crossville, TN X X X TN 35.980 -85.015 McConnel Springs X X X X X X KY 38.056 -84.528 Starshoot Pkwy X X X X X X KY 38.025 -84.423 Walton X KY 38.857 -84.618 Florence X KY 39.008 -84.650 Waverly X KY 37.989 -84.574

Table C2. Statistics for mated and virgin female fertility. Single step adjusted P- Tvaluesable S 2are. S reported.tatistics fo r mated and virgin female fertility. Single step adjusted P-values are reported. test- test-statistic Fertility Measurement statistic value df P Mated o viposition willingness Chisq 11.415 3 9.68E-03 Mated egg number F 8.981 3 1.51E-05 Table A3.2. Statistics for mated and virgin female fertility. Single step adjusted Virgin oviposition willingness Chisq 14.316 2 7.78E-04 P-values are reported. Virgin egg number F 3.113 2 0.0537

Table S3. Tukey’s HSD post-hoc tests for mated and virgin female fertility. Single step adjusted P-values are reported.

Mated oviposition Virgin oviposition willingness Mated egg number willingness Compa rison Z value P t value P Z value P N . pinetum vs. PxL -1.10 0.685 4.04 <0.001 NA NA N . pinetum vs LxP -0.402 0.977 3.78 1.27E-03 1.52 0.281 N . pinetum vs. N . leconte i -2.98 0.0145 -4.29 <0.001 -3.68 <0.001 PxL vs. LxP 1.20 0.621 -0.355 0.984 NA NA PxL vs N . lecontei -1.09 0.690 0.765 0.96 NA NA LxP vs. N . lecontei -2.42 0.0699 1.19 0.622 -1.60 0.244

Table S4. Tukey’s HSD post-hoc tests for male viability (colonies that had adult emergence). Single step adjusted P-values are reported.

Comparison Z P N . pinetum vs. PxL -1.43 0.416 N . pinetum vs LxP -0.868 0.784 N . pinetum vs. N . lecontei -0.012 1 PxL vs. LxP 1.01 0.696 PxL vs N . lecontei -0.011 1 LxP vs. N . lecontei -0.011 1 Table S2. Statistics for mated and virgin female fertility. Single step adjusted P-values are reported. test- test-statistic Fertility Measurement statistic value df P Table S2. Statistics for mated and virgin female fertility. Single step adjusted P-values are reported. Mated oviposition willingness Chisq 11.415 3 9.68E-03 test- test-statistic Mated egg number F 8.981 3 1.51E-05 Fertility Measurement statistic value df P MViartgeidn o ovivpiopsoitsioitni owni llwiniglnliensgsneCshsisq Chisq 11.415 14.3316 9.68E-023 7.78E-04 MViartgeidn e egg gn unmubmerber F F 8.981 3.1313 1.51E-025 0.0537 Virgin oviposition willingness Chisq 14.316 2 7.78E-04 Virgin eg g number F 3.113 2 0.0537 Table S3. Tukey’s HSD post-hoc tests for mated and virgin female fertility. Single step adjusted P-values are TrTableeapbloer Ste 3C3.d. T. u kTukey’sey’s HSD pHSDost-ho postc test-shoc for m testsated afornd vmatedirgin fe mandale fvirginertility. femaleSingle s tefertility.p adjuste dSingle P-value steps are radjustedeported. P-values are reported. Mated oviposition Virgin oviposition Mated ovipwosiiltliionng ness Mated egg nuVmirgbien ro viposition willingness willingness Mated egg number willingness Comparison Z value P t value P Z value P Comparis on Z value P t value P Z value P N .. p pinineteutmum vs .v Psx.L PxL -1.10 -1.100.685 0.6845.04 <04.0.0041 <0N.0A01 NA NA NA N .. p pinineteut mum vs vLxsP LxP -0.402 -0.4020.977 0.9737.78 1.273E.-7083 1.217.E52-03 0.2811.52 0.281 N .. p pinineteutmum vs .v sN.. lNec.o lnetceoi nte i -2.98 -2.908.0145 0.014-45.29 <-04.0.2091 <-30.6.0801 <0.001-3.68 <0.001 TablePxxLL v vs.s L.A3.3. xLPxP Tukey’s HSD post1.-2hoc0 1tests.200 .for621 mated0.6- 02and.1355 virgin-0 0.female.395854 fertility.0N.9A8 4Single stepNA NA NA PxL vs N . lecontei -1.09 0.690 0.765 0.96 NA NA adjustedPxL vs N .P le-valuescontei are reported. -1.09 0.690 0.765 0.96 NA NA LxP vs. N . lecontei -2.42 0.0699 1.19 0.622 -1.60 0.244 LxP vs. N . lecontei -2.42 0.0699 1.19 0.622 -1.60 0.244

Table S4. Tukey’s HSD post-hoc tests for male viability (colonies that had adult emergence). Single step

TableaTdajbusltee dC4.S P4-.v TaTukey’sluukese yar’es rH eHSDpSoDrte dp .opostst-h-ohocc te teststs fso forr m maleale vi aviabilitybility (c o(colonieslonies th athatt ha dhad ad uadultlt em ergence). Single step emergence).aCdojmusptaerdis oPn-v aSinglelues ar estep rep adjustedortZed. P-valuesP are reported. N . pinetu m vs. PxL -1.43 0.416 NC.o pminpeaturmis ovns LxP -Z0.868 0.7P84 N .. p pinineteu tmum vs .v sN.. PlexcLontei -0.012 -1.43 1 0.416 PNx.L pvsin. LextPum vs LxP 1.01 -0.8680.696 0.784 PNx.L pvsin NeTable. tluemcon vt eA3.4.si. N . lTukey’secontei HSD-0.0 post11 -0-hoc.012 tests1 for male1 viability (colonies that had adult LxP vs. N . lecontei -0.011 1 emergence).PxL vs. LxP Single step adjusted P-values1.0 1are reported.0.69 6 PxL vs N . lecontei -0.011 1 LxP vs. N . lecontei -0.011 1

111

X

N . lecontei N . pinetum

F1 fem ale viability (Survival to adulthood)

F1 fem ale fertility Unmated or (oviposition willingness, and backcrossed egg number)

F2 m ale viability (Survival to adulthood)

F2 male fertility (Sperm motility)

Backcrossed to F2 m ale fertility X (Mating willingness) N . lecontei female

F2 m ale fertility (production of daughters)

Figure C1. Diagram of one direction of the crosses illustrating when each set of Figure S1. Diagram of one direction of the crosses illustrating when each set of reproductive isolation data was reproductive isolation data was collected. The reciprocal cross was also performed for F1 collected. The reciprocal cross was also performed for F1 female viability and fertility and F2 male viability. The femalere viabilityctangles rep randesen tfertility the genoty andpe of tFh2e maleindivid uviability.al. Light gre They is N .rectangles lecontei and drepresentark grey is N .the pin egenotypetum genetic of the mindividual.aterial . The hLightaploids grey(single isre cN.tan gleconteile) are ma leands an ddark the di pgreyloids (is2 r eN.cta pinetumngles) are f egeneticmale. material. The haploids (single rectangle) are males and the diploids (2 rectangles) are female.

112

Figure C2. Virgin female fertility. A. The proportion of females that oviposited. B. The average number of eggs laid for pure and hybrid females. Error bars represent SE. Different letters denote pairwise comparisons that are statistically significantly different in post-hoc tests; lack of letters indicate that there were no significant differences.

113

APPENDIX D. HAPLODIPLOIDY LEADS TO HETEROGENEOUS GENOMIC DIVERGENCE DUING SPECIATION WITH GENE FLOW

Table D1. Sampling locations, population identification, and barcodes for Neodiprion sawflies Individual Species Population Latitude Longitude Index Barcode NP002L_01 N. pinetum KY 38.026 -84.495 TGACCAAT GCAAGCCAT NP005L_01 N. pinetum KY 38.042 -84.442 CCGTCCCG CGTCGCCACT NP006_08 N. pinetum KY 38.042 -84.442 GTGAAACG GCGTCCT NP015_01 N. pinetum KY 38.016 -84.490 TTAGGCAT TGGCACAGA NP018_V1 N. pinetum KY 38.003 -84.525 GTGAAACG GCAAGCCAT NP019_01 N. pinetum KY 38.003 -84.525 CCGTCCCG ATATCGCCA NP021L_01 N. pinetum KY 38.857 -84.618 TGACCAAT ATAGAT NP022L_01 N. pinetum KY 38.857 -84.618 CGATGTAT CGTCGCCACT NP023L_01 N. pinetum KY 38.857 -84.618 TGACCAAT AACTGG NP024L_01 N. pinetum KY 38.857 -84.618 CCGTCCCG ACAACCAACT NP025L_01 N. pinetum KY 38.857 -84.618 CCGTCCCG TATTCGCAT NP026L_01 N. pinetum KY 38.857 -84.618 GTGAAACG GGAACGA NP027L_01 N. pinetum KY 38.857 -84.618 TTAGGCAT TAGCCAA NP028L_01 N. pinetum KY 38.857 -84.618 GTGAAACG ACGGTACT NP029L_01 N. pinetum KY 38.857 -84.618 ATCACGAT TATGT NP030L_01 N. pinetum KY 38.857 -84.618 GTCCGCAC CCTCG NP034_02 N. pinetum KY 38.032 -84.565 ATCACGAT CGTGGACAGT NP036_04 N. pinetum KY 38.032 -84.565 TTAGGCAT TGACGCCA NP037L_01b N. pinetum KY 38.032 -84.565 GTCCGCAC GGTGT NP038_01 N. pinetum KY 38.003 -84.525 TGACCAAT ACAACCAACT NP040_02 N. pinetum KY 37.973 -84.500 CGATGTAT TCACGGAAG NP041_01 N. pinetum KY 37.973 -84.500 TTAGGCAT AAGACGCT NP047_01 N. pinetum KY 39.008 -84.650 GTCCGCAC TATTCGCAT LL002_01 N. lecontei KY 38.014 -84.504 GTGAAACG GAGCGACAT LL006_02b N. lecontei KY 38.014 -84.504 GTGAAACG TCAGAGAT LL010 N. lecontei KY 36.928 -84.619 GTGAAACG AACTGG LL011 N. lecontei KY 37.071 -84.211 AGTCAACA CTAAGCA LL013 N. lecontei KY 37.071 -84.211 TGACCAAT ACAACT LL014 N. lecontei KY 37.071 -84.211 CCGTCCCG ACTGCGAT LL015_01 N. lecontei KY 37.071 -84.211 ATCACGAT TCAGAGAT LL053 N. lecontei KY 38.014 -84.504 TGACCAAT TCTTGG LL064 N. lecontei KY 38.014 -84.504 TTAGGCAT ACGGTACT LL074_1R N. lecontei KY 38.014 -84.504 AGTCAACA TCAGAGAT LL097 N. lecontei KY 38.044 -84.497 GTGAAACG GGCTTA LL098_02 N. lecontei KY 38.044 -84.497 ATCACGAT AACTGG LL099_02 N. lecontei KY 38.044 -84.497 TGACCAAT ATTAT LL100 N. lecontei KY 38.023 -84.494 GTCCGCAC TGGCACAGA

114

LL110_02 N. lecontei KY 38.024 -84.532 ATCACGAT TCTTGG LL111_02 N. lecontei KY 38.024 -84.532 TTAGGCAT ATTAT LL112 N. lecontei KY 38.024 -84.532 CCGTCCCG TAGCCAA LL113 N. lecontei KY 38.024 -84.532 GTCCGCAC TGGCAACAGA LL117 N. lecontei KY 37.984 -84.418 CCGTCCCG CACCA LL123 N. lecontei KY 38.033 -84.507 TGACCAAT CCTCG LL124 N. lecontei KY 38.033 -84.507 ATCACGAT ATATCGCCA LL129 N. lecontei KY 38.044 -84.497 TTAGGCAT CCGAACA LL130 N. lecontei KY 38.044 -84.497 AGTCAACA AACGTGCCT LL133 N. lecontei KY 38.024 -84.532 CGATGTAT CTAAGCA LL160 N. lecontei KY 38.014 -84.504 TTAGGCAT ATAGAT LL181 N. lecontei KY 38.402 -85.586 GTGAAACG CACCA LL194_02 N. lecontei KY 37.806 -83.678 TGACCAAT TGCTT LL195_02 N. lecontei KY 37.806 -83.678 GTCCGCAC CTTGA RB017.05 N. lecontei KY 37.984 -84.511 AGTCAACA TAGCCAA RB020Db N. lecontei KY 37.066 -84.159 GTCCGCAC ATTAT RB022C N. lecontei KY 38.024 -84.494 TTAGGCAT TCACTG RB040_02 N. lecontei KY 38.209 -84.390 GTGAAACG ATGAGCAA RB076_01 N. lecontei KY 38.014 -84.504 GTGAAACG TGACGCCA RB129_01 N. lecontei KY 38.014 -84.504 TTAGGCAT TCAGAGAT RB144 N. lecontei KY 38.209 -84.390 AGTCAACA TCACTG RB148 N. lecontei KY 38.209 -84.390 TTAGGCAT GAAGTG RB149_02 N. lecontei KY 38.209 -84.390 ATCACGAT TAGCCAA RB151 N. lecontei KY 38.209 -84.390 GTGAAACG GCCTACCT RB161_02 N. lecontei KY 37.984 -84.418 GTCCGCAC ACTGCGAT RB162 N. lecontei KY 38.024 -84.494 GTGAAACG AAGACGCT RB344 N. lecontei KY 38.014 -84.504 AGTCAACA CCTTGCCATT RB358 N. lecontei KY 38.014 -84.504 TTAGGCAT CTTGA RB361 N. lecontei KY 37.984 -84.418 AGTCAACA AACTGG RB370_02 N. lecontei KY 38.033 -84.507 GTCCGCAC AACTGG RB099_01 N. lecontei UPMI 45.924 -86.303 ATCACGAT CAGATA RB405 N. lecontei UPMI 45.949 -86.261 GTGAAACG ACTGCGAT RB406 N. lecontei UPMI 45.949 -86.261 TTAGGCAT ACAACCAACT RB095_04 N. lecontei UPMI 46.094 -85.339 TGACCAAT TCAGAGAT RB096B N. lecontei UPMI 46.096 -85.394 GTGAAACG TAGCCAA RB098_01 N. lecontei UPMI 46.096 -85.394 CCGTCCCG AACGTGCCT RB480_02 N. lecontei UPMI 46.096 -85.394 GTCCGCAC ACAACCAACT RB480_1 N. lecontei UPMI 46.096 -85.394 TGACCAAT CGTGGACAGT RB481 N. lecontei UPMI 46.096 -85.394 AGTCAACA CGTCGCCACT RB482 N. lecontei UPMI 46.096 -85.394 GTGAAACG TCTTGG RB483 N. lecontei UPMI 46.096 -85.394 TGACCAAT CGTCGCCACT RB407 N. lecontei UPMI 45.918 -86.313 TTAGGCAT CCACTCA RB408 N. lecontei UPMI 45.918 -86.313 CGATGTAT CGTGGACAGT

115

RB409 N. lecontei UPMI 45.926 -86.294 TTAGGCAT ACCAGGA RB410 N. lecontei UPMI 45.926 -86.294 TGACCAAT GGCTTA RB411 N. lecontei UPMI 45.926 -86.294 GTCCGCAC TAGCCAA RB412 N. lecontei UPMI 45.926 -86.294 AGTCAACA CGTGGACAGT RB413 N. lecontei UPMI 45.926 -86.294 CCGTCCCG GGTGCACATT LL198 Wild Hybrid KY 38.857 -84.618 TTAGGCAT CGTCGCCACT LL244 Wild Hybrid KY 38.014 -84.504 ATCACGAT GGTGCACATT NP046 Wild Hybrid KY 38.893 -84.557 GTGAAACG CTCTCGCAT S23 F1 Lab Hybrid NA NA NA TGACCAAT CTAAGCA

Table D2. CV error scores for admixture models of K=1-5

K CV error 1 0.4853375 2 0.3911959 3 0.4113284 4 0.4259785 5 0.440888

116

Table D3. Demographic model of N. pinetum and N. lecontei divergence

continuous continuous Seconday contact - migration that migration that one burst of starts after ends before continuous No migration admixture divergence present day migration # of paramaters estimated 7 7 7 7 6 Ancestral population size 2075281 1962052 1971660 2006404 1982187 N. pinetum population size 645779 369032 322897 337579 328311 N. lecontei populaion size 1057278 1052921 1104528 1075336 1093739 Time since divergence (generations) 1010864 1255392 1542789 1553120 1548690 N. pinetum bottle neck size 717 NA NA NA NA N. lecontei bottle neck size 470 NA NA NA NA Time since bottle neck 1010854 NA NA NA NA Admixtue proportion from N. pinetum to N. lecontei NA 0.0060941 NA NA NA Admixture proportion from N. lecontei to N. pinetum NA 0.118292 NA NA NA Time since admixture NA 115046 NA NA NA Migration rate fom N. lecontei to N. pinetum NA NA 3.94E-07 3.63E-07 3.65E-07 Migration rate from N. pinetum to N. lecontei NA NA 1.64E-08 1.83E-08 1.71E-08 Number of N. pinetum migrants NA NA 0.1273259 0.1226254 0.1196984 Number of N. lecontei migrants NA NA 0.0181423 0.0196645 0.018665 Time since migration started NA NA 901891 NA NA Time since migration ended NA NA NA 1906 NA Estimated maximum likelihood -33034.2 -32805.496 -32794.575 -32795.01 -32794.283 Maximum observed likelihood -32727.859 -32727.859 -32727.859 -32727.859 -32727.859

117

Table D4. Samples and values for meta-analysis

Comparison Mean Fst Variance Fst Mean Variance Sex Sequencing Study Autosome Autosome Fst Sex Fst Sex Chrom.

Teleogryllus 0.213 0.054 0.237 0.083 X RAD (Moran et al. commodus vs 2018) T. oceanicus Canis lupus vs. 0.152 0.040 0.220 0.067 X WGS (Cagan and C. l. familiaris Blass 2016)

Phylloscopus 0.064 0.018 0.099 0.036 Z GBS (Irwin et al. trochiloides 2016) plumbeitarsus vs. P.t trochiloides

Phylloscopus 0.107 0.042 0.169 0.083 Z GBS (Irwin et al. trochiloides 2016) viridanus vs. P.t. plumbeitarsus

Phylloscopus 0.109 0.040 0.169 0.083 Z GBS (Irwin et al. trochiloides 2016) viridanus vs. P.t trochiloides

Heliconious m. 0.008 0.002 0.040 0.006 Z WGS (Martin et al. amaryllis vs H. 2013) m. aglaope

H. m. amaryllis 0.070 0.035 0.169 0.063 Z WGS (Martin et al. vs. H. m. 2013) melpomene (FG)

H. m. rosina 0.116 0.055 0.232 0.091 Z WGS (Martin et al. vs. H. 2013) melpomene (FG)

H. m rosina vs. 0.080 0.041 0.173 0.065 Z WGS (Martin et al. H. m. amaryllis 2013)

H. t. thelxinoe 0.097 0.054 0.339 0.143 Z WGS (Martin et al. (Per) vs. H.m. 2013) amaryllis

118

H. c. chioneus 0.089 0.047 0.229 0.103 Z WGS (Martin et al. vs. H. m. 2013) rosina

H. t. thelxinoe 0.162 0.082 0.429 0.162 Z WGS (Martin et al. (Per) vs. H. 2013) melpomene (FG)

H. t. thelxinoe 0.137 0.069 0.413 0.161 Z WGS (Martin et al. (Per) vs. H. m. 2013) rosina

H. c. chioneus 0.137 0.074 0.249 0.112 Z WGS (Martin et al. vs. H. 2013) melpomene (FG)

H. c. chioneus 0.100 0.056 0.197 0.088 Z WGS (Martin et al. vs H. m. 2013) amaryllis

H. c. chioneus 0.107 0.053 0.208 0.088 Z WGS (Martin et al. vs H. t. 2013) thelxinoe (Per)

Centrocercus 0.119 0.015 0.123 0.009 Z RAD (Oyler- urophasianus McCance et vs. C. minimus al. 2015)

Catharus 0.063 0.013 0.140 0.069 Z RAD (Ruegg et al. ustulatus 2014) ustulatus vs. C. u. swainsoni

Phylloscopus 0.040 0.011 0.050 0.013 Z WGS (Talla et al. collybita 2017) abietinus vs. P. tristis Sympatric

Phylloscopus 0.074 0.022 0.040 0.011 Z WGS (Talla et al. collybita 2017) abietinus vs. P. tristis Allopatric

Heliconious 0.097 0.040 0.090 0.032 Z RAD (Nadeau et melopmene al. 2014) Ecuador

119

Heliconious 0.084 0.041 0.089 0.044 Z RAD (Nadeau et erato Ecuador al. 2014)

Heliconious 0.085 0.043 0.082 0.039 Z RAD (Nadeau et erato Peru al. 2014) Bos taurus 0.294 0.076 0.640 0.101 X SNP chip (Porto-Neto taurus vs. B. t. et al. 2013) indicus

Corvus corone 0.030 0.004 0.048 0.008 Z WGS (Vijay et al. vs. C. cornix 2016)

Corvus corone 0.052 0.006 0.109 0.024 Z WGS (Vijay et al. vs. C. orientalis 2016)

Corvus cornix 0.051 0.008 0.104 0.026 Z WGS (Vijay et al. vs. C. orientalis 2016)

Anopheles 0.016 0.003 0.015 0.006 X WGS (Consortium gambiae vs. A. 2017)` coluzzii

* mean FST is average of the per site Fst

A. B.

Pop 1 Pop 2 500 kb 1+s

N s

e e

s e m 1000

i 8,000 gen

n

t

T

i F

Ne Ne 2,000 gen 1 1000 1000 AA Aa aa Genotype

Figure D1 A. Schematic of the simulations. Grey bar represents 500kb chromosome that was simulated, and the red arrow is the position of the divergently selected mutation that occurred at the time of divergence. B. Fitness model of the recessive divergently selected mutation. The heterozygote has the same fitness as the ancestral homozygote.

120

REFERENCES

Adams, D. C., and E. Otárola-Castillo. 2013. Geomorph: An r package for the collection and analysis of geometric morphometric shape data. Methods Ecol. Evol. 4:393– 399. Agrawal, A., J. Feder, and P. Nosil. 2011. Ecological divergence and the origins of intrinsic postmating isolation with gene flow. Int. J. Ecol. 435357. Al-Shahrour, F., P. Minguez, T. Marqués-Bonet, E. Gazave, A. Navarro, and J. Dopazo. 2010. Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes. PLoS Comput. Biol. 6:e1000953. Alexander, D. H., J. Novembre, and K. Lange. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19:1655–1664. Anderson, J. T., C.-R. Lee, and T. Mitchell-Olds. 2011. Life-history QTLs and natural selection on flowering time in Boechere stricta, a perennial relative of Arabadopsis. Evolution 65:771–787. Arnegard, M. E., M. D. Mcgee, B. Matthews, K. B. Marchinko, G. L. Conte, S. Kabir, N. Bedford, S. Bergek, Y. F. Chan, F. C. Jones, D. M. Kingsley, C. L. Peichel, and D. Schluter. 2014. Genetics of ecological divergence during speciation. Nature 511:307–311. Averill, R. D., L. F. Wilson, and R. F. Fowler. 1982. Impact of the Redheaded Pine Sawfly (Hymenoptera: Diprionidae) on Young Red Pine Plantations. Gt. Lakes Entomol. 15:65–96. Avery, P. J. 1984. The population genetics of haplo-diploids and X-linked genes. Genet. Res., Camb 44:321–341. Bagley, R. 2017. Examining the role of host use on divergence in the redheaded pine sawfly, Neodiprion lecontei, across multiple spatial scales. Theses Diss., doi: https://doi.org/10.13023/ETD.2017.285. Bagley, R. K., V. C. Sousa, M. L. Niemiller, and C. R. Linnen. 2017. History, geography, and host use shape genome-wide patterns of genetic differentiation in the redheaded pine sawfly (Neodiprion lecontei). Mol. Ecol. 26:1022–1044. Barreto, F. S., and R. S. Burton. 2012. Evidence for compensatory evolution of ribosomal proteins in response to rapid divergence of mitochondrial rRNA. Mol. Biol. Evol 30:310–314. Barrett, R. D. H., and H. E. Hoekstra. 2011. Molecular spandrels: tests of adaptation at the genetic level. Nat. Rev. Genet. 12:767–780. Barrett, R. D. H., and D. Schluter. 2008. Adaptation from standing genetic variation. Trends Ecol. Evol. 23:38–44. Barton, N., and B. O. Bengtssont. 1986. The barrier to genetic exchange between hybridising populations. Heredity 56:357–376. Bateson, W. 1909. Heredity and variation in modern lights. Pp. 85–101 in A. C. Seward,

121

ed. Danvin and Modern Science. Cambridge University Press, Cambridge. Beaumont, M. a., and D. J. Balding. 2004. Identifying adaptive genetic divergence among populations from genome scans. Mol. Ecol. 13:969–980. Bendall, E. E., R. K. Bagley, V. C. Sousa, and C. R. Linnen. n.d. Haplodiploidy leads to greater and more variable genomice divergence. Bendall, E. E., K. L. Vertacnik, and C. R. Linnen. 2017. Oviposition traits generate extrinsic postzygotic isolation between two pine sawfly species. BMC Evol. Biol. 17:26. BioMed Central. Benjamin, D. M. 1955. The Biology and Ecology of the Red-headed Pine Sawfly. U.S. Department of Agriculture. Berlocher, S. H., and J. L. Feder. 2002. in phytophagous insects : Moving Beyond Controversy ? Annu. Rev. Entomol. 47:773–815. Bierne, N., P.-A. Gagnaire, and P. David. 2013. The geography of introgression in a patchy environment and the thorn in the side of ecological speciation. Curr. Zool. 59:72–86. Blackman, B. K. 2016. Speciation Genes. Encycl. Evol. Biol. 4:166–175. Boughman, J. W. 2002. How sensory drive can promote speciation. TRENDS Ecol. Evol. 17:571–577. Broman, K. W., D. M. Gatti, P. Simecek, N. A. Furlotte, P. Prins, Ś. Sen, B. S. Yandell, and G. A. Churchill. 2019. R/qtl2: Software for mapping quantitative trait loci with high-dimensional data and multiparent populations. Genetics 211:495–502. Brothers, A. N., and L. F. Delph. 2010. Haldane’s rule is extended to plants with sex chromosomes. Evolution 64:3643–3648. Butlin, R. K., J. Galindo, and J. W. Grahame. 2008. Sympatric, parapatric or allopatric: the most important way to classify speciation? Phil. Trans. R. Soc. B 363:2997– 3007. Byers, K. J. R. P., S. Xu, and P. M. Schlüter. 2017. Molecular mechanisms of adaptation and speciation: why do we need an integrative approach? Mol. Ecol. 26:277–290. Caballero, A. 1995. On the effective size of populations with separate sexes, with particular reference to sex-linked genes. Genetics 139:1007–11. Cabot, E. L., A. W. Davis, N. A. Johnson, and C. I. Wu. 1994. Genetics of reproductive isolation in the Drosophila simulans : Complex epistasis underlying hybrid male sterility. Genetics 137:175–189. Cagan, A., and T. Blass. 2016. Identification of genomic variants putatively targeted by selection during dog domestication. BMC Evol. Biol. 16:10. l. Cahenzli, F., C. Bonetti, and A. Erhardt. 2018. Divergent strategies in pre- and postzygotic reproductive isolation between two closely related Dianthus species. Evolution 72:1851–1862. Carling, M. D., and R. T. Brumfield. 2008. Haldane’s rule in an avian system using

122

theory and divergence population genetics to test for differential introgression of mitochondrial, autosomal, and sex-linked loci across the Passerina bunting hybrid Zone. Evolution 62:2600–2615. Carroll, S. B. 2008. Evo-Devo and an Expanding Evolutionary Synthesis: A Genetic Theory of Morphological Evolution. Cell 134:25–36. Carroll, S. B. 2006. The Making of the Fittest: DNA and the Ultimate Forensic Record of Evolution - Sean B. Carroll - Google Books. W.W Norton & Company, New York. Charlesworth, B. 1992. Evolutionary rates in partially self-fertilizing species. Am. Nat. 140:126–148. Charlesworth, B. 2012. The Role of Background Selection in Shaping Patterns of and Variation: Evidence from Variability on the Drosophila X Chromosome. Genetics 191:233–246. Charlesworth, B., J. L. Campos, and B. C. Jackson. 2018. Faster-X evolution: Theory and evidence from Drosophila. Mol. Ecol. 27:3753–3771. Charlesworth, B., J. A. Coyne, and N. H. Barton. 1986. The relative rates of evolution of sex chromosomes and autosomes. Am. Nat. 130:113–136. Charlesworth, B., M. T. Morgan, and D. Charlesworth. 1993. The Effect of Deleterious Mutations on Neutral Molecular Variation. Genetics 134:1289–1303. Christie, K., and S. Y. Strauss. 2018. Along the speciation continuum: Quantifying intrinsic and extrinsic isolating barriers across five million years of evolutionary divergence in jewelflowers. Evolution 72:1063–1079. Codella, S. G., and K. F. Raffa. 2002. Desiccation of Pinus foliage induced by sawfly oviposition: effect on egg viability. Ecol. Entomol. 27:618–621. Consortium, T. A. gambiae 1000 G. 2017. Genetic diversity of the African malaria vector anopheles gambiae. Nature 552:96–100. Cook, J. M. 1993. Sex determination in the Hymenoptera: a review of models and evidence. Heredity 71:421–435. Coppel, H., and D. Benjamin. 1965. Bionomics of the- nearctic pine-feeding diprionids. Annu. Rev. Entomol. 10:69–96. Corbett-Detig, R., and R. Nielsen. 2017. A Hidden Markov Model Approach for Simultaneously Estimating Local Ancestry and Admixture Time Using Next Generation Sequence Data in Samples of Arbitrary Ploidy. PLoS Genet. 13:e1006529. Coyne, J. A., and H. Orr. 1997. “Patterns of speciation in Drosophila ” revisited. Evolution 51:295–303. Coyne, J. A., and H. Orr. 1998. The evolutionary genetics of speciation. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 353:287–305. Coyne, J., S. Elwyn, S. Kim, A. L.-G. Research, and U. 2004. 2004. Genetic studies of two sister species in the Drosophila melanogaster subgroup, D. yakuba and D. santomea. Genet. Res. Camb. 84:11–26. 123

Coyne, J., and H. Orr. 1989. Patterns of Speciation in Drosophila. Evolution 43:362–381. Coyne, J., and H. Orr. 2004. Speciation. Sinauer Associates, Sunderland, MA. Craig, L. R. 2009. Defending Evo-Devo: A response to Hoekstra and Coyne. Philos. Sci. 76:335–344. Cruickshank, T. E., and M. W. Hahn. 2014. Reanalysis suggests that genomic islands of speciation are due to reduced diversity, not reduced gene flow. Mol. Ecol. 23:3133– 3157. Cutter, A. D., and B. A. Payseur. 2013. Genomic signatures of selection at linked sites: Unifying the disparity among species. Nat. Rev. Genet. 14:262–274. Danecek, P., A. Auton, G. Abecasis, C. A. Albers, E. Banks, M. A. Depristo, R. E. Handsaker, G. Lunter, G. T. Marth, S. T. Sherry, G. Mcvean, R. Durbin, and G. Project. 2011. The variant call format and VCFtools. Bioinformatics 27:2156–2158. Darwin, C. 1859. On the Origins of species by means of Natural Selection. John Murray, London. De La Filia, A. G., S. A. Bain, and L. Ross. 2015. Haplodiploidy and the reproductive ecology of Arthropods. Curr. Opin. Insect Sci. 9:36–43. Elsevier Inc. Delph, L. F., and J. P. Demuth. 2016. Haldane’s rule: Genetic bases and their empirical support. J. Hered. 107:383–391. Demuth, J. P., R. J. Flanagan, and L. F. Delph. 2013. Genetic architecture of isolation between two species of Silene with sex chromosomes and Haldane’s rule. Evolution 68:332–342. Dittmar, E. L., C. G. Oakley, J. K. Conner, B. A. Gould, D. W. Schemske, W. K. Kellogg, and B. Station. 2016. Factors influencing the effect size distribution of adaptive substitutions. Proc. R. Soc. B 283:20153065. Dobzhansky, T. 1951. Genetics and the Origin of Species. 3rd ed. Columbia University Press, New York. Dobzhansky, T. 1937. Genetics and the Origin of Species. Columbia University Press, New York. Dobzhansky, T. 1940. Speciation as a Stage in Evolutionary Divergence. Am. Nat. 74:312–321. Egan, S. P., and D. J. Funk. 2009. Ecologically dependent postmating isolation between sympatric host forms of Neochlamisus bebbianae leaf beetles. Proc. Natl. Acad. Sci. U. S. A. 106:19426–31. Enard, D., P. W. Messer, and D. A. Petrov. 2014. Genome-wide signals of positive selection in evolution. Genome Res. 24:885–895. Excoffier, L., I. Dupanloup, E. Huerta-Sánchez, V. C. Sousa, and M. Foll. 2013. Robust Demographic Inference from Genomic and SNP Data. PLoS Genet. 9:e1003905. Eyre-Walker, A., and P. D. Keightley. 2007. The distribution of fitness effects of new mutations. Nat. Rev. Genet. 8:610–618.

124

Farrell, B. D. 1998. “Inordinate Fondness” Explained: Why Are There So Many Beetles? Sci. 281:555–559. Feder, J. L., S. M. Flaxman, S. P. Egan, A. A. Comeault, and P. Nosil. 2013. Geographic Mode of Speciation and Genomic Divergence. Annu. Rev. Ecol. Evol. Syst. 44:73– 97. Feder, J. L., R. Gejji, S. Yeaman, and P. Nosil. 2012. Establishment of new mutations under divergence and genome hitchhiking. Phil. Trans. R. Soc. B 367:461–474. Felsenstein, J. 1981. Skepticism Towards Santa Rosalia, or Why are There so Few Kinds of ? Evolution 35:124. Felsenstein, J. 1974. The evolutionary advantage of recombination. Genetics 78:737–756. Feng, S., S. J. Cokus, X. Zhang, P. Y. Chen, M. Bostick, M. G. Goll, J. Hetzel, J. Jain, S. H. Strauss, M. E. Halpern, C. Ukomadu, K. C. Sadler, S. Pradhan, M. Pellegrini, and S. E. Jacobsen. 2010. Conservation and divergence of methylation patterning in plants and animals. Proc. Natl. Acad. Sci. U. S. A. 107:8689–8694. Ferchaud, A.-L., and M. M. Hansen. 2016. The impact of selection, gene flow and demographic history on heterogeneous genomic divergence: three-spine sticklebacks in divergent environments. Mol. Ecol. 25:238–259. Fisher, R. 1930. The genetical theory of natural selection: a complete variorum edition. Oxford University Press, Oxford. Flaxman, S. M., A. C. Wacholder, J. L. Feder, and P. Nosil. 2014. Theoretical models of the influence of genomic architecture on the dynamics of speciation. Mol. Ecol. 4074–4088. Foll, M., and O. Gaggiotti. 2008. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: A Bayesian perspective. Genetics 180:977–993. Forbes, A. A., J. Fisher, and J. L. Feder. 2005. Habitat avoidance: overlooking an important aspect of host-specific mating and sympatric speciation? Evolution 59:1552–9. Forister, M. L. 2005. Independent inheritance of preference and performance in hybrids between host races of Mitoura butterflies (: Lycaenidae). Evolution 59:1149–1155. Francioli, L. C., P. P. Polak, A. Koren, A. Menelaou, S. Chun, I. Renkens, C. M. Van Duijn, M. Swertz, C. Wijmenga, G. Van Ommen, P. E. Slagboom, D. I. Boomsma, K. Ye, V. Guryev, P. F. Arndt, W. P. Kloosterman, P. I. W. De Bakker, and S. R. Sunyaev. 2015. Genome-wide patterns and properties of de novo mutations in . Nat. Genet. 47:822–826. Funk, D. J., K. E. Filchak, and J. L. Feder. 2002. Herbivorous insects: model systems for the comparative study of speciation ecology. Genetica 116:251–267. Funk, D. J., P. Nosil, and W. J. Etges. 2006. Ecological divergence exhibits consistently positive associations with reproductive isolation across disparate taxa. Proc. Natl.

125

Acad. Sci. U. S. A. 103:3209–13. Gavrilets, S. 2004. Fitness Landscapes and the Origin of Species. Princeton University Press, Princeton, New Jersey. Gernandt, D. S., G. G. Lopez, S. O. Garcia, and A. Liston. 2005. Phylogeny and classification of Pinus. Taxon 54:29–42. Ghent, A. W. 1959. Row type oviposition in Neodiprion sawflies as exemplified by the european pine sawfly, N. sertifer (Geoff). Can. J. Zool. 37:267–281. Gompel, N., B. Prud’homme, P. J. Wittkopp, V. A. Kassner, and S. B. Carroll. 2005. Chance caught on the wing: cis-regulatory evolution and the origin of pigment patterns in Drosophila. Nature 433:481–487. Grant, P. R., and B. R. Grant. 1997. Mating patterns of Darwin’s Finch hybrids determined by song and morphology. Biol. J. Linn. Soc. 94:7768–7775. Griswold, C. K. 2006. Gene flow’s effect on the genetic architecture of a local adaptation and its consequences for QTL analyses. Heredity 96:445–453. Groman, J. D., and O. Pellmyr. 2000. Rapid evolution and specialization following host colonization in a yucca moth. J. Evol. Biol. 13:223–236. Guerrero, R. F., C. D. Muir, S. Josway, and L. C. Moyle. 2017. Pervasive antagonistic interactions among hybrid incompatibility loci. PLOS Genet. 13:e1006817. Haldane, J. 1924. A mathematical theory of natural and artificial selection, part I. Trans Camb Phil Soc 23:19–41. Haldane, J. 1927. A mathematical theory of natural and artificial selection, part V: selection and mutation. Math. Proc. Cambridge. Haldane, J. B. S. 1922. Sex ration and unisexual sterility in hybrid animals. J. Genet. 12:101–109. Hall, M. C., D. B. Lowry, and J. H. Willis. 2010. Is local adaptation in Mimulus guttatus caused by trade-offs at individual loci? Mol. Ecol. 19:2739–2753. Hansen, T. F. 2006. The evolution of genetic architecture. Annu. Rev. Ecol. Evol. Syst 37:123–57. Harper, K., R. Bagley, K. Thompson, and C. Linnen. 2016. Complementary sex determination, and inbreeding avoidance in a gregarious sawfly. Heredity 117:326–335. Hartl, D. L. 1972. A fundamental theorem of natural selection for sex linkage or arrhenotoky. Am. Nat. 106:516–524. Hatfield, T., and D. Schluter. 1999. Ecological Speciation in Sticklebacks : Environment- Dependent Hybrid Fitness. Evolution 53:866–873. Hauschteck-Jungen, E. 1990. Postmating reproductive isolation and modification of the ‘sex ratio’ trait in Drosophila subobscura induced by the sex chromosome gene arrangement. Genetica 83:31–44. Hawthorne, D. J., and S. Via. 2001. Genetic linkage of ecological specialization and

126

reproductive isolation in pea aphids. Nature 412:904–7. Hedrick, P. 2013. Adaptive introgression in animals: examples and comparison to new mutation and standing variation as sources of adaptive variation. Mol. Ecol. 22:4606–18. Hedrick, P. W., and J. D. Parker. 1997. Evolutionary genetics and genetic variation of haplodiploids and X-linked genes. Annu. Rev. Ecol. Syst 28:55–83. Heikkinen, E., and J. Lumme. 1998. The Y chromosomes of Drosophila lummei and D. novamexicana differ in fertility factors. Heredity 81:505–513. Heimpel, G., and J. de Boer. 2008. Sex determination in the Hymenoptera. Annu. Rev. Entomol. 53:209–30. Helbig, A. 1991. Inheritance of migratory direction in a bird species: a cross-breeding experiment with SE- and SW-migrating blackcaps (Sylvia atricapilla). Behav. Ecol. Sociobiol. 28:9–12. Hoekstra, H. E., and J. A. Coyne. 2007. The locus of evolution: evo devo and the genetics of adaptation. Evolution 61:995–1016. Hurst, L. D., C. Pál, and M. J. Lercher. 2004. The evolutionary dynamics of eukaryotic gene order. Nat. Rev. Genet. 5:299–310. Hurst, L. D., E. J. B. Williams, and C. Pál. 2002. Natural selection promotes the conservation of linkage of co-expressed genes. Trends Genet. 18:604–606. Irwin, D. E. 2018. Sex chromosomes and speciation in birds and other ZW systems. Mol. Ecol. 27:3831–3851. Irwin, D. E., M. Alcaide, K. E. Delmore, J. H. Irwin, and G. L. Owens. 2016. Recurrent selection explains of genomic regions of high relative but low absolute differentiation in a . Mol. Ecol. 25:4488–4507. Janz, Ni. 2003. Evolutionary ecology of oviposition strategies. Pp. 349–376 in M. Hilker and T. Meiners, eds. Chemoecology of Insect Eggs and Egg Deposition. Blackwell Publishing Ltd, Berlin. Jombart, T. 2008. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24:1403–1405. Joy, J. B., and B. J. Crespi. 2007. of gall-inducing insects within a single host-plant species. Evolution 61:784–95. Kanippayoor, R. L., J. H. M. Alpern, and A. J. Moehring. 2020. A common suite of cellular abnormalities and spermatogenetic errors in sterile hybrid males in Drosophila. Proc. R. Soc. B In press. Kapler, J. E., and D. M. Benjamin. 1960. The biology and ecology of the red-pine sawfly in Wisconsin. For. Sci. 6:253–268. Keenan, K., P. McGinnity, T. F. Cross, W. W. Crozier, and P. A. Prodöhl. 2013. diveRsity : An R package for the estimation and exploration of population genetics parameters and their associated errors. Methods Ecol. Evol. 4:782–788.

127

Kimura, M. 1983. The neutral theory of molecular evolution. Cambridge University Press, Cambridge. Kirkpatrick, M., and V. Ravigné. 2002. Speciation by natural and : models and experiments. Am. Nat. S22-35. Knerer, G. 1984. Morphological and physiological clines in Neodiprion pratti (Dyar) (Symphyta, Diprionidae) in eastern North America. J. Appl. Entomol. 97:9–21. Knerer, G., and C. E. Atwood. 1973. Diprionid Sawflies: and Speciation. Science 179:1090–1099. Knerer, G., and C. E. Atwood. 1972. Evolutionary trends in the subsocial sawflies belonging to the Neodiprion abietis complex (Hymeoptera: Tentherdinoidea). Amer. Zool. 12:407–418. Koevoets, T., and L. W. Beukeboom. 2009. Genetics of postzygotic isolation and Haldane’s rule in haplodiploids. Heredity 102:16–23. Koevoets, T., O. Niehuis, L. van de Zande, and L. W. Beukeboom. 2012. Hybrid incompatibilities in the parasitic wasp genus Nasonia: negative effects of hemizygosity and the identification of transmission ratio distortion loci. Heredity 108:302–311. Kondrashov, A. S. 2003. Accumulation of Dobzhansky-Muller incompatibilities within a spatially structures population. Evolution 57:151–153. Kondrashov, A. S., S. Sunyaev, and F. A. Kondrashov. 2002. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl. Acad. Sci. U. S. A. 99:14878–83. Kuwajima, M., and N. Kobayashi. 2010. Detection of ecological hybrid inviability in a pair of sympatric phytophagous ladybird beetles (Henosepilachna spp.). Entomologia 134:280–286. Lang, J., and I. Sharakhov. 2019. Premeiotic and meiotic failures lead to hybrid male sterility in the Anopheles gambiae complex. Proc. R. Soc. B 286:20191080. Langmead, B., and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9:357–359. Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, G. Project, and D. P. Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinforma. Appl. NOTE 25:2078–2079. Lima, T. G. 2014. Higher levels of sex chromosome heteromorphism are associated with markedly stronger reproductive isolation. Nat. Commun. 5:4743. Lin, J., D. A. Sampson, and R. Ceulemans. 2001. The effect of crown position and tree age on resin-canal density in Scots pine ( Pinus sylvestris L.) needles. Can. J. Bot. 79:1257–1261. NRC Research Press Ottawa, Canada. Lindstedt, C., H. Huttunen, M. Kakko, and J. Mappes. 2011. Disengtangling the evolution of weak warning signals: High detection risk and low production costs of chemical defences in gregarious pine sawfly larvae. Evol. Ecol. 25:1029–1046. Linn, C., H. Dambroski, and J. Feder. 2004. Postzygotic isolating factor in sympatric 128

speciation in Rhagoletis flies: reduced response of hybrids to parental host-fruit odors. Proc. Natl. Acad. Sci. 101:17753–17758. Linnen, C. R., and B. D. Farrell. 2010. A test of the sympatric host race formation hypothesis in Neodiprion (Hymenoptera: Diprionidae). Proc. Biol. Sci. 277:3131–8. Linnen, C. R., and B. D. Farrell. 2008a. Comparison of methods for species-tree inference in the sawfly genus Neodiprion (Hymenoptera: Diprionidae). Syst. Biol. 57:876–90. Linnen, C. R., and B. D. Farrell. 2007. Mitonuclear discordance is caused by rampant mitochondrial introgression in Neodiprion (Hymenoptera: Diprionidae) sawflies. Evolution 61:1417–38. Linnen, C. R., and B. D. Farrell. 2008b. Phylogenetic analysis of nuclear and mitochondrial genes reveals evolutionary relationships and mitochondrial introgression in the sertifer species group of the genus Neodiprion (Hymenoptera: Diprionidae). Mol. Phylogenet. Evol. 48:240–57. Linnen, C. R., C. T. O’quin, T. Shackleford, C. R. Sears, and C. Lindstedt. 2018. Genetic basis of body color and spotting pattern in Redheaded pine sawfly larvae (Neodiprion lecontei). , doi: 10.1534/genetics.118.300793. Linnen, C. R., and D. Smith. 2012. Recognition of Two Additional Pine-Feeding Neodiprion Species (Hymenoptera: Diprionidae) in the Eastern United States. roc. Entomol. Soc. Washingt. 114:492–500. Locey, K. J., and J. T. Lennon. 2016. Scaling laws predict global microbial diversity. Proc. Natl. Acad. Sci. 113:5970–5. Lohmueller, K. E., A. Albrechtsen, Y. Li, S. Y. Kim, T. Korneliussen, N. Vinckenbosch, G. Tian, E. Huerta-Sanchez, A. F. Feder, N. Grarup, T. Jørgensen, T. Jiang, D. R. Witte, A. Sandbæk, I. Hellmann, T. Lauritzen, T. Hansen, O. Pedersen, J. Wang, and R. Nielsen. 2011. Natural Selection Affects Multiple Aspects of Genetic Variation at Putatively Neutral Sites across the Human Genome. PLoS Genet. 7:e1002326. Lowry, D. B., M. C. Hall, D. E. Salt, and J. H. Willis. 2009. Genetic and physiological basis of adaptive salt tolerance divergence between coastal and inland Mimulus guttatus. New Phytol. 183:776–788. Lowry, D. B., R. C. Rockwood, and J. H. Willis. 2008. Ecological reproductive isolation of coast and inland races of Mimulus guttatus. Evolution 62:2196–2214. Mackay, T. F. C. 2001. The Genetic Architecture of Quantitative Traits. Annu. Rev. Genet. 35:303–339. Macnar, M. R., and P. Christie. 1983. Reproductive isolation as a pleiotropic effect of copper tolerance in Mimulus guttatus. Heredity 50:295–302. Mani, G. S., and B. C. Clarke. 1990. Mutational order: a major stochastic process in evolution. Proc. R. Soc. Lond. B 240:29–37. Martin, C., and P. Wainwright. 2013. Multiple fitness peaks on the adaptive landscape drive the evolution of novel ecological niches within a recent sympatric adaptive

129

radiation of Cyprinodon. Science 339:208–211. Martin, M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads | Martin | EMBnet.journal. EMBnet.journal 1:10–12. Martin, S. H., K. K. Dasmahapatra, N. J. Nadeau, C. Salazar, J. R. Walters, F. Simpson, M. Blaxter, A. Manica, J. Mallet, and C. D. Jiggins. 2013. Genome-wide evidence for speciation with gene flow in butterflies. Genome Res. 23:1817–1828. Martineau, R. 1959. On an infestation of the red-headed jack pine sawfly, Neodiprion virginianus complex in Quebec. Can. Dep. Agric. For. Biol. Div. Bimon. Prog. Rep. 15. Masly, J. P., and D. C. Presgraves. 2007. High-resolution genome-wide dissection of the two rules of speciation in Drosophila. PLoS Biol. 5:e243. Matsubayashi, K. W., I. Ohshima, and P. Nosil. 2010. Ecological speciation in phytophagous insects. Entomol. Exp. Appl. 134:1–27. Matuszewski, S., J. Hermisson, and M. Kopp. 2015. Catch Me if You Can: Adaptation from Standing Genetic Variation to a Moving Phenotypic Optimum. Genetics 200:1255–74. Matuszewski, S., J. Hermisson, and M. Kopp. 2014. Fisher’s geometric model with a moving optimum. Evolution 68:2571–2588. Mayr, E. 1963. Animal species and evolution. Harvard University Press, Cambridge. Mayr, E. 1947. Ecological factors in speciation. Evolution 1:263–288. Mayr, E. 1942. and the origin of species, from the viewpoint of a zoologist. Columbia University Press, New York. Mayr, E. 1982. The growth of biological thought: Diversity, evolution, and inheritance. Harvard University Press, Cambridge, MA. Mcbride, C. S., and M. C. Singer. 2010. Field Studies Reveal Strong Postmating Isolation between Ecologically Divergent Butterfly Populations. PLoS Biol 8. McCullough, D. G., and M. R. Wagner. 1993. Defusing Host defenses : ovipositional adaptations of sawflies to plant resins. Pp. 157–172 in Sawfly Life History Adaptations to Woody Plants. McDermott, S. R., and M. A. F. Noor. 2010. The role of meiotic drive in hybrid male sterility. Phil. Trans. R. Soc. B 365:1265–1272. McGee, M. D., D. Schluter, and P. C. Wainwright. 2013. Functional basis of ecological divergence in sympatric stickleback. BMC Evol. Biol. 13:277. Meisel, R. P., and T. Connallon. 2013. The faster-X effect: integrating theory and data. Trends Genet. 29:537–544. Messina, F. J., and M. E. Karren. 2003. Adaptation to a novel host modifies host discrimination by the seed beetle Callosobruchus maculatus. Anim. Behav. 65:501– 507. Mitter, C., B. Farrell, and B. Wiegmann. 1988. The phylogenetic study of adaptive zones:

130

has phytophagy promoted insect diversification? Am. Nat. 132:107–128. Moehring, A., A. Llopart, S. Elwyn, … J. C.-, and U. 2006. 2006. The genetic basis of postzygotic reproductive isolation between Drosophila santomea and D. yakuba due to hybrid male sterility. Genetics 173:225–233. Mora, C., D. P. Tittensor, S. Adl, A. G. B. Simpson, and B. Worm. 2011. How Many Species Are There on Earth and in the Ocean? PLoS Biol. 9:e1001127. Moran, P. A., S. Pascoal, T. Cezard, J. E. Risse, M. G. Ritchie, and N. W. Bailey. 2018. Opposing patterns of intraspecific and interspecific differentiation in sex chromosomes and autosomes. Mol. Ecol. 27:3905–3924. Muller, H. J. 1942. Isolating mechanisms, evolution and temperature. Biol. Symp. 6:71– 125. Nachman, M. W., and B. A. Payseur. 2012. Recombination rate variation and speciation: Theoretical predictions and empirical results from rabbits and mice. Philos. Trans. R. Soc. B Biol. Sci. 367:409–421. Nadeau, N. J., P. Salazar, B. Counterman, J. A. Medina, H. Ortiz-zuazaga, A. Morrison, W. O. Mcmillan, C. D. Jiggins, and R. Papa. 2014. Population genomics of parallel hybrid zones in the mimetic butterflies, H. melpomene and H. erato. Genome Res. 24:1316–1333. Nei, M. 1978. Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89:583–590. Normark, B. B. 2003. The evolution of alternative genetic systems in insects. Annu. Rev. Entomol 48:397–423. Nosil, P. 2007. Divergent host plant adaptation and reproductive isolation between ecotypes of walking sticks. Am. Nat. 169:151–162. Nosil, P. 2012. Ecological Speciation. Oxford University Press. Nosil, P., and J. L. Feder. 2012. Genomic divergence during speciation: causes and consequences. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 367:332–42. Nosil, P., J. L. Feder, S. M. Flaxman, Z. Gompert, and T. Mitchell-Olds. 2017. Tipping points in the dynamics of speciation. Nat. Ecol. Evol. 1:0001. Nosil, P., D. J. Funk, and D. Ortiz-Barrientos. 2009a. Divergent selection and heterogeneous genomic divergence. Mol. Ecol. 18:375–402. Nosil, P., L. J. Harmon, and O. Seehausen. 2009b. Ecological explanations for (incomplete) speciation. Trends Ecol. Evol. 24:145–56. Nosil, P., and D. Schluter. 2011. The genes underlying the process of speciation. Trends Ecol. Evol. 26:160–7. Nosil, P., T. H. Vines, and D. J. Funk. 2005. Reproductive isolation caused by natural selection against immigrants from divergent habitats. Evolution 59:705. Nyman, T., V. Vikberg, and D. Smith. 2010. How common is ecological speciation in plant-feeding insects? A’Higher’Nematinae perspective. BMC 10:266.

131

Ohshima, I. 2008. Host race formation in the leaf‐mining moth Acrocercops transecta (Lepidoptera: Gracillariidae). Biol. J. Linn. Soc. 93:135–145. Ohshima, I., and K. Yoshizawa. 2010. Differential introgression causes genealogical discordance in host races of Acrocercops transecta (Insecta: Lepidoptera). Mol. Ecol. 19:2106–2119. Orr, H. 1998. The population genetics of adaptation: the distribution of factors fixed during adaptive evolution. Evolution 935–949. Orr, H. A., and S. Irving. 2005. Segregation distortion in hybrids between the Bogota and USA subspecies of Drosophila pseudoobscura. Genetics 169:671–682. Ortíz-Barrientos, D., and M. A. F. Noor. 2005. Evidence for a one-allele assortative mating locus. Science 310:1467. Oyler-McCance, S. J., R. S. Cornman, K. L. Jones, and J. A. Fike. 2015. Z chromosome divergence, polymorphism and relative effective population size in a genus of lekking birds. Heredity 115:452–459. Penalba, J. V., L. Joseph, and C. Moritz. 2017. Current geography masks dynamic history of gene flow during speciation in northern Australian birds. bioRxiv 178475. Picelli, S., A. ˚ Sa, K. Bj, B. € Orn Reinius, S. Sagasser, G. € Osta Winberg, and R. Sandberg. 2014. Tn5 transposase and tagmentation procedures for massively scaled sequencing projects. Genome Res. 24:2033–2040. Porto-Neto, L. R., T. S. Sonstegard, G. E. Liu, D. M. Bickhart, M. V. B. Da Silva, M. A. Machado, Y. T. Utsunomiya, J. F. Garcia, C. Gondro, and C. P. Van Tassell. 2013. Genomic divergence of zebu and taurine cattle identified through high-density SNP genotyping. BMC Genomics 14. Presgraves, D. C. 2010a. Speciation genetics: Search for the missing snowball. Curr. Biol. 20:R1073–R1074. Presgraves, D. C. 2010b. The molecular evolutionary basis of species formation. Nat. Rev. Genet. 11:175–80.

Price, T. D., and M. M. Bouvier. 2002. The evolution of F 1 postzygotic incompatibilities in birds. Evolution 56:2083–2089. Provine, W. 2001. The origins of theoretical population genetics: With a new afterword. University of Chicago Press, Chicago. Prud’homme, B., N. Gompel, A. Rokas, V. A. Kassner, T. M. Williams, S.-D. Yeh, J. R. True, and S. B. Carroll. 2006. Repeated morphological evolution through cis- regulatory changes in a pleiotropic gene. Nature 440:1050–1053. Przeworski, M. 2002. The Signature of Positive Selection at Randomly Chosen Loci. Genetics 160:1179–1189. R Core Team. 2015. R: A Language and Environment for Statistical Computing. Vienna, Austria. Rafajlović, M., A. Emanuelsson, K. Johannesson, R. K. Butlin, and B. Mehlig. 2016. A universal mechanism generating clusters of differentiated loci during divergence- 132

with-migration. Evolution 70:1609–1621. Rauf, A., and D. M. Benjamin. 1980. The biology of the white pine sawfly, Neodiprion pinetum (Hymenoptera: Diprionidae) in Wisconsin. Gt. Lakes Entomol. 13:219– 224. Ravinet, M., R. Faria, R. K. Butlin, J. Galindo, N. Bierne, M. Rafajlović, M. A. F. Noor, B. Mehlig, and A. M. Westram. 2017. Interpreting the genomic landscape of speciation: a road map for finding barriers to gene flow. J. Evol. Biol. 30:1450– 1477. Rice, W., and E. Hostert. 1993. Laboratory Experiments on Speciation - What Have We Learned in 40 Years. Evolution (N. Y). 47:1637–1653. Rice, W. R., and G. W. Salt. 1990. The evolution of reproductive isolation as a correlated character under sympatric conditions: Experimental evidence. Evolution. 44:1140–1152. Rieseberg, L. H., and J. H. Willis. 2007. Plant speciation. Science. 317:910–914. Rogers, D., E. McConnell, J. Ono, and D. Greig. 2018. Spore-autonomous fluorescent protein expression identifies meiotic chromosome mis-segregation as the principal cause of hybrid sterility in yeast. PLoS Biol. 16:e2005066. Roskov, Y., G. Ower, T. Orrell, D. Nicolson, N. Bailly, P. M. Kirk, T. Bourgoin, R. E. DeWalt, W. Decock, E. van Nieukerken, J. Zarucchi, and L. Penev. 2019. Species 2000 & ITIS Catalogue of Life, 2019 Annual Checklist. Ross, H. 1955. The and Evolution of the Sawfly Genus Neodiprion. For. Sci. 1:196–209. Ruegg, K., E. C. Anderson, J. Boone, J. Pouls, and T. B. Smith. 2014. A role for migration-linked genes and genomic islands in divergence of a songbird. Mol. Ecol. 23:4757–4769. Rundle, H. D. 2002. A test of ecologically dependent postmating isolation between sympatric sticklebacks. Evolution. 56:322–329. Rundle, H. D., L. Nagel, J. W. Boughman, and D. Schluter. 2000. Natural selection and parallel speciation in sympatric sticklebacks. Science. 287:306–308. Rundle, H. D., and P. Nosil. 2005. Ecological speciation. Ecol. Lett. 8:336–352. Rundle, H. D., and M. C. Whitlock. 2001. A genetic interpretation of ecologically dependent isolation. Evolution 55:198–201. Salazar, C. A., C. D. Jiggins, C. F. Arias, A. Tobler, E. Bermingham, and M. Linares. 2005. Hybrid incompatibility is consistent with a hybrid origin of Heliconius heurippa Hewitson from its close relatives, Doubleday and Linnaeus. J. Evol. Biol. 18:247–256. Schemske, D. W., and H. D. Bradshaw. 1999. preference and the evolution of floral traits in monkeyflowers (Mimulus). Proc. Natl. Acad. Sci. 96:11910–5. Schilthuizen, M., M. C. W. G. Giesbers, and L. W. Beukeboom. 2011. Haldane’s rule in the 21st century. Heredity. 107:95–102. 133

Schluter, D. 2001. Ecology and the origin of species. Trends Ecol. Evol. 16:372–380. Schluter, D. 2009. Evidence for Ecological Speciation and Its Alternative. Science. 323:737–741. Schluter, D. 2000. The ecology of adaptive radiation. Oxford University Press, Oxford. Schluter, D., and G. L. Conte. 2009. Genetics and ecological speciation. Proc. Natl. Acad. Sci. U. S. A. 106:9955–62. Schluter, D., and L. M. Nagel. 1995. Parallel speciation by natural selection. Am. Nat. 146:292–301. Schneider, C. A., W. S. Rasband, and K. W. Eliceiri. 2012. NIH Image to ImageJ: 25 years of image analysis. Nat. Methods 9:671–675. Schumer, M., M. Squire, G. G. R. Designed Research, ; M Schumer, and R. C. Research. 2017. Assortative mating and persistent reproductive isolation in hybrids. Proc Natl. Acad. Sci. 114:10936–10941. Schwahn, D., R. Wang, M. White, and B. Payseur. 2018. Genetic dissection of hybrid male sterility across stages of spermatogenesis. Genetics 210:1453–1465. Seehausen, O., R. K. Butlin, I. Keller, C. E. Wagner, J. W. Boughman, P. A. Hohenlohe, C. L. Peichel, G.-P. Saetre, C. Bank, Å. Brännström, A. Brelsford, C. S. Clarkson, F. Eroukhmanoff, J. L. Feder, M. C. Fischer, A. D. Foote, P. Franchini, C. D. Jiggins, F. C. Jones, A. K. Lindholm, K. Lucek, M. E. Maan, D. A. Marques, S. H. Martin, B. Matthews, J. I. Meier, M. Möst, M. W. Nachman, E. Nonaka, D. J. Rennison, J. Schwarzer, E. T. Watson, A. M. Westram, and A. Widmer. 2014. Genomics and the origin of species. Nat. Rev. Genet. 15:176–192. Servedio, M. R., and M. A. F. Noor. 2003. The Role of Reinforcement in Speciation: Theory and Data. Annu. Rev. Ecol. Evol. Syst. 34:339–364. Shapiro, M. D., M. A. Bell, and D. M. Kingsley. 2006. Parallel genetic origins of pelvic reduction in vertebrates. Proc. Natl. Acad. Sci. U. S. A. 103:13753–8. Shapiro, M. D., M. E. Marks, C. L. Peichel, B. K. Blackman, K. S. Nereng, B. Jo, D. Schluter, and D. M. Kingsley. 2004. Genetic and developmental basis of evolutionary pelvic reduction in threespine sticklebacks. Nature 428:717–724. Siepielski, A. M., J. D. DiBattista, and S. M. Carlson. 2009. It’s about time: the temporal dynamics of phenotypic selection in the wild. Ecol. Lett. 12:1261–1276. Simmons, M. J., and J. F. Crow. 1977. Mutations affecting fitness in Drosophila populations. Ann. Rev. Genet. 11:49–78. Singer, M. C., and C. S. McBride. 2010. Multitrait, host-associated divergence among sets of butterfly populations: implications for reproductive isolation and ecological speciation. Evolution. 64:921–933. Smukowski, C. S., and M. A. F. Noor. 2011. Recombination rate variation in closely related species. Heredity (Edinb). 107:496–508. Sobel, J. M., and G. F. Chen. 2014. Unification of methods for estimating the strength of reproductive isolation. Evolution 68:1511–22. 134

Sota, T., M. Hayashi, and T. Yagi. 2007. Geographic variation in body and ovipositor sizes in the leaf beetle Plateumaris constricticollis (Coleoptera: Chrysomelidae) and its association with climatic conditions and host plants. Eur. J. Entomol. 104:165– 172. Soudi, S., K. Reinhold, and L. Engqvist. 2016. Ecologically dependent and intrinsic genetic signatures of postzygotic isolation between sympatric host races of the leaf beetle Lochmaea capreae. Evolution. 70:471–479. Stephan, W. 2010. Genetic hitchhiking versus background selection: The controversy and its implications. Philos. Trans. R. Soc. B Biol. Sci. 365:1245–1253. Royal Society. Stinchcombe, J. R., and H. E. Hoekstra. 2008. Combining population genomics and quantitative genetics: finding the genes underlying ecologically important traits. Heredity. 100:158–70. Storz, J. F. 2005. Using genome scans of DNA polymorphism to infer adaptive Using genome scans of DNA polymorphism to infer adaptive population divergence population divergence. Mol. Ecol. 14:671–88. Sucena, É., and D. Stern. 2000. Divergence of larval morphology between Drosophila sechellia and its sibling species caused by cis-regulatory evolution of ovo/shaven- baby. Proc Natl. Acad. Sci. 97:4530–4534. Talla, V., F. Kalsoom, D. Shipilina, I. Marova, and N. Backström. 2017. Heterogeneous Patterns of Genetic Diversity and Differentiation in European and Siberian Chiffchaff ( Phylloscopus collybita abietinus/P. tristis ). G3 Genes, Genomes, Genet. 7:3983–3998. Tamames, J. 2001. Evolution of gene order conservation in prokaryotes. Genome Biol. 2:research0020.1-0020.11. Tao, Y., D. L. Hartl, and C. C. Laurie. 2001. Sex-ratio segregation distortion associated with reproductive isolation in Drosophila. Proc Natl. Acad. Sci. U.S.A 98:13183– 13188. Taylor, S. A., and E. L. Larson. 2019. Insights from genomes into the evolutionary importance and prevalence of hybridization in nature. Nat. Ecol. Evol. 3:170–177. Tenaillon, O. 2014. The Utility of Fisher’s Geometric Model in Evolutionary Genetics. Annu. Rev. Ecol. Evol. Syst. 45:179–201. Thompson, K. A., M. Urquhart-Cronish, K. D. Whitney, L. H. Rieseberg, and D. Schluter. 2019. Patterns, predictors, and consequences of dominance in hybrids. bioRxiv, doi: 10.1101/818658. Tsai, I. J., A. Burt, and V. Koufopanou. 2010. Conservation of recombination hotspots in yeast. Proc. Natl. Acad. Sci. U. S. A. 107:7847–7852. Turelli, M., N. H. Barton, and J. A. Coyne. 2001. Theory and speciation. Trends Ecol. Evol. 16:330–343. Turelli, M., and D. J. Begunt. 1997. Haldane’s Rule and X-chromosome Size in Drosophila. Genetics 147:1799–1815.

135

Turelli, M., and H. A. Orr. 1995. The dominance theory of Haldane’s rule. Genetics 140:389–402. Turner, J. R. G. 1981. Adaptation and evolution in Heliconious: A defense of NeoDarwinism. Ann. Rev, Ecol. Syst. 12:99–121. Turner, T. L., M. W. Hahn, and S. V Nuzhdin. 2005. Genomic Islands of Speciation in Anopheles gambiae. PLoS Biol. 3:e285. Van der Niet, T., R. Peakall, and S. D. Johnson. 2014. Pollinator-driven ecological speciation in plants: new evidence and future perspectives. Ann. Bot. 113:199–211. Van Wilgenburg, E., G. Driessen, and L. W. Beukeboom. 2006. Single locus complementary sex determination in Hymenoptera: an “unintelligent” design? Front. Zool. 3:1. Vertacnik, K. L., and C. R. Linnen. 2017. Evolutionary genetics of host shifts in herbivorous insects: insights from the age of genomics. Ann. N. Y. Acad. Sci. 1389:186–212. Vertacnik, K. L., and C. R. Linnen. 2015. Neodiprion lecontei genome assembly Nlec1.0. NCBI/GenBank. Via, S. 2012. Divergence hitchhiking and the spread of genomic isolation during ecological speciation-with-gene-flow. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 367:451–60. Via, S. 2001. Sympatric speciation in animals: The ugly duckling grows up. Trends Ecol. Evol. 16:381–390. Via, S., and D. J. Hawthorne. 2002. The genetic architecture of ecological specialization: correlated gene effects on host use and habitat choice in pea aphids. Am. Nat. 76– 88. Vicoso, B., and B. Charlesworth. 2009. Effective population size and the faster-X effect: An extended model. Evolution. 63:2413–2426. Vijay, N., C. M. Bossu, J. W. Poelstra, M. H. Weissensteiner, A. Suh, A. P. Kryukov, and J. B. W. Wolf. 2016. Evolution of heterogeneous genome differentiation across multiple contact zones in a crow . Nat. Commun. 7:1–10. Wagner, G., J. Kenney-Hunt, M. Pavlicev, and J. Peck. 2008. Pleiotropic scaling of gene effects and the “cost of complexity.” Nature 452:470–472. Wang, Z., B.-Y. Liao, and J. Zhang. 2010. Genomic patterns of pleiotropy and the evolution of complexity. Proc. Natl. Acad. Sci. U. S. A. 107:18034–9. Warren, L. O., and J. F. Coyne. 1958. The pine sawfly, Neodiprion taedae linearis Ross, in Arkansas. Univ. Arkansas Agric. Exp. Stn. Bull. 602. Weiblen, G. D., and G. L. Bush. 2002. Speciation in fig and parasites. Mol. Ecol. 11:1573–1578. Weir, B. S., and C. C. Cockerham. 1984. Estimating F-Statistics for the Analysis of Population Structure. Evolution. 38:1358.

136

West-Eberhard, M. J. 1983. Sexual selection, social , and speciation. Q. Rev. Biol. 58:155–183. Whitlock, M. C., and R. Gomulkiewicz. 2005. Probability of fixation in a heterogeneous environment. Genetics 171:1407–1417. Wiens, J., R. Lapoint, and N. Whiteman. 2015. Herbivory increases diversification across insect . Nat. Commun. 6:1–7. Wilkinson, R. 1971. Slash-pine sawfly, Neodiprion merkeli. 1. oviposition pattern and descriptions of egg, female , pupa, and cocoon. Ann. Entomol. Soc. Wilkinson, R. C. 1961. The biology and ecology of the Neodiprion virginianus complex in Wisconsin. University of Wisconsin-Madison. Wilkinson, R. C., G. C. Becker, and D. M. Benjamin. 1966. The biology of Neodiprion rugifrons (Hymenoptera: Diprionidae), a sawfly infesting jack pine in Wisconsin. Ann. Entomol. Soc. 59. Wilson, L. F., R. C. Wilkinson, and R. C. Averill. 1992. Redheaded Pine Sawfly- Its Ecology and Management. Winkler, I. S., and C. Mitter. 2008. The Phylogenetic Dimension of Insect-Plant Interactions: A Review of Recent Evidence. Spec. Speciation, Radiat. Evol. Biol. Herbiv. Insects 240–263. Wolf, J. B. W., and H. Ellegren. 2016. Making sense of genomic islands of differentiation in light of speciation. Nat. Rev. Genet. 18:87–100. Wood, D. E., and S. L. Salzberg. 2014. Kraken: Ultrafast metagenomic sequence classification using exact alignments. Genome Biol. 15:R46. Wright, K. M., D. Lloyd, D. B. Lowry, M. R. Macnair, J. H. Willis, and N. H. Barton. 2013. Indirect evolution of hybrid lethality due to linkage with selected locus in Mimulus guttatus. PLoS Biol 11:1001497. Wu, C.-I. 2001. The genic view of the process of speciation. J. Evol. Biol. 14:851–865. Wu, H., and Z. Hu. 1997. Comparative anatomy of resin ducts of the Pinaceae. Trees 11:135–143. Yeaman, S. 2013. Genomic rearrangements and the evolution of clusters of locally adaptive loci. Proc. Natl. Acad. Sci. U. S. A. 110:1743–51. Yeaman, S., and S. Otto. 2011. Establishment and maintenance of adaptive genetic divergence under migration, selection, and drift. Evolution. 65:2123–2129. Yeaman, S., and M. C. Whitlock. 2011. The genetic architecture of adaptation under migration-selection balance. Evolution. 65:1897–1911.

137

VITA

Education

University of Kentucky Ph.D. Candidate, Department of Biology (Anticipated May 2020) Advisor: Catherine R. Linnen Dissertation title: From genes to species: Ecological speciation with gene flow between Neodiprion pinetum and N. leconeti

Willamette University B.A. Biology, minor in classics (2013) Advisor: Christopher I. Smith Thesis title: Correlated trait evolution of (Yucca; Agavaceae) and yucca moths (Tegeticula; Lepidoptera: )

Professional Experience

University of Kentucky Teaching Assistant- Bio 303: Evolutionary Biology 2017 Teaching Assistant- Bio 155: Introductory Biology 2015 Research Assistant- Linnen Lab 2015-2020

Willamette University Teaching Assistant-Bio 333: Gene Structure and Function 2013 Tutor- Bio 376: Evolutionary Biology (Tutor) 2013 Teaching Assistant- Bio 125: Evolution, Ecology, and Diversity 2012 Tutor- Bio 125: Evolution, Ecology, and Diversity 2012

Professional Honors and Awards

Hamilton Award Finalist, Society for the Study of Evolution 2019 Morgan Graduate Fellowship, University of Kentucky 2017 Kentucky Opportunity Fellowship, University of Kentucky 2014, 2017 Ribble Travel Support, University of Kentucky 2015 Presidents Prize, Entomological Society of America 2015 Graduate School Travel Support, University of Kentucky 2015, 2018 Mini Ribble Grant, University of Kentucky 2015 Gertrude Flora Ribble Research Fellowship, University of Kentucky 2015 Departmental Honors, Willamette University Biology Department 2013 Honors Thesis, Willamette University 2013 G. Herbert Smith Presidential Scholarship, Willamette University 2009

Publications

Bendall, E.E., Vertacnik, K.L, and C.R. Linnen. Oviposition traits generate extrinsic postzygotic isolation between two pine sawfly species. BMC Evolutionary Biology 17:26 * Highlighted on BMC Series Blog

Oral Presentation

Evolution Meetings (Providence, RI): “Haplodiploidy leads to heterogeneous 2019 genomic divergence during speciation with geneflow” *Hamilton Award talk

Gordon Research Seminar on Speciation (Ventura, CA): “Haplodiploidy leads 2019 to heterogeneous genomic divergence during speciation with geneflow”

Current research in biology seminar (University of Kentucky): “The role of host 2018 adaptation in driving speciation of Neodiprion sawflies”

Fourth Year Symposium (University of Kentucky): “The role of host adaptation 2018 in driving speciation of Neodiprion sawflies”

Evolution Meetings (Austin, TX): “Selection on oviposition traits generates 2016 reproductive isolation between two pine sawflies” (Emily E. Bendall, Kim Vertacnik, Catherine R. Linnen)

Evolution Meetings (Snowbird, UT): “Correlated trait evolution of yuccas 2013 (Yucca; Agavaceae) and yucca moths (Tegeticula; Lepidoptera: Prodoxidae)” (Emily E. Bendall, Christopher I. Smith)

139