DISENTANGLING THE GENETICS OF COEVOLUTION IN POTAMOPYRGUS

ANTIPODARUM AND MICROPHALLUS SP.

By

CHRISTINA E JENKINS

A dissertation submitted in partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

WASHINGTON STATE UNIVERSITY School of Biological Sciences

JULY 2016

© Copyright by CHRISTINA E JENKINS, 2016 All Rights Reserved

© Copyright by CHRISTINA E JENKINS, 2016 All Rights Reserved

To the Faculty of Washington State University:

The members of the Committee appointed to examine the dissertation of CHRISTINA E

JENKINS find it satisfactory and recommend that it be accepted.

Mark Dybdahl, Ph.D., Chair

Scott Nuismer, Ph.D.

Joanna Kelley, Ph.D.

Jeb Owen, Ph.D.

ii Acknowledgement

First and foremost, I need to thank my committee, Mark Dybdahl, Scott Nuismer, Joanna

Kelley and Jeb Owen. They have put in a considerable amount of time helping me grow and learn as a scientist, and have consistently challenged me to be better during my Ph.D. studies. I cannot find words to thank them enough, so for now, “thank you” will need to suffice. I especially thank Mark and Scott; coadvising was an adventure and one I embarked on gladly.

Thank you for all the input and effort, even when it made all three of us cranky.

I need to thank the undergraduates and field assistants that have worked for and with me to collect data, process samples, plan field seasons and generally make my life easier. Thanks to

Jared and Caitlin for their tireless work (seriously, hours upon hours of their time) running flow cytometry to answer questions about polyploidy. Thank you to Meredith and Jordan for collecting snails, through sand flies, rain, hangovers, and occasionally hypothermia. And especially to Jordan: it is not easy traveling with someone for 6 weeks at a time on the far side of the world, but he made our trips productive and fun.

I need to thank all of my lab mates, from both the Nuismer lab at University of Idaho:

Ailene McPherson, ET Thornquiste, Anahi Espindola, Virginie Poullain and Bob Week; and from the Dybdahl lab at Washington State University: Jennifer Madrid Thorson, Jon Finger

(DIJON!) and Mark Smithson. Being in two different labs is a lot like having two families.

Neither of them knows much about each other, and there isn’t much interaction as a whole unit.

But similar to having two families, I cannot imagine going through the last six years without the endless support from every one of them.

I need to thank my friends in the graduate student and postdoc community on the

Palouse. Six years is a long time to slog through a Ph.D. and I have met some amazing people,

iii many of whom I now consider family. The non-exhaustive list of people I need to thank: Emily

Jones, Katie Shine, Diego Morales, Simon Uribe-Convers, Kayla Hardwick, Travis Hagey, Tim

McGuin, Natalie Gage, Hannah Marx, Roxy Hickey, Matt Pennell, Matt Singer, Daniel Beck,

Tyler Heather, Gen Metzger, TATE (just Tate), Wesly Loftie-Eaton, Thibault Stadler, Amy

Worthington, Maribeth Latvis, Sarah Jacobs, Marius Myrvold, Urs Weber, Andy Kramer, Ben

Weideback, Chloe Stenkamp-Strahm, Erin Weise and many more. Thank you for supporting me, for taking care of me, for encouraging me, and for believing in me. Especially when I wasn’t able to do any of these things for myself.

I need to thank a few people who have acted in the role of “partner” over the last few years. First, I need to thank Bobbi Johnson. I don’t know how I convinced the coolest person on the planet to be friends with me, but I’m sure glad I did. If I had half of the life skills that she does, I could take over the world, and I’m amazed every day that she hasn’t done so already.

Thank you for taking care of all the things I am entirely incapable of.

Kimberly Lackey and I undertook the impossible and succeeded. We were asked to teach a course for which there was no course. We wrote all the labs, the quizzes, the lectures, the exams and any other relevant material. We met constantly for two years, spoke many times a day, produced a cohesive class, and published a text book together. I could not have done this with anyone else, and I will forever be grateful for our partnership.

Finally, I need to thank my “domestic partner” Hannah Marx. We have a wine club membership, a CSA, a storage unit together and have the keys to each other’s apartment. We have done “romantic weekends” in Paris, McCall, Seattle, and many more. I can’t imagine not having you in my life, and can’t wait for our adventures in the future.

iv A Ph.D. takes a long time, and I simply would not have been able to undertake it without the support of my family and friends, who I would like to thank a million times. Especially my baby niece, Annie Grace, who makes me smile a little bit every day.

v DISENTANGLING THE GENETICS OF COEVOLUTION IN POTAMOPYRGUS

ANTIPODARUM AND MICROPHALLUS SP.

Abstract

by Christina E. Jenkins, Ph.D. Washington State University July 2016

Chair: Mark Dybdahl

Host-parasite coevolution is potentially important for many evolutionary transitions such as the evolution of sexual reproduction, ploidy, and the evolution of mating systems according to mathematical models of coevolution. In these models, host-parasite interaction is characterized by host resistance and parasites’ infectivity which is assumed to be based on a matrix of genotype by genotype specificity (GxG). Importantly, a recent trend has demonstrated that the

GxG matrix assumed in a given theoretical model will drastically alter the outcome of the above evolutionary transitions. Consequently, determining the form of genetic interaction matrices in natural populations is crucial to both understanding coevolution and the resulting evolutionary transitions. In this dissertation, I explore the genetics of host-parasite interactions in three different ways. First, using the New Zealand snail, Potamopyrgus antipodarum and its undescribed trematode parasite Microphallus sp., I tested the fit of different genetic models by comparing the resistance of triploid and tetraploid hosts, which differ in gene dosage, heterozygosity, and abundance of novel alleles. In my second chapter, I explored the molecular and genetic basis of traits associated with infection by assembling and annotating a transcriptome for Microphallus sp. and comparing the results with other similar parasites. First, to facilitate

vi comparisons, I determined the phylogenetic placement of Microphallus sp. among the trematode parasites. Second, I compared the genes expressed in Microphallus sp. with those of other well- studied trematode parasites. Finally, because trematodes infect both snail and vertebrate hosts in their life cycle, I used further comparative analyses to determine whether Microphallus sp. are expressing genes to evade the vertebrate or invertebrate immune system. Finally, in my third chapter, I developed a technique for finding the genomic regions involved in coevolution. Using genomic data, we can look for SNPs that covary spatially between the host and parasite, as these will likely mark regions involved in local adaptation. I tested the efficacy of this technique using simulated populations of hosts and parasites with coevolving and neutral loci. I altered models of infection, evolutionary and statistical parameters to determine when we are able to detect coevolving loci.

vii TABLE OF CONTENTS

Page Abstract ...... vi

TABLE OF CONTENTS ...... viii

LIST OF TABLES ...... x

LIST OF FIGURES ...... xi

Dedication ...... xiii

Introduction ...... 1

The role of ploidy in host resistance ...... 8

Abstract: ...... 8

Introduction ...... 9

Materials and Methods ...... 12

Results ...... 16

Discussion ...... 17

Acknowledgements ...... 19

The Microphallus sp. transcriptome and an analysis of its taxonomic relationship to other

Digenea parasites ...... 24

Abstract ...... 24

Introduction ...... 26

Materials and Methods ...... 29

Results ...... 33

Discussion ...... 37

Acknowledgements ...... 41

Identifying genomic hot spots of coevolution in host-parasite systems ...... 61

viii Abstract ...... 61

Introduction ...... 63

Overview of Approach ...... 65

Method testing ...... 67

Discussion ...... 74

Literature Cited ...... 82

ix LIST OF TABLES

Table 1 - Five host source populations with the respective percentage of 3N and 4N individuals within each sample and the 3 populations from which parasites were collected and the ploidy of their hosts. The ploidy of the hosts at each parasite population are included...... 22

Table 2- Results of a generalized linear model predicting host infection based on ploidy level, parasite source, host source and all possible interactions...... 23

Table 3- Information from both 454 and Illumina reads, and the subsequent hybrid assembly. .. 45

Table 4- Population specific data for reads included in the reference assembly and the percentage of reads that mapped to the reference transcriptome...... 46

Table 5 – GO terms with their associated descriptions of the annotated contigs from the

Microphallus sp. transcriptome...... 47

Table 6 – Taxanomic information, NCBI accession numbers, and references for all samples included in our 28s nrDNA phylogeny...... 49

Table 7 – Results of reciprocal blast of Microphallus sp. transcriptome to the assembled transcriptomes of the nematode C. elegans, and the trematode parasites C. sinensis, E caproni, F. hepatica, O. viverrini, S. mansoni and T. regent...... 55

Table 8 – Comparison of Microphallus sp. transcriptome with EST libraries of the five stages of the S. mansoni life cycle...... 56

Table 9 – Candidate genes with GO annotations from the SwissProt database that are associated with the Microphallus sp. metacercaria, S. mansoni adult worms (found within vertebrates), and

S. mansoni sporocysts (found within invertebrates)...... 57

Table 10 - A summary of the evolutionary conditions needed to detect spatially covarying loci, across all eight parameters and four genetic interactions...... 81

x LIST OF FIGURES

Figure 1- Proportion of 3N (white) and 4N (black) individuals infected in each host population with each parasite source. Parasite source populations are columns; host source populations are rows...... 20

Figure 2- Proportion (+/- 1 sd) of triploid (3N; white) and tetraploid (4N; black) snail individuals infected, pooling infection rates across host populations for each parasite source. Lake Kaniere parasites infect significantly more 4N individuals, Lake Rotoroa parasites infect significantly more 3N individuals and there is no difference in infection between 3N and 4N when inoculated with Lake Poerua parasites...... 21

Figure 4 - Assembled phylogeny for Microphallodidea, including the species Microphallus sp. from New Zealand, colored red...... 42

Figure 5- Top blast hits by species...... 43

Figure 6- GO terms associated with annotated genes within the Microphallus sp. Metacercaria, S. mansoni adult worms (resides within vertebrates), and S. mansoni sporocysts (reside within invertebrates)...... 44

Figure 7 - The four different genetic interactions that were tested. The interactions can be broadly grouped into two categories: escalation and matching, continuous and discrete. The first category addresses how phenotypes will interact with each other, either based on the host matching the immune system, matching, or an arms race dynamic, escalation. The second category refers to how the infection phenotype is translated from the underlying genotype, either in an additive polygenic fashion, continuous, or with each infection locus pair explicitly interacting, discrete...... 78

xi Figure 8 - The evolutionary conditions under which the spatial covariance is strong enough to be detected are those that have an impact on local adaptation. We examined the relationship between local adaptation and type II error rate, and found a significant negative relationship. To address the sampling conditions an empiricist might need to find coevolving genomic regions, we found a significant negative relationship between number of populations sampled and type II error rate. Therefore, in order to find coevolving genomic regions, the population samples must be locally adapted and have many populations sampled...... 79

Figure 9 - Under four different models of interaction based on matching, one can determine for each given amount of local adaptation, how many populations need to be sampled to have a low type II error rate...... 80

xii

Dedication

This dissertation is dedicated to my mom and dad who provided both emotional and financial

support through the long years of my education, as well as all of my DNA. Especially to my mom, who is not only serves as my biggest fan, but is also the best editor a girl could ask for.

xiii Introduction

Coevolution between hosts and parasites has been shown to drive a number of evolutionary transitions such as the evolution of ploidy (Nuismer and Otto 2004), the evolution of outcrossing

(Morran et al. 2012; 2014), increased mutation rates (M'Gonigle et al. 2009), and the evolution and maintenance of sexual reproduction (Otto and Nuismer 2004). These theoretical predictions are all based on a fundamental assumption: within a population, each host genotype is resistant to certain parasite genotypes, and each parasite genotype is only able to infect certain host genotypes. These genotype by genotype (GxG) interactions are often characterized in coevolutionary theory as infection matrices.

Two such matrices are commonly used: matching allele model (MAM) (Frank 1991;

2000) and the gene-for-gene model (GFG) (Flor 1956; Thompson and Burdon 2002). MAM is based on self/non-self recognition molecules such as major histocompatibility complex (MHC molecules) in the vertebrate immune system (Grosberg and Hart 2000). Under this model of infection, parasites that match the host will be viewed by the host immune system as “self” and will not be eliminated. The GFG model was first conceived by observing agricultural plants and their fungi. It is based on an escalatory model of infection, where virulence alleles arise in the parasite populations and correspondingly resistance alleles arise in the host population (Flor

1971). Over time, this generates more and more resistant hosts and virulent parasites. These two often used matrices are an example of many that are based on an idea of how the immune system may work, and the validity and correct usage of each has long been debated (Parker 1994; Frank

1994; Parker 1996; Frank 1996). However, a commonality for both of these systems is a lack of concrete examples within the immunological literature that they are correct representations of how infection may occur (Dybdahl et al. 2014a).

1 What’s more, recent studies have demonstrated that the choice of matrices will alter the predicted evolutionary outcome. For example, under the GFG model of infection, sexual reproduction does not evolve, while under the MAM model, it does (Agrawal and Lively n.d.).

The MAM model leads to negative frequency dependent selection, where it is advantageous for the host genotype to be different than common genotypes within the population (Koskella and

Lively 2009). Because sexual reproduction consistently produces novel genotypes, under the

MAM model, sexual reproduction can evolve (Agrawal 2009). However, because GFG leads to more resistant hosts and more virulent parasites, sexual reproduction will not evolve (Parker

1994; Agrawal and Lively n.d.). Moreover, both of these matrices were originally conceived assuming both host and parasite are haploid. The diploid MAM models that account for heterozygote genotypes are varied in both their conception and their results (Otto and Nuismer

2004; Nuismer and Otto 2005). In this one evolutionary transition that relies on host-parasite coevolution, the variation in the underlying matrix of infection, and the variation between different diploid matrices change the outcome. This is just one example of many where theoretical results have demonstrated that understanding the genetics that underlie coevolutionary relationships are important to understanding the impact of host-parasite coevolution in natural populations.

Despite the importance of finding and understanding the genomic regions that underlie coevolution, it has proven to be extremely difficult to do so. There are a number of reasons why this is the case. One problem is that there could be many genes of small effect that are undetectable using traditional genomic measures of selection (Luikart et al. 2003; Storz 2005).

This is not an unreasonable assumption for genomic regions involved in coevolution given that both known immune proteins and virulence proteins have been demonstrated to have multiple

2 loci (Dausset 1981; Hughes and Yeager 2003). Another potential difficulty lies in the deluge of data that is gathered from next-generation sequencing experiments like RNA-seq and sorting through the variation associated with coevolution may be constrained by the wealth of variation associated with other sources. Some studies have looked for differentially expressed genomic regions in infected vs. uninfected hosts in hopes that within those differentially expressed genes lie the genomic regions of interest to coevolution (Portillo et al. 2013; Tanaka et al. 2013; Foth et al. 2014; Blomström et al. 2015; Videvall et al. 2015). However, the majority of the genes expressed in infected individuals will not be related to coevolutionary interaction, but rather appear as a result of being infected (i.e., stress response genes, or starvation genes). Without an annotated reference, it is difficult to be able to sufficiently sort through the genomic regions that are present but involved with response to something besides coevolution. Therefore, despite the increasing availability of a wide array of genomic material in non-model organisms, very few studies have characterized the underlying genetics of host parasite coevolution in natural populations. There are some systems for which we understand either the genomic regions involved in host resistance (Labrie et al. 2010; Perry et al. 2015) or the parasite’s ability to infect

(Barrett et al. 2009; Dy et al. 2014; Burmeister et al. 2015), but not the key combination of both host and parasite genes that determine the outcome of coevolution.

In this dissertation, I aim to fill this obvious gap in our scientific knowledge by developing new techniques to pinpoint coevolving genes, and by exploring the genomics of infection and resistance in a very important system for studying coevolution, the freshwater snail

Potamopygrus antipodarum and its trematode parasite, Microphallus sp. P. antipodarum is native to New Zealand and has a number of genetic and ecological characteristics that make it an ideal system to study coevolution. First, Microphallus sp. has been shown to impose strong

3 enough negative frequency selection on P. antipodarum to select for the maintenance of sexual reproduction (Dybdahl and Lively 1995b; 1998; Koskella and Lively 2007). It has therefore become one of the primary systems that has consistently demonstrated the Red Queen hypothesis

(Salathé et al. 2008). Additionally, P. antipodarum and Microphallus are found in a wide variety of different lakes and streams and much work has been done on the ecology of the hosts and how it is affected by parasites. Finally, there are diploid sexuals and both triploid and tetraploid asexuals, allowing us to examine the role of sex and ploidy in host-parasite coevolution (Neiman et al. 2011). However, despite the importance of this system for studying host-parasite coevolution, the genetics of this interaction remain entirely unknown. To better understand the genetics of this system, and better inform the consequences of host-parasite coevolution across systems, I sought to start disentangling the underlying genetics of coevolution between P. antipodarum and Microphallus.

First, I sought to determine if increased gene dosage, heterozygosity, or abundance of novel alleles alters the resistance of P. antipodarum to Microphallus sp. by comparing infection rates of tetraploids with that of triploids. As mentioned above, the infection matrices that underlie host-parasite coevolution were all conceived under the assumption that both host and parasite are haploid. When considering instead how a diploid host or parasite would behave under these same infection matrices, how resistant the heterozygote host and parasites are will drastically alter the outcome of coevolution (Nuismer and Otto 2005; Agrawal and Otto 2006).

Because polyploids have higher heterozygosity than diploids, how ploidy changes resistance to parasitic infection could give us insight into understanding how the genotypic infection matrix treats heterozygotes.

4 We collected P. antipodarum snails from five populations that are known to have asexual hosts that are either triploid or tetraploid (Neiman et al. 2011). We then exposed these snails to parasites from three different populations of parasites, and after a three-month incubation period, determined infection status. Finally, we determined ploidy of each snail using flow-cytometry.

Most importantly, there was no overall increase in resistance associated with increased ploidy, contrary to the general view that polyploids are more robust due to increased gene dosage, heterozygosity, or abundance of novel alleles. Overall, this suggests that heterozygotes are either

1) not more resistant or more susceptible than homozygotes or 2) in this system, increasing ploidy does not increase heterozygosity. However we found a significant interaction between parasite population and ploidy, and speculate that it could be due to a two step infection process, such that heterozygosity increases resistance in the first step and decreases resistance in the second step. Under coevolutionary cycling, we could then expect that parasites from populations which are fixed for the first locus, heterozygotes will be more infective, and therefore, triploids will be proportionally infected less. If instead, the parasite comes from a population where the second locus is fixed, then heterozygotes will be favored and tetraploids will be infected less.

To better facilitate this endeavor, I sought to find genomic regions involved in infection in the parasite, Microphallus sp., by examining the transcriptome of the metacercariae stage of the parasite. While there is transcriptomic data available for P. antipodarum (Wilton et al.

2012a), little to no data exists for Microphallus sp.. I first sequenced and annotated the transcriptome, providing the first transcriptome for the Microphallus sp. parasite. I found key gene ontology (GO) terms that are potentially associated with host-parasite interactions. I then looked for similarity between Microphallus sp. and other well studied or medically important parasites by isolating the 28s nrDNA genomic and used this gene to construct a phylogenetic tree

5 of Microphilidae and trematodes. I also compared the transcriptome of Microphallus sp. with the SwissProt database, and other trematodes in an effort to uncover potentially similar genes involved with infection. I found high similarity with F. hepatica, S. mansoni, and C. sinensis, for which there is some awareness of the genetic basis of infection. This similarity between species will allow us to look for similar infection associated genes within the

Microphallus sp. transcriptome. Finally, I did stage specific comparisons to determine which immune response the metacercaria stage of Microphallus sp. may be trying to evade, vertebrate or invertebrate. To this end, I compared the Microphallus sp. transcriptome assembled to the five stage of the S. mansoni life cycle. I found that there is high similarity between my transcriptome and the miracidia stage (found in snails) within S. mansoni. However, when analyzing the GO terms in the vertebrate vs. invertebrate stages in S. mansoni, I found that the Microphallus sp metacercaria is qualitatively more similar to the S. mansoni stage within their vertebrate host.

This suggests that while the metacercaria are evading their snail host immune system, they are also preparing to invade the vertebrate host. The availability of this assembled transcriptome and my subsequent analyses will facilitate future research in the genomics of host-parasite interaction in this system, and thus start disentangling the genetics of host-parasite coevolution.

Finally, I developed a technique for finding the genomic regions involved in host-parasite coevolution and tested this theory using simulated populations of coevolving hosts and parasites.

Theory has demonstrated that the biotic component of coevolution is the sum of the spatial covariance of host and parasite genotypes (Nuismer and Gandon 2008). Large spatial covariances are generated by local coevolutionary selection. For example, a mutation providing greater infectivity arising in a parasite population may cause selection on a mutation for resistance in the host population and vice versa. When examined across populations, genetic

6 marker polymorphisms associated with genomics regions that are responding to reciprocal selection will be swept to high frequency together with the beneficial mutations. We can therefore find the genomic regions involved in host-parasite coevolution by looking for single nucleotide polymorphisms (SNPs) that spatially covary between hosts and parasites. The SNPs that have the highest spatial covariance are those that contribute the most to the biotic component of local adaptation, and therefore, coevolution. I tested the infection matrices, and parameters under which this technique successfully detected the loci coevolving in simulated populations of hosts and parasites. These results not only demonstrate whether this technique works, but also how robust it is under different parameters that could increase or decrease local adaptation. This technique will provide an important tool for finding genomic regions involved in host-parasite coevolution and therefore facilitated research in understanding the maintenance of genetic variation, disease and organismal interactions in general.

7 The role of ploidy in host resistance

Authors: Christina E. Jenkins, Scott Nuismer and Mark Dybdahl

*Corresponding Author

Affiliations:

1 School of Biological Sciences, Washington State University, Pullman WA

2 Department of Biology, University of Idaho, Moscow ID

Abstract:

Polyploidy is common across a wide variety of taxa, which is striking given the many barriers to polyploid establishment. A number of hypotheses compete to explain its prevalence but one intriguing possibility is that an increase in ploidy results in increased resistance to parasitic infection. We expect polyploids to be more resistant to parasites because they could have increased dosage of immune proteins, higher heterozygosity and novel alleles. We tested whether an increase in ploidy is associated with increased resistance using the freshwater snail system Potamopyrgus antipodarum and its trematode parasite Microphallus sp.. Using experimental inoculations, we found that ploidy had no overall effect on infection rates, indicating that polyploids are not consistently more or less resistant to parasites. Instead, our results demonstrated a significant interaction between parasite source population and snail ploidy suggesting that higher ploidy increases snail resistance to parasites from some lakes but decreases snail resistance to parasites from other lakes.

Keywords: Polyploidy, Host-parasite coevolution, Potamopyrgus antipodarum

8 Introduction

Polyploidy, or whole genome duplication, is particularly widespread in plants but also occurs regularly within some lineages (Zhang and King 1993; Masterson 1994;

Beukeboom et al. 1998; Otto and Whitton 2000; Wendel 2000; Langston et al. 2001; Seoighe

2003; D'Souza et al. 2005; Duchemin et al. 2007; Soltis et al. 2009; Ching et al. 2009). The success of polyploid lineages is somewhat surprising given the many hurdles to their establishment including competition with their diploid progenitors (Baack 2005), initial decrease in fertility after polyploidization (Ramsey and Schemske 2002; Rausch and Morgan 2005), and minority cytotype exclusion (Levin 1975; Husband 2000). As a consequence, a range of hypotheses have been developed to explain why polyploids might have an advantage over their diploid progenitors. For instance, it has been hypothesized that polyploids may be more evolutionarily flexible and likely to innovate because they have a redundant set of genes that can potentially diverge without loss of function (reviewed in (Adams and Wendel 2005; Comai

2005). It has also been suggested that polyploidy may facilitate masking of deleterious mutations

(Otto and Whitton 2000). Another intriguing possibility is that genome duplication increases resistance to parasites (Levin 1983). Due to the ubiquitous nature of parasites, this is a tempting explanation.

An increase in ploidy is associated with many genetic changes (Adams and Wendel 2005;

Chen 2007), some of which could cause an increase in resistance to parasites. For example, increased ploidy has been associated with gene redundancy, which produces new proteins or protein subunits resulting from mutational variants in redundant copies (Comai 2005; Birchler

2012). This could potentially produce novel defense proteins and thus could increase capacity to recognize and mount an immune response to a greater diversity of pathogens. Another reason

9 polyploids could have consistently higher resistance is an increase in gene dosage; with more copies of each gene, polyploids may have increased circulating defense proteins and thus greater resistance (reviewed in (Wertheim et al. 2013; Dybdahl et al. 2014a). Polyploids could also be more resistant if there is strong coevolution and polyploids have a more rapid response to selection because they are more adaptable under particular patterns of dominance (Otto and

Whitton 2000; Choleva and Janko 2013). Finally, heterozygosity has been consistently demonstrated to be greater in populations with higher ploidy, all else being equal (White 1970;

Stenberg and Saura 2013; Tayalé and Parisod 2013; Mable et al. 2015), and under some models of infection, heterozygotes are more resistant (Nuismer and Otto 2004; Agrawal and Otto 2006).

For example, under the inverse matching allele (IMA) model, host defense involves an array of recognition molecules (e.g., antibodies) that are able to recognize specific antigens and resist parasites carrying those antigens (Frank 1994). Therefore, because heterozygotes are able to recognize more parasite genotypes, they are more resistant than homozygous hosts, and thus polyploids would be more resistant than diploids. If any of these genetic changes result from polyploidization, then individuals with higher ploidy should have higher resistance.

Although increased ploidy could result in increased resistance to parasites, other genetic changes associated with an increase in ploidy could result in a consistent decrease in resistance.

For example, under some models of coevolution, heterozygotes are less resistant than homozygotes. In the matching allele (MA) model of coevolution, resistance is based upon a system of self/non-self recognition (Mode 1958). Under this model, heterozygous hosts will be susceptible to both homozygote parasite genotypes (Nuismer and Otto 2004; Agrawal and Otto

2006), and thus polyploids with more heterozygosity should be more susceptible to parasites under an MA model. Additionally, it is possible that polyploids are less resistant due to the

10 dramatic genomic rearrangement that often occurs after polyploidization (Hufton and

Panopoulou 2009), that could make newly formed polyploids less resistant (Reviewed in (Hufton and Panopoulou 2009; King et al. 2012). Under each of these scenarios, increased ploidy would be associated with consistently lower resistance.

Finally, it is also possible that polyploids are sometimes more and sometimes less resistant to parasites. For example, gene frequency fluctuations under coevolutionary cycling can result in transiently greater or lower resistance to parasite infection. Under this scenario, resistance depends on the frequencies of resistant and infective host genotypes and parasite genotypes that are coevolved to those specific genotypes within each population. If the parasite is at some point during parasite-driven allele frequency fluctuations where it favors a specific genotype, then it will be able to infect hosts of that genotype irrespective of ploidy. Another possibility is that the genetic model of infection differs among populations. Under this scenario, ploidy will not necessarily be predictive of resistance.

To date, empirical efforts in plant polyploids have shown genome duplication can have mixed consequences for species interactions. For instance, some studies have found polyploid plants to be more resistant than diploids (Busey et al. 1992; Zhao et al. 2005; Vleugels et al.

2013), whereas others have found polyploids to be less resistant (Thompson et al. 1997;

Munzbergova 2006; Kao 2008). In other cases, no differences in resistance were detected across ploidies (Burdon and Marshall 1981; Schoen et al. 1992; Ohberg et al. 2005; Yli-Mattila et al.

2009; Gottula et al. 2014) or polyploids were found to be more resistant to some parasites but less resistant to others (Nuismer and Thompson 2001; Arvanitis et al. 2007; Halverson et al.

2007). Taken together, these studies suggest that increased ploidy has no consistent impact on resistance to parasites in plant populations. Because almost no empirical investigations of

11 have been conducted to date, we do not yet know whether the impacts of polyploidy are similar (Guégan and Morand 1996).

We investigated the impact of increased ploidy on resistance to the parasitic trematode species, Microphallus sp in the snail, Potamopyrgus antipodarum. Within New Zealand, snail populations are comprised of sexual diploid and asexual triploid or tetraploid individuals, with some lakes containing a mix of asexual triploid and tetraploid snails (Neiman et al. 2011). We exposed triploid and tetraploid individuals of P. antipodarum from 4 populations to their coevolving parasitic trematode, Microphallus sp., from three allopatric populations of parasites, and asked whether tetraploid individuals were more resistant to parasite infection. Unlike other studies addressing the impact of increased ploidy in this system (Parsons et al. 1986; Lively et al.

2004a; Osnas and Lively 2006; Duchemin et al. 2007), we held mating system constant by studying only asexual populations of this species that differ in ploidy. Further, by challenging snails with trematodes from allopatric lakes, we eliminated the impact of coevolutionary history.

Thus, our study was designed to investigate whether increasing ploidy, per se, influences levels of parasite resistance in this system.

Materials and Methods

Study System

Potamopygrus antipodarum is a small gastropod commonly found in New Zealand lakes and streams (Talbot and Ward 1987; Jokela and Lively 1995), and diploid, triploid, and tetraploid individuals coexist across the species range (Neiman et al. 2011). The predominant parasite of P. antipodarum is an undescribed species of Microphallus (:

Microphallidae; (Lively 1987)). Mature Microphallus produces eggs in waterfowl, which pass

12 out of the in the feces. The eggs are ingested by P. antipodarum and then develop and encyst in the snail, becoming infective metacercariae approximately 3 months after exposure.

During the maturation process, the parasite sterilizes its snail host, rendering it unable to reproduce. When an infected snail is then eaten by a bird host, the parasite life cycle is completed. Decades of empirical study have shown that lake populations of snails and parasites coevolve, and that parasites are consistently adapted to local lake populations of their hosts; these coevolutionary dynamics are broadly consistent with the Red Queen hypothesis (Lively

1987, Dybdahl and Lively 1998, Jokela et al. 2009, Koskela and Lively 2009).

Host and parasite sampling

Our goal was to expose P. antipodarum snails to Microphallus sp. eggs from different, allopatric populations of Microphallus sp. because we wanted to test resistance in P. antipodarum while eliminating the effects of past or historical coevolution and adaptation to specific clonal genotypes. Our snail sampling focused on lakes that contain asexual triploids and tetraploids. Snails were collected from five such lakes on the South Island of New Zealand during January 2014 (See Table 1). One of our sampled lakes, Lake Poerua, was a source for both snails and parasites; we therefore did not expose Lake Poerua snails to sympatric Lake

Poerua parasites.

We used parasites collected from three different lakes as sources of allopatric parasites; the three populations differ in the ploidy of their hosts (Table 1). Parasite eggs were collected from wild duck feces using previously established protocol (King et al. 2011b). Feces were washed with water and the mixture was then filtered through 1 mm mesh. After two weeks of

13 twice daily washing to remove organic materials and toxins from the feces, the parasite eggs were added to the snail samples.

Experimental Inoculations

We exposed samples of the five different snail populations to Microphallus sp. eggs from three different allopatric parasite populations. All experimental inoculations were conducted at the University of Canterbury in Christchurch, New Zealand. Eggs from each parasite population were equally divided among snail containers housing snails from each host population in a full factorial design with the exception of the sympatric cross between Lake Poerua hosts and parasites (Table 1). Snails were housed in 2 liter containers for the duration of the inoculation.

Temperatures were held a constant 20 C, with a 12 hour light-dark cycle. Water was changed weekly, using filtered water. Snails were fed spirulina, a standard laboratory diet. After two weeks, the snails were moved to parasite free water. Experimental snails were returned to

Washington State University and dissected three months post exposure to determine infection status. Two inoculated populations experienced high mortality rates in transit back to

Washington State University: Lake Gunn snails that were exposed to Lake Rotoroa parasites, and Lake Rotoiti snails that were exposed to Lake Poerua parasites. These two populations were excluded from all analyses. The heads of all individuals were flash frozen for ploidy determination and stored at -80 C.

Flow Cytometry

We determined ploidy by staining snail DNA with propidium iodide (PI) and quantifying

PI fluorescence with flow cytometry. For each sample, the frozen P. antipodarum heads were

14 ground in a solution containing 10 ul 1% TritonX, 0.2 ul 0.5M EDTA, 1 ml PBS, and 50 ul propidium iodide per each sample. Propidium iodide binds to DNA and fluoresces, allowing us to treat levels of fluorescence as a proxy for amount of DNA. The solution was allowed to settle and the cells were separated from larger intact tissue by pipetting off the supernatant of the solution. Chicken (Gallus gallus) red blood cells were included in each sample as an internal positive control because diploid P. antipodarum contain approximately the same amount of

DNA as chicken red blood cells, while triploid snails have 50% more DNA, and tetraploid snails have 100% more DNA (Neiman et al. 2011). As a result, we determined ploidy for each sample by comparing fluorescence levels of snail cells to that of chicken red blood cells. Fluorescence was measured by running all prepared samples through a BD FACS Calibur flow cytometer to measure fluorescence of PI, using the FL1 channel.

Statistical Analyses

In order to distinguish among our three hypotheses, we developed a generalized linear model that allowed us to compare infection rate across combinations of snail and trematode populations. Specifically, we fit a model with infection status as the dependent variable, and ploidy, host population, parasite population and their interactions as independent variables.

� = � + � + � + � ∗ � + � ∗ � + � ∗ � + � ∗ � ∗ �

Here, � is the infection status, �, is the ploidy, � is the source population of the host, and � is the source population of the parasite.

Because � is a binomial variable, we used a generalized linear model with binomial link function. The generalized linear model is like ordinary linear regressions, but it does not rely on the assumption that the dependent variable is normally distributed; rather, error structure is

15 designated by the user, binomial in this case. In a binomial model, the sign of the B-coefficient

(i.e., the slope) for each explanatory variable within the model determines whether it is positively or negatively associated with our dependent variable, here individual infection status. All statistical analyses were conducted using the R programing platform (R Development Core

Team, 2008).

Results

Analysis of our full model revealed that ploidy had no overall impact on individual infection status (p = 0.899; Table 2). Thus, our results provide no evidence that polyploidy is consistently associated with greater or lower resistance to parasites. Additionally, we found no significant effect of host population (p = 0.871; Table 2)(Figure 1), so it does not appear that some host populations are simply more resistant than others. However, there is a significant effect of parasite (p = 0.0036; Table 2), suggesting that some parasite populations are better able to infect than other parasite populations.

Only one of the interaction terms, between parasite source and ploidy, was statistically significant (p = 0.002, Figure 2) such that parasites from some source populations infected triploids at a higher rate than tetraploids, while parasites from other source populations infected tetraploids at a higher rate than triploids. Fisher’s Exact post-hoc tests were used to determine how parasite source populations differed in their capacity to infect triploids versus tetraploids across all host populations. Within the snails of all five populations exposed to Lake Kaniere

2 parasites, a higher percentage of tetraploids were infected than triploids (χ = 5.3932; p =0.0202).

Conversely, Lake Rotoroa parasites infected a higher percentage of triploids than tetraploids (χ2

16 = 7.5367; p = 0.0061). Finally, parasites from Lake Poerua did not differ in their ability to infect triploids and tetraploids (χ2 = 0.0052; p = 0.9424) (Fig. 3).

Discussion

The possible greater resistance of polyploids to ubiquitous parasites might explain their relative abundance in the diversity of plants and animals. However, the effect of elevated host ploidy on resistance is uncertain because of the many possible genetic and phenotypic changes that can occur due to polyploidization (Comai 2005; Sémon and Wolfe 2007), and because these changes interact with the various and largely unknown genetic models determining infection and coevolution (REFS). We studied variation in resistance of asexual triploids and tetraploids of the fresh water snail Potamopyrgus antipodarum to its coevolved trematode parasite Microphallus sp.. Because we exclusively used asexual snails from populations where ploidy varies (Neiman et al. 2011), we were able to isolate the effects of increased ploidy independent of reproductive mode, a traditional problem due to the association in animals between increased ploidy and asexual reproduction (Otto 2007).

There are number of potential mechanisms that could have resulted in higher ploidy in P. antipodarum altering resistance to Microphallus sp.. For example, in snail defense against macroparasites like trematodes, snails with higher ploidy should express more gene copies and produce more effector molecules that attack the parasite and counteract their defenses. Increased heterozygosity due to an increase in ploidy could either increase or decrease resistance, dependent on the underlying model of infection. When the host recognizes and attacks the parasite (IMA), greater heterozygosity leads to resistance to a great diversity of parasite genotypes. On the other hand, when the host is attacked by parasites that match the host

17 genotype (MA), greater heterozygosity leads to greater susceptibility to a larger diversity of parasites. In contrast, we found no significant main effect of host ploidy on their likelihood of infection, suggesting that there is no overall consistent effect of ploidy on resistance. Because we did not see an overall effect of ploidy, we can conclude that gene redundancy, gene dosage, or heterozygosity does not affect resistance in this system. However, whether these genetic changes are happening and do not affect resistance, or whether these changes do not occur due to polyploidization in this system is largely unknown.

Previous studies in our system have considered the role of host ploidy with respect to parasites but were hindered by population reproductive mode. For example, a large meta-analysis of local adaptation studies in this system (Lively et al. 2004a) tested for a difference in resistance between diploids and triploids. While they found some effect of ploidy, the largest predictor of whether a host was infected was commonness of the genotype (Dybdahl and Lively 1998).

Because triploids are asexual, and therefore more common than sexually reproducing diploids, triploids tended to be infected more often, independent of ploidy. That the present study showed no effect of ploidy concurs with previous work in this system. However, because we exclusively used asexual snails from populations where ploidy varies, we are able to definitively say ploidy is not related to infection rate in these snails.

Despite the lack of overall effect of ploidy, we did find that Microphallus sp. parasites are either better or worse at infecting tetraploids depending on their parasite population of origin.

Without specific genotypic data within our system to determine what is specifically mediating the genotype by genotype interaction between host and parasite, and how this is altered by increased ploidy, we are unable to say why some parasites are better at infecting triploids or tetraploids. We considered a number of different models of infection, selection regimes and

18 coevolutionary cycling scenarios, however, we are unable to adequately explain why we see this pattern. Previous work in plant systems has demonstrated a similar effect, where some seed predators preferentially attack diploids and some prefer tetraploids (Nuismer and Thompson

2001), with a similar inability to explain why we might see this pattern. Additionally, when addressing the role of increased ploidy on plant populations, results are generally mixed. While our result does not confirm this hypothesis specifically, it does suggest that it might be the case in the P. antipodarum-Microphallus sp. system.

Thus, although it is tempting to think that the reason polyploidy is so prevalent in natural populations is because they are more resistant, the general conclusion is that there is no general pattern of resistance associated with either plant or P. antipodarum polyploids.

Acknowledgements

We would like to thank our field assistants, Jordan Erlenbach and Meredith Kee. I would like to thank the National Science Foundation IGERT and the Washington State University Elling

Fellowship for funding.

19 Lake Kaniere Lake Poerua Lake Rotoroa 0.6 Lake Gunn 0.4

0.2

0.0 0.6 Lake Haupiri 0.4

0.2

0.0 0.6 Lake Mavora 0.4

Individual Infected 0.2

0.0

% Individuals Infected 0.6 Lake Poerua 0.4

Proportion Proportion of 0.2

0.0 0.6 Lake Rotoiti 0.4

0.2

0.0 3N 4N 3N 4N 3N 4N PloidyPloidy

Figure 1- Proportion of 3N (white) and 4N (black) individuals infected in each host population with each parasite source. Parasite source populations are columns; host source populations are rows.

20 Lake Kaniere Lake Poerua Lake Rotoroa

0.5

0.4

0.3 Individual Infected

% Individuals Infected 0.2 Proportion Proportion of

0.1

0.0

3N 4N 3N 4N 3N 4N Ploidy Ploidy

Figure 2- Proportion (+/- 1 sd) of triploid (3N; white) and tetraploid (4N; black) snail individuals infected, pooling infection rates across host populations for each parasite source. Lake Kaniere parasites infect significantly more 4N individuals, Lake Rotoroa parasites infect significantly more 3N individuals and there is no difference in infection between 3N and 4N when inoculated with Lake Poerua parasites.

21 Table 1 - Infection rates of triploids, tetraploids and the average for each of the three parasite source populations (ploidy of their hosts shown in parentheses), and the five host populations with the frequency of triploid and tetraploid individuals within each sample.

%3N %4N % Parasite % 3N % 4N Infected Infected Infection Source Host Source Lake Gunn 52.94 47.05 33.33 43.75 38.23

Lake Haupiri 62.66 18.96 42.55 59.09 45.68

Lake Kaniere Lake Mavora 29.23 9.09 30 40 30.90 (2N Hosts) Lake Poerua 47.23 28.70 23.37 48.38 30.55

Lake Rotoiti 31.80 45.75 20.48 32.85 26.14

Lake Gunn 15.81 45.16 29.41 28.57 29.03

Lake Poerua Lake Mavora 31.25 11.76 30 25 29.41 (3N/4N Hosts) Lake Haupiri 59.09 14.47 29.23 27.27 28.94

Lake Haupiri 47.65 13.82 35.80 15.38 32.97

Lake Mavora 39.69 24.76 37.97 19.23 33.33 Lake Rotoroa (2N/3N Hosts) Lake Poerua 25.92 26.31 26.19 13.33 22.80

Lake Rotoiti 32.11 32.69 54.28 29.41 46.15

22 Table 2- Results of a generalized linear model predicting host infection based on ploidy level, parasite source, host source and all possible interactions. df: degrees of freedom, resid.dev: residual deviance.

Factor df resid. dev p-value Ploidy 1 0.02590 0.899 Host population 4 1.24140 0.871 Parasite source 2 11.95687 0.0036 Host source x Parasite source (interaction) 5 1.38714 0.9304 Ploidy x Host Source (interaction) 4 1.34272 0.854 Ploidy x Parasite Source (interaction) 2 12.42784 0.0020 Host Source x Parasite Source x Ploidy (interaction) 5 0.79313 0.9774

23 The Microphallus sp. transcriptome and an analysis of its taxonomic

relationship to other Digenea parasites

Authors: Christina E. Jenkins1,2,*, Diego Morales2, Daniel D. New3, Mark Dybdahl1, and Joanna

L. Kelley1

*Corresponding Author

Affiliations:

1 School of Biological Sciences, Washington State University, Pullman WA

2 Department of Biology, University of Idaho, Moscow ID

3 IBEST Genomics Resources Core, University of Idaho, Moscow ID

Abstract

The genetics of host-parasite interactions are important in coevolutionary biology. However, the traits and genes under coevolutionary selection in natural populations are largely unknown.

Arguably one of best natural systems for studying host-parasite coevolution, the fresh water snail

Potamopyrgus antipodarum and its trematode parasite Microphallus sp. is no exception. Here we used RNA-seq data to assemble and annotate the first transcriptome of Microphallus sp. to aid in identifying the genes underlying the host-parasite interaction, and facilitate future research in coevolutionary genetics. We sequenced, assembled, and annotated the Microphallus sp. transcriptome using samples of metacercaria dissected from the snail host, providing the first transcriptome for this species. To explore the genes associated with successful infection, we studied our transcriptome and designed comparisons with other well-studied trematodes. To facilitate comparative analysis with related trematodes, we first isolated 28S nrRNA and developed a phylogeny to place our undescribed species. To see which genes are expressed in the snail host, we first examined gene ontology (GO) terms. We found Microphallus sp.

24 transcripts that are associated with immunological terms, and with other pathogen terms. We then looked for similarity in the transcriptomes of our parasite and other related trematode parasites. We found high similarity with other digenean trematodes; specifically Schistosoma mansoni, Fasciola hepatica and all demonstrated high similarity with

Microphallus sp.. Finally, because the parasite metacercaria form in the snail host but also infect the vertebrate final host, we sought to determine if genes expressed were related to evasion of the vertebrate or invertebrate immune system. We found that although there is high similarity between our transcriptome and the transcripts from the snail portion of the S. mansoni lifecycle, the GO terms are qualitatively similar to the portion of the S. mansoni found within the vertebrate host. This work represents the first genomic and taxonomic data on Microphallus sp., which will facilitate future work on host-parasite interactions. We found putative genes associated with infection that merit further investigation. The transcriptome is similar to other parasitic helminth parasites, which is promising to look for other parasitic genes associated with infection within Microphallus sp.. Additionally, the metacercaria stage shows similarity with both the vertebrate and invertebrate portion of the S. mansoni lifecycle. This work will facilitate work on the genotypic interaction that drives the coevolutionary process.

Keywords: Microphallus sp., RNA-seq,

25 Introduction

The molecular and genetic basis of host-parasite interaction is important in coevolutionary biology (Hamilton 1980; Agrawal and Lively 2001; Nuismer and Otto 2004).

Studies using RNA-seq of a few key host or parasite species of medical importance reveal putative genes involved in resistance and suggest mechanisms by which genotype by genotype interactions (GxG) are mediated (Gasnier et al. 2000a; Hurtrez-Bousses et al. 2001; Young et al.

2011; Mitta et al. 2012). But we know of no studies that have looked at both host and parasite transcriptomes and none that have done so in a coevolutionary context. The molecular mechanisms of infection and the underlying genetics are barely known for any natural system of host-parasite coevolution (Barribeau et al. 2014). It is critical that we find the genomic regions involved in host-parasite coevolution to confirm the validity of theoretical predictions of coevolution and their applicability in naturally coevolving populations.

In order to facilitate coevolutionary analysis, we need genomic resources available for both the host and parasite in a system where we understand the coevolutionary process. One classic natural system to study coevolution is the fresh water snail, Potamopyrgus antipodarum, and its trematode parasite, Microphallus sp.. In this system, we know infection is based partially on the genetic identity of the host. Researchers utilize this system to examine negative frequency dependent selection — where the most common host genotype is consistently the most infected (Dybdahl and Lively 1995b; 1998). This genetic specificity is robust to the effect of variation in the physiological condition of the host (Krist et al. 2004). The parasite populations are adapted to their local host populations and this local adaptation is the result of genotype-specific tracking by the parasite (Lively and Dybdahl 2000; Lively et al. 2004b). This system has been seminal in demonstrating the evolution and maintenance of sexual reproduction

26 (Lively 1989; Koskella and Lively 2007; 2009; King and Lively 2009; King et al. 2011a).

However, what little is known of the specific molecular or genetic mechanisms of the interaction comes from experimental infection studies and indirect inference (Dybdahl et al. 2014b).

Genomic resources have been developed for the snail host P. antipodarum as part of an ongoing characterization of their molecular and the evolutionary history. Allozymes have been developed and used to track the genotype frequencies of P. antipodarum within populations over time (Dybdahl and Lively 1995a). The phylogenetic relationship and cladal composition of snails has been determined using cytochrome B sequence data, which showed that the spread of the snail across New Zealand corresponds to the receding glaciers in the last ice age (Neiman and

Lively 2004; Dybdahl and Drown 2010). Additionally, the transcriptome of P. antipodarum has been sequenced (Wilton et al. 2012b).

On the other hand, despite 30 years of extensive research in this coevolutionary system, very little genetic or genomic data are available to address questions about the evolutionary history or infection genetics for the parasite Microphallus sp.. Allozyme variation has been used to estimate gene flow among populations (Dybdahl and Lively 1996). We know that genetic interactions are important because of widespread local parasite adaptation, and the breakdown of infection in F1 hybrids between parasite populations (Dybdahl et al. 2008). But the molecular traits and specific genes that determine infectivity are unknown. However, a comparative approach is possible because of the expanding genomic information on helminth parasites, and specifically of other trematode parasites (Gasnier et al. 2000b; Roger et al. 2008; Han et al. 2009;

Consortium et al. 2010; Howe et al. 2016).

A comparative study of genes and proteins involved in infection in Microphallus sp. is hampered by a lack of taxonomic placement of the parasite; we do not know its evolutionary

27 history, so we are unable to robustly compare it to other parasite genomes. In fact, the parasite populations in New Zealand represent an undescribed taxon that was given the name

Microphallus by a personal communication (Lively and McKenzi 1991). Because of the extensive work in this system done by Curt Lively, it has been suggested that this parasite be called “Microphallus livelyi” (Hechinger 2012), but this work failed to describe the morphology or phylogenetic placement of Microphallus sp. Although this is an excellent system to study host-parasite coevolution, the lack of genomic resources and taxonomic affiliation for the parasite is a barrier to understanding the underlying genetic basis of infection.

We remedied this gap in coevolutionary research by examining actively transcribed genes in its metacercaria stage found within the snail host, and comparing expressed regions to other well studied trematode transcriptomes in order to identify active genes. We first sequenced and assembled the transcriptome of Microphallus sp. To facilitate our comparative approach, we isolated the 28S nrRNA gene and used it to determine the phylogenetic placement of this undescribed parasite to other well-studied trematode parasites. We then annotated the transcriptome, to look for proteins that may be involved in the host-parasite interaction. We looked for similarity between our parasite and similar well studied digenean trematode parasites.

We then looked for similarity between our parasite and medically relevant trematode parasites using NCBI Blast. Finally, we sought to determine if Microphallus sp. metacercaria is evading the vertebrate or invertebrate immune system by comparing our transcriptome to transcripts from the five Schistosoma mansoni life cycle stages. This work represents the first molecular and taxonomic characterization of these evolutionarily important parasites and will facilitate future studies into the genetics of host-parasite coevolution in this system.

28 Materials and Methods

Sampling and Sequencing

Microphallus sp. has a complex lifecycle, infecting both a vertebrate (bird) host and an invertebrate (snail) host. Eggs containing miracidia are shed from the bird into the environment through feces. Snails ingest these eggs during regular feeding. The eggs develop into sporocysts which asexually reproduce into metacercaria within the snail host. Metacercaria are ingested by the duck when they ingest the snail, and the adults undergo sexual reproduction to produce eggs

(Galakionov et al. 2012). Due to ease of sampling, we isolated metacercaria by dissecting infected snails.

All of the Microphallus sp. parasites used in this study were isolated by dissecting P. antipodarum hosts from populations that are known to have high rates of infection (Lively

1987). We isolated Microphallus cysts from four different populations on the South Island of

New Zealand (Table 1). We collected parasites from five snails from each of the four populations.

RNA was extracted from each of the parasites using a phenol-chloroform RNA extraction protocol (Gasic et al. 2004). All individuals within populations were pooled and the samples were purified by selective depletion of the ribosomal RNA transcripts from total RNA

(RiboMinus Eukaryote Kit for RNA-Seq). We then removed any genomic DNA contamination using TURBO DNA-free kit (Ambion). We ran an RNA picochip analysis on the Agilent 2100 for quality control and found the RNA to be of sufficient quality to make into libraries. Lake

Rotoroa was of lower quality because of RNA degradation, but was of sufficient quality to include. We then made cDNA using both random hexamers and oligo-dT primers to minimize positional bias in the transcriptome data (Chapalamadugu et al. 2014). Approximately half of the

29 mRNA from each population was prepared for sequencing on a 454 sequencer. We used the protocol from the 454 sequencing cDNA rapid library prep manual to prepare the 454 libraries

(Roche Diagnostics GmbH 2010). The other half of the cDNA was made into Illumina libraries using the Apollo 324 library prep Truseq (Integenx). The samples were then run on an Illumina miSeq platform using a v2 500 cycle kit (2x257 cycles).

To remove any remaining rRNA contamination prior to assembly, the reads were filtered in silico. We created a Microphalloidea ribosomal database by searching the NCBI database for ribosomal sequences within this family of trematodes. Using Bowtie 2 (Langmead and Salzberg

2012), all reads were aligned to our ribosomal database, and reads that did not map (those that were not ribosomal) were collected in a separate fasta file. Of the 5,614,799 raw reads across both sequencing platforms, 63.9% were removed. The resulting 2,022,980 reads were used in transcriptome assembly.

Assembling Reference Transcriptomes

The reads from both the Illumina and 454 platforms from all populations were combined and assembled using the MIRA Hybrid assembler (Chevreux et al. 1999). MIRA has an internal trimming algorithm that removes the adapters from all reads prior to assembly. Additionally,

MIRA removes excess copies of any given site through digital normalization. Although

2,022,980 reads were read into the assembler, only 109,831 reads were used in the reference transcriptome, largely due to normalization (Table 3).

To assess quality, all pooled reads were then mapped back onto the reference transcriptome using bwa aligner (Li and Durbin 2009; Li et al. 2009). Additionally, reads from

30 each population were independently mapped to the reference transcriptome to determine if each population was adequately represented in the reference (Table 4).

Phylogenetics

The large ribosomal subunit, 28S nrRNA gene, which is widely used in phylogeny reconstruction, has also been specifically informative in determining relationships between species and genera in the Microphalloidea family (Tkach et al. 2003b; Galaktionov et al. 2012;

Kudlai et al. 2015). We isolated this genomic region from our reference transcriptome by comparing a generic Microphallus sp. 28S nrRNA sequence to our transcriptome and pulling out the hit with the highest similarity score (length= 1785 bp, similarity=94%). We included species from Microphallidae, Prosthogonomidae, Pleurogenidae, and Lecithodendriidae subfamilies.

Additionally we included a number of undescribed Microphallus species from Russia, Australia, and Japan (Kakui 2011; Galaktionov et al. 2012; Kudlai et al. 2015). All 28S nrRNA sequences were downloaded from the NCBI nucleotide database (Table 5).

Sequences were aligned using MAFFT v7.037b (Katoh and Standley 2013). The best-fit models of sequence evolution for the gene region was determined using a decision-theory (DT) approach (Minin et al. 2003), which was implemented in PAUP* 4.01a147 (Swofford 2002). The maximum likelihood analysis was conducted with RAxML v8.0.3 (Stamatakis 2014) using the

GTR + G model. One hundred searches for the best tree were performed and clade support was assessed with 1,000 bootstrap replicates. Bayesian inference analysis was conducted with

MrBayes v3.2.6 (Ronquist et al. 2012) on the CIPRES portal (Miller et al. 2010). Each analysis consisted of four independent runs with four Monte Carlo Markov Chains (MCMC) for 50 million generation and tree sampling every 1,000th generation, using the rate variation selected

31 by DT approach and using reversible-jump Markov Chain Monte Carlo (rjMCMC) to allow sampling across the entire substitution rate model space (nst = mixed). Convergence of the four independent MCMC runs was assessed using Tracer 1.6 (Rambaut and Drummond 2003). A

50% majority rule consensus tree was generated and posterior probability (PP) calculated after removing the first 10% of sampled trees.

Transcriptome Annotation

We used BLASTx (Camacho et al. 2009) to compare our reference transcriptome to the

NCBI non-redundant database (Pruitt 2004). We also compared the transcriptome to the

SwissProt database, but did not find hits that were not captured by the NCBI nr database BLAST results. We limited our BLAST search to hits with e-value < 10-5 and matched the resulting

NCBI entrez ID and with UniprotKB gene ontology (GO) terms (Table 6).

We found that our parasite was similar to a number of other helminth parasites, and so we compared our transcriptome to trematodes assembled transcriptomes from the WormBase database (Howe et al. 2016). Reciprocal similarity searches were conducted using blastn with an e-value threshold of 0.0001. We then calculated the percent coverage as the number of unique hits, divided by the number of sequences within the database. The higher the percent coverage, the more similar the two transcriptomes.

To determine whether Microphallus sp. metacercaria is expressing genes that might be involved with evasion of the vertebrate or invertebrate immune system, we blasted our transcriptome to ESTs from each of the five stages of the S. mansoni lifecycle. Sequences were downloaded from the NCBI EST database, and assembled into a blast database using the NCBI blast toolkit. Similarity searches were conducted using blastn with an e-value threshold of

32 0.0001. However, while blast comparisons show how similar proteins are between our parasite and the different stages of S. mansoni, it fails to capture what those similarities might be, and whether they are related to evading the immune system. As a result, we also compared the GO terms from Microphallus sp. metacercaria with ESTs found of the snail stage of S. mansoni and the vertebrate stage. We specifically focused on the GO terms related to the immune system, and immune response to determine if the metacercaria stage is evading the invertebrate or vertebrate immune system.

Results

Assessing the quality of the Transcriptome

Across all four populations and both sequencing platforms, we generated a total of

2,022,980 reads, 320,415 from the 454 platform and 1,702,565 reads from the Illumina platform.

After quality checks, a total of 109,831 reads were used in the reference assembly, out of the

2,022,980 total reads that were imported into MIRA. In total, 18,000 contigs were assembled with a mean length of 504 bp, ranging in size from 4,446 bp to 362 bp. The general quality of a de novo transcriptome assembly can be assessed by quantifying the proportion of reads that map back to the transcriptome, where greater than 50% is considered high quality (O'Neil and Emrich

2013). Of the combined trimmed 454 and Illumina reads, 83% mapped to our assembled transcriptome — signifying high quality. We then mapped the trimmed reads from each population to the reference transcriptome. The percentage of reads mapped ranged from 18.63%

(Lake Rotoroa) to 83.7% (Lake Kaniere) (Table 4).

33 Putative proteins involved in infection

For annotation, putative gene sequences were first searched using the BLASTx tool against the SwissProt database using a cut-off e-value of 10-5 and limiting our search to one result per contig. Using this approach, 1864 genes (10.4% of all contigs assembled) returned an above the cut-off BLAST result (see supplementary file). GO assignments were used to classify the function of the predicted Microphallus sp. genes (Table 5). In each of the main categories

(biological process, cellular component and molecular function) of the GO classification,

“cellular process”, “cell” and “binding” are the most represented, respectively. Not surprisingly, these are all housekeeping processes and are likely not specific to host-parasite interactions.

Interestingly, a few contigs linked to GO terms that may be associated with capacity to infect. The “leukocyte activation” GO term (0.37% of our contigs) is of some interest.

Leukocyte activation is defined as a change in morphology and behavior of a leukocyte resulting from exposure to a specific antigen. Extracellular parasites have been known to evade the host immune system through mimicking surface cell receptors. As another example, we found ~1% of our contigs linked with GO terms that are linked with viruses “viral entry into host cell” and

“virion attachment to host cell,” both of which are involved with a viral pathogen infecting the host. While our parasite is far removed from viruses, it is possible that these contigs are linked to similar immune evasion proteins and thus merit further investigation. Finally, ~ 1% of contigs were linked to immune effector process, which can be either immune response of Microphallus sp. to infection by other organisms, or could be used to evade the immune system of its host.

While these are not definitely involved in infection, it does give a suite of genes that are worth investigating.

34 Finding similar parasites

One possible avenue to find genomic regions involved in the interaction between

Microphallus and P. antipodarum is to look for genes involved in infection in similar parasites.

To that end, we sought to understand how closely related our parasite is to other digenean trematodes and other Microphallidea parasites by generating a phylogeny. Our tree contains representative species from all families of the suborder Digenia (Table 6). Our parasite is found within the Microphallidea clade, comprising four major clades with strong support (Figure 4).

The first clade consisted of the Lecithodendriidae subfamily (7 species), the second clade contained the Pleurogenidae subfamily (8 species), and the third clade contained the

Prosthogonomidae subfamily (12 species). The final clade contained all the undescribed species of Microphallus, including our species, collected from Potamopyrgus antipodarum, and largely contained the Microphallidae subfamily. Our species is a close sister taxa with an undescribed

Microphallidae species collected from the Brisbane River in Australia (Kudlai et al. 2015), and

Microphallus fusiformis. Together these three form a clade distinct from the genus clade, and either sister to or part of the larger clade Microphallus. While our tree has them within the Microphallus clade, the nodal support is too low to definitively place them within this genus

(Figure 3 and 4).

Previous work has placed the two other species within our identified clade, Microphallus fusiformis and the Microphallidae species from Brisbane, closer to the in the Maritrema clade

(Kudlai et al. 2015), as opposed to the Microphallus clade. To determine if these three species are within the Maritrema clade, we did a topology test. Briefly, we conducted ML searches with a constraining topology to include the three species within the Maritrema clade. Then the difference in likelihood scores between the constrained and unconstrained ML trees were

35 calculated as the test statistic, and a null distribution for this test statistic was generated by simulating new datasets using the topology and parameter estimates from the constrained likelihood search. After 100 simulations, we rejected the hypothesis that these three trematode species are within the Maritrema clade (p= 0.009).

We looked for similarity with other related parasites, to determine whether we could expect to find proteins known to be involved other well studied host-parasite interactions within our system. Within the Microphallidea clade, there are no sequenced and annotated transcriptomes. However, our blast results demonstrated that our parasite was similar to several medically important, and therefore well studied parasites (Figure 5). We compared our transcriptome to other digenean trematodes for which a full transcriptome is available,

(Clonorchis sinensis, Echinostoma caproni, Fasciola hepatica, viverrini,

Schistosoma mansoni and Trichobilharzia regenti) as well as the well described nematode,

Caenorhabditis elegans. While the species that were similar are not taxonomically close, they are very well characterized. Knowing how similar our parasite is to these well studied trematodes may allow us to find similar proteins involved in the host-parasite interaction from these systems within our system. We used reciprocal blast searches, and analyzed the number of similar contigs (unique hits) vs. the number of unique contigs to determine percent coverage or similarity of each trematode with Microphallus sp. (Table 7). We found that there was high similarity with S. mansoni (66%), C. sinenesis (51%), and F. hepatica (45%). The genetic basis of infection in both S. mansoni and F. hepatica has been investigated for the vertebrate host, and in S. mansoni, the molecular basis of infection has been discovered in the snail, Biomaphlaria glabrata.

36 Evading the vertebrate or invertebrate immune system

Our final approach to identifying genes that might be important to host parasite interactions was to determine whether expressed transcripts map to genes that might be involved in immune evasion in the invertebrate or vertebrate host. Microphallus sp. has a multistage life cycle involving both a vertebrate host (water fowl) and an invertebrate host (P. antipodarum). To determine the genetics of infection with the snail, we needed to discern if it is primarily expressing genes to evade the vertebrate immune or invertebrate immune system in the collected metacercaria stage. We compared our transcriptome to ESTs from each of the five stages in the similar parasite, S. mansoni. The stages can be categorized as being within the vertebrate

(humans) or the invertebrate (B. glabrata). We found that our transcriptome had the most similarity with the miracidia portion of the lifecycle (51.76%), which is the portion of the S. mansoni lifecycle where the trematode is infecting the snail (Table 7). However, the blast similarity does not tell us the function of the similar genes. Therefore, we considered the GO terms from the stage that lives within the vertebrate (adult worms) and within the invertebrate

(sporocysts), compared to the GO terms associated with our metacercaria transcriptome (Figure

6). Qualitatively, the adult stage of S. mansoni is more similar to our parasite than the sporocyst stage (Table 8).

Discussion

Over the past 30 years, this trematode parasite Microphallus sp. has been used extensively in studies of the evolutionary ecology of host-parasite interactions. Despite the extensive research done in this system, the genetics of host-parasite interaction remain unknown.

37 Here we present the first sequenced and assembled transcriptome for this parasite, and build a foundation for determining the genetic basis of host-parasite interaction.

We examined the GO terms for potential genomic regions that are involved in infection. The majority of the GO terms were comprised of general housekeeping genes; however there were some interesting exceptions. Notably, there were genes associated with immune function, and virion capsid and binding. The first group, immune function genes, could be one of two things.

The trematodes themselves could be expressing immune proteins to prevent infection of bacteria or other pathogens. However, these proteins could also be to evade the host immune system, and, given that some are associated with the vertebrate immune system (leukocyte activation), then these represent possible candidates for host-parasite interaction. Additionally, virion associated genes, especially those associated with virion capsid and attachment, are genes potentially involved in infection. While our parasite does not have an intracellular stage, the capsid proteins could be used to evade the snail or vertebrate immune systems in a similar fashion as virion proteins. These make up a promising suite of proteins that may be involved in the host-parasite interaction.

In addition to not knowing the genetics of infection, the evolutionary history and taxonomy of our parasite is entirely unknown. Hechinger (2012) described this species as “Microphallus

‘livelyi’”, but his analysis did not provide any molecular or morphological evidence to place this species within the Microphallus genus. Finding taxonomically similar parasites will allow us to look for infection associated proteins from other parasites within our parasite. Our phylogenetic analysis demonstrates that while this parasite clearly falls within the Microphallidae clade, it is not clear that it falls within the Microphallus genus. Rather it is closely related to another undescribed species of Microphallidae found in Australia and Microphallus fusiformis. These

38 three either make up a separate clade of Microphallidae, or are part of the Microphallus genus.

Previous work, which also utilized the 28S nrRNA gene, determined that the Brisbane River

Microphallidae and Microphallus fusiformis were an independent clade, but placed them closer to Maritrema rather than Microphallus (Kudlai et al. 2015). However, support for this relationship was low; other studies have found Microphallus fusiformis to be closer to the

Microphallus clade (Tkach et al. 2003b). Our topology test demonstrated that Microphallus sp.,

Microphallidae Brisbane and Microphallus fusiformis are definitively not within the Maritrema clade, but we cannot confirm that it is within the Microphallus clade. In order to better resolve the position of these three species, a few key pieces of data are needed. First, we need to include more than the one genomic region used to construct our phylogeny. However, the majority of previous phylogenies examining the Microphallidae taxonomy was constructed using only the

28S nrRNA gene region (Olson et al. 2003; Tkach et al. 2003a; Galaktionov et al. 2012), as this is the only genomic region available for most Microphalloidea. While we sequenced and assembled a transcriptome for Microphallus sp., to our knowledge, this is the first and only full transcriptome available for this clade. The paucity of transcriptomes for other Microphalloidea species, or even of data from gene regions other than 28S nrRNA, is impeding the generation of a tree that may ultimately resolve the evolutionary history of our parasite.

In addition, our phylogenetic analysis demonstrated that our parasite is related to other trematode parasites that are medically relevant and well-studied. Utilizing our blast search of the

SwissProt database, we found that our parasite had strong similarity with other helminth parasites such as Schistosoma mansoni, Echinococcus granulosa, Oxytricha trifallax, Tricuris trichuria and Hymenolypsis microstoma. We further looked for similarity between our parasite and medically relevant parasites using reciprocal blast, and found that our parasite shares high

39 similarity with S. mansoni, F. hepatica and C. senensis. The genetic basis of infection in both S. mansoni and F. hepatica has been investigated for the vertebrate host, and in S. mansoni, the molecular basis of infection has been discovered in the snail, Biomaphlaria glabrata. Therefore, the similarity between Microphallus sp. and these well studied trematodes allows us to look for similar proteins associated with infection within Microphallus sp., and could lead to finding the genomic region involved in infection.

We also examined whether the metacercaria stage used in this study is evading the invertebrate immune system (within the snail), or preparing to evade the vertebrate immune system (within the waterfowl). We found that our transcriptome has high similarity with the miracidia stage in S. mansoni, which is the portion of the S. mansoni lifecycle that penetrates and actively infects the snail Biomaphlaria glabrata. However, qualitatively, the GO terms associated with the adult worm were more similar to the Microphallus sp. metacercaria than the sporocysts. Given that the sporocysts reside for multiple generations within their snail host, and our transcript shares similarity with another portion of the S. mansoni lifecycle within the snail, this is surprising. It appears that during the metacercaria stage, the parasite is expressing genes to evade the snail immune system, due to the similarity with the miracidia, but that it is also preparing to evade the vertebrate immune system. It is worth examining the transcriptomic differences between different stages of the Microphallus sp. life cycle, to determine the difference between stages and the potential role those differences play in the genomic basis of host-parasite interaction.

Overall, this work represents the first molecular data available for the trematode parasite

Microphallus sp. of the fresh water snail P. antipodarum. From these sequences, we can look at variation within and between populations to understand the evolutionary history of the parasite

40 population, and further studies of the genetic basis of host-parasite coevolution within this system.

Acknowledgements

Data collection and analyses were performed by the IBEST Genomics Resources Core at the

University of Idaho and were supported in part by NIH COBRE grant P30GM103324.

41 Microphallus sp. Wellers Rock New 1.00 93 0.72 Microphallus sp. IBC 2010 Russia Microphallus basodactylophallus 0.99 56 Microphallus primas 1.00 100 0.60 Microphallus abortivus 1.00 90 Microphallidae sp. Okinawa Microphallus triangulatus 1.00 99 1.00 Microphallus minutus 0.86 98 60 Microphallus similis Microphallus sp. New Zealand 1.00 100 1.00 Microphallidae gen. sp. Australia 96 Microphallus fusiformis Microphallidae Maritrema poulini 0.60 56 1.00 Maritrema brevisacciferum 94 1.00 1.00 Maritrema deblocki 66 66 Maritrema neomi 0.55 0.51 Maritrema novaezealandense 56 1.00 1.00 Maritrema eroliae 72 100 Maritrema heardi

1.00 Microphalloidea 88 Maritrema oocysta Maritrema prosthometra 1.00 1.00 100 100 0.63 Maritrema arenaria Pycnoporus megacotyle 1.00 100 Pycnoporus heteroporus 1.00 77 0.79 Prosthodendrium hurkovaae 0.81 61 Lecithodendrium linstowi Lecithodendriidae Prosthodendrium longiforme 1.00 1.00 1.00 100 100 Prosthodendrium chilostomum 100 1.00 78 Prosthodendrium parvouterus Ophiosacculus mehelyi Lecithopyge rastellus 1.00 100 1.00 Haplometra cylindracea 100 Plagiorchis vespertilionis 1.00 76 Haematoloechus varioplexus 1.00 Prosthogonomidae 1.00 100 100 0.77 Haematoloechus longiplexus 51 Telorchis assula Brachycoelium salamandrae Schistogonimus rarus 0.62 1.00 100 1.00 100 Prosthogonimus cuneatus Parabascus semisquamosus 1.00 1.00 Pleurogenidae 98 84 1.00 Parabascus joannae 98 0.88 Parabascus duboisi 52

1.00 Allassogonoporus amphoraeformis 100 Loxogenes macrocirra Pleurogenes claviger 1.00 99 1.00 Candidotrema loossi 99 Pleurogenoides medians Prosotocus confusus Brandesia turgida 0.005

Figure 3 - Assembled phylogeny for Microphalloidea, including the species Microphallus sp. from New Zealand, colored red.

42 Schistosoma mansoni

Hungatella hathewayi

Stylonychia lemnae

Crassostrea gigas

Trichuris trichiura

Nematostella vectensis Species Names Echinococcus granulosus

Medicago truncatula

Hymenolepis microstoma

Oxytricha trifallax

0 100 200 300 400 Number of Blast Hits Figure 4- Top blast hits by species.

43 Metacercariae Adult Sporocysts Biological Process Cellular Component Molecular Function

activation of immune response chemorepellent activity leukocyte activation nutrient reservoir activity somatic diversification of immune receptors

antigen processing and presentation circadian sleep/wake cycle leukocyte homeostasis organelle structural molecule activity

antioxidant activity detoxification leukocyte mediated cytotoxicity organelle part supramolecular fiber

apolipoprotein A−I receptor activity developmental growth leukocyte migration other organism synapse

behavior developmental process localization other organism part synapse part

beta selection electron carrier activity locomotion plasmodesma T cell costimulation

binding erythrocyte differentiation macromolecular complex polymeric cytoskeletal fiber T cell selection

biological adhesion extracellular matrix membrane presynaptic process involved in synaptic transmission transcription cofactor activity

biological regulation extracellular matrix component membrane part production of molecular mediator of immune response transcription factor activity, protein binding

cargo receptor activity extracellular region membrane−enclosed lumen protein tag transcription factor activity, sequence−specific DNA binding

cartilage condensation extracellular region part metabolic process receptor activity transcription factor activity, transcription factor binding

catalytic activity growth metallochaperone activity reproduction translation regulator activity

cell hematopoietic or lymphoid organ development mitochondrial nucleoid reproductive process transmembrane receptor activity

cell adhesion hemocyte differentiation molecular function regulator response to stimulus transporter activity

cell junction hemocyte proliferation multi−organism process rhythmic process virion

cell killing immune effector process multicellular organismal process signal transducer activity virion attachment to host cell

cell part immune response myeloid cell homeostasis signaling pattern recognition receptor activity virion part

cellular component organization or biogenesis immune system development neurotransmitter secretion signaling receptor activity virus receptor activity

cellular process immune system process nucleic acid binding transcription factor activity single organism signaling

chemoattractant activity laminin receptor activity nucleoid single−organism process

Figure 5- GO terms associated with annotated genes within the Microphallus sp. Metacercaria, S.

mansoni adult worms (resides within vertebrates), and S. mansoni sporocysts (reside within

invertebrates).

44 Table 3- Information from both 454 and Illumina reads, and the subsequent hybrid assembly.

Library Information

Total number of reads 2,022,980

Total number of reads after trimming 1,719,533

Total number of reads assembled 98,259

Average trimmed 454 read length 396 bp

Shortest trimmed 454 read 246

Longest trimmed 454 read 706

Average trimmed Illumina read length 228

Shortest trimmed Illumina read 79

Longest trimmed Illumina read 256

Total number of contigs 18,000

Mean length of contigs 504 bp

Contig N50 492 bp

45 Table 4- Population specific data for reads included in the reference assembly and the percentage of reads that mapped to the reference transcriptome.

Parasite 454 Illumina Total number of Reads mapped to Population Trimmed Trimmed Trimmed Reads Reference Reads Reads Transcriptome Lake Rotoroa 91,941 437,469 529,410 18.63%

Lake Kaniere 85,008 279,225 364,233 62.36%

Lake Alexandrina 93,588 517,072 610,660 78.18%

Lake Ianthe 49,878 468,799 518,677 83.70%

46 Table 5 – GO terms with their associated descriptions of the annotated contigs from the

Microphallus sp. transcriptome.

GO Category GO Term Percentage cellular process 17.89 metabolic process 15.41 single-organism process 14.26 biological regulation 9.51 response to stimulus 6.00 localization 5.88 cellular component organization or biogenesis 5.63 multicellular organismal process 5.28 developmental process 5.13 single organism signaling 2.75 multi-organism process 2.13 reproduction 1.31 reproductive process 1.29 locomotion 1.29 immune system process 1.08 biological adhesion 1.04 cell adhesion 0.98 immune response 0.55 Biological Processes behavior 0.46 growth 0.43 immune system development 0.39 immune effector process 0.20 rhythmic process 0.20 activation of immune response 0.18 leukocyte activation 0.17 leukocyte migration 0.12 detoxification 0.10 myeloid cell homeostasis 0.08 antigen processing and presentation 0.07 virion attachment to host cell 0.07 presynaptic process involved in synaptic transmission 0.04 leukocyte mediated cytotoxicity 0.03 somatic diversification of immune receptors 0.03 production of molecular mediator of immune response 0.03 cartilage condensation 0.01 T cell selection 0.01

47 cell 18.54 cell part 18.49 organelle 15.61 organelle part 9.85 membrane 8.12 macromolecular complex 7.36 membrane part 5.35 extracellular region 5.00 extracellular region part 3.94 membrane-enclosed lumen 3.32 Cellular Component cell junction 1.23 polymeric cytoskeletal fiber 0.94 synapse 0.63 extracellular matrix 0.47 synapse part 0.42 other organism 0.16 other organism part 0.16 virion 0.15 extracellular matrix component 0.12 virion part 0.09 mitochondrial nucleoid 0.06 binding 44.42 catalytic activity 28.32 structural molecule activity 7.86 transporter activity 5.45 molecular function regulator 3.99 receptor activity 1.92 signal transducer activity 1.54 electron carrier activity 1.39 transcription factor activity, sequence- specific DNA binding 1.09 transcription factor activity, transcription factor binding 1.02 Molecular Function signaling receptor activity 0.98 transmembrane receptor activity 0.64 cargo receptor activity 0.56 antioxidant activity 0.26 signaling pattern recognition receptor activity 0.19 translation regulator activity 0.11 virus receptor activity 0.08 protein tag 0.08 laminin receptor activity 0.04 apolipoprotein A-I receptor activity 0.04 chemorepellent activity 0.04

48 Table 6 – Taxonomic information, NCBI accession numbers, and references for all samples included in our 28s nrDNA phylogeny.

Accession Taxonomic Group Species Number Source Class Trematoda Subclass Order Aspidogastrida Family Aspidogastridae Aspidogaster conchicola AY222162 Olson 2003 Cotylaspis sp AY222165 Olson 2003 Cotylogaster basiri AY222164 Olson 2003 Lobatostoma manteri AY157177 Olson 2003 Multicotyle purvisi AY222166 Olson 2003 Family Multicalycidae Multicalyx elegans AY222163 Olson 2003 Order Stichocotylida Family Rugogastridae Rugogaster hydrolagi AY157176 Olson 2003 Subclass Digenea Order Echinostomida Superfamily Echinostomoidea Family Atractotrematidae Atractotrema sigani AY222267 Olson 2003 Family Echinostomatidae Echinostoma revolutum AY222246 Olson 2003 Euparyphium melis AF151941 Olson 2003 Family Fasciolidae Fasciola gigantica AY222245 Olson 2003 Fasciola hepatica AY222244 Olson 2003 Family Haploporidae Hapladena nasonis AY222265 Olson 2003 Pseudomegasolena ishigakiense AY222266 Olson 2003 Family Hymenocotta mulli AY222239 Olson 2003 Haplosplanchnidae Skrjabinoeces similis AY222279 Olson 2003 Family Philophthalmidae Cloacitrema narrabeenensis AY222248 Olson 2003 Unidentified philophthalmid AY222247 Olson 2003 Family Psilostomidae Psilochasmus oxyurus AF151940 Olson 2003 Superfamily Heronimoidea Family Heronimidae Heronimus mollis AY116878 Olson 2003 Superfamily Paramphistomoidea Family Cladorchiidae Solenorchis travassosi AY222213 Olson 2003 Superfamily Pronocephaloidea Family Labicolidae Labicola elongata AY222221 Olson 2003 Family Notocotylidae Notocotylus sp AY222219 Olson 2003 Family Opisthotrematidae Lankatrema mannarense AY222222 Olson 2003 Family Pronocephalidae Macrovestibulum obtusicaudum AY116877 Olson 2003 Family Rhabdiopoeidae Rhabdiopoeus taylori AY222218 Olson 2003 Taprobanella bicaudata AY222112 Olson 2003 Superfamily Microscaphidioidea

49 Family Mesometridae Mesometra sp AY222216 Olson 2003 Family Microscaphidiidae Hexangium sp AY222215 Olson 2003 Order Gaevskajatrema halosauropsi AY222207 Olson 2003 Superfamily Allocreadioidea Macvicaria macassarensis AY222208 Olson 2003 Family Peracreadium idoneum AY222209 Olson 2003 Family Opistholebetidae Maculifer sp. AY222211 Olson 2003 Opistholebes amplicoelus AY222210 Olson 2003 Superfamily Lepocreadioidea Family Cableia pudica AY222251 Olson 2003 Stephanostomum baccatum AY222256 Olson 2003 Family Apocreadiidae Homalometron armatum AY222241 Olson 2003 Homalometron synagris AY222243 Olson 2003 Neoapocreadium splendens AY222242 Olson 2003 Schistorchis zancli AY222240 Olson 2003 Family Brachycladiidae Zalophotrema hepaticum AY222255 Olson 2003 Family Enenteridae Enenterum aureum AY222232 Olson 2003 Koseiria xishaense AY222233 Olson 2003 Family Gorgocephalidae Gorgocephalus kyphosi AY222234 Olson 2003 Family Gyliauchenidae Paragyliauchen arusettae AY222235 Olson 2003 Family Lepocreadiidae Preptetos caballeroi AY222236 Olson 2003 Preptetos trulla AY222237 Olson 2003 Superfamily Microphalloidea Family Microphallidae Microphallidae gen.(Australia) KT355820 Kudlai 2015 Microphallidae sp. (Okinawa) AB974360 Kakui 2011 Microphallus abortivus AY220626 Galaktionov 2012 Microphallus basodactylophallus AY220628 Galaktionov 2012 Microphallus fusiformis AY220633 Galaktionov 2012 Microphallus primas AY220627 Galaktionov 2012 Microphallus similis AY220625 Galaktionov 2012 Microphallus sp. (NZ) KJ868217 O'Dwyer 2014 Microphallus sp. (Russia) HM584142 Galaktionov 2012 Maritrema arenaria AY220629 Galaktionov 2012 Maritrema neomi AF151927 Galaktionov 2012 Maritrema oocysta AY220630 Galaktionov 2012 Maritrema prosthometra AY220631 Galaktionov 2012 Maritrema subdolum AF151926 Galaktionov 2012 Floridatrema heardi AY220632 Galaktionov 2012 Superfamily Opisthorchioidea Family Cryptogonimidae Caecincola parvulus AY222231 Olson 2003 Siphodera vinaledwardsii AY222230 Olson 2003 Mitotrema anthostomatum AY222229 Olson 2003 Family Cryptocotyle lingua AY222228 Olson 2003 lacteum AY222227 Olson 2003

50 Haplorchoides sp. AY222226 Olson 2003 Family Amphimerus ovalis AY116876 Olson 2003 Superfamily Plagiorchioidea Family Auridistomidae Auridistomum chelydrae AY116872 Olson 2003 Family Brachycoeliidae Brachycoelium salamandrae AF151935 Olson 2003 Mesocoelium sp. AY222277 Olson 2003 Family Cephalogonimidae Cephalogonimus retusus AY222276 Olson 2003 Family Choanocotylidae Choanocotyle hobbsi AY116865 Olson 2003 Family Dicrocoeliidae Brachylecithum lobatum AY222260 Olson 2003 Lyperosomum collurionis AY222259 Olson 2003 Dicrocoelium dendriticum AY222261 Olson 2003 Family Encyclometridae Encyclometra colubrimurorum AF184254 Olson 2003 Family Gorgoderidae Degeneria halosauri AY222257 Olson 2003 Gorgodera cygnoides AY222264 Olson 2003 Nagmia floridensis AY222262 Olson 2003 Xystretrum sp AY222263 Olson 2003 Family Lecithodendriidae Lecithodendrium linstowi AF151919 Olson 2003 Prosthodendrium longiforme AF151921 Olson 2003 Family Omphalometridae Rubenstrema exasperatum AY222275 Olson 2003 Family Pachypsolidae Pachypsolus irroratus AY222274 Olson 2003 Family Lecithodendriidae Lecithodendrium linstowi AF151919 Tkach 2003 Ophiosacculus mehelyi AF480167 Tkach 2003 Prosthodendrium chilostomum AF151920 Tkach 2003 Prosthodendrium hurkovaae AF151922 Tkach 2003 Prosthodendrium longiforme AF151921 Tkach 2003 Prosthodendrium parvouterus AY220617 Tkach 2003 Pycnoporus heteroporus AF151918 Tkach 2003 Pycnoporus megacotyle AF151917 Tkach 2003 Family Plagiorchiidae Plagiorchis vespertilionis AF151931 Tkach 2003 Lecithopyge rastellus AF151932 Tkach 2003 Haematoloechus varioplexus AF387798 Tkach 2003 Haematoloechus longiplexus AY222280 Olson 2003 Glypthelmins quieta AY222278 Olson 2003 Family Pleurogenidae Parabascus duboisi AY220618 Tkach 2003 Parabascus joannae AY220619 Tkach 2003 Parabascus semisquamosus AF151923 Tkach 2003 Brandesia turgida AY220622 Tkach 2003 Loxogenes macrocirra AY220624 Tkach 2003 Pleurogenes claviger AF151925 Olson 2003 Pleurogenoides medians AF433670 Olson 2003 Family Prosthogonimus cuneatus AY220634 Tkach 2003 Prosthogonimus ovatus AF151928 Olson 2003 Schistogonimus rarus AY116869 Olson 2003

51 Family Telorchiidae Opisthioglyphe ranae AF151929 Olson 2003 Telorchis assula AF151915 Olson 2003 Superfamily Renicoloidea Family Renicolidae Renicola sp AY116871 Olson 2003 Superfamily Troglotrematoidea Family Orchipedidae Orchipedum tracheicola AY222258 Olson 2003 Family Paragonimidae Paragonimus iloktsuenensis AY116875 Olson 2003 Paragonimus westermani AY116874 Olson 2003 Family Troglotrematidae Nephrotrema truncatum AF151936 Olson 2003 Nanophyetus salminicola AY116873 Olson 2003 Superfamily Zoogonoidea Family Faustulidae Antorchis pomacanthi AY222268 Olson 2003 Trigonocryptus conus AY222270 Olson 2003 Bacciger lesteri AY222269 Olson 2003 Family Lissorchiidae Lissorchis kritskyi AY222250 Olson 2003 Family Monorchiidae Ancylocoelium typicum AY222254 Olson 2003 Provitellus turrum AY222253 Olson 2003 Diplomonorchis leiostomi AY222252 Olson 2003 Family Zoogonidae Deretrema nahaense AY222273 Olson 2003 Lecithophyllum botryophorum AY222205 Olson 2003 Diphterostomum sp. AY222272 Olson 2003 Zoogonoides viviparus AY222271 Olson 2003 Order Strigeida Superfamily Azygioidea Family Azygiidae Otodistomum cestoides AY222187 Olson 2003 Superfamily Bivesiculoidea Family Bivesiculidae Bivesicula claviformis AY222182 Olson 2003 Bivesicula unexpecta AY222181 Olson 2003 Bivesiculoides fusiformis AY222183 Olson 2003 Superfamily Brachylaimoidea Family Brachylaima sp. AY222167 Olson 2003 Brachylaima thompsoni AF184262 Olson 2003 Zeylanurotrema spearei AY222170 Olson 2003 Family Leucochloridiidae Leucochloridium perturbatum AY222169 Olson 2003 Urogonimus macrostomus AY222168 Olson 2003 Family Bucephalidae Prosorhynchoides gracilescens AY222224 Olson 2003 Rhipidocotyle galeata AY222225 Olson 2003 Superfamily Clinostomoidea Family Clinostomidae Clinostomum sp. AY222175 Olson 2003 Clinostomum sp. AY222176 Olson 2003 Superfamily Cyclocoeloidea Family Cyclocoelidae Family Eucotylidae Tanaisia fedtschenkoi AY116870 Olson 2003 Superfamily Diplostomoidea

52 Family Diplostomidae Alaria alata AF184263 Olson 2003 Diplostomum phoxinib AY222173 Olson 2003 Family Strigeidae Apharyngostrigea cornu AF184264 Olson 2003 Cardiocephaloides longicollis AY222171 Olson 2003 Ichthyocotylurus erraticus AY222172 Olson 2003 Superfamily Gymnophalloidea Family Callodistomidae Prosthenhystera obesa AY222206 Olson 2003 Family Fellodistomidae Fellodistomum fellis AY222282 Olson 2003 Olssonium turneri AY222283 Olson 2003 Proctoeces maculatus AY222284 Olson 2003 Steringophorus margolisi AY222281 Olson 2003 Family Tandanicolidae Prosogonarium angelae AY222285 Olson 2003 Superfamily Hemiuroidea Family AY222190 Olson 2003 Family Derogenidae Derogenes varicus AY222189 Olson 2003 Hemiperina manteri AY222196 Olson 2003 Family Didymozoidae Unidentified didymozoid AY222192 Olson 2003 Unidentified didymozoid AY222193 Olson 2003 Unidentified didymozoid AY222194 Olson 2003 Didymozoon scombri AY222195 Olson 2003 Family Hemiuridae Dinurus longisinus AY222202 Olson 2003 Lecithochirium caesionis AY222200 Olson 2003 Lecithocladium excisum AY222203 Olson 2003 Machidatrema chilostoma AY222197 Olson 2003 Merlucciotrema praeclarum AY222204 Olson 2003 Opisthadena dimidia AY222198 Olson 2003 Plerurus digitatus AY222201 Olson 2003 Family Lecithasteridae Lecithaster gibbosus AY222199 Olson 2003

Family Sclerodistomidae Prosogonotrema bilabiatum AY222191 Olson 2003 Superfamily Schistosomatoidea Family Sanguinicolidae Unidentified sanguinicolid AY157174 Olson 2003 Aporocotyle spinosicanalis AY222177 Olson 2003 Chimaerohemecus trondheimensis AY157239 Olson 2003 Neoparacardicola nasonis AY222179 Olson 2003 Plethorchis acanthus AY222178 Olson 2003 Sanguinicola inermis AY222180 Olson 2003 Family Schistosomatidae Austrobilharzia terrigalensisb AY157249 Olson 2003 Bilharziella polonica AY157240 Olson 2003 Dendritobilharzia pulverulenta AY157241 Olson 2003 Gigantobilharzia huronensis AY157242 Olson 2003 Heterobilharzia americana AY157246 Olson 2003 Ornithobilharziella AY157248 Olson 2003

53 canaliculata Schistosoma japonicum AY157607 Olson 2003 Schistosoma mansoni AY157173 Olson 2003 Schistosomatium douthitti AY157247 Olson 2003 Schistosoma haematobium AY157263 Olson 2003 Family Spirorchiidae Spirorchis scripta AY222174 Olson 2003

Superfamily Transversotrematoidea Family Crusziella formosa AY222185 Olson 2003 Prototransversotrema steeri AY222184 Olson 2003 haasi AY222186 Olson 2003

54 Table 7 – Results of reciprocal blast species of Microphallus sp. transcriptome to the assembled transcriptomes of the nematode C. elegans, and the trematode parasites C. sinensis, E caproni, F. hepatica, O. viverrini, S. mansoni and T. regent.

Microphallus sp. to focal species Focal species to Microphallus sp.

Unique Unique % Unique Unique % Microphall hits in Coverage hits in Microphall Coverage us sp. focal in focal focal us sp. in focal transcripts species species species transcripts species Caenorhabditis 460 31191 1.45 400 17600 2.27 elegans Clonorchis 6911 6723 50.69 1462 16538 8.12 sinensis Echinostoma 144 18463 0.77 141 17859 0.78 caproni Fasciola 7016 8723 44.57 1713 16287 9.52 hepatica Opisthorchis 515 15841 3.15 507 17493 2.82 viverrini Schistosoma 7831 3997 66.21 2414 15586 20.41 mansoni Trichobilharzia 1015 21170 4.58 538 17462 2.99 regenti

55 Table 8 – Comparison of Microphallus sp. transcriptome with EST libraries of the five stages of the S. mansoni life cycle.

Host Life Cycle Stage Unique Hits Unique Microphallus % Coverage in in species Schistosoma to stage specific sp. transcripts each stage mansoni database Eggs 4831 13169 26.84 B. Miracidia 9316 8684 51.76 glabrata Sporocysts 4286 13714 23.81 H. sapien Cercariae 5714 12286 31.74 Adult Worm 4831 13169 26.84

56 Table 9 – Candidate genes with GO annotations from the SwissProt database that are associated with the Microphallus sp. metacercaria, S. mansoni adult worms (found within vertebrates), and

S. mansoni sporocysts (found within invertebrates).

GO Microphallus sp. S. mansoni S. mansoni Category metacercaria Adult worms Sporocysts GO Term % GO Term % GO Term % cellular process 17.89 cellular process 21.44 cellular process 29.19 metabolic process 15.41 metabolic process 19.87 metabolic process 27.36 single-organism single-organism single-organism process 14.26 process 15.56 process 9.14 biological biological biological regulation 9.51 regulation 8.26 regulation 6.19 cellular cellular component component response to organization or organization or stimulus 6 biogenesis 6.20 biogenesis 5.35 response to response to localization 5.88 stimulus 5.61 stimulus 4.01 cellular component

organization or biogenesis 5.63 localization 5.59 localization 3.70 multicellular organismal multicellular multi-organism process 5.28 organismal process 3.71 process 2.58 developmental developmental developmental process 5.13 process 3.66 process 2.50 multicellular single organism single organism organismal signaling 2.75 signaling 2.29 process 2.32 multi-organism multi-organism immune system process 2.13 process 1.46 process 1.52 Biological Processes Biological reproduction 1.31 reproduction 1.01 reproduction 1.25 reproductive reproductive reproductive process 1.29 process 0.99 process 1.20 immune system single organism locomotion 1.29 process 0.85 signaling 0.80 antigen immune system processing and process 1.08 locomotion 0.60 presentation 0.53 hematopoietic or biological lymphoid organ adhesion 1.04 immune response 0.41 development 0.53 cell adhesion 0.98 detoxification 0.32 immune response 0.49 immune response 0.55 behavior 0.32 locomotion 0.45 activation of behavior 0.46 biological adhesion 0.31 immune response 0.27 growth 0.43 growth 0.27 rhythmic process 0.27 immune system 0.39 immune system 0.23 neurotransmitter 0.09

57 development development secretion immune effector circadian process 0.2 rhythmic process 0.19 sleep/wake cycle 0.09 immune effector developmental rhythmic process 0.2 process 0.15 growth 0.09 activation of activation of erythrocyte immune response 0.18 immune response 0.13 differentiation 0.04 leukocyte antigen processing leukocyte activation 0.17 and presentation 0.12 migration 0.04 leukocyte myeloid cell migration 0.12 homeostasis 0.10 leukocyte detoxification 0.1 activation 0.10 presynaptic process involved in myeloid cell synaptic homeostasis 0.08 transmission 0.10 antigen processing and leukocyte presentation 0.07 migration 0.06 virion attachment leukocyte to host cell 0.07 homeostasis 0.03 presynaptic process involved in synaptic transmission 0.04 cell killing 0.02 leukocyte mediated cartilage cytotoxicity 0.03 condensation 0.01 somatic diversification of immune receptors 0.03 T cell costimulation 0.01 production of molecular somatic mediator of diversification of immune response 0.03 immune receptors 0.01 production of molecular mediator cartilage of immune condensation 0.01 response 0.01 T cell selection 0.01 differentiation cell 18.54 cell 20.55 cell part 23.61111111

cell part 18.49 cell part 20.47 organelle 21.46164021 macromolecular organelle 15.61 organelle 15.65 complex 20.3042328 organelle part 9.85 organelle part 10.10 organelle part 13.26058201 macromolecular membrane 8.12 complex 8.41 membrane 4.96031746 macromolecular membrane- complex 7.36 membrane 7.39 enclosed lumen 4.728835979 extracellular membrane part 5.35 membrane part 4.82 region 3.538359788 extracellular membrane- extracellular

Cellular Component Cellular region 5 enclosed lumen 3.89 region part 3.472222222 extracellular 3.94 extracellular region 3.12 cell junction 2.149470899

58 region part membrane- extracellular region polymeric enclosed lumen 3.32 part 2.67 cytoskeletal fiber 0.992063492 cell junction 1.23 cell junction 1.01 membrane part 0.760582011 polymeric supramolecular cytoskeletal fiber 0.94 fiber 0.67 plasmodesma 0.562169312 synapse 0.63 synapse 0.39 synapse part 0.099206349 extracellular matrix 0.47 synapse part 0.35 synapse 0.099206349 synapse part 0.42 plasmodesma 0.16 other organism 0.16 extracellular matrix 0.14 other organism part 0.16 nucleoid 0.06 extracellular matrix virion 0.15 component 0.05 extracellular matrix component 0.12 other organism 0.03 virion part 0.09 other organism part 0.03 mitochondrial nucleoid 0.06 virion 0.02 virion part 0.01 structural binding 44.42 binding 43.94 molecule activity 42.29 catalytic activity 28.32 catalytic activity 35.22 binding 39.43 structural structural molecule molecule activity 7.86 activity 7.86 catalytic activity 15.52 transporter transporter activity 5.45 transporter activity 5.09 activity 1.14 molecular molecular function signal transducer function regulator 3.99 regulator 2.60 activity 0.38 electron carrier translation receptor activity 1.92 activity 1.11 regulator activity 0.38

signal transducer signal transducer molecular activity 1.54 activity 0.87 function regulator 0.38 electron carrier virus receptor activity 1.39 antioxidant activity 0.77 activity 0.19 transcription factor activity, transcription factor sequence-specific activity, protein transcription DNA binding 1.09 binding 0.67 cofactor activity 0.19 transcription factor activity, transcription Molecular Function Molecular factor binding 1.02 receptor activity 0.52 protein tag 0.10 nucleic acid binding signaling receptor transcription factor activity 0.98 activity 0.52 transmembrane signaling receptor receptor activity 0.64 activity 0.28 cargo receptor transmembrane activity 0.56 receptor activity 0.18 antioxidant translation activity 0.26 regulator activity 0.09 signaling pattern 0.19 protein tag 0.08

59 recognition receptor activity translation cargo receptor regulator activity 0.11 activity 0.08 virus receptor laminin receptor activity 0.08 activity 0.05 virus receptor protein tag 0.08 activity 0.03 laminin receptor metallochaperone activity 0.04 activity 0.02 apolipoprotein A- nutrient reservoir I receptor activity 0.04 activity 0.01 chemorepellent apolipoprotein A-I activity 0.04 receptor activity 0.01 chemoattractant activity 0.01

60 Identifying genomic hot spots of coevolution in host-parasite systems

Authors: Christina E. Jenkins1,2,*, Mark Dybdahl1, and Scott L. Nuismer1

*Corresponding Author

Affiliations:

1 School of Biological Sciences, Washington State University, Pullman WA

2 Department of Biology, University of Idaho, Moscow ID

Keywords: Coevolution, molecular ecology

Abstract

The genetics of host-parasite interactions are important in coevolutionary biology. However, finding genomic regions that contribute to coevolution has proven prohibitively difficult. One promising avenue is to find loci that are involved in local adaptation, which theory has demonstrated should spatially covary between the host and parasite. Here, we formalize and evaluate the power of using spatial covariances to identify genomic regions involved in coevolution. We used individual-based simulations to test the robustness of this technique by determining our ability to detect coevolving loci under different models of infection (escalating models and matching models), when multiple loci contribute to infection, and under varying quantities of local adaptation. Under escalating models of infection, we were not able to find coevolving regions. However, when interactions are mediated by the matching models of infection, we found that when the populations that are sampled were locally adapted (high selection and low migration), and we sampled many populations, we were consistently able to detect coevolving loci. Additionally, the combination of high local adaptation and high number of populations sampled allowed us to detect coevolving regions when multiple loci are involved

61 in the infective or resistant phenotype. Overall, we developed a powerful tool to find the genomic regions involved in host-parasite coevolution.

62 Introduction

Host–parasite coevolution has the potential to drive many evolutionary transitions: from asexual to sexual (Hamilton 1980; Hamilton et al. 1990), from haploid to diploid (Nuismer and

Otto 2004; Oswald and Nuismer 2007), and from selfing to outcrossing (Agrawal and Lively

2001). However, one emerging theme from theoretical studies is that the outcome of these evolutionary transitions is dependent on the genetics of the interaction. Theoretical models investigate a small handful of genetic models of interaction between host and parasite which are designed to mimic the genes that underlie molecular interactions between host defense and parasite attack. However, the extent to which these theoretical genotype x genotype interactions

(G x G) are realistic or how they alter coevolution in natural populations is subject to debate

(Dybdahl et al 2014). In order to understand how host – parasite coevolution can cause evolutionary transitions in natural populations, we need a deeper understanding of the genes and the genetic models that mediate coevolution.

Despite the importance of coevolving genes in evolutionary biology, finding them has been prohibitively difficult. Standard approaches focus typically on either host or parasite but rarely both (Barribeau et al. 2014). One problem is that there could be many loci of small effect that together make up the resistant or infective genotype. If this were the case, then the genomic regions involved in coevolution would be undetectable using traditional genomic measures of selection (Visscher et al. 2012; Ben Lehner 2013). Given that known immune proteins often have multiple loci contributing to each protein, this is not an unreasonable barrier (Klein 1986;

Kasahara 1999). Additionally, the large number of candidate loci generated in a RNA-seq experiment poses the problem of separating out the genomic regions responding to coevolutionary selection from those responding to other factors (Romero et al. 2012). However,

63 the greatest challenge is identifying the specific loci contributing to GxG interactions and driving coevolution in both the host and parasite. Standard approaches find genes in the host (Hacquard et al. 2011) or in the parasite (Barrett et al. 2009; Sperschneider et al. 2015) independently, but not the key combinations of genes in both the host and parasite that work together to determine the outcome of the GxG interactions. Without connecting both sides of the coevolutionary interaction, our understanding of the outcomes of coevolution in natural populations remains unpredictable at best.

One promising area of research to find genotypes associated with phenotypes is to look for genomic regions that are adapted to their local environment (Coop et al. 2010; Keller et al.

2012). One definition of local adaptation is the covariance between a genotype and its environment, and therefore genotypes that covary with an environmental variable across populations are locally adapted to that variable (Blanquart et al. 2013). This principle can be extended to coevolutionary local adaptation, because parasites within a population are often locally adapted to their hosts due to reciprocal selection (Kawecki and Ebert 2004; Greischar and

Koskella 2007). Theoretically, the biotic component of local adaptation has been described as the spatial covariance between host and parasite genotypes and this theoretical prediction suggests a potentially powerful approach to detecting functionally linked loci (Nuismer and Gandon 2008).

Utilizing locally adapted populations, we can look for genomic regions (i.e. SNPs or microsatellites) that spatially covary between the host and parasite, as these will be those that are responding to coevolutionary selection. Thus, established theory suggests scanning the genomes of host and parasites for regions that covary across space, which should allow us to isolate the genomic regions involved in coevolution and provides a powerful alternative to conventional approaches.

64 Here, we formalize and evaluate the power of using spatial covariances to identify covarying genomic regions of coevolution. To this end, we addressed the following specific questions: 1) Is the ability to detect the loci undergoing coevolution affected by the genotypic model of infection? 2) Are we able to detect multiple coevolving loci that contribute to the coevolving genotype? and 3) Does the strength of local adaptation alter our ability to detect coevolving genomic regions? Overall, we determined how robust this technique is and developed a powerful new calculation to find the genomic regions involved in host-parasite coevolution.

Overview of Approach

Our Approach

Our technique utilizes spatial covariance to determine which loci are involved in host- parasite coevolution. Because of the reciprocity of coevolutionary selection, as an allele in the host for resistance increases in frequency within the population, in response an allele for resistance will also increase with the same population. Thus, the coevolving alleles will have similar allele frequencies within populations, and will spatially covary across populations

(Nuismer and Gandon 2008). We can therefore look for genomic regions that significantly spatially correlate, as these will be candidate loci for those involved in coevolution.

The reciprocal selection that defines coevolution can generate local adaptation, which ultimately underlies our technique. While an organism can, and will, adapt to both biotic and abiotic conditions, for simplicity we focus here on the GxG interactions that drive biotic local adaptation. We know that this biotic component of local adaptation, Δ, can be expressed as:

∗ Δ = �,,�(�, �) (1)

65 Where �,,, is the fitness consequence of interactions between host and parasite genotypes, � is host genotype �, and � is host genotype � (Gandon and Nuismer 2009). More specifically, this term measures the spatial covariance between host and parasite genotype frequencies. If this covariance is positive for the parasite, then the parasite genotypes are found more frequently than expected with host genotypes that confer the largest fitness benefit to the parasite. Under this condition, the parasite is locally adapted. As a result, in locally adapted populations, there must be genes that covary across space and our technique is most likely to work in systems that show strong patterns of local adaptation.

Required Data

To find genomic regions involved in coevolution, we must start with a basic understanding of the segregating genetic variation within both host and parasite populations.

Because our technique relies on the covariance between host and parasite genotypes, we need to understand the variance within both host and parasite populations. Therefore, one would need to know the allele frequency of SNPs or microsatellites for both the host and the parasite in the study system of interest. Additionally, as significance in statistical analysis is based on sample size, and here our sample size is the number of populations sampled, our technique relies on sampling a number of populations. Therefore, one would need to know the allele frequency of any molecular marker, and therefore the variation, across a number of different populations.

While these two requirements may seem unsurmountable, there are already datasets for which this data is available, or could easily be acquired. For example, within the New Zealand snail,

Potamopygrus antipodarum and its trematode parasite Microphallus sp. System, SNPs have been found within the host, and will soon be evaluated in the parasite (Wilton et al. 2012b; Paczesniak

66 et al. 2013). There are dozens of lakes across New Zealand in which the snail and trematode are coevolving, giving us the ability to sample many different populations (Lively et al. 2004c).

Another system that would immediately be able to utilize this technique is the ,

Daphnia magna and its bacterial pathogen, Pasteuria ramose (Andras and Ebert 2012). In both these systems, there are a multitude of populations, and there is either an understanding of the variation within host and parasite populations, or the molecular tools are currently being developed.

Method testing

In order to evaluate how well our method might work, we developed genetically explicit individual based simulations and then searched for coevolving genes by identifying loci whose allele frequencies covary across space. In the following sections, we describe the details of our simulations and the method used to identify coevolving genes; we then quantify the performance of our method across a wide range of parameters and genetic models of coevolution.

Individual Based Simulations

We used individual-based simulations to generate populations of hosts and parasites that are coevolved to test for conditions under which coevolving loci covary. These simulations assume that all hosts and parasites are diploid, and “genomes”. We assume each locus within our host and parasite genomes is biallelic, and therefore each allele is either 0 or 1. Because each allele is determined randomly, the initial allele frequency is approximately 0.5 for each locus.

67 We then picked which of the loci across our host and parasite genomes would be coevolving. Loci were picked at random without replacement, such that for each simulation, there are a set number of distinct loci that were involved in infection and resistance. All remaining unpicked loci were not involved in the coevolving phenotype, or under any other form selection.

Each generation within the simulation undergoes life-cycle steps that individuals proceed through. First, we allowed individual hosts and parasites to encounter one another at random, and the outcome of the encounter was determined by the probability of infection. The probability of infection is in turn dependent on the model of interaction (Figure 7).

Both discrete and continuous escalation interactions assume that fitness is dependent on phenotypic difference. We modeled a realistic scenario in host – parasite interactions where a parasite fails to infect a host if that host has more circulating defense proteins than the parasite has infection proteins. This scenario generates selection for hosts to have increasingly more circulating immune proteins and parasites to have increasingly more infection proteins. To model this kind of escalation scenario, we utilized the following equation to evaluate the probability of

successful infection, �,, of individual of host (phenotype =� ) in an encounter with a parasite

(phenotype = �), as (Abrams 2000):

�, = 1 + (2) [()] where � determines the sensitivity of the probability of successful infection to changes in host and parasite phenotypes. Thus, as the host phenotype, �, increases, the host becomes more resistant and as the parasite phenotype, �,increases, the parasite becomes more virulent (Figure

7).

68 A discrete model that causes escalation is the GFG model, where infection is based on the presence of resistance genes in the host and avirulence genes in the pathogen. Empirical studies have demonstrated that the virulent allele is generally recessive and resistant alleles are dominant, so the infection matrix we utilized reflects this (Figure 7).

In the interactions mediated by phenotypic matching, fitness outcomes depend on similarity of an interacting host and parasite phenotype. The matching models are largely based on the immune system being able to distinguish self from nonself. When a parasite is able to match the host genotype, the host immune system sees this as “self” and will not mount an immune response, allowing the parasite to infect. Hosts can therefore successfully defend against parasites that are deemed nonself, or genotypes that do not match their own. For the continuous form of this interaction, we utilized the following equation to evaluate the outcome of an individual host, with the phenotype � in an encounter with an individual parasite, with the phenotype � (Abrams 2000):

�, = exp [−�(� − �) ] (3)

The discrete matching model is simply the MAM. We utilized a version of the MAM matrix that assumes the host immune system is codominant (Nuismer and Otto 2004). Similar to natural examples of self – nonself recognition, heterozygote hosts express both alleles and therefore are able to be infected by both parasite genotypes, while homozygous hosts are only able to be infected by homozygous parasites (Figure 7).

The outcome of the interaction, either host infection or resistance, determined each individual’s fitness. All hosts and parasites within the population had an initial fitness of 1. If the host was infected, then host fitness (� ) was reduced by a set amount. If the host was

69 not infected, then the unsuccessful parasite’s fitness (� ) was reduced by the same set amount �.

� = � = 1 − � (4)

Individuals were then killed if their fitness fell below a certain threshold.

The hosts and parasites that successfully survived the interaction were allowed to reproduce. We assumed all individuals within both populations are both diploid and sexual.

Recombination occurs between chromosomes within individuals at a set rate r. Recombination then leads to the production of a haploid gamete. Haploid gametes of two randomly selected parents unite to form a zygote. These assumptions allowed us to accurately simulate random mating in a diploid sexual population by including both recombination and segregation. Each generation was non-overlapping in that adults died after reproduction.

After reproduction, all alleles were allowed to mutate at rate µ. Additionally, individuals migrated at a set rate m. Migration is symmetric— if an individual from one population migrates to another population, then a replacement must migrate back to the migrant’s starting population.

Ultimately, the simulations generate populations of hosts and parasites with “genomes” that have been coevolving for 800 generations. We then altered the models of infection and evolutionary parameters to allow us to address what an empiricist might see when they sample natural populations.

Identifying Coevolving Loci

Our technique utilizes spatial correlations to determine which loci are involved in host- parasite coevolution. We start by calculating the correlation between the allele frequencies for

70 each host and parasite pair of loci. We used Pearson’s correlation coefficient to calculate the spatial correlation:

( )( ) � = (5) () ()

between each possible host-parasite locus pair, resulting in a correlation matrix that has dimensions equal to the number of host loci and the number of parasite loci.

To determine significance, which indicates the loci are coevolving, we transformed the r value for each host and parasite locus pair into its subsequent t-value using the following:

� = (6)

We then determined if each locus combination was significantly correlated or not by defining a significance threshold. We determined the t-value for which p=0.05 given the number of populations sampled, which is statistically equal to the degrees of freedom. All t-values that fall below this threshold are significant, and therefore those coevolving loci are significantly correlated.

At the end of each simulation, we calculated the spatial correlation for each host-parasite locus pair. The calculated spatial correlations were then used to determine the efficacy of our technique across different models of infection and parameters. We defined efficacy as the rate of false negatives. When two loci involved in coevolution are not significantly correlated, we refer to this as a false negative (type II error rate-false negative). The type II error rate was calculated per locus at the end of each simulation. The type II error rate was determined by summing the number of times a coevolving locus was not significant, then dividing that sum by the number of coevolving loci within that run of the simulation:

71 ���� �� ����� ���� = (7)

To account for stochasticity within the simulation, we then averaged over 100 simulations with the same parameters to determine type II error rate under any given set of parameters. We consider the technique is successful when type II error rate is low.

Performance of method

To determine how robust our technique is in different populations, we quantified the parameters that could alter the spatial covariance between all host and parasite loci. For all infection models, simulations were run to quantify the effect of selection intensity, mutation rate, number of coevolving loci, number of loci in each genome, recombination rate, population size, and migration rate on the ability to detect coevolving loci.

A number of these parameters explicitly alter the amount of local adaptation across populations. As a result, we also measured the magnitude of local adaptation under each set of parameters. We used the sympatric vs allopatric measure of local adaptation, �� (Blanquart et al. 2013), and measured the average infection rate within each population �, minus the global infection rate, �.

Δ�� = � − � (8)

To this end, we simulated what would happen in a laboratory full factorial cross inoculation experiment using the populations at the end of each simulation. This quantification of local adaptation allowed us to explicitly examine what parameters affect local adaptation and which do not, but may alter the ability to detect coevolving loci (Type II error rate).

For both our discrete genetic interaction (GFG model) and our continuous genetic interaction, we failed to find the loci involved in coevolution. That is, the type II error rate was

72 consistently at or close to 1, demonstrating that our coevolving loci are not significantly covarying (Table 1).

Under both matching models of infection, the parameters we tested had a similar impact

(Table 10). The number of populations sampled, selection intensity, migration, number of loci coevolving, and number of individuals within each population all have an effect on the type II error rate. The remaining parameters (recombination, mutation, and genome size) do not alter the efficacy of our ability to detect the genomic regions involved in coevolution.

In both matching models, the amount of local adaptation was determined largely by the amount of selection vs. migration. Additionally, across all the parameters we tested, there was a strong negative correlation between type II error rate and the amount of local adaptation

(p<0.001), such that as local adaptation increased, the ability to detect coevolving loci increased.

Additionally, across all parameters measured, the type II error rate decreased as the number of populations increased (Figure 8).

When compared to the continuous matching model, the same patterns held true for all parameters in the discrete matching allele model, with one important exception (Table 1). When considering more than one locus coevolving, the amount of local adaptation plummets and the type II error rate is elevated. This is likely dependent on our assumption that all coevolving loci must match for a host to be infected. Thus, if any of the host loci fix while the parasite loci are lost or vice versa, then all the remaining coevolving loci are not under selection because no infection can occur. We relaxed the assumption of absolute epistasis, and instead measured our ability to detect coevolving loci when the probability of infection is instead related to the number of loci that do not match �, and the amount of epistasis, �:

�, = (1 − �) (9)

73 We found that when epistasis is low, then we are again able to consistently detect the loci involved in coevolution when more than one locus is coevolving. However, when epistasis is high, we are unable to find genomic regions involved in coevolution, similar to the discrete matching model with absolute epistasis.

We have determined that we can consistently find the genomic regions involved in coevolution when there is sufficient local adaptation, and enough populations are sampled. Thus, rather than look across all evolutionary parameters that we tested, which may or may not be known in each population, we instead focused on the relationship between local adaptation and number of populations and their effect on type II error rate. As you increase the statistical power to detect (number of populations sampled), then less local adaptation is needed to detect coevolving spatially correlated loci. Thus, if empiricists know the magnitude of local adaptation, then they can determine how many populations need to be sampled to consistently detect the genomic regions involved in coevolution (Figure 9).

Discussion

We developed a novel technique to find the genomic regions involved in host-parasite coevolution based on finding locally adapted regions. We found that coevolving loci have a statistically significant spatial correlation, thus allowing us to filter those loci that are likely coevolving from those that are not. Specifically, if the coevolving loci are locally adapted, such as can be found in matching models of infection, they will spatially covary, allowing our technique to consistently detect these loci. Additionally, there is an inverse relationship between the number of populations sampled and the ability to consistently detect the loci that are involved

74 in coevolution. Therefore, for this technique to work consistently, one would need to sample many locally adapted populations.

While our technique was tested using simulated populations of hosts and parasites, the results determine a subset of conditions that are required for our technique to be applied to natural populations. First, the populations examined need to be locally adapted to the biotic environment. While we can imagine that most empirical researchers do not know the amount of migration or selection between each of their populations, a quantification of local adaptation is a common step in analyzing the relationship between hosts and parasites across populations.

Additionally, the efficacy of our technique is determined by both local adaptation and the number of populations sampled. Therefore if the populations being studied have lower amounts of local adaptation, then more populations will need to be sampled to accurately detect the genomic regions involved in coevolution.

Unlike the matching models of infection, we found that under both discrete and continuous escalation models, the coevolving loci are not significantly correlated under any of the parameters we examined. This is not surprising, given that escalation models do not result in local adaptation. Because our technique relies at least in part on local adaptation occurring with parasites to their hosts, it is clear why coevolving loci in escalation models do not spatially covary.

One potential caveat of this technique is that all alleles that are undergoing coevolution have to be sequenced in a number of individuals. Like other genomic techniques for linking phenotypes to genotypes such as genome wide association studies (GWAS), finding genomic regions in the trait under selection often relies on regions that are linked to the genomic region of interest having have a signature of selection. Thus, it is often the case that the regions

75 surrounding the genomic region linked to a selected trait are sufficient to find the genomic region of interest. This is not the case for our technique, as it relies on multiple populations all undergoing local adaptation. Linkage and recombination are slightly different within each population, resulting in the signature of selection, or covariance of genomic regions surrounding our region of interest varying with each population. In order to detect genomic regions under coevolution using correlation, it is necessary to both have all genomic regions sequenced, and have enough depth at each locus to detect variation. This requires potentially costly deep sequencing, and sequencing techniques such as RAD sequencing will be ineffective. But as the cost of sequencing decreases and the of available genomes increases, this hurdle may become minimal over time.

Another potential barrier to using correlation to detect coevolving genomic regions is the potential hindrance of genomic regions under selection to the abiotic environment. There are two ways this could be problematic. The first is that when the host and parasite population are both adapting to the same abiotic environment, then the loci involved in that adaptation will exhibit a similar covariance as the one we have outlined here. Secondly, if a genomic region in one species is under strong positive or negative selection that is concordant with one of the coevolutionary cycles, it will also appear to have a significant correlation, despite not being involved in the coevolutionary interaction. However, both of these issues would depend on all the populations within a study adapting to the same environmental variable, as our technique relies on spatial covariance. It seems unlikely that a single environmental variable will be a consistent problem across many populations, but more testing of this technique is needed to determine how problematic abiotic environmental influence may be.

76 Finally, there are some promising areas of development that are made possible by our current research. The most obvious follow up research to this work is testing it empirically. As the number of species where genomic material is available across populations increases, this will likely become possible in the near future. One potential system is agricultural plants where genomic data is available in both hosts and parasites. With a new found ability to detect genomic regions undergoing coevolution, this work may become empirically possible.

Importantly, this gives us a tool to find the genomic regions involved in coevolution. This technique can be used across different species and taxa to determine the genomic regions by which hosts are resistant to infection and parasites are able to infect. Once these genomics regions are determined, it opens a whole world of understanding both theoretical and empirical coevolution. Theoretically, we can better inform the models of coevolution to determine which theoretical predictions are likely representatives of what is happening in natural populations and which models are unrealistically simplified. Empirically, we will be better able to detect the strength and variation in selection due to coevolution, and confirm the theoretical predictions produced. Overall, it will enable the field of coevolution to measurably move forward.

77 Discrete Traits Continuous Traits

1.00 Host

Genotypes 0.75 Escalation AA Aa aa 0.50 AA I I I Fitness of Host 0.25 Aa R R I

Parasite 0.00 Genotypes aa R R I −1.0 −0.5 0.0 0.5 1.0 Host Phenotype−Parasite Phenotype

1.00 Host Genotypes 0.75 Matching

AA Aa aa 0.50 AA I I R Fitness of Host 0.25 Aa R I R

0.00 Parasite

Genotypes aa −1.0 −0.5 0.0 0.5 1.0 R I I Host Phenotype−Parasite Phenotype

Figure 6 - The four different genetic interactions that were tested. The interactions can be broadly grouped into two categories: escalation and matching, continuous and discrete. The first category addresses how phenotypes will interact with each other, either based on the host matching the immune system, matching, or an arms race dynamic, escalation. The second category refers to how the infection phenotype is translated from the underlying genotype, either in an additive polygenic fashion, continuous, or with each infection locus pair explicitly interacting, discrete.

78 Continuous( Matching Discrete(Matching

Figure 7 - The evolutionary conditions under which the spatial covariance is strong enough to be detected are those that have an impact on local adaptation. We examined the relationship between local adaptation and type II error rate, and found a significant negative relationship. To address the sampling conditions an empiricist might need to find coevolving genomic regions, we found a significant negative relationship between number of populations sampled and type II error rate. Therefore, in order to find coevolving genomic regions, the population samples must be locally adapted and have many populations sampled.

79 1 Locus Coevolving 5 Loci Coevolving 0.35 0.6 0.33 0.32 0.3 0.3 0.31 0.2 0.3 0.29 0.6 0.3

0.28 0.3 Continuous Matching 0.27 0.6 0.3 0.26 0.3 0.3 0.6 0.3 0.25 0.6 0 0.24 0.3 0.2 0.2 0.1 0.23 0.8 0.4 0.22 0.3 0 0 0.21 0.5 0.6 0.4 0 0.2 0.6 0.4 0.4 0.19 0.8 0.4 0.6 0.3 0 0.18 0.7 0.6 0.6 0.4 0.1 0 0 0 0 0.17 0.6 0.2 0 0.5 0.2 0 0 0.16 0.4 0.4 0.3 0 0.6 0.4 0 0.2 0 0.15 0.6 0.5 0.2 0 0 0.4 0.4 0 0 0 0.14 0.5 0.5 0 0 0 0.5 0.4 0 0 0 0 0.13 0.6 0.5 0 0 0.1 0.5 0.4 0 0 0 0 0.12 0.5 0.4 0.1 0 0 0 0.4 0.4 0 0 0 0 0 0.11 0.5 0.5 0 0 0.1 0 0.5 0.4 0 0.1 0 0 0 0.1 0.6 0.5 0.1 0 0.1 0.1 0.1 0.5 0.4 0 0 0 0 0 0.09 0.6 0.5 0.1 0.1 0 0.1 0 0.4 0.3 0 0.1 0 0 0 0.08 0.5 0.5 0.2 0.1 0 0 0 0.5 0.3 0 0 0 0 0 0.07 0.6 0.5 0.1 0.1 0 0 0 0.4 0.4 0 0 0 0 0 0.06 0.6 0.5 0.1 0.1 0 0 0 0.5 0.4 0 0 0 0 0 0.05 0.6 0.5 0.3 0.1 0.1 0 0 0.5 0.4 0 0 0 0 0 0.04 0.6 0.5 0.3 0.2 0.1 0.1 0 0.5 0.4 0.2 0 0 0 0 0.03 0.6 0.5 0.2 0.2 0.2 0.1 0 0.5 0.4 0.2 0.1 0 0.1 0 0.02 0.6 0.6 0.3 0.2 0.2 0.2 0.1 0.6 0.5 0.3 0.2 0.1 0.1 0 0.01 0.7 0.6 0.4 0.3 0.2 0.2 0 0.6 0.7 0.5 0.3 0.1 0.5 0.1 0 0.7 0.6 0.4 0.3 0.2 0.1 0 0.7 0.8 0.6 0.5 0.1 0.5 0.2

0.35 0.33 0.7 0.32 0.31 0.3 0.7 0.2 0.29

0.3 Low Epistasis Discrete 0.28 0.27 0.5 0.26 0.6 0.25 0.7 0.2 0.24 0.7 0.5 0.7 0.2 0.23 0.7 0.22 0.6 0.2 0.21 0.5 0 0.5 0.2 0.7 0 0 0.19 0.7 0.6 0.3 0.18 0.7 0.6 0.2 0 0 0.17 0.5 0.6 0.6 0.2 0 0 0.16 0.4 0.4 0.1 0.6 0 0 0.15 0.6 0.2 0.1 0.7 0.4 0.3 0.2 0 0.14 0.6 0.6 0.2 0 0 0.7 0.1 0 0 0 0 0.13 0.5 0.4 0 0 0 0.5 0.3 0.2 0 0 0 0.12 0.5 0.5 0.1 0 0 0 0 0.6 0.3 0.2 0 0 0 0 0.11 0.5 0.5 0.1 0 0 0 0 0.7 0.4 0.1 0 0 0 0 TypeIIerror 0.1 0.6 0.5 0.1 0 0 0 0 0.6 0.3 0.1 0 0 0 0 0.09 0.5 0.5 0.1 0 0 0 0 0.6 0.3 0.1 0 0 0 0 0.08 0.5 0.5 0.1 0 0 0 0 0.6 0.4 0.1 0 0 0 0 1.00 0.07 0.6 0.5 0.1 0 0 0 0 0.6 0.3 0.1 0 0 0 0 0.06 0.6 0.5 0.1 0 0 0 0 0.6 0.3 0.1 0 0 0 0 0.05 0.6 0.4 0.2 0 0 0 0 0.5 0.4 0.1 0 0 0 0 0.04 0.5 0.5 0.2 0 0 0 0 0.6 0.4 0.1 0 0 0 0 0.03 0.5 0.5 0.3 0.1 0 0 0 0.6 0.4 0.1 0 0.1 0 0 0.02 0.6 0.5 0.3 0.1 0.1 0.1 0.1 0.6 0.4 0.3 0 0.2 0 0.1 0.75 0.01 0.6 0.5 0.3 0.3 0.3 0.2 0.3 0.7 0.4 0.4 0.3 0.4 0.1 0.2 0 0.6 0.5 0.4 0.5 0.6 0.4 0.4 0.6 0.4 0.4 0.4 0.6 0.2 0.6

0.35 0.33 0.50 0.32 0.31 0.3

0.29 High Epistasis Discrete 0.28 0.27 0.25 0.26 0.25 0.24 0.6 0.23 0.22 0.5 0 0.00 0.21 0.4

Local Adaptation 0.2 0.3 0.1 0.19 0.6 0.18 0 0 0.17 0.1 0.16 0 0 0.6 0.15 0.1 0 0.14 0 0.4 0.3 0.13 0 0 0 0.12 0 0.5 0.5 0.1 0.11 0.5 0.4 0.1 0 0 0 0.1 0.5 0.4 0.1 0 0 0 0 0.09 0.6 0.3 0.1 0 0 0 0 0.9 0.08 0.5 0.4 0.1 0.1 0 0 0 0.07 0.5 0.4 0.1 0 0 0 0 1 0.9 0.06 0.5 0.4 0.3 0 0 0 0 0.8 0.9 0.05 0.5 0.4 0.3 0.1 0.1 0 0 0.9 1 0.04 0.6 0.5 0.3 0.2 0.1 0.1 0 0.9 0.9 1 0.03 0.5 0.4 0.3 0.2 0.1 0.1 0.1 0.9 0.9 0.8 0.9 0.02 0.6 0.6 0.3 0.3 0.1 0.1 0.1 0.8 0.9 0.9 0.9 0.9 0.01 0.7 0.6 0.5 0.5 0.2 0.2 0.2 0.8 0.9 0.9 0.9 0.9 0.9 0.9 0 0.7 0.7 0.6 0.5 0.2 0.2 0.2 0.3 0.8 0.5 0.6 0.4 0.6 0.3

0.35 0.33 0.32 0.31 0.3 0.29 0.28 0.27 0.26 Discrete Matching 0.25 0.24 0.2 0.23 0.22 0.21 0.4 0.2 0 0.19 0 0.18 0.1 0 0.17 0.5 0.3 1 0.16 0 0 0 0.8 0.15 0.3 0.1 0 0.6 0.14 0 0 0 0.5 0.9 0.13 0 0 0.5 0.4 0.1 0 0.9 0.12 0 0 0.6 0.3 0.1 0 0 0.7 0.11 0 0 0.6 0.3 0.1 0 0 0.8 0.8 0.1 0 0 0.5 0.4 0.1 0 0 0.7 0.09 0 0.1 0.6 0.5 0.2 0 0 0.8 0.8 0.08 0 0.3 0.5 0.4 0.2 0 0 0.8 0.8 0.07 0.1 0.3 0.5 0.5 0.3 0.1 0 0.9 0.7 0.8 0.06 0.1 0.4 0.6 0.5 0.3 0.1 0.1 0.8 0.8 0.9 0.8 0.05 0.4 0.4 0.6 0.5 0.4 0.4 0.2 0.8 0.9 0.9 0.8 0.04 1 1 0.6 0.5 0.5 0.4 0.6 0.8 0.8 0.8 0.9 0.03 0.9 0.8 0.8 1 0.8 0.8 1 0 0.8 0.8 0.9 0.9 0.9 0.9 0.02 0.8 0.9 1 0.8 0.9 0.9 1 0.5 0.3 0.8 0.8 0 0.8 0.7 0.01 0.9 0.8 1 0.8 0.9 1 0.6 0.1 0.3 0.3 0.4 0.7 0.1 0.4 0 1 0.8 0.8 0.9 1 0.8 0.8 0.1 0 0.4 0.4 0 0.5 0.2 3 5 10 15 20 25 30 3 5 10 15 20 25 30 Number of Populations Figure 8 - Under four different models of interaction based on matching, one can determine for each given amount of local adaptation, how many populations need to be sampled to have a low type II error rate.

80 Table 10 - A summary of the evolutionary conditions needed to detect spatially covarying loci, across all eight parameters and four genetic interactions.

Evolutionary Conditions Tested

Selection Population Coevolving Genome Populations Migration Recombine Mutation Intensity Size Loci Size Sampled Continuous Few Low High Large No effect No effect No effect Many (1-~15)

Discrete Low High Large No effect No effect 1 No effect Many Matching

Continuous Never Never Never Never Never Never Never Never

Discrete Never Never Never Never Never Never Never Never Continuous

81 Literature Cited

Adams, K. L., and J. F. Wendel. 2005. Novel patterns of gene expression in polyploid plants. Trends Genet. 21:539–543. Elsevier.

Agrawal, A. F. 2009. Differences between selection on sex versus recombination in red queen models with diploid hosts. Evolution 63:2131–2141.

Agrawal, A. F., and C. M. Lively. 2001. Parasites and the evolution of self-fertilization. Evolution 55:869–879.

Agrawal, A. F., and S. P. Otto. 2006. Host-parasite coevolution and selection on sex through the effects of segregation. Am Nat 168:617–629.

Agrawal, A., and C. M. Lively. n.d. Infection genetics: gene-for-gene versus matching-alleles models and all points in between. Evolutionary Ecology Research 4:79–90.

Andras, J. P., and D. Ebert. 2012. A novel approach to parasite population genetics: Experimental infection reveals geographic differentiation, recombination and host-mediated population structure in Pasteuria ramosa, a bacterial parasite of Daphnia. Molecular Ecology 22:972–986.

Arvanitis, L., C. Wiklund, and J. Ehrlén. 2007. Butterfly seed predation: effects of landscape characteristics, plant ploidy level and population structure. Oecologia 152:275–285.

Baack, E. J. 2005. Ecological factors influencing tetraploid establishment in snow buttercups (Ranunculus adoneus , Ranunculaceae): minority cytotype exclusion and barriers to triploid formation. American Journal of Botany 92:1827–1835.

Barrett, L. G., P. H. Thrall, P. N. Dodds, M. van der Merwe, C. C. Linde, G. J. Lawrence, and J. J. Burdon. 2009. Diversity and evolution of effector loci in natural populations of the plant pathogen Melampsora lini. Mol. Biol. Evol. 26:2499–2513. Oxford University Press.

Barribeau, S. M., B. M. Sadd, L. du Plessis, and P. Schmid-Hempel. 2014. Gene expression differences underlying genotype-by-genotype specificity in a host–parasite system. Proceedings of the National Academy of Sciences 111:3496–3501.

Ben Lehner. 2013. Genotype to phenotype: lessons from model organisms for human genetics. Nat. Rev. Genet. 14:168–178. Nature Publishing Group.

Beukeboom, L. W., T. F. Sharbel, and N. K. Michiels. 1998. Reproductive modes, ploidy distribution, and supernumerary chromosome frequencies of the Polycelis nigra (Platyhelminthes: Tricladida). Hydrobiologia 383:277–285. Kluwer Academic Publishers.

Birchler, J. A. 2012. Genetic Consequences of Polyploidy in Plants. Pp. 21–32 in P. S. Soltis and D. E. Soltis, eds. Polyploidy and Genome Evolution. Springer Berlin Heidelberg, Berlin, Heidelberg.

82 Blanquart, F., O. Kaltz, S. L. Nuismer, and S. Gandon. 2013. A practical guide to measuring local adaptation. Ecol Letters 16:1195–1205.

Blomström, A.-L., Q. Gu, G. Barry, G. Wilkie, J. K. Skelton, M. Baird, M. McFarlane, E. Schnettler, R. M. Elliott, M. Palmarini, and A. Kohl. 2015. Transcriptome analysis reveals the host response to Schmallenberg virus in bovine cells and antagonistic effects of the NSs protein. BMC Genomics 16:324. BioMed Central.

Burdon, J. J., and D. R. Marshall. 1981. Inter- and Intra-Specific Diversity in the Disease- Response of Glycine Species to the Leaf-Rust Fungus Phakopsora Pachyrhizi. Journal of Ecology 69:381–390. British Ecological Society.

Burmeister, A., R. Lenski, and J. Meyer. 2015. Selection for Intermediate Genotypes Enables a Key Innovation in Phage Lambda. bioRxiv, doi: 10.1101/018606.

Busey, P., R. M. Giblin-Davis, Center, B J. 1992. (1993) Resistance in Stenotaphrum to the Sting Nematode. Crop Science 33:1066–1070.

Camacho, C., G. Coulouris, V. Avagyan, N. Ma, J. Papadopoulos, K. Bealer, and T. L. Madden. 2009. BLAST+: architecture and applications. BMC Bioinformatics 10:421–9. BioMed Central.

Chapalamadugu, K. C., C. A. VandeVoort, M. L. Settles, B. D. Robison, and G. K. Murdoch. 2014. Maternal Bisphenol A Exposure Impacts the Fetal Heart Transcriptome. PLoS ONE 9:e89096–9. Public Library of Science.

Chen, Z. J. 2007. Genetic and epigenetic mechanisms for gene expression and phenotypic variation in plant polyploids. Annual Review of Plant Biology 58:377–406.

Chevreux, B., T. Wetter, and S. Suhai. 1999. Genome Sequence Assembly Using Trace Signals and Additional Sequence Information. German conference on bioinformatics.

Ching, B., S. Jamieson, J. W. Heath, D. D. Heath, and A. Hubberstey. 2009. Transcriptional differences between triploid and diploid Chinook salmon (Oncorhynchus tshawytscha) during live Vibrio anguillarum challenge. Heredity 104:224–234. Nature Publishing Group.

Choleva, L., and K. Janko. 2013. Rise and Persistence of Animal Polyploidy: Evolutionary Constraints and Potential. Cytogenet Genome Res 140:151–170.

Comai, L. 2005. The advantages and disadvantages of being polyploid. Nat. Rev. Genet. 6:836– 846.

Consortium, T. S. J. G. S. A. F. A., G. A. A. E. analysis, F. G. analysis, S. A. assembly, P. writing, and P. leaders. 2010. The Schistosoma japonicum genome reveals features of host- parasite interplay. Nature 460:345–351. Nature Publishing Group.

Coop, G., D. Witonsky, A. Di Rienzo, and J. K. Pritchard. 2010. Using Environmental Correlations to Identify Loci Underlying Local Adaptation. Genetics 185:1411–1423.

83 D'Souza, T. G., M. Storhas, and N. K. Michiels. 2005. The effect of ploidy level on fitness in parthenogenetic . Biological Journal of the Linnean Society 85:191–198.

Dausset, J. 1981. The major histocompatibility complex in man. Science 213:1469–1474.

Duchemin, M. B., M. Fournier, and M. Auffret. 2007. Seasonal variations of immune parameters in diploid and triploid Pacific oysters, Crassostrea gigas (Thunberg). Aquaculture 264:73–81.

Dy, R. L., C. Richter, G. P. C. Salmond, and P. C. Fineran. 2014. Remarkable Mechanisms in Microbes to Resist Phage Infections. Annu Rev Virol 1:307–331. Annual Reviews.

Dybdahl, M. F., and C. M. Lively. 1995a. Diverse, endemic and polyphyletic clones in mixed populations of a freshwater snail (Potamopyrgus antipodarum). Journal Of Evolutionary Biology 8:385–398. Wiley Online Library.

Dybdahl, M. F., and C. M. Lively. 1998. Host-parasite coevolution: evidence for rare advantage and time-lagged selection in a natural population. Evolution 1057–1066. JSTOR.

Dybdahl, M. F., and C. M. Lively. 1995b. Host-parasite interactions: infection of common clones in natural populations of a freshwater snail (Potamopyrgus antipodarum). Proc. Biol. Sci. 260:99–103. The Royal Society.

Dybdahl, M. F., and C. M. Lively. 1996. The geography of coevolution: comparative population structures for a snail and its trematode parasite. Evolution 50:2264–2275. JSTOR.

Dybdahl, M. F., and D. M. Drown. 2010. The absence of genotypic diversity in a successful parthenogenetic invader. Biol Invasions 13:1663–1672.

Dybdahl, M. F., C. E. Jenkins, and S. L. Nuismer. 2014a. Identifying the Molecular Basis of Host-Parasite Coevolution: Merging Models and Mechanisms. Am Nat 184:1–13.

Dybdahl, M. F., Dybdahl, M. F., C. E. Jenkins, S. L. Nuismer, and S. L. Nuismer. 2014b. Identifying the Molecular Basis of Host-Parasite Coevolution: Merging Models and Mechanisms. Am Nat 184:1–13.

Dybdahl, M. F., J. Jokela, L. F. Delph, B. Koskella, and C. M. Lively. 2008. Hybrid Fitness in a Locally Adapted Parasite. Am Nat 172:772–782.

Flor, H. H. 1971. Current status of the gene-for-gene concept.

Flor, H. H. 1956. The complementary genic systems in flax and flax rust. Adv. Genet.

Foth, B. J., I. J. Tsai, A. J. Reid, A. J. Bancroft, S. Nichol, A. Tracey, N. Holroyd, J. A. Cotton, E. J. Stanley, M. Zarowiecki, J. Z. Liu, T. Huckvale, P. J. Cooper, R. K. Grencis, and M. Berriman. 2014. Whipworm genome and dual-species transcriptome analyses provide molecular insights into an intimate host-parasite interaction. Nature Publishing Group 46:693–700. Nature Publishing Group.

84 Frank, S. A. 1991. Ecological and genetic models of host-pathogen coevolution. Heredity 67:73– 83.

Frank, S. A. 2000. Polymorphism of attack and defense. Trends in Ecology & Evolution 15:167– 171.

Frank, S. A. 1996. Problems inferring the specificity of plant?pathogen genetics. Evol Ecol 10:323–325.

Frank, S. A. 1994. Recognition and polymorphism in host-parasite genetics. Philos. Trans. R. Soc. Lond., B, Biol. Sci. 346:283–293.

Galaktionov, K. V., I. Blasco-Costa, and P. D. Olson. 2012. Life cycles, molecular phylogeny and historical biogeography of the “pygmaeus” microphallids (Digenea: Microphallidae): widespread parasites of marine and coastal in the Holarctic. Parasitology 139:1346–1360.

Gandon, S., and S. L. Nuismer. 2009. Interactions between Genetic Drift, Gene Flow, and Selection Mosaics Drive Parasite Local Adaptation. Am Nat 173:212–224.

Gasic, K., A. Hernandez, and S. S. Korban. 2004. RNA extraction from different apple tissues rich in polyphenols and polysaccharides for cDNA library construction. Plant Mol Biol Rep 22:437–438. Springer-Verlag.

Gasnier, N., D. Rondelaud, M. Abrous, and F. Carreras. 2000a. Allopatric combination of Fasciola hepatica and Lymnaea truncatula is more efficient than sympatric ones. International journal for … 30:573–578.

Gasnier, N., D. Rondelaud, M. Abrous, F. Carreras, C. Boulard, and P. Diez-Banos. 2000b. Allopatric combination of Fasciola hepatica and Lymnaea truncatula is more efficient than sympatric ones. International journal for … 30:573–578.

Gottula, J., R. Lewis, S. Saito, and M. Fuchs. 2014. Allopolyploidy and the evolution of plant virus resistance. BMC Evol. Biol. 14.

Greischar, M. A., and B. Koskella. 2007. A synthesis of experimental work on parasite local adaptation. Ecol Letters 10:418–434. Blackwell Publishing Ltd.

Grosberg, R. K., and M. W. Hart. 2000. Mate selection and the evolution of highly polymorphic self/nonself recognition genes. Science 289:2111–2114.

Guégan, J.-F., and S. Morand. 1996. Polyploid Hosts: Strange Attractors for Parasites? Oikos 77:366–370. Wiley.

Hacquard, S., B. Petre, P. Frey, A. Hecker, N. Rouhier, and S. Duplessis. 2011. The Poplar- Poplar Rust Interaction: Insights from Genomics and Transcriptomics. Journal of Pathogens 2011:1–11.

Halverson, K., S. B. Heard, J. D. Nason, and J. O. Stireman III. 2007. Differential attack on

85 diploid, tetraploid, and hexaploid Solidago altissima L. by five insect gallmakers. Oecologia 154:755–761.

Hamilton, W. D. 1980. Sex Versus Non-Sex Versus Parasite. Oikos 35:282–290.

Hamilton, W. D., R. Axelrod, and R. Tanese. 1990. Sexual reproduction as an adaptation to resist parasites (a review). Proceedings of the National Academy of Sciences of the United States of America 87:3566–3573.

Han, Z. G., P. J. Brindley, S. Y. Wang, and Z. Chen. 2009. Schistosoma genomics: new perspectives on schistosome biology and host-parasite interaction. Annual Reviews.

Hechinger, R. F. 2012. Faunal survey and identification key for the trematodes (Platyhelminthes: Digenea) infecting Potamopyrgus antipodarum (: Hydrobiidae) as first …. Zootaxa.

Howe, K. L., B. J. Bolt, S. Cain, J. Chan, W. J. Chen, P. Davis, J. Done, T. Down, S. Gao, C. Grove, T. W. Harris, R. Kishore, R. Lee, J. Lomax, Y. Li, H.-M. Muller, C. Nakamura, P. Nuin, M. Paulini, D. Raciti, G. Schindelman, E. Stanley, M. A. Tuli, K. Van Auken, D. Wang, X. Wang, G. Williams, A. Wright, K. Yook, M. Berriman, P. Kersey, T. Schedl, L. Stein, and P. W. Sternberg. 2016. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res. 44:D774–D780.

Hufton, A. L., and G. Panopoulou. 2009. Polyploidy and genome restructuring: a variety of outcomes. Current Opinion in Genetics & Development 19:600–606.

Hughes, A. L., and M. Yeager. 2003. Natural selection at major histocompatibility complex loci of vertebrates. Annual Review of Genetics 32:415–435. Annual Reviews 4139 El Camino Way, P.O. Box 10139, Palo Alto, CA 94303-0139, USA.

Hurtrez-Bousses, S., C. Meunier, P. Durand, and F. Renaud. 2001. Dynamics of host-parasite interactions: the example of population biology of the liver fluke (Fasciola hepatica). Microbes and Infection 3:841–849.

Husband, B. C. 2000. Constraints on Polyploid Evolution: A Test of the Minority Cytotype Exclusion Principle. Proceedings: Biological Sciences 267:217–223. The Royal Society.

Jokela, J., and C. M. Lively. 1995. Spatial variation in infection by digenetic trematodes in a population of freshwater snails (Potamopyrgus antipodarum). Oecologia 103:509–517. Springer.

Kakui, K. 2011. A novel transmission pathway: first report of a larval trematode in a tanaidacean crustacean

. Bioinformatics 85:553–566. Paleontological Society.

Kao, R. H. 2008. Implications of polyploidy in the host plant of a dipteran seed parasite. Western North American Naturalist, doi: 10.3398/1527-0904(2008)68%5B225:IOPITH%5D2.0.CO;2.

Kasahara, M. 1999. The chromosomal duplication model of the major histocompatibility

86 complex. Immunol. Rev. 167:17–32. Blackwell Publishing Ltd.

Katoh, K., and D. M. Standley. 2013. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Molecular Biology And Evolution 30:772–780.

Kawecki, T. J., and D. Ebert. 2004. Conceptual issues in local adaptation. Ecology Letters 7:1225–1241. Blackwell Science Ltd.

Keller, S. R., N. Levsen, M. S. Olson, and P. Tiffin. 2012. Local adaptation in the flowering-time gene network of balsam poplar, Populus balsamifera L. Mol. Biol. Evol. 29:3143–3152. Oxford University Press.

King, K. C., and C. M. Lively. 2009. Geographic variation in sterilizing parasite species and the Red Queen. Oikos.

King, K. C., J. Jokela, and C. M. Lively. 2011a. Parasites, sex, and clonal diversity in natural snail populations. Evolution 65:1474–1481.

King, K. C., J. Jokela, and C. M. Lively. 2011b. Trematode parasites infect or die in snail hosts. Biology Letters 7:265–268.

King, K. C., O. Seppala, and M. Neiman. 2012. Is more better? Polyploidy and parasite resistance. Biology Letters 8:598–600.

Klein, J. 1986. Natural history of the major histocompatibility complex. Wiley, New York.

Koskella, B., and C. M. Lively. 2007. Advice of the Rose: Experimental Coevolution of a Trematode Parasite and Its Snail Host. Evolution 61:152–159.

Koskella, B., and C. M. Lively. 2009. Evidence for negative frequency-dependent selection during experimental coevolution of a freshwater snail and a sterilizing trematode. Evolution 63:2213–2221.

Kudlai, O., S. C. Cutmore, and T. H. Cribb. 2015. Morphological and molecular data for three species of the Microphallidae (Trematoda: Digenea) in Australia, including the first descriptions of the cercariae of Maritrema brevisacciferum Shimazu et Pearson, 1991 and Microphallus minutus Johnston, 1948. Folia Parasitologica 62:1–13.

Labrie, S. J., J. E. Samson, and S. Moineau. 2010. Bacteriophage resistance mechanisms. Nature Reviews Microbiology 8:317–327. Nature Publishing Group.

Langmead, B., and S. L. Salzberg. 2012. Fast gapped-read alignment with Bowtie 2. Nat Meth 9:357–359.

Langston, A. L., R. Johnstone, and A. E. Ellis. 2001. The kinetics of the hypoferraemic response and changes in levels of alternative complement activity in diploid and triploid Atlantic salmon, following injection of lipopolysaccharide. & Shellfish Immunology 11:333–345.

87 Levin, D. A. 1975. Minority Cytotype Exclusion in Local Plant Populations. Taxon 24:35–43. International Association for Plant Taxonomy (IAPT).

Levin, D. A. 1983. Polyploidy and Novelty in Flowering Plants. Am Nat 122:1–25. The American Society of Naturalists.

Li, H., and R. Durbin. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760.

Li, H., B. Handsaker, A. Wysoker, T. Fennell, J. Ruan, N. Homer, G. Marth, G. Abecasis, R. Durbin, 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079.

Lively, C. M. 1989. Adaptation by a parasitic trematode to local populations of its snail host. Evolution 43:1663–1671. JSTOR.

Lively, C. M. 1987. Evidence from a New Zealand snail for the maintenance of sex by . Nature 328:519–521. Nature Publishing Group.

Lively, C. M., and M. F. Dybdahl. 2000. Parasite adaptation to locally common host genotypes. Nature 405:679–681.

Lively, C. M., M. F. Dybdahl, J. Jokela, E. E. Osnas, and L. F. Delph. 2004a. Host sex and local adaptation by parasites in a snail-trematode interaction. Am Nat 164 Suppl 5:S6–18.

Lively, C. M., M. F. Dybdahl, J. Jokela, E. E. Osnas, and L. F. Delph. 2004b. Host sex and local adaptation by parasites in a snail-trematode interaction. Am Nat 164 Suppl 5:S6–18.

Lively, C. M., M. F. Dybdahl, J. Jokela, E. E. Osnas, and L. F. Delph. 2004c. Host Sex and Local Adaptation by Parasites in a Snail-Trematode Interaction. Am Nat 164:S6–S18.

Luikart, G., P. R. England, D. Tallmon, S. Jordan, and P. Taberlet. 2003. The power and promise of population genomics: from genotyping to genome typing. Nat. Rev. Genet. 4:981–994.

M'Gonigle, L. K., J. J. Shen, and S. P. Otto. 2009. Mutating away from your enemies: the evolution of mutation rate in a host-parasite system. Theoretical Population Biology 75:301–311.

Mable, B. K., E. Kilbride, M. E. Viney, and R. C. Tinsley. 2015. Copy number variation and genetic diversity of MHC Class IIb alleles in an alien population of Xenopus laevis. Immunogenetics 67:1–13. Immunogenetics.

Masterson, J. 1994. Stomatal size in fossil plants: evidence for polyploidy in majority of angiosperms. Science.

Miller, J. R., S. Koren, and G. Sutton. 2010. Assembly algorithms for next-generation sequencing data. Genomics 95:315–327. Elsevier Inc.

Minin, V., Z. Abdo, P. Joyce, and J. Sullivan. 2003. Performance-Based Selection of Likelihood

88 Models for Phylogeny Estimation. Systematic Biology 52:674–683.

Mitta, G., C. M. Adema, B. Gourbal, E. S. Loker, and A. Theron. 2012. Compatibility polymorphism in snail/schistosome interactions: From field to theory to molecular mechanisms. Developmental & Comparative Immunology 37:1–8. Elsevier Ltd.

Mode, C. J. 1958. A mathematical model for the co-evolution of obligate parasites and their hosts. Evolution 12:158–165.

Morran, L. T., R. C. I. Parrish, I. A. Gelarden, M. B. Allen, and C. M. Lively. 2014. Experimental Coevolution: Rapid Local Adaptation by Parasites Depends on Host Mating System. American Naturalist 184:S91–S100.

Morran, L. T., R. C. Parrish, I. A. Gelarden, and C. M. Lively. 2012. Temporal dynamics of outcrossing and host mortality rates in host-pathogen experimental coevolution. Evolution 67:1860–1868.

Munzbergova, Z. 2006. Ploidy level interacts with population size and habitat conditions to determine the degree of herbivory damage in plant populations. Oikos 115:443–452. Blackwell Publishing Ltd.

Neiman, M., and C. M. Lively. 2004. Pleistocene glaciation is implicated in the phylogeographical structure of Potamopyrgus antipodarum, a New Zealand snail. Molecular Ecology 13:3085–3098.

Neiman, M., D. Paczesniak, D. M. Soper, A. T. Baldwin, and G. Hehman. 2011. Wide Variation in Ploidy Level and Genome Size in a New Zealand Freshwater Snail with coexisting sexual and asexual lineages. Evolution 65:1–15.

Nuismer, S. L., and J. N. Thompson. 2001. Plant polyploidy and non-uniform effects on insect herbivores. Proceedings of the Royal Society B: Biological Sciences 268:1937–1940.

Nuismer, S. L., and S. Gandon. 2008. Moving beyond Common-Garden and Transplant Designs: Insight into the Causes of Local Adaptation in Species Interactions. Am Nat 171:658–668.

Nuismer, S. L., and S. P. Otto. 2004. Host-parasite interactions and the evolution of ploidy. Proceedings of the National Academy of Sciences 101:11036–11039.

Nuismer, S. L., and S. P. Otto. 2005. Host–Parasite Interactions and the Evolution of Gene Expression. PLoS Biol. 3:e203.

O'Neil, S. T., and S. J. Emrich. 2013. Assessing De Novo transcriptome assembly metrics for consistency and utility. BMC Genomics 14. BioMed Central.

Ohberg, H., P. Ruth, and U. Bang. 2005. Effect of ploidy and flowering type of red cultivars and of isolate origin on severity of clover rot, Sclerotinia trifoliorum. Journal of Phytopathology 153:505–511.

89 Olson, P. D., T. H. Cribb, V. V. Tkach, R. A. Bray, and D. T. J. Littlewood. 2003. Phylogeny and classification of the Digenea (Platyhelminthes: Trematoda). International Journal fo Parasitology 33:733–755.

Osnas, E. E., and C. M. Lively. 2006. Host ploidy, parasitism and immune defence in a coevolutionary snail-trematode system. J Evolution Biol 19:42–48.

Oswald, B. P., and S. L. Nuismer. 2007. Neopolyploidy and pathogen resistance. Proc. Biol. Sci. 274:2393–2397.

Otto, S. P. 2007. The evolutionary consequences of polyploidy. Cell 131:452–462.

Otto, S. P., and J. Whitton. 2000. Polyploidy Incidence and Evolution. Annual Review of Genetics 34:401–437.

Otto, S. P., and S. L. Nuismer. 2004. Species interactions and the evolution of sex. Science 304:1018–1020.

Paczesniak, D., J. Jokela, K. Larkin, and M. Neiman. 2013. Discordance between nuclear and mitochondrial genomes in sexual and asexual lineages of the freshwater snail Potamopyrgus antipodarum. Molecular Ecology 22:4695–4710.

Parker, M. A. 1994. Pathogens and sex in plants. Evol Ecol 8:560–584.

Parker, M. A. 1996. The nature of plant?parasite specificity. Evol Ecol 10:319–322.

Parsons, J. E., R. A. Busch, G. H. Thorgaard, and P. D. Scheerer. 1986. Increased Resistance of Triploid Rainbow-Trout X Coho Salmon Hybrids to Infectious Hematopoietic Necrosis Virus. Aquaculture 57:337–343.

Perry, E. B., J. E. Barrick, and B. J. M. Bohannan. 2015. The Molecular and Genetic Basis of Repeatable Coevolution between Escherichia coli and Bacteriophage T3 in a Laboratory Microcosm. PLoS ONE 10:e0130639–12.

Portillo, M., J. Cabrera, K. Lindsey, J. Topping, M. F. Andrés, M. Emiliozzi, J. C. Oliveros, G. García-Casado, R. Solano, H. Koltai, N. Resnick, C. Fenoll, and C. Escobar. 2013. Distinct and conserved transcriptomic changes during nematode-induced giant cell development in tomato compared with Arabidopsis: a functional role for gene repression. New Phytologist 197:1276– 1290.

Pruitt, K. D. 2004. NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 33:D501–D504.

Rambaut, A., and A. J. Drummond. 2003. Tracer v1. 4. 2007.

Ramsey, J., and D. W. Schemske. 2002. Neopolyploidy in Flowering Plants. Annu Rev Ecol Syst 33:589–639.

90 Rausch, J. H., and M. T. Morgan. 2005. Teh effect of self-fertilization, inbreeding depression, and population size on autopolyploid establishment. Evolution 59:1867–10.

Roche Diagnostics GmbH. 2010. cDNA Rapid Library Preparation Method Manual. Mannheim.

Roger, E. E., C. C. Grunau, R. J. R. Pierce, H. H. Hirai, B. B. Gourbal, R. R. Galinier, R. R. Emans, I. M. I. Cesari, C. C. Cosseau, and G. G. Mitta. 2008. Controlled chaos of polymorphic mucins in a metazoan parasite (Schistosoma mansoni) interacting with its invertebrate host (Biomphalaria glabrata). PLoS Negl Trop Dis 2:e330–e330.

Romero, I. G., I. Ruvinsky, and Y. Gilad. 2012. Comparative studies of gene expression and the evolution of gene regulation. Nat. Rev. Genet. 13:505–516. Nature Publishing Group.

Ronquist, F., M. Teslenko, P. van der Mark, D. L. Ayres, A. Darling, S. Hohna, B. Larget, L. Liu, M. A. Suchard, and J. P. Huelsenbeck. 2012. MrBayes 3.2: Efficient Bayesian Phylogenetic Inference and Model Choice Across a Large Model Space. Systematic Biology 61:539–542.

Salathé, M., R. D. Kouyos, and S. Bonhoeffer. 2008. The state of affairs in the kingdom of the Red Queen. Trends in ecology & evolution 23:439–445.

Schoen, D. J., J. J. Burdon, and A. H. Brown. 1992. Resistance of Glycine tomentella to soybean leaf rust Phakopsora pachyrhizi in relation to ploidy level and geographic distribution. Theor. Appl. Genet. 83:827–832. Springer-Verlag.

Seoighe, C. 2003. Turning the clock back on ancient genome duplication. Current Opinion in Genetics & Development 13:636–643.

Sémon, M., and K. H. Wolfe. 2007. Consequences of genome duplication. Current Opinion in Genetics & Development 17:505–512.

Soltis, D. E., V. A. Albert, J. Leebens-Mack, C. D. Bell, A. H. Paterson, C. Zheng, D. Sankoff, C. W. dePamphilis, P. K. Wall, and P. S. Soltis. 2009. Polyploidy and angiosperm diversification. American Journal of Botany 96:336–348.

Sperschneider, J., D. M. Gardiner, L. F. Thatcher, R. Lyons, K. B. Singh, J. M. Manners, and J. M. Taylor. 2015. Genome-Wide Analysis in Three Fusarium Pathogens Identifies Rapidly Evolving Chromosomes and Genes Associated with Pathogenicity. Genome Biology and Evolution 7:1613–1627.

Stamatakis, A. 2014. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313.

Stenberg, P., and A. Saura. 2013. Meiosis and Its Deviations in Polyploid Animals. Cytogenet Genome Res 140:185–203.

Storz, J. F. 2005. Using genome scans of DNA polymorphism to infer adaptive population divergence. Molecular Ecology 14:671–688.

91 Swofford, D. L. 2002. PAUP 4.0 b10: Phylogenetic analysis using parsimony.

Talbot, J. M., and J. C. Ward. 1987. Macroinvertebrates associated with aquatic macrophytes in Lake Alexandria, New Zealand. New Zealand Journal of Marine and Freshwater Research 21:199–213.

Tanaka, S., M. Nishimura, F. Ihara, J. Yamagishi, Y. Suzuki, and Y. Nishikawa. 2013. Transcriptome Analysis of Mouse Brain Infected with Toxoplasma gondii. Infection and Immunity 81:3609–3619.

Tayalé, A., and C. Parisod. 2013. Natural Pathways to Polyploidy in Plants and Consequences for Genome Reorganization. Cytogenet Genome Res 140:79–96.

Thompson, J. N., and J. J. Burdon. 2002. Gene-for-Gene Coevolution Between Plants and Parasites. Nature 360:121–125.

Thompson, J. N., B. M. Cunningham, K. A. Segraves, D. M. Althoff, and D. Wagner. 1997. Plant polyploidy and insect/plant interactions. American Naturalist 150:730–743.

Tkach, V. V., D. T. J. Littlewood, P. D. Olson, J. M. Kinsella, and Z. Swiderski. 2003a. Molecular phylogenetic analysis of the Microphalloidea Ward, 1901 (Trematoda: Digenea). Syst. Parasitol. 56:1–15.

Tkach, V. V., D. T. J. Littlewood, P. D. Olson, J. M. Kinsella, and Z. Swiderski. 2003b. Molecular phylogenetic analysis of the Microphalloidea Ward, 1901 (Trematoda: Digenea). Syst. Parasitol. 56:1–15.

Videvall, E., C. K. Cornwallis, V. Palinauskas, G. Valki nas, and O. Hellgren. 2015. The Avian Transcriptome Response to Malaria Infection. Molecular Biology And Evolution 32:1255–1267.

Visscher, P. M., M. A. Brown, M. I. McCarthy, and J. Yang. 2012. Five Years of GWAS Discovery. The American Journal of Human Genetics 90:7–24. The American Society of Human Genetics.

Vleugels, T., G. Cnops, and E. van Bockstaele. 2013. Screening for resistance to clover rot (Sclerotinia spp.) among a diverse collection of red clover populations (Trifolium pratense L.). Euphytica 194:371–382.

Wendel, J. F. 2000. Genome evolution in polyploids. Plant Mol. Biol. 42:225–249.

Wertheim, B., L. W. Beukeboom, and L. van de Zande. 2013. Polyploidy in Animals: Effects of Gene Expression on Sex Determination, Evolution and Ecology. Cytogenet Genome Res 140:256–269.

White, M. J. D. 1970. Heterozygosity and Genetic Polymorphism in Parthenogenetic Animals. Pp. 237–262 in M. K. Hecht and W. C. Steere, eds. Essays in Evolution and Genetics in Honor of Theodosius Dobzhansky. Springer US, Boston, MA.

92 Wilton, P. R., D. B. Sloan, J. M. Logsdon Jr, H. Doddapaneni, and M. Neiman. 2012a. Characterization of transcriptomes from sexual and asexual lineages of a New Zealand snail ( Potamopyrgus antipodarum). Mol Ecol Resour 13:289–294.

Wilton, P. R., D. B. Sloan, J. M. Logsdon Jr, H. Doddapaneni, and M. Neiman. 2012b. Characterization of transcriptomes from sexual and asexual lineages of a New Zealand snail (Potamopyrgus antipodarum). Mol Ecol Resour, doi: 10.1111/1755-0998.12051.

Yli-Mattila, T., G. Kalko, A. Hannukkala, S. Paavanen-Huhtala, and K. Hakala. 2009. Prevalence, species composition, genetic variation and pathogenicity of clover rot (Sclerotinia trifoliorum) and Fusarium spp. in red clover in Finland. Eur J Plant Pathol 126:13–27.

Young, N. D., A. R. Jex, C. Cantacessi, R. S. Hall, B. E. Campbell, T. W. Spithill, S. Tangkawattana, P. Tangkawattana, T. Laha, and R. B. Gasser. 2011. A Portrait of the Transcriptome of the Neglected Trematode, Fasciola gigantica—Biological and Biotechnological Implications. PLoS Negl Trop Dis 5:e1004–12.

Zhang, L., and C. E. King. 1993. Life-History Divergence of Sympatric Diploid and Polyploid Populations of Brine Shrimp Artemia-Parthenogenetica. Oecologia 93:177–183. Springer- Verlag.

Zhao, J., J. A. Udall, P. A. Quijada, C. R. Grau, J. Meng, and T. C. Osborn. 2005. Quantitative trait loci for resistance to Sclerotinia sclerotiorum and its association with a homeologous non- reciprocal transposition in Brassica napus L. Theor. Appl. Genet. 112:509–516.

93