The evolution and ecology of sigma viruses

Ben Longdon

PhD University of Edinburgh 2011

Declaration

The work within this thesis is my own, except where clearly stated. Chapters 2-5 are published or submitted manuscripts and so have kept the use of “we”. Chapters 1 and 6 were written purely as thesis chapters and so use “I”.

Ben Longdon

ii Thesis abstract

Insects are host to a diverse range of vertically transmitted micro-organisms, but while their bacterial symbionts are well-studied, little is known about their vertically transmitted viruses. The sigma virus (DMelSV) is currently the only natural host- specific pathogen to be described in . In this thesis I have examined; the diversity and evolution of sigma viruses in Drosophila, their transmission and population dynamics, and their ability to host shift.

I have described six new rhabdoviruses in five Drosophila species — D. affinis, D. obscura, D. tristis, D. immigrans and D. ananassae — and one in a member of the Muscidae, Muscina stabulans (Chapters two and four). These viruses have been tentatively named as DAffSV, DObsSV, DTriSV, DImmSV, DAnaSV and MStaSV respectively. I sequenced the complete genomes of DObsSV and DMelSV, the L gene from DAffSV and partial L gene sequences from the other viruses. Using this new sequence data I created a phylogeny of the rhabdoviruses (Chapter two). The sigma viruses form a distinct clade which is closely related to the Dimarhabdovirus supergroup, and the high levels of divergence between these viruses suggest that they may deserve to be recognised as a new genus. Furthermore, this analysis produced the most robustly supported phylogeny of the Rhabdoviridae to date, allowing me to reconstruct the major transitions that have occurred during the evolution of the family. This data suggests that the bias towards research into plants and vertebrates means that much of the diversity of rhabdoviruses has been missed, and rhabdoviruses may be common pathogens of .

In Chapter three I examined whether the new sigma viruses in Drosophila affinis and Drosophila obscura are both vertically transmitted. As is the case for DMelSV, both males and females can transmit these viruses to their offspring. Males transmit lower viral titres through sperm than females transmit through eggs, and a lower proportion of their offspring become infected. I then examined natural populations of D. obscura in the UK; 39% of were infected and the viral population shows clear evidence of a recent expansion, with extremely low genetic diversity and a large excess of rare polymorphisms. Using sequence data I estimate that the virus has

iii swept across the UK within the last ~11 years, during which time the viral population size doubled approximately every 9 months. Using simulations based on lab estimates of transmission rates, I show that the biparental mode of transmission allows the virus to invade and rapidly spread through populations, at rates consistent with those measured in the field. Therefore, as predicted by the simulations, the virus has undergone an extremely rapid and recent increase in population size.

In Chapter four I investigated for the first time whether vertically transmitted viruses undergo host shifts or cospeciate with their hosts. Using a phylogenetic approach I show that sigma viruses have switched between hosts during their evolutionary history. These results suggest that sigma virus infections may be short-lived in a given host lineage, so that their long-term persistence relies on rare horizontal transmission events between hosts.

In Chapter five I examined the ability of three Drosophila sigma viruses to persist and replicate in 51 hosts sampled across the phylogeny. I used a phylogenetic mixed model to account for the non-independence of host taxa due to common ancestry, which additionally allows integration over the uncertainty in the host phylogeny. In two out of the three viruses there was a negative correlation between viral titre and genetic distance from the natural host. Additionally the host phylogeny explains an extremely high proportion of the variation (after considering genetic distance from the natural host) in the ability of these viruses to replicate in novel hosts (>0.8 for all viruses). There were strong phylogenetic correlations between all the viruses (>0.65 for all pairs), suggesting a given species’ level of resistance to one virus is strongly correlated with its resistance to other viruses. This suggests the host phylogeny, and genetic distance from the natural host, may be important in determining viruses ability to host switch.

This work has aimed to address fundamental questions relating to host-parasite coevolution and pathogen emergence. The data presented suggests that sigma viruses are likely to be widespread vertically transmitted viruses, which have dynamic interactions with their hosts. These viruses appear to have switched between hosts

iv during their evolutionary history and it is likely the host phylogeny is a determinant of such host shifts.

v Acknowledgments

“That’s a fool’s experiment. But I love fools' experiments. I am always making them” Charles Darwin.

“One becomes a gourmet chef for the diptera” Terri Markow

It is traditional to start a thesis with a quote (or two) and I thought these summed up my PhD rather nicely. I have tried to be succinct in writing these acknowledgements, but realised that this thesis would not have been possible without the help and support of a great number of people. Specific acknowledgments with regards to experimental work are made in each chapter.

I owe a huge thanks to Frank Jiggins for his help, constant advice and reassurance (even if he did leave me and run away to Cambridge!). I am also greatly indebted to Darren Obbard who has always been friendly, keen to help, and has shown constant enthusiasm (coupled with extreme pessimism and whinging). I could not have asked for a better pair of supervisors.

I would like to thank the following to whom I am hugely grateful: Lena Wilfert for her help, advice and occasional accommodation in Cambridge. Jarrod Hadfield for being the source of knowledge for all things statistical, his musings over a pint, and unrepeatable quotes. Tom Little for being my official supervisor, and for offering useful advice. Mark Blaxter for being my post-graduate carer and also providing money towards my external examiner travel expenses. Claire Webster who has provided continuous supplies of gossip and helped with work. Mike Magwire for going out of his way to be helpful. Andrew Rambaut for numerous informal discussions on the correct phylogenetic practices. Fergal Waldrin, Jewelna Osei- Poku, Floh Bayer and Jenny Carpenter for being great lab mates. The Edinburgh coevolution group for numerous interesting, and often heated, discussions. The French sigma virus researchers of days gone by for providing many snippets of useful information. I would also like to thank various IEB folk for offering words of

vi advice or kind words (in particular Sarah Reece, Dan Nussey, Penny Haddrill and Desiree Allen). Thanks to everyone who put me up on fieldwork (Chapter 2 and 3) and supplied flies at various points during my PhD. Thank you to Nigel Franks for his help and encouragement in my undergraduate degree. I would like to thank the folk at the CIE at Deakin University for allowing me to work in the department whilst writing up. Thanks to Helen Borthwick and Helen Cowan for making me laugh and helping with fly food preparation. Thanks to the other IEB PhD students for putting things into perspective (and making me realise flies are relatively good compared to daphnia, plasmodium or some long term “wild” dataset).

I received funding from the BBSRC, Royal Entomological Society, Genetics Society UK and the James Rennie bequest.

On a personal note I would like to thank my parents, for their enduring support, even when I was forced to abandon them on their visit to Edinburgh for lab work. Thanks to Matt Boyce and Saskia Pedersen for providing a very welcome distraction from work. Finally I want to thank Laura Kelley for her kindness and unwavering faith in me, for putting up with my grumpiness and for generally being brilliant.

Lastly thank you to Andrew Rambaut and John Jaenike for taking the time to read this thesis.

vii Table of Contents Declaration ii Thesis Abstract iii Acknowledgments vi

1. General Introduction

1.1 General background 1 1.2 Questions 2 1.3 Sigma virus general biology 12

2. Sigma virus discovery and phylogeny 25

2.1 Introduction 26 2.2 Methods 28 2.3 Results 33 2.4 Discussion 41

3. Vertical transmission and a recent sweep 46

3.1 Introduction 47 3.2 Methods 48 3.3 Results 55 3.4 Discussion 65

4. Host-switching 70

4.1 Introduction 71 4.2 Methods 72 4.3 Results 73 4.4 Discussion 75

5. Phylogenetic determinants of host shifts 78

5.1 Introduction 79 5.2 Methods 82 5.3 Results 89 5.4 Discussion 93

6. General discussion

6.1 Summary of field 98 6.2 Overview 99 6.3 Future directions 104 6.4 Conclusions 109

References 110 Appendix/Supplementary material 130

viii Chapter 1: General Introduction 1. General Introduction

I wrote this chapter with comments on drafts from Darren Obbard and Frank Jiggins.

1.1 Background

Parasites are a major cause of mortality, being responsible for 13 million deaths in humans each year (WHO 1999). Similarly, parasites are responsible for diseases affecting crops, livestock and natural populations of plants and (Altizer and Pedersen 2008). However, parasites are also of interest to evolutionary biologists, as antagonistic reciprocal interactions between hosts and parasites drive rapid coevolutionary dynamics (Woolhouse et al. 2002). In turn, understanding the coevolutionary processes that occur between hosts and parasites may have implications for the control and prediction of infectious disease.

Research into host-parasite interactions is valuable for its contributions to biomedicine. However, non-vertebrate systems are particularly valuable in providing experimental flexibility and are the focus of this thesis (Little 2002; Schmid-Hempel 2005). Studies of host-parasite interactions aim to understand how the coevolutionary processes occurs (Lively and Dybdahl 2000; Buckling and Rainey 2002; Little 2002; Hornett et al. 2006; Decaestecker et al. 2007; Jokela et al. 2009); the genetics underlying such processes (Agrawal and Lively 2002); how parasite virulence evolves (Alizon et al. 2009); how coinfections may alter the infection success of a given parasite (Pedersen and Fenton 2007; Haine 2008; Hurst and Hutchence 2010) and how environmental factors can affect these interactions (Lazzaro and Little 2009).

1 Chapter 1: General Introduction Study system

In this thesis I aim to address three questions in host-parasite evolutionary biology which I describe in turn below. I have examined these questions using Drosophila and its natural pathogen the sigma virus. The sigma virus of Drosophila melanogaster has been studied for over 70 years (L'Heritier and Teissier 1937), and is the best-studied natural virus of Drosophila (Brun and Plus 1980). The tractable nature of both host and parasite genetics make this system ideal for addressing questions relating to host-parasite coevolution. Not only this, but viruses are responsible for important infectious diseases, for example HIV (Sharp and Hahn 2010), SARs coronavirus (Holmes and Rambaut 2004), Influenza (Webby and Webster 2001), Ebola (Leroy et al. 2005), Dengue (Vasilakis and Weaver 2008) and many other viruses (Parrish et al. 2008), so studying their evolution in tractable model systems is of particular relevance to understanding viral emergence. I firstly provide background to each of the three topics studied and then describe how each of these topics relates to sigma virus biology and the questions I aim to address using this system (1.2). Finally, I provide an overview of sigma virus general biology (1.3).

1.2 Questions

1.2.1 Virus diversity, evolution and phylogenetics

General background

Viruses are perhaps the most abundant organisms on earth. Recent metagenomic surveys of marine algae found up to 10 viral genomes per host cell (Suttle 2005; Koonin et al. 2008). Almost all organisms are parasitized by viruses including animals (Holmes 2009), plants (Hull 2009), bacteria (Hambly and Suttle 2005), protozoans (Wang and Wang 1991) and archaea (Prangishvili et al. 2006). Even viruses themselves are parasitized by virus-like elements (satellite elements; Roossinck et al. 1992).

2 Chapter 1: General Introduction

However, we are currently hugely underestimating the total viral diversity (the virosphere; Suttle 2005), with an array of new viruses being discovered through next generation sequencing technologies (Edwards and Rohwer 2005; Suttle 2005), including entirely new families of viruses. While we are aware of this diversity of RNA viruses we know very little about their origins (Holmes 2009). Viral taxa fall into families based on nucleotide or amino-acid sequence data, but evolve so rapidly that ancient relationships between viral families are lost, meaning viral-archaeology is difficult (Holmes 2003). Examining the protein structure of viruses may offer an alternate method of reconstructing ancient relationships, although the methods and techniques have yet to be realised (Holmes 2009). Additionally viral insertions in host genomes that occurred prior to host speciation events could be studied as molecular viral-fossils (Gilbert and Feschotte 2010; Katzourakis and Gifford 2010)

One particular gap in our knowledge of viral diversity is in viruses in non-vertebrate hosts. Such sampling bias may affect our understanding of virus evolution. Indeed Rossmann and Tao (1999) state “Range and variety of insect species are probably greater than in most other classes, yet the study of insect viruses is quite limited”. While this is beginning to be rectified there are still far fewer recognised genera of invertebrate than vertebrate viruses (ICTVdB 2006).

Drosophila viruses

The genus Drosophila contains over 2000 species (Markow and O'Grady 2007) and as a genus is perhaps one of the most studied group of organisms (Clark et al. 2007), yet we know very little about the viruses that naturally infect these flies. However, excluding the sigma virus of D. melanogaster (discussed below), fewer than ten viruses have previously been reported to infect Drosophila species (Brun and Plus 1980; Huszar and Imler 2008), most of which occur in Drosophila melanogaster. Drosophila C virus (DCV) is a positive sense RNA virus in the family Dicistroviridae (ICTVdB 2006) that naturally infects Drosophila melanogaster in the wild (Brun and Plus 1980; Johnson and Christian 1999). DCV commonly infects

3 Chapter 1: General Introduction laboratory stocks of other Drosophila species (Kapun et al. 2010), and can replicate when injected into a wide range of insects (Jousset 1976). Drosophila A virus (DAV) is an unclassified positive sense RNA virus (Ambrose et al. 2009) that has been isolated from D. melanogaster. Drosophila F virus (DFV) is a Reovirus from D. melanogaster and is thought to relate to other Dipteran Reoviruses (Brun and Plus 1980; Plus et al. 1981). Nora virus is positive sense RNA picorna-like virus of a previously uncharacterised family that infects D. melanogaster and Drosophila simulans (Habayeb et al. 2006). Drosophila P virus (DPV) is a picorna-like viruses that infects D. melanogaster (Brun and Plus 1980). Outside of D. melanogaster, Iota virus infects Drosophila immigrans and is serologically related to DPV (Jousset 1972; Jousset and Plus 1975; Plus et al. 1975). A reovirus named Drosophila S virus have been found to infect D. simulans and is reported to be vertically-transmitted (Lopez-Ferber et al. 1989; Lopez-Ferber et al. 1997). Similarly, virus-like particles have been reported in the testes of Drosophila virilis (Tandler 1972), and in D. melanogaster (Virus G; Brun and Plus 1980).

Other unidentified particles have been found (Felluga et al. 1971), although some of these may be any of the previously mentioned viruses as this work seems to neglect to discuss any of the past literature. Other viruses have been found that infect Drosophila cell culture in the laboratory, so it is questionable whether they are natural Drosophila viruses. Drosophila X virus (DXV) is a double stranded RNA virus in the family Birnaviridae which infects D. melanogaster and causes flies exposed to anoxic conditions to die (Teninges et al. 1979). However, this virus has only been found in flies injected with cell culture extracts so is likely a lab contaminant. Recently, sequencing of short RNAs in Drosophila cell culture found three novel viruses (an alphanodavirus related to the Coleopteran isolated Flock House Virus, a totivirus and a birnavirus) plus DAV and DXV (Wu et al. 2010). While DAV, Nora virus and DCV are known from wild populations (Christian 1987 and D. Obbard personal communication), it is far from clear whether the other viruses are natural pathogens of Drosophila or whether they are just lab contaminants.

4 Chapter 1: General Introduction Sigma viruses

The sigma virus of Drosophila melanogaster (DMelSV) is perhaps the best-studied virus of Drosophila, having being discovered over 70 years ago (L'Heritier and

Teissier 1937). It causes flies to become CO2 sensitive; infected flies become paralysed and die after exposure to CO2, while uninfected flies recover after a short period (described in detail in section 1.3; (L'Heritier 1948). DMelSV is vertically- transmitted by both eggs and sperm (section 1.3). DMelSV is unique amongst the known Drosophila viruses in that is has been well studied in both the laboratory and the field (L'Heritier 1957; Brun and Plus 1980; Fleuriet 1988; Carpenter et al. 2007).

The bullet shaped morphology, antigenic profile and partial genome sequences of DMelSV identify it as a rhabdovirus (Berkalof et al. 1965; Teninges 1968; Calisher et al. 1989; Teninges et al. 1993). Rhabdoviruses infect a wide breadth of host taxa, including plants, invertebrates and vertebrates, in both aquatic and terrestrial environments (Fu 2005). Only two other viral families are known to infect such a wide group of host taxa (Holmes 2009). This diverse range of host species is reflected by an equally diverse group of viruses, with the rhabdoviruses falling into six recognised genera (Bourhy et al. 2005; ICTVdB 2006). The relationship of these viruses to DMelSV is uncertain, as its genome sequence was previously lacking the RNA-dependant RNA polymerase (RDRP), which composes about half of its ~12.5kb genome (Huszar and Imler 2008). The RDRP is the most suitable of the five standard rhabdovirus genes for examining family level phylogenies, as it contains conserved domains that allow robust sequence alignments to be created (Poch 1989; Poch et al. 1990). Previous phylogenies of DMelSV have placed it in conflicting positions to other rhabdoviruses (Hogenhout et al. 2003; Kuzmin et al. 2006; Kuzmin et al. 2009). It seems likely DMelSV is related to a group of vectored viruses of vertebrates named the dimarhabdoviruses, based on both phylogenetic and antigenic relationships (Calisher et al. 1989; Bjorklund et al. 1996; Hogenhout et al. 2003; Fu 2005; Kuzmin et al. 2006; Ammar et al. 2009); although it displays considerable differences to any known rhabdovirus in protein sequence and structure (Teninges et al. 1993; Walker and Kongsuwan 1999).

5 Chapter 1: General Introduction

The first question I aim to address is; are there other sigma-like viruses in different hosts species, and how are they related? There is evidence to suggest sigma viruses are a common group of viruses in Drosophila, if not all Diptera. CO2 sensitivity has been reported in 15 species of Drosophila. In two species, Drosophila affinis and

Drosophila athabasca, the CO2 sensitivity was shown to be inherited independently of the host chromosomes in a biparental manner similar to that of DMelSV (Williamson 1959; Williamson 1961). Similar patterns have been observed in Culex mosquitoes (Shroyer and Rosen 1983). As CO2 sensitivity is a common trait of rhabdoviruses in insects (Bussereau and Contamine 1980; Rosen 1980; Shroyer and Rosen 1983; Sylvester and Richardson 1992), it seems likely that these viruses infect a range of different dipterans. My first aim was to screen natural populations for sigma-like viruses, sequence the RDRP of these viruses and DMelSV, and examine how they relate to other rhabdoviruses and one another.

1.2.2 Transmission and dynamics of vertically transmitted parasites

General background

Perhaps some of the most dynamic and rapid coevolutionary interactions observed to date have been in parasitic bacterial endosymbionts of . These bacteria are widespread in arthropods (Duron et al. 2008) and as they are vertically transmitted (maternally) have evolved various strategies to ensure their own spread (Engelstadter and Hurst 2009). These can include being mutualistic and providing a benefit to the host, such as the breakdown of nutrients, or manipulating host reproduction to favour their own spread (Buchner 1965; Douglas 1989; Hurst and Majerus 1993; Douglas 1998; Engelstadter and Hurst 2009; Hurst and Hutchence 2010).

Vertically transmitted symbionts may sweep through host populations either due to a selective sweep of an advantageous mutation through an existing symbiont population, or due to a sweep of a novel symbiont through a previously uninfected

6 Chapter 1: General Introduction host population. In an arms race scenario, parasite infectivity alleles sweep through the population, following sweeps of host resistance genes (Woolhouse et al. 2002). In the butterfly Hypolimnas bolina, Polynesian populations are infected with a Wolbachia strain which causes male killing, but in southeast Asia, both males and females are infected with no apparent sex-ratio distortion (Hornett et al. 2006). The lack of male-killing is due to the evolution of a suppressor gene in the host (Hornett et al. 2006), which has been shown to spread very rapidly through host populations, with a population in Samoa changing from 100:1 to 1:1 sex-ratios in less than ten generations (Charlat et al. 2007). Therefore, it is possible the sweeps observed in parasites are part of an ongoing arms race such as that described above, with advantageous mutations sweeping through both the host and parasite populations.

Alternatively, such sweeps could be explained by a novel parasites spreading through previously uninfected host populations. Parasitic vertically transmitted bacteria are known to switch between host species (O'Neill et al. 1992; Weinert et al. 2009; Haselkorn 2010), which will result in the spread of parasites through uninfected host species. Such patterns have been observed in P transposable elements in Drosophila (Clark and Kidwell 1997). P elements are widespread in host taxa (Pinsker et al. 2001) and show clear evidence of host switching (Clark and Kidwell 1997). Previously uninfected populations of D. melanogaster have recently acquired P elements from Drosophila willistoni as the result of a host switch and subsequent spread through populations (Kidwell 1994). This invasion has occurred in the past ~60 or so years, and has resulted in the parasite sweeping globally through D. melanogaster populations to fixation (Kidwell 1994). It is unclear what the mechanism of transfer between species was, but parasitic mites have been suggested to be possible vectors (Houck et al. 1991). Therefore, sweeps of vertically transmitted symbionts could also be due to ongoing invasions of host populations due to host shifts. Once established in a host population these parasites may undergo selective sweeps as part of arms races with the hosts as described above.

A number of maternally-transmitted symbiont-driven sweeps have been observed in insects, because they are readily detected as the linkage between host and symbiont

7 Chapter 1: General Introduction means a sweep of the symbiont also results in a sweep of mtDNA (Jiggins 2003; Charlat et al. 2009; Raychoudhury et al. 2010). Similarly, in Drosophila simulans a strain of Wolbachia has swept through populations in California at a rate of 100km a year (Turelli and Hoffmann 1991). Likewise, Spiroplasma bacteria have been shown to be sweeping through populations of Drosophila neotesteca in North America as it protects is host from a recently acquired sterilising nematode parasite (Jaenike et al. 2010).

Sigma viruses

While a great deal is known about vertically transmitted bacteria in arthropods, we know very little about vertically transmitted viruses. Having asked whether other sigma-like viruses are common (1.2.1), I then examined if such viruses are vertically transmitted like DMelSV. It seems likely they are, as CO2 sensitivity ⎯ a common symptom of rhabdovirus infection ⎯ has been shown to be transmitted vertically through both males and females in Drosophila and mosquitoes (see 1.2.1; Williamson 1959; Williamson 1961; Shroyer and Rosen 1983).

The biparental mode of transmission of DMelSV has been shown to be an alternate method of spread than those used by purely maternally transmitted bacteria (L'Heritier 1970), and can allow the virus to persist even if it is costly to the host and if it is transmitted imperfectly. In DMelSV, a rapid spread of a viral genotype able to overcome a known host resistance gene was observed (Fleuriet et al. 1990; Fleuriet and Sperlich 1992; Fleuriet and Periquet 1993), consistent with the arms race scenario discussed above (described in detail in 1.3). DMelSV has been reported to show a high degree of population structuring with low levels of within population diversity (KST=0.73) and European isolates of the virus have been estimated to have shared a common ancestor in the last few hundred years (Carpenter et al. 2007). Therefore, by sampling natural populations of sigma viruses in other species I have examined the population structure and dynamics in other sigma virus populations.

8 Chapter 1: General Introduction 1.2.3 Host shifts

General background

Host parasite coevolution is generally thought of as a long-term process of reciprocal adaptation between host species. However, many diseases are the result of recent host switching events, with a parasite jumping from one host species to another. For example HIV (Hahn et al. 2000), Influenza (Webby and Webster 2001), and Plasmodium (Liu et al. 2010) have all come to infect humans as the result of a host shift. Only a few cases are known of parasites cospeciating with their hosts at a rate greater than we would expect by chance (Hafner and Page 1995; McGeoch et al. 2000; Switzer et al. 2005), with many contemporary host-parasite interactions being the result of a host switching event (Woolhouse and Gowtage-Sequeria 2005).

There are likely to be multiple determinants of which host species parasites switch between, with sympatry between the natural (or donor) and novel (or recipient) host species being a pre-requisite (Woolhouse et al. 2005; Parrish et al. 2008). Parasites must then be able to replicate in a host, and then be transmitted at a sufficient rate for the pathogen to spread in the novel host population (Woolhouse and Gowtage- Sequeria 2005), which may require adaptation to the novel host (Anishchenko et al. 2006; Pepin et al. 2010). However, once in sympatry it is unclear what determines the likelihood of a successful host shift. Understanding these determinants is critical for predicting disease emergence as they may help predict which pathogens are likely to emerge with relatively little contact with the novel host.

One factor that may be a major determinant of host switching is the host phylogeny. The reasoning behind this is that closely related hosts will offer a more similar environment, increasing the chance of successful infection by a host-switching parasite (Engelstadter and Hurst 2006). Such patterns have been observed in nature with rabies virus in bats showing greater rates of cross species transmission between closely related species (Streicker et al. 2010). Likewise, host relatedness is the best predictor for pathogen sharing of protozoan and helminth parasites in primates

9 Chapter 1: General Introduction (Davies and Pedersen 2008). However, viruses infect more distantly related host species with geographic overlap being a greater predictor of pathogen sharing (Davies and Pedersen 2008). The phylogenies of primate infecting lentiviruses and rodent-insectivore infecting hantaviruses show high levels of congruence with their hosts phylogeny, but the common ancestor of these viruses in thought to be much more recent than the host speciation events; preferential host switching between closely related species has been demonstrated to be a possible explanation for such patterns (Charleston and Robertson 2002; Ramsden et al. 2009). In experimental studies that artificially infected novel host species, similar patterns have been found with negative relationships between infection success and genetic distance from the natural host (Perlman and Jaenike 2003; Gilbert and Webb 2007; de Vienne et al. 2009; Russell et al. 2009).

Additionally, the host phylogeny may determine infection success due to certain clades of hosts being particularly resistant or susceptible to a parasite due to common ancestry. This could be due to the common ancestor of a group having acquired or lost a certain immune or cellular component. For example, obscura group Drosophila lack the lamellocytes (a type of hemocyte) found in other Drosophila species, and are so unable to encapsulate or melanise foreign bodies (Havard et al. 2009). This loss of immune component may result in a clade of hosts particularly susceptible to macro-parasites such as parasitoids. Such patterns are likely common- place, with antifungal peptides in Drosophila (drosomycins) being found only in the melanogaster group (Sackton et al. 2007) and RNAi components (Obbard et al. 2009) showing lineage specific distributions across host taxa. Also, gains of immune components are apparent across greater evolutionary time scales (Litman et al. 2005).

Vertically-transmitted parasite host shifts

One might expect a priori that vertically transmitted symbionts should form long- term coevolutionary associations and cospeciate with their hosts. It has been well established that vertically transmitted mutualistic bacteria that provide fitness

10 Chapter 1: General Introduction benefits to the host (such as providing nutrients) cospeciate with their hosts (Moran et al. 1993; Bandi et al. 1995; Chen et al. 1999; Sauer et al. 2000). However, vertically transmitted parasitic bacteria seem to rarely cospeciate with their hosts (O'Neill et al. 1992; Weinert et al. 2009; Haselkorn 2010). Such associations may become unstable due to the evolution of host resistance (Hornett et al. 2006), which may lead to parasite extinction and result in failure to persist over host speciation events (Koehncke et al. 2009). Genomic parasites show similar patterns with transposable element and homing-endonuclease phylogenies both showing evidence of host switching (Goddard and Burt 1999; Loreto et al. 2008).

There have been few studies examining the factors affecting host-shifts of vertically transmitted symbionts, but anecdotal evidence suggests genetic distance from the natural host may also be important, with artificial transfers of bacterial symbionts being generally more successful between closely related host species (reviewed in Engelstadter and Hurst 2006; meta-analysis in Russell et al. 2009). Additionally, a study examining such effects between cocccinelid beetles and a male-killing bacteria found the bacteria were less successful at male-killing as genetic distance from the natural host increased (Tinsley and Majerus 2007). Similarly, the phylogenies of vertically transmitted bacteria in spiders and butterflies show evidence in support of preferential host shifts between closely related species (Jiggins et al. 2002; Baldo et al. 2008; Russell et al 2009; Stahlhut et al. 2010).

Sigma viruses

As it seems likely sigma viruses infect multiple host species, it is of interest as to how sigma viruses pass between host lineages. As DMelSV is known be costly to its host (discussed in 1.3) it is perhaps unlikely that it cospeciates with its hosts like mutualistic vertically transmitted symbionts. It has been suggested that the evolution of host resistance may lead to extinctions of other vertically transmitted symbionts (Hornett et al. 2006; Koehncke et al. 2009). The coevolutionary dynamics observed in natural populations (described in detail in 1.3(Fleuriet et al. 1990; Fleuriet and Sperlich 1992; Fleuriet and Periquet 1993) suggest that sigma virus associations with

11 Chapter 1: General Introduction their hosts may be similarly unstable. This may result in sigma virus associations with a host being relatively short lived and so ⎯ like other parasitic vertically transmitted symbionts (O'Neill et al. 1992; Koehncke et al. 2009; Weinert et al. 2009; Haselkorn 2010) ⎯ sigma viruses long term persistence may rely on switching between host lineages. Therefore, I have examined whether sigma viruses have undergone host shifts. For other vertically transmitted endosymbionts, parasitic mites and parasitoids have been shown to have the potential to act as vectors (Houck et al. 1991; Vavre et al. 1999; Jaenike et al. 2007; Loreto et al. 2008), and so it may be possible sigma viruses also switch between host species in this manner. If horizontal transfer does occur, which species do sigma viruses switch between? It may be they simply switch between hosts in the same habitat or on the same continent, or the success of establishment may be determined by other host factors such as genetic distance from the natural host (discussed above). I have examined this by carrying out a cross-inoculation experiment, injecting sigma viruses into hosts from across the Drosophila phylogeny. This question has broad applicability as it has been suggested that the high mutation rates and short generation times make RNA viruses the most likely pathogen group to host shift (Woolhouse et al. 2005; Parrish et al. 2008).

1.3 Sigma virus general biology

1.3.1 Genome

DMelSV is in the family Rhabdoviridae (Berkalof et al. 1965; Teninges 1968; Teninges et al. 1993). Rhabdoviruses infect the cytoplasm of host cells, have a ~12kb genome, and usually composed of 5 main genes 3’- N-P-M-G-L - 5’ (Fu 2005). The N gene encodes the nucleocapsid protein which forms a tight helical structure of ~1200 sub-units forming the viral capsid. The P gene encodes a phosphoprotein that interacts with the L gene protein ⎯ the RNA-dependant RNA polymerase⎯ to transcribe and replicate the viral genome. The M protein encodes the matrix protein whilst the G protein encodes the glycoprotein, which forms spikes coating the virion.

12 Chapter 1: General Introduction The M and G proteins are involved in assembly of the virus and budding from the host cell (Lyles and Rupprecht 2007). The genome organisation of DMelSV is the same as other rhabdoviruses, but with the addition of an additional gene of unknown function (the X gene) located between the P and M genes (Figure 1 and (Teninges et al. 1993; Contamine and Gaumer 2008).

Figure 1. 1 Diagram of sigma virus particle and viral genome.

1.3.2 Rhabdovirus replication

In rhabdovirus replication the negative sense genomic strand is used for transcription of mRNAs, with the polymerase stopping and starting mRNA production of each gene at conserved transcription initiation and termination sites, but with a 70-80% chance of transcription continuing at each gene junction (Lyles and Rupprecht 2007; Huszar and Imler 2008). This results in a greater number of transcripts of genes at the 3’ end of the genome, which is presumably a mechanism to control gene expression. The genomic strand also acts as a template to produce a positive sense

13 Chapter 1: General Introduction replication intermediate (anti-genome), which is used to produce new genomes for packaging (Lahaye et al. 2009). Finally virions then bud from the infected host cells.

1.3.3 CO2 sensitivity

DMelSV causes flies to become CO2 sensitive, with infected flies becoming paralysed and dying after exposure to CO2, whilst uninfected flies recover after a short period (L'Heritier and Teissier 1937). The paralysis seems to be due to damage caused to nervous tissue, in particular the thoracic ganglia (L'Heritier 1948; Busserea 1970; Busserea 1970). Other rhabdoviruses are known to replicate in nervous tissue, with the glycoprotein of rabies virus preferentially attaching to several neural cell receptors (Lyles and Rupprecht 2007), suggesting this may be a general rhabdovirus trait. Indeed, other rhabdoviruses injected into Drosophila (and other insects) also cause CO2 paralysis (Bussereau and Contamine 1980; Rosen 1980; Shroyer and Rosen 1983) and it has been noted rhabdovirus infected aphids have reduced longevity after CO2 exposure (Sylvester and Richardson 1992). Recent studies have confirmed DMelSV is found at high titres in nervous tissues using fluorescent-tagged antibodies (Tsai et al. 2008; Ammar et al. 2009).

The CO2 paralysis is brought on by relatively short exposures to CO2 (<1 min) and is specific to CO2, with other gases and injected chemicals having no effect (L'Heritier

1948; Brun and Plus 1980). The paralysis is also sensitive to changes in CO2 concentration and temperature, with lower temperatures requiring lower CO2 concentrations to cause paralysis. At close to 0°C low concentrations of CO2 are required (<20%), increasing to 50% at 10°C, with no CO2 induced paralysis observed above 23°C (Brun and Plus 1980).

Although we know the thoracic ganglion is involved in CO2 induced paralysis, the exact mechanism is unknown. Interestingly, CO2 exposure will lower the pH of a fly’s haemolymph, and both vesicular stomatitis virus (VSV) and rabies show enhanced binding at reduced pH, and G proteins induce membrane fusion in a pH- dependant manner, suggesting a potential mechanism. However, crude attempts to

14 Chapter 1: General Introduction manipulate fly pH by injecting acidic solutions (hydrogen-cyanide, formic acid and hydrochloric acid) were found to have no effect (L'Heritier 1948; Brun and Plus 1980). Additionally, rhabdoviruses bind to acetylcholine receptors on nervous tissue, which could also be involved in paralysis, but assays of acetylcholinesterase (which degrades acetylcholine at synapses) activity found no different between infected and uninfected flies (Brun and Plus 1980). Additionally, a dose of CO2 below the threshold required to cause paralysis can protect the flies from a subsequent dose that would normally be lethal (L'Heritier 1948). While injection of infected fly haemolymph into uninfected flies will cause paralysis ~10-15 days post-inoculation, there is no immediate effect suggesting it is not due to changes in the haemolymph but is induced by viral replication itself (L'Heritier 1970).

It is unclear if the CO2 trait has any relevance in nature, although rotting fruit or plants ⎯ which are often Drosophila oviposition sites (Basden 1954) ⎯ will produce CO2 , which could have (sub-lethal) effects on flies in enclosed spaces at low temperatures (where a lower concentration of CO2 is required for paralysis (Brun and Plus 1980), and could make flies more susceptible to predators or parasitoids.

Additionally, the effect of CO2 is not just specific to adults, but also causes death in larvae, although early stage embryos and pupae do not seem to be affected (L'Heritier 1948).

1.3.4 Mode of transmission

DMelSV is transmitted vertically through both sperm and eggs, with horizontal transmission being rare or absent (L'Heritier 1957; Brun and Plus 1980; Fleuriet 1988). The pattern of DMelSV transmission differs between the sexes (L'Heritier 1948; L'Heritier 1957; L'Heritier 1970; Fleuriet 1988). Firstly, females transmit the virus at a higher rate than males. Secondly, the transmission rate of a fly is reduced when it is infected by its father rather than its mother. Females inheriting the virus from their mother are termed ‘stabilised’, as their rate of transmission is close ~100%. However, if a female is infected by her father (termed ‘unstablised’), her average transmission drops to a much lower rate. Similarly while a ‘stabilised’ male

15 Chapter 1: General Introduction infected by his mother can transmit the virus (albeit at a lower rate than females); if a male is infected by his father, he does not transmit the virus at all. Therefore, the virus cannot be transmitted through males for two successive generations. This peculiarity seems be because sperm transfer a lower viral titre to the developing embryo, leading to a failure to infect the early-stage germ line (Plus 1955; L'Heritier 1957; L'Heritier 1970). Similar outcomes to unstabilised transmission are observed when the virus is injected into flies (L'Heritier 1957; L'Heritier 1970).

Interestingly, viral titres can be greater by ~3-5 times in non-stabilised flies (L'Heritier 1957; L'Heritier 1970), which have lower rates of transmission (Plus 1955; L'Heritier 1957); this may be a viral response to failure to invade the germ line pole cells in the zygote (L'Heritier 1970), causing over replication in the host in an “attempt” to get in (see Figure 1 in L'Heritier 1970). A similar process is thought to occur through artificial inoculation of the virus (L'Heritier 1970). As evidence for this, in stabilised flies about one half of the total viral titre is found in the ovarian cysts (clusters of germ line cells) of egg laying females (L'Heritier 1970), whereas in non-stabilised females only a few cysts harbour similar viral titres. This means the majority of the viral titre in non-stabilised flies is in somatic tissues, which may represent a failure to invade and replicate to high levels in the germ line. Females with high ovarian viral titres have high rates of transmission (Bregliano 1970; Brun and Plus 1980). In stabilised females, this is thought to be the result of early invasion of the immature ovaries, which can also occur in non-stabilised females, but only when an ovarian cyst happens to get infected early enough in development (L'Heritier 1970). The proportion of sensitive offspring decline over time in crosses of infected males mated once to uninfected virgin females, which could suggest infected sperm have reduced longevity or the virus is lost from sperm over time (Plus 1955; L'Heritier 1957).

1.3.5 Costs of infection

Host populations kept under relaxed selection (by reducing density) rapidly accumulated DMelSV to high levels (~70%), whereas if these populations were

16 Chapter 1: General Introduction placed under strong natural selection the virus showed a decline in frequency (Yampolsky et al. 1999). Given the virus is only found at low prevalence in the field, it is likely this difference is due to DMelSV causing a fitness cost to the host. Based on the rate of expansion under relaxed selection this has been estimated at a cost of 20-30% (Yampolsky et al. 1999). A similar cost (23%) has been estimated from transmission rates and prevalence in the field (Wilfert and Jiggins in review), although this approach assumes the virus population is at equilibrium, which it is most likely not.

Several studies have examined the physiological basis of this cost. Infected flies have reduced egg viability in the lab, (Seecof 1964; Fleuriet 1981) and may take longer to develop from egg to adult by ~5-10 hours (Seecof 1964). Seecof (1964) used only a single fly line but found a 20% reduction in egg viability. Fleuriet (1981) used multiple fly lines and her data suggest an overall reduction in egg viability of ~10% (averaging each line over all crosses, Wilcox-exact test, P=0.049, following reanalysis by B. Longdon). Similarly it has been reported that infected ovarian cysts develop slower than uninfected ones, which can result in infected eggs accumulating and blocking the ovarian tubules (L'Heritier 1970). Additionally a study in semi- natural conditions reported that virus infected flies declined in frequency if over- wintered as adults (Fleuriet 1981). However, the patterns observed are not clear, with only five out of the six populations tested showing this trend, and no statistical analysis has been carried out on this data (Fleuriet 1981). Studies of CO2 sensitivity in the wild find a relationship consistent with this, but the data is inadequate to be conclusive (Herforth and Westphal 1966; Felix et al. 1977). Additionally, it has been reported that DMelSV reduces flies’ resistance to the non-natural fungal pathogen Beauvaria bassiana (~10% increase in mortality in DMelSV infected flies exposed to fungi), although no suppression of the Toll pathway (which is known to be involved in anti-fungal resistance) was observed (Carpenter 2008).

17 Chapter 1: General Introduction 1.3.6 Host resistance

Given the cost of DMelSV to its host, it is not surprising the host has evolved resistance mechanisms towards the virus. Variation in resistance can be through mutations in immune genes or changes to the host cellular machinery that is used by the virus during its replication cycle (Coustau et al. 2000). In DMelSV this has been studied by mapping resistance alleles to the virus (Gay 1978), aided by the large genetic toolbox in D. melanogaster and the recent advances in sequencing technology which has enabled these polymorphisms to be mapped to a fine scale.

1.3.7 Ref(2)P

There is considerable variation in the transmission efficiency of DMelSV (Bangham et al. 2008). In females, most of this variation can be explained by the presence or absence of an allele of ref(2)P, with the greatest effect being when offspring are homozygotes (Bangham et al. 2008). The mean female transmission rate to offspring homozygous for the resistance allele was 8% compared to 99% in offspring homozygous for the susceptible allele. The resistance allele is associated with a 59% decline in male transmission in homozygous offspring, with other genes also affecting paternal transmission. Female flies sampled in the field were found to have 28% lower transmission rates if they carried the ref(2)P resistance allele (combined homozygous and heterozygous flies), whereas it had no detectable effect on male transmission (Wilfert and Jiggins 2010). Being able to replicate lab results in the field is a crucial finding if results are to be extrapolated to the real world. It is unclear why the resistance allele has a greater effect in females, as ref(2)P is expressed in both ovaries and testes. As the resistance allele is known to reduce viral titre (Brun and Plus 1980; Carre-Mouka et al. 2007) it is likely it acts to clear virus from zygotes; why this is less effective in sperm than eggs in unclear, particularly as embryos infected by sperm carry a lower viral titre (Plus 1955; L'Heritier 1957; L'Heritier 1970).

18 Chapter 1: General Introduction The resistance is due to two amino-acid changes in the ref(2)P gene, from a Glutamine-Asparagine to a single Glycine (Dru et al. 1993; Wayne et al. 1996; Bangham et al. 2007). The resistant allele of Ref(2)P appears to have recently increased in frequency under positive selection as the resistant haplotypes show less variation than expected under a neutral model (Bangham et al. 2007). The resistant allele appears to have spread through the host population in the last few thousand years, based on the degree of linkage disequilibrium with flanking markers (Bangham et al. 2007).

The resistant allele may be a transient polymorphism, which is currently sweeping through host populations. This is at odds however, to the much more recent spread of the virulent strain of DMelSV able to overcome this resistance (see 1.3.9). This lag could be explained by the mutation only recently becoming frequent enough to select for viral counter adaptation, as it is at frequencies of ~24% in natural populations (Wilfert and Jiggins 2010) and is recessive, so only ~5% of flies will be homozygous and so resistant (Bangham et al. 2007).

Alternatively, the resistance allele could also be a polymorphism that is being maintained by negative frequency dependant selection. Negative frequency dependant selection can occur in a non-matching alleles scenario (which seems to be the case here; Bangham et al. 2008) if host resistance is costly (Agrawal and Lively 2002). There is no clear evidence for a cost to the ref(2)P resistance allele, although some possible mechanisms for a cost exist. Male flies suffer sterility if the gene is knocked-out (Dezelee et al. 1989) with sperm being non-motile and suffering mitochondrial degeneration, hinting it may be involved in gamete production and so it is possible the resistance allele may carry a fertility cost. Additionally ref(2)P is involved in the Toll pathway, which is part of Drosophila’s defence against gram positive bacteria and fungi (Lemaitre and Hoffmann 2007), suggesting resistance to sigma virus may trade-off with resistance to other parasites. However, a test for an association between bacterial resistance and ref(2)P resistance to DMelSV found no effect (Bangham et al. 2007).

19 Chapter 1: General Introduction The molecular function of ref(2)P DMelSV resistance is unknown, but based on known functions of ref(2)P there are a number of possible mechanisms. Firstly, as knocking out ref(2)P results in male sterility (Dezelee et al. 1989) it is likely involved in gamete formation, so it may prevent the virus entering the gametes. It is also involved in the toll pathway, which is involved in resistance to other pathogens, but is not known to affect DMelSV.

Perhaps the most interesting possible mechanism is the fact ref(2)P plays a role in autophagy (Nezis et al. 2008); the mammalian homolog of ref(2)P (p62) is a scaffold protein which binds to a protein from the autophagy pathway called Atg8 (Pankiv et al. 2007). Autophagy is the compartmentalisation and degradation of cellular components to recycle and reallocate nutrients, and can be involved in supporting viral replication (viruses may hijack the autophagy pathway to produce physical scaffolds or “virus factories” to support replication (Wileman 2006; Lahaye et al. 2009)). However, it can also have an antiviral role, with viral particles being isolated and degraded in a similar manner to cellular components (Levine and Deretic 2007). In D. melanogaster the role of autophagy in cell-culture infected with VSV (which is closely related to DMelSV) has been well-examined (Shelly et al. 2009). The knockout of autophagy genes (including Atg8) results in increased levels of VSV replication. VSV infection causes the formation of virally induced autophagic compartments (autophagosomes), and Atg8, which is known to correlate with the number of autophagosomes, is upregulated upon infection. The production of autophagosomes are induced by replication defective viral particles, suggesting it is the incoming virus that activates this rather than viral replication, and the VSV-G protein appears to be responsible for the induction. In adult flies, knocking out genes in the autophagy pathway results in higher viral titres and increased mortality, confirming the antiviral role of autophagy. Infection activates the pathway by decreasing PI3K/Akt signalling (which normally suppresses autophagy), leading to increased antiviral autophagy (Shelly et al. 2009).

Ref(2)P is also involved in the formation of ubiquitin protein aggregates in the brain of adult D. melanogaster ⎯ which are a characteristic feature of neurodegenerative

20 Chapter 1: General Introduction diseases ⎯ and autophagy is critical to prevent their accumulation (Nezis et al. 2008). In mammals the ref(2)P homolog p62 is thought to deliver these protein aggregates to autophagosomes for degradation (Bjorkoy et al. 2005). As DMelSV affects the nervous system causing paralysis on exposure to CO2, and like other rhabdoviruses (Lyles and Rupprecht 2007) is found at high levels in nervous tissue (L'Heritier 1948; Tsai et al. 2008; Ammar et al. 2009), it is possible ref(2)P may be involved in sigma virus infection of nervous tissue.

Ref(2)P has been reported to be associated with the polymerase associated protein (P gene) of DMelSV(Wyers et al. 1993), and has antigenic cross-reactivity with the nucleocapsid protein (N gene) (Wyers et al. 1993; Wyers et al. 1995). These associations are apparently specific to DMelSV, with proteins from another rhabdovirus (VSV) showing no interaction with ref(2)P (Wyers et al. 1993).

Genes other than ref(2)P are also involved in resistance to male transmission (Gay 1978; Bangham et al. 2008). However, it seems resistance through these other genes may not always involve impeding viral replication, as there is little correlation between resistance through injection and paternal transmission (in ref(2)P susceptible flies; (Bangham et al. 2008). This may be due to the virus having to invade and replicate to high levels in the germ line early in host development to be efficiently transmitted (Bregliano 1970; L'Heritier 1970). High viral titres in the soma (which inflate total viral titre) could therefore simply represent a failure to invade and replicate to high levels in the germ line.

1.3.8 Immune activation

While the immune responses to DMelSV have been examined in two studies (Tsai et al. 2008; Carpenter et al. 2009), there is no clear trend of immune genes or pathways being switched on or upregulated in DMelSV infected flies, suggesting that like other vertically transmitted symbionts (Bourtzis et al. 2000; Vasilakis and Weaver 2008) DMelSV may not induce an immune response. Carpenter et al (2009) observed effects of sigma virus infection on gene expression. These included down

21 Chapter 1: General Introduction regulation of genes involved in translation, a strategy related rhabdoviruses also use to allow preferential translation of viral mRNAs (Lyles and Rupprecht 2007). Interestingly, in females genes encoding chorion proteins were upregulated which, given DMelSV’s mode of vertical transmission through eggs, may be involved in the mechanism underlying this.

ADARs are editing enzymes that introduce mutations in double stranded RNA (changing adenosine to guanosine) and are expressed in the central nervous system where they edit host mRNAs, but have also been shown to edit viruses in vertebrates (reviewed in Carpenter et al. 2009). Similarly strains of DMelSV have been isolated with hyper-mutation characteristic of ADAR editing. It is thought this may act to disrupt viral gene function or may mark or tag them for degradation, possibly by the RNAi silencing complex (Carpenter et al. 2009).

It has recently been shown that the Drosophila RNAi pathway provides resistance to another rhabdovirus, VSV (Mueller et al. 2010). Whether sigma viruses induce an RNAi antiviral response is of particular interest, as genes in the RNAi pathways are amongst the most rapidly evolving in the Drosophila genome (Obbard et al. 2006; Obbard et al. 2009), suggesting they may be locked in coevolutionary arms races with their natural viral parasites (Obbard et al. 2009).

1.3.9 Coevolutionary dynamics

Rapid sweeps of viral isolates through host populations have been observed in DMelSV. The evolution of the resistant ref(2)P allele has led to reciprocal coevolution in DMelSV, leading to virulent viral isolates able to overcome this host resistance. This virulent strain rapidly swept through natural host populations in France and Germany over a period of ~10 years in the 1980’s, replacing the avirulent (ref(2)P susceptible) viral strain (Fleuriet and Sperlich 1992; Fleuriet and Periquet 1993). European isolates of DMelSV have been estimated to have shared a common ancestor ~200 years ago (Carpenter et al. 2007), suggesting the mutation that allowed the virus to overcome host resistance is very recent, as current DMelSV isolates

22 Chapter 1: General Introduction include both the virulent and a small proportion of avirulent strains (~16-27% L. Wilfert, personal communication). While most DMelSV strains fall into the recent clade described above, one isolate (AP30 collected in Florida) shows high levels of divergence from the other isolates (Ks, the number of substitutions per synonymous site, equals 0.4; Carpenter et al. 2007). It has been postulated this strain may be a recent acquisition from another host species, or may be the remnant from a past selective sweep (Carpenter et al. 2007).

Fleuriet (1988) attempted to examine this process in experimental populations of D. melanogaster containing only susceptible ref(2)P alleles and infected with the avirulent strain of DMelSV. If resistant ref(2)P alleles were introduced into the population a decrease in the frequency of infected hosts was observed, and the resistant ref(2)P allele frequency increased to levels similar to that observed in natural populations (~0.3). In an analogous second population the virulent strain was introduced into the population. An increase in the frequency of infected flies was observed, with the avirulent viral type being replaced by the infective viral type, along with a decrease in resistant ref(2)P allele frequency (Fleuriet 1988). It has been suggested this may be due to a cost of resistance as in the presence of the infective viral type (reducing the benefit of resistance), the ref(2)P resistance allele decreases in frequency. However, these experiments were founded with only a few flies, resulting in linkage-disequilibrium between ref(2)P and surrounding sites, which is made worse by the fact is ref(2)P in a region of low recombination, meaning the results may not be due to ref(2)P.

It has also been reported that there is a cost for the virus to be infective and overcome host resistance (Fleuriet 1999). Fleuriet (1999) attempted to examine this by measuring the frequency of competing infective and avirulent viral types in populations of flies lacking of the ref(2)P resistance allele. It was reported that avirulent viral types had a slight advantage to infective viral types in some cases, with a ~20% reduction in male transmission in the virulent strain (Fleuriet 1999). However, as this study used only one or two isolates of each of the infective and

23 Chapter 1: General Introduction avirulent viruses, the effect may be due to linkage disequilibrium with other mutations that affect viral fitness.

24 Chapter 2: Sigma virus discovery and phylogeny Chapter 2: Sigma virus discovery and phylogeny

This chapter has been published as:

LONGDON, B., OBBARD, D. J. & JIGGINS, F. M. 2010. Sigma viruses from three species of Drosophila form a major new clade in the rhabdovirus phylogeny. Proceedings of the Royal Society B, 277, 35-44. http://dx.doi.org/10.1098/rspb.2009.1472

It was written in collaboration with Darren Obbard and Frank Jiggins.

Summary

The sigma virus (DMelSV), which is a natural pathogen of Drosophila melanogaster, is the only arthropod-specific rhabdovirus that has been described. We have discovered two new rhabdoviruses in Drosophila obscura and Drosophila affinis, which we have named DObsSV and DAffSV respectively. We sequenced the complete genomes of DObsSV and DMelSV, and the L gene from DAffSV. Combining these data with sequences from a wide range of other rhabdoviruses, we found that the three sigma viruses form a distinct clade which is a sister group to the Dimarhabdovirus supergroup, and the high levels of divergence between these viruses suggest that they deserve to be recognised as a new genus. Furthermore, our analysis produced the most robustly supported phylogeny of the Rhabdoviridae to date, allowing us to reconstruct the major transitions that have occurred during the evolution of the family. Our data suggest that the bias towards research into plants and vertebrates means that much of the diversity of rhabdoviruses has been missed, and rhabdoviruses may be common pathogens of insects.

25 Chapter 2: Sigma virus discovery and phylogeny 2.1 Introduction

Rhabdoviruses are single stranded negative sense RNA viruses in the order Mononegavirales. The family is diverse and has a wide host range, infecting plants, invertebrates and vertebrates (ICTVdB 2006). Rhabdoviruses were originally classified as a family based on their shared bullet-shaped morphology and on serological evidence, but genome sequencing has since confirmed their shared ancestry (Fu 2005). The rhabdoviruses are divided into six genera. The genus Lyssavirus infects a range of mammals and includes the Rabies virus. The genera Cytorhabdovirus and Nucleorhabdovirus are arthropod-vectored and infect plants, while the genus Novirhabdovirus infects various species of fish. Members of the genera Vesiculovirus and Ephemerovirus infect a wide range of animals including fish, invertebrates and mammals, and together form the dimarhabdovirus super group (Bourhy et al. 2005). A large proportion of the known dimarhabdoviruses have been isolated from vertebrates and arthropods, which are thought to vector them.

The full diversity of the rhabdovirus family is unknown due to a strong sampling bias towards lineages of agronomic and medical importance (Fu 2005; Ammar et al. 2009). One area of neglect is the study of rhabdoviruses in arthropod hosts. As the majority of known dimarhabdoviruses, cytorhabdoviruses and nucleorhabdoviruses are arthropod- (often insect-) vectored, by studying arthropod-specific rhabdoviruses we may be able to understand how and why these viruses evolved traits such as virulence towards vertebrates.

The only arthropod-specific rhabdovirus that has been described to-date is the sigma virus (DMelSV), which is a natural pathogen of Drosophila melanogaster (L'Heritier and Teissier 1937; Contamine and Gaumer 2008). Sigma has an unusual mode of transmission, in that it is only transmitted vertically (through both eggs and sperm), and does not move horizontally between hosts. It was initially placed in the Rhabdoviridae based on its bullet shaped viral particles (Berkalof et al. 1965; Teninges 1968), and this has subsequently been confirmed using sequence data (Bjorklund et al. 1996). However, only about half of DMelSV’s ~12.7kb genome has previously been sequenced (Teninges et al. 1993), and the full sequence of the L

26 Chapter 2: Sigma virus discovery and phylogeny gene — which encodes the RNA dependant RNA polymerase (RDRP) — is unknown (Huszar and Imler 2008). This has hampered phylogenetic analyses of DMelSV because the L gene contains conserved domains that are useful in determining the evolutionary relationships between distantly related viruses (Poch 1989; Poch et al. 1990; Bourhy et al. 2005). Previous phylogenies that have included DMelSV have been based on the less conserved N gene, but many have lacked strong statistical support or only included a few closely related viruses. This may explain why the different studies have found conflicting results, either placing DMelSV as a sister group to the vesiculoviruses, or as an out-group to the ephemeroviruses and vesiculoviruses (Bjorklund et al. 1996; Hogenhout et al. 2003; Fu 2005; Kuzmin et al. 2006)

It is possible that rhabdoviruses may be common pathogens in insect populations. Flies infected with DMelSV become paralysed or die on exposure to high concentrations of CO2 , whereas uninfected flies recover, and similar symptoms occur when other rhabdoviruses are injected into mosquitoes or Drosophila (Rosen 1980; Shroyer and Rosen 1983). It has also been noted aphids have reduced longevity after CO2 exposure following rhabdovirus injection (Sylvester and

Richardson 1992). There have been reports of CO2 sensitivity occurring in at least fifteen other species of Drosophila (Brun and Plus 1980) and in Culex mosquitoes (Shroyer and Rosen 1983), suggesting that rhabdoviruses may be common in insects.

The most extensive of these studies looked at CO2 sensitivity in Drosophila affinis and Drosophila athabasca, and found the sensitivity was caused by a vertically transmitted infectious agent (Williamson 1959; Williamson 1961). However, it was not known if this agent is a rhabdovirus, as other viruses (e.g. DXV which was isolated from cell culture) can also cause sensitivity to anoxia in Drosophila (Teninges et al. 1979).

In this study we have identified two new rhabdoviruses associated with CO2 sensitivity in Drosophila obscura and D. affinis. To see where these new Drosophila rhabdoviruses are placed within the phylogeny, we sequenced the L gene from all viruses. Additionally we have completed the genome sequence of DMelSV and the new virus in Drosophila obscura. The L gene of these viruses was combined with all

27 Chapter 2: Sigma virus discovery and phylogeny the rhabdoviruses L gene sequences available from public databases to produce the most comprehensive phylogeny of the Rhabdoviridae published to date.

2.2 Methods

Identifying and sequencing viruses

Drosophila affinis were collected from Raleigh NC, USA and D. obscura were collected from Essex, UK in the summer/autumn of 2007. Flies were collected by netting from yeasted fruit baits, and isofemale lines were created by placing a single female in a vial of Drosophila medium and allowing them to lay eggs. Offspring 0 were then exposed to pure CO2 for 15mins at 12 C, then placed at room temperature and examined 30mins later. The lines where the flies were dead or paralysed were used for RNA extractions. CO2 sensitive lines were stabilised (Brun and Plus 1980) by selecting female offspring which transmitted the virus to 100% of their offspring and maintained in the laboratory for over 15 generations. RNA was also extracted from two lines of D.melanogaster infected with the Hap23 and Ap30 isolates of DMelSV (Gay 1978; Carpenter et al. 2007). Ap30 is known to be genetically distinct from all the other DMelSV isolates that have been sequenced. Total RNA was extracted using Trizol reagent (Invitrogen Corp, San Diego) in a chloroform- isoproponal extraction. RNA was then reverse transcribed with MMLV reverse transcriptase (Invitrogen Corp, San Diego) using random hexamer primers.

The L gene of rhabdoviruses contains highly conserved domains (Poch 1989; Poch et al. 1990), and is the most conserved gene in rhabdoviruses and other non-segmented negative sense RNA viruses (Fu 2005). This conservation is useful in designing PCR primers which will work on a range of rhabdoviruses. Rhabdovirus L gene sequences were downloaded from Genbank and were aligned (as amino acids) using ClustalW. We manually designed degenerate primers that are conserved across most of the dimarhabdoviruses (table S1). PCR reactions with all primer combinations were carried out using a touchdown PCR cycle. PCR products were treated with exonuclease 1 and shrimp alkaline phosphatase to remove unused PCR primers and

28 Chapter 2: Sigma virus discovery and phylogeny dNTPs, and then sequenced directly using BigDye reagents (ABI, Carlsbad California) on an ABI capillary sequencer. A sequence's similarity to rhabdoviruses was confirmed using a tBLASTN search of Genbank.

Once a small region of the L gene had been sequenced, 3' RACE (rapid amplification of cDNA ends) was used to reach the 3' end of the L gene mRNA. RNA was reverse transcribed using superscript (Invitrogen Corp, San Diego) and a T linker primer (5’- GATCGAT[17]VN -3’). Products were then purified using a PCR purification column kit (Qiagen Corp, Maryland), and concentrated to a volume of 10-20 µl in a rotoevaporator. A PCR reaction (Long Range PCR kit, Invitrogen Corp, San Diego) was carried out using 2 µl of the cDNA using a T-linker primer and a gene specific forward primer. In some cases a nested PCR was required on the first PCR (which was diluted 1:10 first). Products were then sequenced by primer walking and sequences were assembled using Sequencher (version 4.5/4.8; Gene Codes Corp).

To obtain the remainder of the L gene and to attempt to obtain the rest of the genome, 3’ RACE was carried out on the viral genome itself. A polyA tail was added to the 3’ end of the virus using polyA polymerase (PAP). Approximately 5 µg of total RNA, 4units (0.8µl) PAP (New England Biolabs), 2 µl 10x PAP buffer, 2 µl rATP (10mM) (Promega corporation) and RNase free water to 20 µl, was incubated at 370C for 40mins. The RNA was then purified using a spin column kit (Zymo clean, Cambridge Biosciences). The eluted RNA was then reverse transcribed using superscript (Invitrogen Corp, San Diego) and a T linker primer. A PCR reaction was carried out using 2 µl of the cDNA using a T-linker and a gene specific primer (Long Range PCR kit, Invitrogen Corp, San Diego). In some cases a nested PCR was required on the first PCR (which was diluted 1:10 first). Products were then sequenced by primer walking using the methods as described previously. Although most of the 3’ end of the DMelSV genome has already been sequenced, the 3’ leader sequence is unknown. Therefore, we also use this approach to acquire the DMelSV leader sequence.

To obtain the 5’ genomic trailer sequence, and to determine the N gene transcription initiation site 5’ RACE was used. For the 5’ RACE on the viral genome, a gene

29 Chapter 2: Sigma virus discovery and phylogeny specific primer was used for a reverse transcription reaction using superscript RT (Invitrogen Corp, San Diego), whereas for the 5’ RACE on mRNA sequences a T- linker primer was used. Two 5’ RACE methods were used. In the first 1µl of BSA (20x) and 1µl of Manganese (20x) were added to the reverse transcription reaction. 20µl of the cDNA was then incubated overnight at 160C with 5 µl buffer 2 (New England Biolabs), 6µl dNTPs (2mM), 1µl klenow enzyme (5000U/ µl) (New England Biolabs),1 µl (50 µM) TS-short primer (5'- GGTCTGGAGCTAGTGTTGTGGG-3') and 17 µl water. This was then purified in a spin column PCR purification kit (Qiagen Corp, Maryland, USA), and was used with a gene specific primer and the TS short primer for PCR amplification. In the second method the cDNA was first purified using a spin column purification kit (Qiagen Corp, Maryland, USA). A polyA tail was added to the cDNA by incubating 21.5µl of the purified cDNA with 1µl of terminal transferase (30U/µl ) (Promega Corp), 6µl 5x terminal transferase buffer and 1.5 µl of dATP (2mM) at 370C for 40min, then at 700C for 10min. This was then used for PCRs with a gene specific primer and a T- linker primer. In both methods nested PCRs on the first round of PCRs was required, after diluting the samples 1:10.

Once the initial sequences from the RACE were obtained, new primers were designed along the length of the gene. These were used for PCR reactions on random-hexamer reverse transcribed cDNA, which were then sequenced in both directions (see above) to obtain high quality sequence data. The Genbank accession numbers for our new sequences are GQ375258 (DMelSV- HAP23), AM689309 (DMelSV- Ap30), GQ410979 (DObsSV) and GQ410980 (DAffSV).

Phylogenetic analysis

To infer the phylogeny, we obtained all available full length L-gene sequences from Genbank. L-gene coding sequences and the three sigma virus L gene sequences were aligned as translated amino acid sequences using ClustalW. As some of the sequences are highly divergent, we employed three different approaches for aligning the sequences to ensure our results were robust and not sensitive to the alignment used. Firstly, we aligned full length L gene sequences from all of the viruses, then

30 Chapter 2: Sigma virus discovery and phylogeny the most conserved region of the L gene from all of the viruses (corresponding to nucleotides 1284-3862 of rabies virus L gene coding region- Genbank accession NC_001542 ), and finally the full L-gene sequences of the dimarhabdoviruses (with three lyssaviruses as an outgroup). The conserved region and the alignments of the dimarhabdoviruses alignments are likely to be the most robust, as they do not include very different sequences that are hard to align. Human parainfluenza virus 1 was also included in the first two of these alignments as an out-group to root the tree.

The phylogeny of these sequences was reconstructed using both a Bayesian and a maximum likelihood approach. Bayesian posterior support values are less conservative than maximum likelihood bootstrap support, and so both values can be used as an upper and lower support for nodes (Douady et al. 2003). In addition to nucleotide models, we ran Bayesian analysis with amino acid sequences and models of protein evolution. In total we carried out nine analyses, using different three different methods of inference and three different sequence alignments.

Phylogenies were created from the amino acid alignments translated back into nucleotides. For the maximum likelihood trees, Model test (v3.7) (Posada and Crandall 1998) was used to estimate the model of sequence evolution and the analysis was run in PAUP (v4.0b10) (Swofford 1993). A parsimony tree created from tree bisection and reconnection with a heuristic search was used as a starting tree for the maximum likelihood analysis. A general time reversible model with a gamma distribution of rate variation and proportion of invariable sites was used. The maximum likelihood analysis used a heuristic search with a nearest neighbour interchange algorithm. The substitution rate parameters, shape of the gamma distribution and proportion of invariable sites used were those estimated by Model test. Support for the nodes was calculated by bootstrapping and trees were drawn using FigTree (v1.2; http://tree.bio.ed.ac.uk/software/figtree/).

Bayesian trees were created using the MrBayes programme (v3.1.2) (Huelsenbeck and Ronquist 2001). A general time reversible model was used with a gamma distribution and a proportion of invariable sites, with parameters estimated from the data during the analysis. As there is likely to be a considerable amount of noise from

31 Chapter 2: Sigma virus discovery and phylogeny third codon positions between these divergent sequences, a site specific rate model was used allowing each codon position to have its own rate. Two runs of four chains were run for 2,000,000 MCMC generations (20,000,000 for the conserved region tree), with trees being sampled every 100 generations.

Additionally, Bayesian amino acid trees were created using the MrBayes programme (v3.1.2) (Huelsenbeck and Ronquist 2001). A fixed rate model of protein evolution was assumed, and the phylogeny was reconstructed using a model jumping method. This allows switching between different models of amino acid substitution during the MCMC process, and all the models contribute to the final result and are weighted according to their posterior probability. A gamma distribution of rate variation among sites was used, with the shape estimated from the data. Two runs of four chains were run for 5,000,000 MCMC generations (1,000,000 for the dimarhabdovirus alignment tree), with trees being sampled every 100 generations.

The average standard deviation of split frequencies between the two runs approaching zero, and the log likelihood values of the cold chain becoming stable, were used to assess when to stop the run. The first 25% of the trees were discarded to ensure the chains had reached stationarity, and a consensus tree was created from the remaining trees. Figures were created using FigTree(v1.2) (http://tree.bio.ed.ac.uk/software/figtree/).

To compare the topology of the sigma virus phylogeny with the Drosophila phylogeny, we reconstructed the maximum likelihood phylogeny of Dimarhabdoviruses using the full length L gene alignment under the constraint that the sigma virus phylogeny follows that of the hosts (i.e. D. affinis and D. obscura form a monophyletic group). We then tested whether the likelihood of the constrained tree was significantly less than the unconstrained tree using a Shimodaira-Hasegawa test (SH test) in PAUP using the maximum likelihood dimarhabodvirus trees (Shimodaira and Hasegawa 1999).

Accession numbers for the sequences used in the phylogenetic analysis are available as supplementary materials.

32 Chapter 2: Sigma virus discovery and phylogeny

2.3 Results

Identification of two new Drosophila sigma viruses

In samples from wild populations, we detected 1 line of D. affinis from Raleigh NC, U.S. and 2 lines of D. obscura from Essex, U.K that were paralysed or died after exposure to CO2. To test whether these lines were infected with a rhabdovirus, we created cDNA from the flies and attempted to amplify a region of the RDRP gene using PCR primers designed in conserved sequences. The CO2 sensitive lines all produced a PCR product, which was sequenced and confirmed to be rhabdovirus-like by BLAST searches. These viruses were tentatively named as Drosophila affinis sigma virus (DAffSV) and Drosophila obscura sigma virus (DObsSV).

Genome sequences

We next attempted to sequence the genomes of these newly discovered viruses and the D. melanogaster sigma virus (DMelSV). Our strategy was to first use the short sequences produced with the conserved primers as the basis for 3' RACE on both the L gene mRNA and the negative sense genome, and then to sequence the trailer sequence using 5’RACE. This allowed us to completely sequence the genome of one DMelSV isolate, and to sequence all of the genome except the short 3’ leader and 5’ trailer sequences from a second DMelSV isolate. We sequenced the whole genome except the short 5’ trailer sequence for one DObsSV isolate. We also sequenced the entire L gene of DAffSV, but were unable to retrieve the remainder of the genome, possibly due to a poly-A region causing mispriming of the T-linker primer.

DObsSV genome

The genome of DObsSV (excluding the 5' trailer) is 12676 base pairs long. There are six open reading frames which, based on their predicted protein sequence and gene order, appear to be homologous to the N-P-X-M-G-L genes in DMelSV (Figure 2.1).

33 Chapter 2: Sigma virus discovery and phylogeny The N, P and X genes are in reading frame one, the M and the L genes are in frame two and the G gene is in frame three.

To annotate the coding sequence, we have assumed that each open reading frame starts at the first AUG occurring after the previous transcription termination sequence, and continues to the first stop codon. These coding regions make up 96% of the genome, with the L gene covering 51% of the total genome (Figure 2.1). A tBLASTn search of the NCBI nucleotide collection using the predicted protein sequences of these genes returned significant alignments (blast alignment scores over 80) with homologous genes from other rhabdoviruses for the N, G and L genes. The M gene (which is thought to encode the matrix protein in DMelSV) had a weakly significant alignment to Flanders virus M gene and no significant alignments were found with the P or X genes, which are the least conserved genes in the genome.

To predict the structure and function of the P and X, we used PHYRE (Kelley and Sternberg 2009), which compares the query sequence with proteins of known structure and function, Interproscan, which searches for protein signatures in the InterPro database (Zdobnov and Apweiler 2001), and SignalP, which predicts signal peptides (Bendtsen et al. 2004). The X gene contains a signal peptide (SignalP: P=0.98), but no predicted transmembrane regions, and has regions that are similar to viron RNA polymerases (PHYRE: 90% estimated precision), as has been reported for its homolog in DMelSV (Landesdevauchelle et al. 1995). However, it also shares similarities to topoisomerases, signal proteins and toxin molecules. We identified structures in the P gene as a viral RNA polymerase (PHYRE: 85% estimated precision).

In the non-coding regions, the motif 3’-GGUACUUUUUUU-5’ is found after all of the first five open reading frames in the genome. Based on its homology to other rhabdoviruses it is likely that it acts as the transcription termination sequence, and the seven U residues trigger polyadenylation of mRNAs (Huszar and Imler 2008). At the 3’ end of the genome there is a 3’ leader sequence of 99 bases before the first ATG. 5’ RACE on viral mRNAs failed, possibly due to the T-linker primer

34 Chapter 2: Sigma virus discovery and phylogeny annealing to polyA regions in the positive sense genomic strand, meaning we were unable to confirm the transcription initiation sequence.

DMelSV genome

The genome of DMelSV is 12625 base pairs long, the first five genes of which have been published previously (Teninges et al. 1993; Bras et al. 1994; Landesdevauchelle et al. 1995). The newly sequenced L gene open reading frame is 6389 bases long and compromises 51% of the total genome (Figure 2.1). By using RACE to confirm the sequence at the of the start of the N gene and end of the L gene, we were able to annotate the 54 base pair 3’ leader sequence and 180 base pair trailer sequence. In DMelSV, the transcription initiation site is 3’- GUUGUNG -5’ (Teninges et al. 1993) for all the genes bar the N, where it is 3’- UUGUUG -5’. The transcription initiation site occurs shortly after the previous transcription termination signal, with the exception of the M/G gene junction where the M gene and G gene mRNAs overlap by 33 bases (Teninges et al. 1993). The protein coding regions, however, do not overlap. Additionally we found the G gene to be 21 amino acids longer than described in its original genbank annotation due to what was possibly a sequencing error causing a false stop codon (accession number X91062) (Landesdevauchelle et al. 1995). In sequencing the 5’ trailer region and comparing this to the L gene mRNA we found the same transcription termination sequence (3’- GUACUUUUUUU -5’) as previously reported (Teninges et al. 1993), at the end of the L gene.

DAffSV L gene

We sequenced the entire L gene and the 5’ trailer of DAffSV (Figure 2.1). The predicted protein coding sequence of the L gene has significant tblastn alignments to other Rhabdovirus L genes. The transcription termination sequence is the same as DMelSV (3’-GAUCUUUUUUU-5’) based on comparing where the L gene mRNA terminates (sequenced by 3’ RACE on the mRNA) with the genome sequence (sequenced by 5’ RACE on genomic RNA).

35 Chapter 2: Sigma virus discovery and phylogeny Sequence conservation

The amount of protein sequence divergence between the three sigma viruses is very similar, suggesting that they all diverged at a similar time (table S2). However, the different genes in the genome have very different levels of amino acid sequence conservation (table S2), with the L gene being the most conserved, and the P, X and M genes the least conserved.

There is a high level of amino acid sequence divergence between the three Drosophila sigma viruses (table S3). Comparing the amino acid sequences of the L genes of the sigma viruses with those from related clades, we see that DMelSV, DObsSV and DAffSV share only a slightly higher sequence identity to one another than they do to viruses in a range of different rhabdovirus genera (table S3). Furthermore, the amino acid sequence divergence between the three sigma viruses is only slightly less than that seen when rhabdoviruses in different genera are compared, and is similar to the maximum divergence seen between rhabdoviruses in the same genus.

Phylogeny of the Rhabdoviridae

The phylogeny of the Rhabdoviridae was reconstructed from both full length and conserved regions of L gene sequence. The three sigma viruses form a well supported monophyletic group that is distinct from the other rhabdoviruses (Figures 2.2 and 2.3). As was seen in the analysis of sequence identity, the divergence between the viruses is substantial, and similar to that between the most divergent members of some genera. Therefore, these viruses constitute a major new group of rhabdoviruses.

Although the sigma viruses form a well supported monophyletic clade, the relationships between the three viruses are uncertain. While the Bayesian analysis gives significant posterior support for relationships shown in figure 2.3, the more conservative maximum likelihood bootstrapping does not support this topology (Figure 2.3). As DMelSV is vertically transmitted, we were interested in whether the

36 Chapter 2: Sigma virus discovery and phylogeny nt the stop codon. as gene 3 in some some in gene 3 as The sigma virus genomes. Numbers below show the position of the start codon, numbers above represe the show numbers position codon, above Numbers the sigma below of virus The start genomes.

Figure 2.1 the Dotted unable for tothe the G lines DMelSV represent we In M were and parts mRNA genes sequence. transcripts the of genome not been also referred has the tooverlapoverlap. Note X that frames by butopen gene reading do 33bps, the literature.

37 Chapter 2: Sigma virus discovery and phylogeny topology of the virus phylogeny differs from that of the host, which would indicate that the virus has switched hosts during its evolution rather than co-speciating with them. However, when we forced the topology of the sigma virus phylogeny to match the host phylogeny, there was no significant reduction in the likelihood of the tree (SH test: P=0.173, difference in log likelihood = 5.24). Therefore, we are unable to reject the hypothesis that the host and viral tree topologies are the same.

Our analysis produced a robust and well supported phylogeny of the rhabdoviruses (Figure 2.2). The rhabdoviruses contain two major clades, with the fish-infecting novirhabdoviruses forming a clade basal to all the other genera. In the other group, the arthropod vectored plant viruses (cytorhabdoviruses and nucleorhabdoviruses) form a clade that is a sister group to the lyssaviruses, sigma viruses and the dimarhabdovirus supergroup. The dimarhabdovirus group (Figure 2.3) contains the vesiculoviruses, the ephemeroviruses and some other viruses which are unassigned or have only tentatively been placed to this group (ICTVdB 2006). The sigma virus clade forms a sister group to all the other dimarhabdoviruses.

To assess whether our results are sensitive to the sequence alignment or method of phylogenetic reconstruction, we produced a total of six Bayesian trees and three maximum likelihood trees (see 2.2 Methods). There was greater resolution in the Bayesian nucleotide trees with rate variation between codon positions hence we presented these trees in the figures. However, when different methods of analysis were used, similar tree topologies were inferred. Furthermore, the conserved region alignment (Figure S1) and full length sequence alignment (Figure 2.2) lead to the same general conclusions. In addition to the uncertain relationships among the sigma viruses, there are other minor inconsistencies between the different trees. In the lyssavirus genus, depending on the method and alignment used, the branching order of the clade containing Arravan, Khujand and Rabies viruses switched positions, possibly due to the very short branch lengths in this group (Kuzmin et al. 2006). Also the branching order between Taro vein chlorosis virus, Iranian maize mosaic and Maize mosaic virus was sensitive to the method and alignment used.

38 Chapter 2: Sigma virus discovery and phylogeny

,

uses; uses; , ir , have but ngabel ngabel virus from full length elihood elihood The rooted tree The is Phylogeny of Phylogeny the based on based the , The following viruses followingThe viruses aize aize fine streak virus, Figure 2.2 Rhabdviridae of the nucleotide sequences L gene. tree The was reconstructed using the method, Bayeasian node are labels posterior supports, and in labels are brackets maximum lik bootstrap supports (500 replicates). with human parainfluenza virus 1. rhabdovir are unassigned Flanders virus,Iranian maize virus,mosaic flounder Starry rhabodivurs, chuatsi Siniperca rhabdovirus, Wo Taro viruschlorosis vein M Lettuce yellow virus, mottle Orchid fleck virus been included indifferent the genera forillustrative purposes position in the phylogeny.

39 Chapter 2: Sigma virus discovery and phylogeny

cleotide cleotide

.

)

Bayesian nu Bayesian

e is rooted e with is rooted three

Figure 2.3 tree using full length sequence alignments for the dimarhabdovirus and sigma virus clades. Node are labels posterior supports, in labels are brackets bootstrap supports taken from the maximum tree likelihood with 1000 The bootstrap replicates. tre bat (West Caucasian lyssaviruses virus, 86101RCA virus Mokola virus and Rabies

40 Chapter 2: Sigma virus discovery and phylogeny 2.4 Discussion

Drosophila sigma viruses

We have discovered two new rhabdoviruses in D. affinis and D. obscura, which, together with DMelSV, brings the total number of insect-restricted rhabdoviruses to three. We sequenced the complete genomes of two of these viruses and partial genome of the third, and found that they form a major new clade on the rhabdovirus phylogeny (Figure 2.2; see also (Hogenhout et al. 2003; Kuzmin et al. 2006). This new clade does not fit into any existing genera as it is a sister group to the dimarhabdoviruses, which itself contains two genera. Additionally, the divergence between these viruses is greater than that seen within four of the six previously classified genera. We therefore suggest these three Drosophila viruses be regarded as a new genus.

As DMelSV, and probably the other sigma viruses, are vertically transmitted, it is interesting to ask whether the sigma viruses have cospeciated with their hosts or have moved horizontally between species during their evolution. If parasites have moved between hosts, this can result in incongruence between host and parasite phylogenies. However, the three sigma viruses all diverged from their common ancestor at a similar time, and we are unable to tell whether or not the viral phylogeny matches the host phylogeny. Despite this, it seems likely that the viruses have switched between hosts due to the length of branches on the tree. If these viruses had cospeciated with their hosts, we would expect the DAffSV and DObsSV to be much more closely related to one another than to DMelSV, given that D. obscura and D. affinis diverged from each other ~15-18mya and from D. melanogaster ~30-35mya (Gao et al. 2007). This is not the case, as the viruses all shared a common ancestor at a similar time, suggesting that horizontal transfer has occurred. As these sigma viruses are all probably vertically transmitted, it is not clear how they could move between species. One possibility is that they can be vectored by the parasitic mites that feed on Drosophila. These mites have been shown to vector Spiroplasma bacteria (Jaenike et al. 2007) and are suspected to transfer transposable elements between species of Drosophila (Loreto et al. 2008).

41 Chapter 2: Sigma virus discovery and phylogeny Furthermore, we have found DObsSV in mites removed from wild caught flies (Ben Longdon, unpublished data), although we would highlight it is not known if the virus replicates in the mites or if they can transmit sigma horizontally.

It is possible rhabdoviruses may be common parasites of insects. DAffSV was discovered in D. affinis, where there had been previous reports of CO2 sensitivity

(Williamson 1961) and CO2 sensitivity has been described in fifteen other species of Drosophila (Brun and Plus 1980). Therefore, our results suggest that many of these species may also be infected (although other viruses may cause flies to die in anoxic conditions; see introduction (2.1) and Teninges et al. 1979). The second of these new viruses was found in D. obscura where CO2 sensitivity had not previously been reported. Given that we performed only a limited sampling of a few species, this also suggests that there may be many other insect rhabdoviruses waiting to be discovered.

Although the new sigma virus isolates are anciently divergent from other rhabdoviruses, their genomes are typical of the family. The genomes of DObsSV and DMelSV are similar, both containing six open reading frames, which correspond to the N-P-X-M-G-L genes (3’-5’). Based on sequence conservation and from analysis of predicted proteins, five of the six genes are homologous to genes found in other rhabdoviruses and probably have similar functions. In contrast, the X genes in DObsSV and DMelSV share no detectable sequence similarity either to each other or to other rhabdovirus genes. Although both X genes encode proteins with a signal peptide and domains similar to viral RNA polymerases, their function remains a matter for speculation. Interestingly, the cytorhabdoviruses, the nucleorhabdoviruses, and the Wongabel and Flanders viruses all contain at least one gene between the P and M genes, some of which are similar sizes, raising the possibility that these genes may be orthologous to the X gene, but have diverged to such an extent that there is no detectable sequence similarity between them.

Rhabdovirus phylogeny

Our analysis has produced the most robustly supported phylogeny of the Rhabdoviridae to date, with the members of the various genera forming distinct, well

42 Chapter 2: Sigma virus discovery and phylogeny supported clades. By using conserved L gene sequence, we are able to root our trees using human parainfluenza virus 1, which allows us to examine the branching order at the base of the tree for the first time. Furthermore, in previous phylogenetic analyses the boundaries between genera in the dimarhabdovirus super group have been unclear (Bourhy et al. 2005). This has been resolved in our analyses, which has well supported fine scale resolution within this clade.

It has been suggested that the N gene should be used to get fine scale resolution (Kuzmin et al. 2006). However, we have found that the more rapidly evolving regions of the L gene coupled to its large size (~6kb) provides a much greater phylogenetic resolution than the N gene even when looking at closely related viruses. Furthermore, using sequence from less conserved regions such as the N gene can result in inaccurate sequence alignments, which in turn can result in an incorrect phylogeny (Ogden and Rosenberg 2006). This may explain why previous analyses have sometimes placed DMelSV in very different places in the rhabdoviruses tree (Hogenhout et al. 2003; Kuzmin et al. 2006; Kuzmin et al. 2009). In addition, using the L gene has the benefit of allowing rapid detection of novel rhabdoviruses, by using a diagnostic PCR with conserved degenerate primers.

It is striking how viruses which infect similar hosts have a strong tendency to cluster together on the phylogeny, indicating that it is rare for rhabdoviruses to switch between distantly related hosts. Although it has been suggested that the ancestor of the Rhabdoviridae may have infected insects (Hogenhout et al. 2003), our results suggest that it may be premature to draw any conclusions on the origin of the group, as there appear to be two equally parsimonious models. Specifically, because the fish-infecting novirhabdoviruses are sister to all other groups, it is possible either that (1) the common ancestor of the rhabdovirus infected arthropods (insects or other crustaceans) and switched to fish on the lineage leading to the novirhabdoviruses, or (2), the common ancestor infected fish and switched to arthropods on the lineage leading to the other six clades. Perhaps more importantly, because there are likely to be many undiscovered rhabdoviruses in many different groups of hosts, any inferences about the ancestral ecology of this group would be extremely tentative at best. Nevertheless, it is likely that the ancestor of six of the seven major clades (all

43 Chapter 2: Sigma virus discovery and phylogeny rhabdoviruses other than the novirhabdoviruses) infected arthropods, as these clades (with the exception of the lyssaviruses) include viruses which infect arthropods.

There have been a number of transitions between host taxa. There is evidence for three switches between aquatic and terrestrial habitats — one between the fish infecting novirhabdoviruses and the terrestrial viruses, and two within the dimarhabdoviruses — and a single transition to infect plants in the cytorhabdoviruses and nucleorhabdoviruses. In the clade containing the sigma viruses, dimarhabdoviruses and lyssaviruses, there has either been two events leading to these viruses gaining the ability to infect vertebrates, or one gain followed by a loss in the sigma virus clade. The incomplete sampling of rhabdoviruses means that there may be many more major host switches to be discovered.

In our phylogeny, the sigma viruses are a sister group to the dimarhabdoviruses, which suggests that the common ancestor of this group infected arthropods. Support for this argument comes from the ability of vesicular stomatitis virus to replicate in a range of insects, including sand flies, black flies, Drosophila, leafhoppers and moths (Tesh et al. 1972; Lastra and Esparza 1976; Rosen 1980; Mead et al. 2004). In addition, like the sigma virus, this virus can be transmitted transovarially in sandflies (Tesh et al. 1972). It is even possible that all the dimarhabdoviruses may infect arthropods, including the fish viruses that are usually assumed to be vertebrate- specific. For example, a virus 99% identical to spring viraemia of carp has been found in penaeid shrimps (Johnson et al. 1999), we have found that EST libraries from fish lice contain rhabdovirus like sequences (B. Longdon, unpublished observation), and spring viraemia of carp can be vectored by sea lice in the lab (Ahne et al. 2002)

Conclusions

From a quick survey of Drosophila we have found two new viruses, which together with DMelSV form a major new clade in the Rhabdoviridae. It is a possible that there is a great deal of diversity in this family yet to be discovered, and a more extensive survey for new rhabdoviruses may uncover viruses from a wide diversity

44 Chapter 2: Sigma virus discovery and phylogeny host taxa and further our understanding of the relationships amongst the Rhabdoviridae.

Acknowledgments

Many thanks to Beth Ruedi and members of the Mackay lab for help on fieldwork. Thanks to Roberta Bergero for providing the RACE methods and Lena Bayer- Wilfert and Mike Magwire for help and advice. We thank two anonymous reviewers for their useful comments on the manuscript.

45 Chapter 3: transmission and a recent sweep Chapter 3: Vertical transmission and a recent sweep

This chapter has been published as:

LONGDON, B., WILFERT, L., OBBARD, D. J. & JIGGINS, F. M. 2011. Rhabdoviruses in two species of Drosophila: vertical transmission and a recent sweep. Genetics, Advance online. http://dx.doi.org/10.1534/genetics.111.127696

It was written in collaboration with Lena Wilfert, Darren Obbard and Frank Jiggins. The mathematical model was created by Lena Wilfert and Frank Jiggins, who adapted a model from DMelSV for this purpose (Wilfert and Jiggins submitted).

Summary

Insects are host to a diverse range of vertically transmitted micro-organisms, but while their bacterial symbionts are well-studied, little is known about their vertically transmitted viruses. We have found that two sigma viruses (Rhabdoviridae) recently discovered in Drosophila affinis and Drosophila obscura are both vertically transmitted. As is the case for the sigma virus of Drosophila melanogaster, we find that both males and females can transmit these viruses to their offspring. Males transmit lower viral titres through sperm than females transmit through eggs, and a lower proportion of their offspring become infected. In natural populations of D. obscura in the UK we found that 39% of flies were infected and the viral population shows clear evidence of a recent expansion, with extremely low genetic diversity and a large excess of rare polymorphisms. Using sequence data we estimate that the virus has swept across the UK within the last ~11 years, during which time the viral population size doubled approximately every 9 months. Using simulations based on our lab estimates of transmission rates, we show that the biparental mode of transmission allows the virus to invade and rapidly spread through populations, at

46 Chapter 3: transmission and a recent sweep rates consistent with those measured in the field. Therefore, as predicted by our simulations, the virus has undergone an extremely rapid and recent increase in population size. In light of this and earlier studies of a related virus in D. melanogaster, we conclude that vertically transmitted rhabdoviruses may be common in insects, and that these host-parasite interactions can be highly dynamic.

3.1 Introduction

Insects have a diverse range of vertically transmitted symbionts (Buchner 1965). Of these the best studied are bacteria, which are usually transmitted exclusively by females and have evolved a range of strategies to spread through host populations (such as distorting the sex ratio towards females or providing a metabolic benefit to their hosts (Douglas 1989; Hurst et al. 1993). Far less is known about vertically transmitted viruses in insects. Some viruses are both horizontally and vertically transmitted (Mims 1981; Bezier et al. 2009). Other species contain endogenous retroviruses or polydnaviruses which have integrated into the germline and are inherited with the host genome (Fleming and Summers 1991; Heredia et al. 2007; Bezier et al. 2009). However, very few free living and purely vertically transmitted viruses have been described in insects.

One such virus is the Drosophila melanogaster sigma virus (DMelSV), which infects ~4 % of wild flies (Brun and Plus 1980; Carpenter et al. 2007). DMelSV is a negative-sense RNA virus in the family Rhabdoviridae that is found in the cytoplasm of infected cells. Unlike bacterial symbionts, this virus is transmitted vertically through both sperm and eggs (Fleuriet 1988), so it is able to spread through populations despite being costly to infected flies (Seecof 1964; L'Heritier 1970; Fleuriet 1981). The pattern of DMelSV transmission differs between the sexes, with male flies transmitting at a lower rate than females (Brun and Plus 1980). Additionally, the transmission rate is reduced when the fly is infected by its father rather than its mother—if a female is infected by her father, her average transmission rate drops from ~100% to a much lower rate (Brun and Plus 1980), and if a male is

47 Chapter 3: transmission and a recent sweep infected by his father, he does not transmit the virus at all. Therefore, the virus cannot be transmitted through males for two successive generations.

We have recently discovered two new sigma viruses in Drosophila obscura and Drosophila affinis — DObsSV and DAffSV (Chapter 2; Longdon et al. 2010). Along with DMelSV, these viruses form a deep-branching clade in the Rhabdoviridae, which we have suggested be recognised as a new genus. However, important questions about their biology remain unanswered, including whether these new viruses are vertically transmitted. There is some evidence that DAffSV is;

Williamson (1961) found that CO2-sensitivity was vertically transmitted in some lines of D. affinis in a way similar to that seen in DMelSV-infected flies (CO2 paralysis is a symptom of sigma viruses on their hosts, Chapter 2; Longdon et al. 2010). We also do not know anything about the prevalence, population dynamics or population genetics of these viruses. This work aims to address these questions by examining the transmission of these viruses in the lab and the dynamics of DObsSV in natural populations.

3.2 Methods

Vertical transmission of viruses

To test the mode of transmission of these newly discovered viruses, we carried out crosses between infected and uninfected virgin flies. The crosses used infected isofemale lines of Drosophila affinis and Drosophila obscura that were collected from Raleigh, North Carolina (U.S.A) and Essex (UK), respectively, as described in (Chapter 2; Longdon et al. 2010). The crosses began with infected flies that had both an infected mother and father (from a stock that was close to 100% infected). When both parents are infected, it has been shown for DMelSV that the viral type in the offspring is that of the mother (Brun and Plus 1980). The uninfected D. affinis isofemale lines were collected from the same location at the same time as the infected lines, and the uninfected D. obscura were isofemale lines collected during

48 Chapter 3: transmission and a recent sweep the present study (see below). D. affinis were reared on a banana-malt based Drosophila medium (see supplementary materials), whilst D. obscura were reared on a cornmeal medium (Lewis 1960) with a piece of peeled mushroom (Agaricus bisporus) on the surface.

To test whether flies were infected with sigma virus, we exposed them to CO2 at 12°C for 15mins, and recorded flies dead or paralysed 30mins later as infected. To confirm that CO2 sensitivity was linked to viral infection, we crossed infected males to uninfected females, carried out the CO2 assay on their offspring, and tested 15 paralysed and 15 non paralysed/recovered offspring for sigma virus infection by quantitative reverse transcription PCR (qRT-PCR) (40 cycles: 950C 15 sec, 600C 1min) on an Applied Biosystems StepOnePlus system using a Power SYBR Green PCR Master-Mix (Applied Biosystems, CA, USA). Three technical replicates were carried out for each sample and primer pair, and samples were run in a blocked design across plates. The amount of virus was standardised to a housekeeping gene RpL32 (Rp49) to account for RNA extraction and reverse transcription efficiencies using the ΔΔCt (critical threshold) method. Viral primers were designed to cross gene boundaries so only viral genomes were quantified (rather than mRNA), primer sequences are shown in table S2. RpL32 endogenous control primers also crossed an intron-exon boundary so should not amplify gDNA contamination.

The following crosses were used to measure vertical transmission (diagram in Figure 3.1A) and to determine whether horizontal transmission occurred. In cross 1, infected females were crossed to uninfected males. Cross 2 took the daughters from cross 1 and crossed them to uninfected males. In cross 3 infected males were crossed to uninfected females. Cross 4 mated the daughters from cross 3 with uninfected males. Cross 5 mated the sons from cross 3 to uninfected females. Uninfected partners were assayed for infection to determine whether horizontal transmission had occurred.

For D. affinis, multiple flies were placed in each vial, as the flies appear more likely to lay eggs when maintained at a higher stocking density. In cross 1, one to three

49 Chapter 3: transmission and a recent sweep females were placed in a vial with two to three males, and allowed to lay eggs. For cross 3, two or three infected males were placed in a vial with one to three uninfected females. For crosses 4 and 5, the cross 3 offspring were placed individually in a vial with one or two uninfected flies of the opposite sex. Once eggs or larvae were visible, the adults were exposed to CO2 to confirm their infection status. In all crosses uninfected partners were assayed for infection to test whether horizontal transmission had occurred.

For D. obscura, all crosses were carried out with a single pair of flies in each vial. Once eggs or larvae were visible, the parents were tested for DObsSV using a PCR assay on reverse transcribed RNA (RT-PCR) (Chapter 2; Longdon et al. 2010). PCR primers that amplify the RpL32 gene were used to check extractions were successful and products were run on an EtBr-stained 1% agarose gel. In crosses 1 and 3 the uninfected partners were also assayed for DObsSV to test whether horizontal transmission occurs. The emerging offspring were collected as virgins, aged and mated as above to the appropriate uninfected lines. A mean of 25 replicates were set up per cross and a mean of three offspring were assayed by RT-PCR from each replicate. To examine whether males and females differ in their chances of being infected, we used a binomial test to examine whether the proportion of replicates where the majority of infected flies were female was significantly different from 50%.

To investigate the viral titres transmitted through eggs and sperm, the viral titre in early stage embryos was examined by qRT-PCR. Virgin females and males (with either the female or male being infected) were placed together and allowed to lay in bottles with a small amount of yeast paste on the surface of apple or grape juice agar. Embryos were collected twice daily and homogenised in Trizol (Invitrogen), using a microscope to ensure the embryo was successfully crushed. 30 embryos were collected for each cross. RNA was extracted and reverse transcribed (see above) and qRT-PCR was used to measure the viral titre relative to an endogenous control (RpL32) using the ΔΔCt method. If one or two of the technical replicates failed to amplify virus these were given Ct values of 40 for the statistical analysis. Any

50 Chapter 3: transmission and a recent sweep samples which all three technical replicates failed to amplify using the viral PCR primers were classed as uninfected and excluded from the statistical analysis (i.e. the viral titres of infected embryos were compared). These samples are still present in figures 2 and 3. Two data points were removed from the DAffSV cross where the cDNA was of poor quality (RpL32 Ct values >30). This did not affect the outcome of the analysis.

Population samples of DObsSV

Drosophila obscura were collected from 6 woodland locations around the UK, with 1-3 sites at each location (Longitude, Latitude: Falmouth A 50.149411, -5.106007, Falmouth B 50.170063, -5.122495; Bristol A 51.455615, -2.639748, Bristol B- 51.446629, -2.641035, Bristol C 51.340004,-2.782853; Essex 51.881352, 0.502710; Sussex 51.028827, -0.028390; Kent 51.099703, 0.164456 and 51.096517, 0.173151; and Derbyshire A 52.978411,-1.439769, Derbyshire B 52.883423, -1.398956). Flies were collected in the morning and evening from fruit baits. Males and females were separated, and females were placed in vials to establish isofemale lines. Flies from Kent were collected over two sites, using both bait traps and ground baits, and then combined. Flies at all other locations were collected using hanging bait traps. All samples were collected in August 2008, except for the Kent sample, which was collected in June 2008. To examine whether the prevalence of the virus varied between sites we used Fisher’s exact test and obtained P values by Monte Carlo simulation conditional on the row and column totals (10,000 replicates).

The wild-collected males and any females that did not lay eggs were tested for

DObsSV infection by exposing them to CO2 as described above. The females that produced fertile eggs were not directly tested, but their infection status was inferred from whether their offspring were infected.

51 Chapter 3: transmission and a recent sweep Fly identification

Drosophila obscura can be difficult to distinguish morphologically from Drosophila subobscura, and both species are common in the UK (Basden 1954; Shorrocks 1975). Therefore all obscura group flies were collected and a diagnostic PCR assay was used to distinguish the species. RNA was extracted from flies paralysed by the

CO2 assay using Trizol reagent (Invitrogen Corp, San Diego) in a chloroform- isopropanol extraction. RNA was then reverse transcribed (RT) with MMLV reverse transcriptase (Invitrogen Corp, San Diego) using random hexamer primers. DNA was extracted from flies that did not display CO2 sensitivity using chelex DNA extractions (Hurst et al. 2001). To confirm that nucleic acid extractions were successful, we amplified RpL32 from all samples. To identify the species, two sets of diagnostic PCR primers were used, which amplify the mitochondrial cytochrome b (Cyt-b) gene and the nuclear alcohol dehydrogenase (Adh) gene (table S2). In both cases a conserved forward primer was used. Two species-specific reverse primers were designed for Cyt-b and Adh, by placing the 3’ end of the primer on a species- specific single nucleotide difference, and the penultimate 3’ base mismatching all of the available species sequences. Under suitably stringent PCR conditions (Table S1) these primers anneal to D. obscura or D. subobscura in different positions, resulting in different sized products for the two species (Cyt-b and Adh give bands of 230 and 359 bp respectively for D. obscura and 575 and 194bp for D. subobscura). The primers were designed such that they should not anneal to other common obscura group Drosophila found in the UK, and an agreement between assays was required for firm identification. To confirm reliability we sequenced Cyt-b and/or COI from 28 wild flies (a mixture of D. obscura and other UK obscura-group Drosophila species), and in all cases the PCR test had correctly identified the species.

Viral sequencing and sequence analysis

To investigate the genetic diversity of DObsSV, we sequenced two regions of the virus, located in the N and L gene coding sequence, of 634 bp and 648 bp respectively. These genes were selected as they reside at opposite ends of the

52 Chapter 3: transmission and a recent sweep genome, so in the unlikely case of a recombination event (Chare 2003) we would have more power to detect it. For the L gene we used a variable region outside of the conserved motifs. These regions were amplified by PCR, the PCR products were treated with exonuclease 1 and shrimp alkaline phosphatase to remove unused PCR primers and dNTPs, and then sequenced directly using BigDye reagents (ABI, Carlsbad California) on an ABI 3730 capillary sequencer (provided by the Gene Pool Sequencing Facility, University of Edinburgh) in both directions. Sequences were edited in Sequencher (version 4.8; Gene Codes Corp), and any polymorphisms were manually checked by eye. Direct sequencing of PCR products is expected to reduce the error rate to a negligible level (as compared to cloned sequences), and we confirmed this by repeating RT, PCR and resequencing for 22 of the 67 SNPs (no errors were found). Any heterozygous sites (suggesting more than one viral infection in the host) were randomly assigned one of the possible base pairs (6/103 sequences contained a single ambiguity). Only one sequence contained more than one heterozygous site, and this was removed from the analysis as the phase of the haplotypes was unknown. If the heterozygous sites are removed from the BEAST or Tajima’s D analyses, this makes no difference to the conclusions (data not shown). The N and L gene sequences were concatenated, and median joining networks created using the programme Network (Bandelt et al. 1999). To assess whether Tajima’s D statistic (Tajima 1989) was significantly different from that expected under the standard neutral model, we produced a null distribution by recalculating the statistic from 1000 coalescent simulations conditional on the number of segregating sites observed in our data. We tested for recombination with a four gamete test (Hudson and Kaplan 1985). To assess whether there was genetic differentiation between populations, we used the statistic KST (an analogue of FST;

Hudson et al. 1992). The statistical significance of KST was calculated by permuting the sequences across the populations and recalculating the statistic 10,000 times to produce a null distribution. These analyses were performed in DNA SP v5.0 (Librado and Rozas 2009).

To reconstruct past changes in the size of the viral population we used a Bayesian coalescent genealogy sampler (BEAST; Drummond and Rambaut 2007). The

53 Chapter 3: transmission and a recent sweep substitution rate between viral sequences was assumed to be the same as in DMelSV (9.9x10-5 substitutions/site/year), which has been recently estimated from laboratory strains (Lena Wilfert, unpublished data) and is similar to previous rate estimates for DMelSV (Carpenter et al. 2007) and other related viruses (Sanjuan et al. 2010); (Furio et al. 2005). Carpenter et al (2007) have previously shown that the lab derived substitution rate in DMelSV does not differ significantly from that observed in the field. To account for uncertainty in this substitution rate estimate, we approximated its distribution with a truncated normal distribution (mean=9.9x10-5, standard deviation=3.6x10-5, lower limit=1x10-10, upper limit=1 substitutions/site/year), and this distribution was used as a fully-informative prior. The model assumed a strict molecular clock model and an HKY85 substitution model (Hasegawa et al. 1985), which was selected after comparing Bayes factors with the more complex GTR model. Bayes factors were calculated from the marginal likelihoods by importance sampling, as implemented in Tracer (v1.4.1; Rambaut and Drummond 2007) using the method of Suchard et al. (2001). Sites were partitioned into two categories by codon position (1+2, 3), and separate substitution rates were estimated for each category. Such codon partition models have been shown to be equivalent to more complex non codon-partitioned models (Shapiro et al. 2006). We first fitted a model of an exponentially expanding population (parameterised in terms of growth rate, rather than doubling time). This allowed us to exclude a constant population size, as the growth-rate parameter was significantly greater than zero (based on the 95% highest posterior density interval). The population doubling time was calculated from the growth rate as ln(2)/growth rate. We also fitted a model which allows population size to vary freely over time (Bayesian skyline plot). Two runs of 500 million MCMC generations with sampling every 50000 generations were run for each model, and a 10% burnin was used for all parameter estimates. The two runs were combined and examined for convergence using Tracer (v1.4.1; Rambaut and Drummond 2007). Posterior distributions were also examined using Tracer (v1.4.1) (Rambaut and Drummond 2007) to ensure an adequate number of independent samples. The 95% credible interval (C.I.) was taken as the region with the 95% highest posterior density. The two population size models were then compared by calculating Bayes factors as described above. Our analyses are based

54 Chapter 3: transmission and a recent sweep on the combined sequences of two genes, but when each gene was analysed independently the results are very similar (data not shown).

3.3 Results

CO2 paralysis and infection

To examine whether DObsSV and DAffSV cause paralysis and death when infected flies are exposed to CO2, we crossed infected male flies to uninfected female flies and measured both viral titres and the effects of CO2 in the offspring. We found that in both species permanent paralysis is only ever seen in infected flies (Figure 3.2).

Almost all flies contained some detectable virus, but flies that recovered after CO2 exposure had extremely low viral titres. In D. affinis, paralysed flies had on average 80.7 times greater viral titre (Exact Wilcoxon Rank Sum Test: W=225, P<0.0001), although 13 out of 15 of flies that recovered after CO2 exposure also contained detectable amounts of virus. In D. obscura the viral titre was on average 9.4 times greater in paralysed flies (Exact Wilcoxon Rank Sum Test: W=219, P<0.0001), although all the flies contained detectable amounts of virus. Two of the flies which recovered after CO2 exposure had similar viral titres to the paralysed flies.

Vertical Transmission

We found that DAffSV is transmitted in a similar way to DMelSV. In D. affinis male flies transmit DAffSV to their offspring at a lower rate than females (Figure 3.1. cross 1 and cross 3, Exact Wilcoxon Rank Sum Test: W = 147.5, P <0.001). Infected females transmitted the virus to 98% of their offspring over two successive generations (Figure 3.1, crosses 1 and 2), while males transmitted the virus to only 45% of their offspring (Figure 3.1, cross 3). The rate at which a fly transmits the virus to its offspring is also affected by whether the fly itself was infected by its mother or father. If a female was infected by her father rather than her mother, then the average rate of transmission drops from 98% to 20% (Cross 2 vs Cross 4, Exact

55 Chapter 3: transmission and a recent sweep

Figure 3.1 Top: A) The cross diagram represents the fly crosses which were carried out to measure the transmission of each virus. Bottom: B) Histograms showing the

56 Chapter 3: transmission and a recent sweep proportion of infected offspring from each of the 5 crosses for D. affinis and D. obscura.

Figure 3.2 Viral titres in flies that were paralysed or recovered after exposure to

CO2. Titres were measured by quantitative RT-PCR on genomic viral RNA and are expressed relative to the copy number of the housekeeping gene RpL32. Error bars show the standard deviation of technical replicates.

Wilcoxon Rank Sum Test W = 31, P =0.01). If a male was infected by his father alone rather than his mother (and father), then the average rate of transmission drops from 45% to 0% (Cross 3 vs Cross 5, Exact Wilcoxon rank sum test W = 208, P <0.001). Therefore, the virus cannot be transmitted through males for two successive generations. None of the uninfected parental flies in the crosses were paralysed by

CO2 suggesting horizontal transmission is either rare or absent.

We found that DObsSV in D. obscura is also vertically transmitted, but there are some important differences from DAffSV. As D. obscura can occasionally have a high viral titre yet recover from CO2 exposure (see above), we used RT-PCR rather than the CO2 assay to test flies for infection. Sex was not found to affect the likelihood of infection (binomial test, P=0.35) so both sexes were analysed together.

57 Chapter 3: transmission and a recent sweep Unlike in D. affinis, male flies transmit the virus to their offspring at a similar rate to females — infected females transmitted the virus to 92% of their offspring, while males transmitted the virus to 88% of their offspring (Figure 3.1. cross 1 and cross 3, Exact Wilcoxon rank sum test: W = 357, P=0.259). Furthermore, the rate at which a female transmits the virus to her offspring was not affected by whether she received the infection from her mother or father (Figure 1. cross 2 and cross 4, Exact Wilcoxon rank sum test: W = 3125.5, P=0.435), with females infected by their mothers or fathers having transmission of 63% and 61% respectively. Note that female transmission also declined from cross 1 to cross 2 (Figure 3.1. cross 1 and cross 2, Exact Wilcoxon rank sum test: W = 376, P=0.022) from 92% to 61%. If a male was infected by his father rather than his mother (and father), then the average rate of transmission drops from 88% to 0% (Figure 3.1. cross 3 vs cross 5, Exact Wilcoxon rank sum test W = 765, P<0.001). Therefore, this virus cannot be transmitted through males for two successive generations. To check for any horizontal transmission, we also tested the uninfected parents in Crosses 2 and 3. We found that horizontal transmission was rare or absent, as only one female had a very faint viral band and this could be due to the presence of infected sperm.

Viral titres transmitted in eggs and sperm

The different rates that males and females transmit to their offspring may be because eggs and sperm contain different numbers of virions. To investigate this hypothesis, we measured the viral titres of early stage embryos that had either an infected mother or father. Considering only the embryos where there were detectable amounts of virus, in both species the embryonic viral titre is less when the virus was paternally transmitted than if it was maternally transmitted (Figure 3.3; DAffSV Exact Wilcoxon Rank Sum Test: W=61, P<0.0001; DObsSV Exact Wilcoxon Rank Sum Test: W=17, P<0.0001). In addition, more embryos contained no detectable virus after paternal transmission (47% of DObsSV and 4% of DAffSV paternally infected eggs, and 3% of DObsSV and 0% of DAffSV maternally infected eggs had no detectable virus).

58 Chapter 3: transmission and a recent sweep

Figure 3.3 Viral titres in embryos which were infected either maternally or paternally. Titres were measured as in Figure 2. Error bars show the standard deviation of technical replicates.

Population dynamics of DObsSV

To examine whether the biparental pattern of vertical transmission that we have observed can explain the invasion and maintenance of the virus in populations, we simulated the spread of DObsSV based on the transmission rates seen in our experiments. Pi is the prevalence in the adult population in generation i. We can calculate the prevalence among adults in generation i+1 (Pi+1) from the proportion of infected and uninfected sperm (I♂ and U♂ ) and the proportion of infected and

uninfected eggs (I♀ and U♀ ) produced in the previous generation. The frequency of infected and uninfected gametes can be calculated from the rate of vertical transmission and any change in the fertility of infected flies relative to uninfected flies (table 3.1). Because male D. obscura who inherit the virus from their father do not transmit the virus to the next generation, we split the infected population into a fraction si that has inherited the virus from their mother, and a fraction (1 – si) that inherited the virus solely from their father.

59 Chapter 3: transmission and a recent sweep

(1) Pi+1 = I♀ + I♂ U♀

I♀ (2) si +1 = I + I U ♀ ♂ ♀

Gametes Equation

Pi Ct♀ Infected eggs I = ♀ w ♀ (1 ! P ) + PC(1 ! t ) Uninfected eggs U = i i ♀ ♀ w ♀ Ps Ct Infected sperm I = i i ♂ ♂ w ♂ (1 ! P ) + P (1 ! s )C + Ps C(1 ! t ) Uninfected sperm U = i i i i i ♂ ♂ w ♂ Table 3.1 Proportions of infected and uninfected eggs and sperm. Pi is the proportion of adults infected in generation i, which we assume to be equal for males and females. The virus is transmitted from mother to offspring at a rate

t♀ and father to offspring at rate t♂ . We assume that female and male fertility

are both changed by a factor C in infected flies relative to uninfected flies.

We split the infected population in generation i into a fraction si that has

inherited the virus from their mother, and a fraction (1 – si) that inherited the virus from their father. Male D. obscura who inherit the virus from their father do not transmit it to the next generation (i.e. they produce uninfected

sperm). To obtain proportions, we divide by w♀ , the sum of the numerators

of I♀ and U♀ , and w♂ , the sum of the numerators of I♂ and U♂ .

Using the transmission rates estimated in Generation 1 (t♂ = 0.88, t♀ = 0.92), we found that the virus can rapidly invade a population and reach a high prevalence (Figure 3.4). Yampolsky et al. (1999) estimated that DMelSV reduces the fitness of

60 Chapter 3: transmission and a recent sweep D. melanogaster in the wild by ~20 – 30 %. If DObsSV causes a similar reduction in the fertility of infected flies, our simulations suggest that the virus can still rapidly invade a population (Figure 3.4). We repeated this analysis using the lower female transmission rate measured in the second generation (cross 2). This causes the virus to spread much slower and it can only invade if the virus reduces the fertility of infected flies by less than 10 % (data not shown).

Figure 3.4 Simulations of DObsSV spreading through a population based on lab estimates of transmission rates and a range of possible fertility reductions. Colours represent the different fertilities of infected flies relative to uninfected flies (C), with blue, black, red and yellow representing C=1, 0.9, 0.8 and 0.75 respectively. The virus failed to invade if fertility is reduced by more than 25% (C<0.75). The dashed horizontal line represents the mean prevalence of the virus

in our samples. The transmission rates were t♂ = 0.88 and t♀= 0.92, and the starting frequency of infected flies was 10-6. In the UK there are ~3-4 generations of D. obscura each year (Begon 1976).

61 Chapter 3: transmission and a recent sweep

Prevalence of DObsSV

We tested 267 D. obscura collected from sites across the UK for infection with

DObsSV using the CO2 assay, and found that 103 (39%) were infected. The prevalence of DObsSV varied widely between sites (Table 3.2 and Figure S1; Fisher’s exact test, P =0.0001). This is primarily caused by a low prevalence in Kent, but even if this location is excluded from the analysis there is still significant variation between sites (Fisher’s exact test, P=0.04). As the results from the qRT-

PCR linking CO2 paralysis and infection (see above) found not all infected flies are

CO2 sensitive these values may be an underestimate of prevalence.

Site Prevalence N Bristol A 33% 33 Bristol B 62% 21 Bristol C 50% 2 Derbyshire A 48% 66 Derbyshire B 73% 11 Kent 22% 83 Sussex 48% 42 Essex 0% 3 Falmouth A 0% 5 Falmouth B 0% 1 Table 3.2 Percentage of flies infected and number of flies collected at each field site.

We confirmed that the CO2 assay accurately identifies infected flies by qRT-PCR, and out of 105 lines which were paralysed by CO2, 103 were infected. In all these samples we sequenced two regions of the viral genome covering 634 bp of the N gene at the 3’ end of the viral genome and 648 bp of the L gene towards the 5’ end of the genome, and all the sequences were clearly DObsSV (Genbank accession numbers: HQ149099 to HQ149304).

62 Chapter 3: transmission and a recent sweep Viral sequence analysis

The genetic diversity of DObsSV is very low. There were only 67 segregating sites over 1282bp of sequence from all 103 viral isolates (30 in 634 bases of the N gene and 37 in 648 bases of the L gene with no obvious clustering of the SNPs within the sequences), and the average number of pairwise differences per site (π) was 0.002 across all sites and 0.006 at synonymous sites. We verified a third of our SNPs by repeating RT, PCR and sequencing reactions and found no errors.

The phylogenetic network of the sequences is a star shape (Figure 3.5) suggesting a recent selective sweep or population expansion. This is caused by a large excess of rare variants in the dataset—of the 67 segregating sites, 51 are singletons. For this reason, estimates of θW, (which are derived from the number of segregating sites and are insensitive to their frequency (Watterson 1975)) are greater than π (θW,= 0.011 for all sites and θW,= 0.036 for synonymous sites, compared to 0.002 and 0.006). This excess of rare polymorphisms is significantly greater than expected under the neutral model (Tajima’s D = -2.75; P<0.001).

An excess of rare polymorphisms could result from either a recent sweep of the virus through the host population, or purifying selection on the SNPs in our dataset. If the latter hypothesis were true, then we would expect the frequency of different functional classes of polymorphisms to be different as they are likely to have different effects on fitness. However, when analysed independently, the N and L genes, each had a significant excess of rare polymorphisms relative to the neutral expectation (Tajima’s D= -2.54, -2.68, respectively, P<0.001 for each). Additionally, Tajima’s D differed very little between synonymous sites (-2.47, -2.55 for N and L, respectively, P<0.001 for each) and non synonymous sites (-1.77 and -2.35, P<0.001 for each), and was not more negative at non-synonymous sites. It therefore seems unlikely that the departure from neutrality is driven by purifying selection.

There is very little genetic differentiation between viral sequences from different geographic locations (Figure 3.5). This is reflected in a low KST value of 0.015 (this

63 Chapter 3: transmission and a recent sweep is an analogue of FST that measures the proportion of the genetic variation contained in subpopulations relative to the population as a whole). Despite the very low value of KST it was significantly greater than zero (Permutation test: P= 0.0013).

Figure 3.5 Phylogenetic network of DObsSV sequences. Nodes are colour coded based on location and their size is proportional to the frequency of viral sequences. Branches are approximately sized to the number of mutations.

To check whether there might have been any recombination between our sequences we used the four-gamete test. There was only a single pair of sites where all four gametes exist, suggesting that recombination is either very rare or absent. Given the apparent lack of recombination in negative-sense RNA viruses (Chare 2003), this is most likely to result from homoplasy rather than recombination.

64 Chapter 3: transmission and a recent sweep

We reconstructed past changes in the size of the population using the coalescent sampler BEAST. A comparison of the Bayes factors indicates that the model of an exponentially expanding population was preferred over the skyline model in which the population size is free to vary through time (log10 Bayes factors averaged over two runs for each model= 54 in favour of exponential growth over skyline coalescent model). Using the exponential model, we were able to reject a constant population size as the posterior distribution of the growth rate parameter does not include zero (P<0.0001). We estimated that the effective population size of the virus has doubled approximately every 9 months (mean doubling time = 0.76 years, 95% CI: 0.24-1.51 years), and all the genotypes in our sample shared a common ancestor 11 years ago (95% C.I : 4-19 years). When analysed independently, the N and L genes gave similar estimates to each other and the combined analysis (data not shown).

These estimates are compatible with the very rapid invasion of the virus that is predicted by our simulations (Figure 3.4). Assuming that the host undergoes 3-4 generations per year in the UK (Begon 1976), the virus could reach its current prevalence of 39 % within 12 –16 years, even if infected hosts suffer a fertility reduction of 10% compared to uninfected individuals. The simulations also suggest that DObsSV may still be spreading in the UK, as the equilibrium prevalence in the simulations is higher than we have observed in the wild (Figure 3.4).

3.4 Discussion

Vertical transmission

We have found that two recently discovered rhabdoviruses from D. affinis and D. obscura are both vertically transmitted, and horizontal transmission is rare or absent over experimental timescales. To our knowledge, aside from viruses that are integrated into insect genomes, the only other obligately vertically transmitted virus

65 Chapter 3: transmission and a recent sweep that has been reported in animals is DMelSV from D. melanogaster. Our results suggest that sigma viruses may be common vertically-transmitted insect symbionts.

If a vertically-transmitted symbiont is transmitted solely by females, then anything less than perfect transmission is expected to lead to a decline in prevalence and ultimately extinction (L'Heritier 1970). Vertically transmitted bacteria use various different strategies to avoid this, including distorting the sex ratio towards females, spitefully reducing the fitness of uninfected individuals by causing cytoplasmic incompatibility (Hurst et al. 1993), or providing a fitness benefit to the host such as nutrients (Douglas 1989) or protection from pathogens (Hedges et al. 2008; Teixeira et al. 2008; Brownlie and Johnson 2009; Jaenike et al. 2010). An alternative strategy to spread through host populations is to be transmitted through both sperm and eggs. This is rarely seen in bacterial symbionts, probably because sperm contain little cytoplasm and hence few bacteria (Hurst 1990). However, although sigma viruses infect the cytoplasm of host cells, they have evolved biparental vertical transmission (L'Heritier 1970); the sigma viruses may not be unique in this mode of transmission. Rhabdoviruses have been found in Hemipteran sperm cells (Afzelius et al. 1989), and in Culex mosquitoes, CO2 sensitivity (a common phenotype of rhabdovirus infection that causes infected insects to become paralysed after CO2 exposure) is inherited extra-chromosomally in a biparental manner (Shroyer and Rosen 1983). Furthermore, other virus-like particles have been found in the sperm of a range of different insects (Tandler 1972; Schrankel and Schwalm 1975; Degrugillier et al. 1991; Bao et al. 1996; Lopez-Ferber et al. 1997; Wolf 1997; Longdon and Jiggins 2010). Together these results suggest that viruses may be transmitted vertically by both males and females much more often than is the case for bacterial symbionts.

The mode of transmission of the viruses we studied is similar to that of DMelSV. In D. affinis, males transmit the virus at a lower rate than females and the rate of transmission is reduced in flies that have inherited the virus from their father rather than their mother (females have a reduced transmission rate and males do not transmit the virus at all). In D. obscura, males have comparable transmission rates to females, but males infected by their fathers do not transmit the virus. Curiously, we

66 Chapter 3: transmission and a recent sweep also found a reduction in transmission after two successive female generations, which may be due to the first generation of females being infected by both parents, or the uninfected flies being partially resistant. Although this high level of paternal transmission could potentially act to aid the spread of the virus, the reduced transmission by maternally infected daughters means that over all it is transmitted at a similar rate to DAffSV.

We found that DObsSV and DAffSV embryos that were infected by their fathers have a lower viral titre than those infected from their mother. The small size of the sperm coupled with the fact that sperm cells undergo 64 more divisions than eggs (Williamson and Lehman 1996) and that cytoplasm is removed from sperm (peristaltic squeezing), may act to limit the amount of virus transmitted to offspring. This pattern has previously been observed in DMelSV (Brun and Plus 1980), and probably explains why paternal transmission is less efficient than maternal transmission in DMelSV and DAffSV. DObsSV has comparable paternal and maternal transmission rates (Figure 3.1, crosses 1 and 3) even though paternally infected embryos have much lower viral titres. We would hypothesise that although the low viral titre may not be limiting for one generation, over two generations this two fold dilution effect means that sons infected by their fathers do not transmit the virus. In DMelSV the viral titres in paternally infected flies have recovered to normal levels, and yet flies infected from their father still transmit the virus at lower rates (Brun and Plus 1980). It has therefore been suggested that it is critical for the virus to infect the germ line cells early in development if it is to be transmitted efficiently (Fleuriet 1988). This may also explain why DAffSV and DObsSV are transmitted less efficiently by flies that were infected from their father rather than their mother.

Population dynamics

Parasitic bacterial symbionts often have highly dynamic associations with their hosts, with new strains frequently spreading though host populations. Comparisons of insect and bacterial phylogenies have shown that symbionts rarely co-speciate with their hosts, but instead frequently switch between different host species (Werren et

67 Chapter 3: transmission and a recent sweep al. 1995; Weinert et al. 2009). These bacteria can spread very rapidly through populations. For example, Wolbachia spread at a rate of more than 100km per year through uninfected populations in Drosophila simulans on the West coast of the United States (Turelli and Hoffmann 1991). After a symbiont has invaded a population, co-evolution with the host can cause the turnover of strains within the population. For example, after Wolbachia had invaded US populations of D. simulans, it evolved from a parasitic relationship towards a mutualistic one (Weeks et al. 2007). There can be similarly rapid evolution of the host population, where genes that make the host resistant to the pathogenic effects of symbionts can rapidly spread (Hornett et al. 2006).

Similar processes have occurred in D. melanogaster and its sigma virus DMelSV. A naturally occurring polymorphism in the Drosophila gene ref(2)P blocks the transmission of the virus through females, and natural selection has caused the resistant allele of this gene to spread through natural populations (Wayne et al. 1996; Bangham et al. 2008). In response, from the early 1980s to the early 1990s, DMelSV genotypes that are able to overcome this resistance were observed to sweep through two different European populations (Fleuriet et al. 1990; Fleuriet and Sperlich 1992). More recent molecular data confirms that the DMelSV type currently found in Europe has recently spread through the host population, with all the viral isolates in Europe sharing a common ancestor ~200 years ago (Carpenter et al. 2007), either due to a selective sweep or due to D. melanogaster acquiring the virus from another species. This raises the question as to how common such sweeps of vertically transmitted parasites are in nature.

We found that DObsSV has very recently swept though British populations of D. obscura. This sweep has occurred in the past ~11 years, with the frequency of this strain doubling every 9 months. Our model shows that the biparental transmission of the virus can explain these rapid changes in prevalence, by creating a considerable drive through the host population. Furthermore, the virus can still rapidly spread even when it reduces fertility by up to 25% (although a reduction of ~10% most closely matches the timescale of the sweep estimated using the sequence data).

68 Chapter 3: transmission and a recent sweep

As the virus does not recombine, we cannot tell whether the spread of the virus was caused by a selective sweep of an advantageous mutation through an existing viral population, or the spread of a new virus from a different species or population through an uninfected population. If this was a selective sweep, the new strain must have had a large selective advantage over existing viruses to spread so rapidly, and must have almost totally replaced those viruses as there were no more divergent genotypes in our sample. To separate these hypotheses, we would need to either discover closely related viruses in other species or populations, or remnants of a more diverse viral population that existed before a selective sweep.

In conclusion, our results suggest that vertically transmitted viruses may prove to be common in insect populations. Our simulations based on estimates of the transmission rates predict that this mode of transmission can drive very rapid changes in prevalence. In natural populations we have found this to be the case, with DObsSV sweeping rapidly through populations over the last decade.

Acknowledgements

Thanks to everyone who provided fieldwork accommodation and lab space: Jonny Turner, Tom Price, Clare Stamper, Matthew Boyce, Sue and Keith Obbard, Neil and Jacky Longdon. Thanks to Andrew Rambaut who provided numerous useful comments and suggestions. Many thanks to Jim Bull for his comments and suggestions which greatly improved the manuscript. Thanks to four anonymous reviewers for useful comments.

69 Chapter 4: Host-switching Chapter 4: Host-switching

This chapter has been published as:

LONGDON, B., WILFERT, L., OSEI-POKU, J., CAGNEY, H., OBBARD, D. J. & JIGGINS, F. M. 2011. Host switching by a vertically-transmitted rhabdovirus in Drosophila. Biology Letters, Advance online. http://dx.doi.org/10.1098/rsbl.2011.0160

It was written in collaboration with Darren Obbard and Frank Jiggins. Lena Wilfert, Jewelna Osei-Poku, Heather Cagney, Darren Obbard and Claire Webster collected flies. Lena Wilfert provided comments on the manuscript. Darren Obbard wrote the R script to examine the posterior tree topology and to calculate the R-F distance metrics.

Summary

A diverse range of endosymbionts are found within the cells of animals. As these endosymbionts are normally vertically transmitted, we might expect their evolutionary history to be dominated by host-fidelity and cospeciation with the host. However, studies of bacterial endosymbionts have shown that while this is true for some mutualists, parasites often move horizontally between host lineages over evolutionary timescales. For the first time we have investigated whether this is also the case for vertically transmitted viruses. Here we describe four new sigma viruses, a group of vertically transmitted rhabdoviruses previously known in Drosophila. Using sequence data from these new viruses, and the previously described sigma viruses, we show that they have switched between hosts during their evolutionary history. Our results suggest that sigma virus infections may be short-lived in a given host lineage, so that their long-term persistence relies on rare horizontal transmission events between hosts.

70 Chapter 4: Host-switching 4.1 Introduction

Many animals have intimate associations with protists, bacteria and viruses, which live within the cytoplasm of their cells and are transmitted vertically between generations (Buchner 1965). Vertical transmission and an inability to survive for long outside of the host mean that endosymbionts might be expected to show extreme host-fidelity and cospeciate with their hosts. Indeed, phylogenies of bacterial endosymbionts show obligate mutualists have remarkably stable associations with their hosts. For example, Buchnera bacteria, which synthesise amino acids lacking from the diet of aphids, have been stably vertically transmitted for ~150-250 million years (Moran et al. 1993). Similar patterns have been found in other mutualists such as Wigglesworthia in tsetse flies (Chen et al. 1999), Blochmannia in carpenter ants (Sauer et al. 2000) and Blattabacterium in cockroaches and termites (Bandi et al. 1995). In contrast, parasitic endosymbionts persist for relatively short periods in a given host lineage and frequently switch host species. For example, there is little or no congruence between the phylogenies of Wolbachia, (O'Neill et al. 1992), Rickettsia (Weinert et al. 2009) and Spiroplasma bacteria (Haselkorn 2010) and their arthropod hosts. These associations may be unstable as hosts can evolve resistance and drive the parasite to extinction (Koehncke et al. 2009).

In contrast to bacterial endosymbionts, little is known about the evolutionary history of vertically-transmitted viruses. Sigma viruses are vertically-transmitted rhabdoviruses previously known from three species of Drosophila —Drosophila melanogaster (DMelSV; Brun and Plus 1980), Drosophila obscura (DObsSV) and Drosophila affinis (DAffSV) (Chapter 2; Longdon et al. 2010). These viruses are unusual in that they are transmitted vertically through both eggs and sperm (Brun and Plus 1980; Chapter 3; Longdon et al. 2011). Here we describe four new sigma viruses that each infect a different species of Diptera, and use a phylogenetic approach to show that sigma viruses have switched between host species during their evolution.

71 Chapter 4: Host-switching 4.2 Methods

Viral discovery and sequencing

We collected Drosophila tristis in Derbyshire, UK; Drosophila immigrans in Marktredwitz, Germany; Drosophila ananassae in Kilifi, Kenya; and Muscina stabulans in Cambridge, UK. Infected flies were detected by exposing them to pure

CO2 at 12°C for 15mins. Uninfected flies recover after ~30mins while infected flies remain paralysed (Brun and Plus 1980). RNA was extracted from paralysed flies, reverse transcribed (see Chapter 2; Longdon et al. 2010), and amplified by PCR using multiple degenerate primers targeted to conserved regions of the viral RNA- dependent RNA polymerase gene (RDRP) (table S1). PCR products were sequenced using BigDye reagents (GenePool facility, University of Edinburgh, UK) and once a small region of the RDRP gene had been sequenced, 3' RACE (rapid amplification of cDNA ends) was used to obtain further sequence (see Chapter 2; Longdon et al. 2010). To obtain high-quality sequences, new primers were designed to amplify the fragment sequenced by RACE, and this was resequenced in both directions. The host species was confirmed by sequencing mitochondrial COI and/or Cytb genes.

Additional species were also collected and tested with the CO2 assay, but we only report those species from which we were able to amplify a sigma virus.

Inferring the virus phylogeny

The nucleotide sequence of the RDRP genes from sigma viruses and other rhabdoviruses was aligned based on the translated amino-acid sequence using ClustalW. Alignments were trimmed to contain only a conserved region of the RDRP that could be robustly aligned. Phylogenies were inferred using maximum- likelihood (ML) in PAUP (Swofford 1993) and Bayesian methods in MrBayes (Huelsenbeck and Ronquist 2001). The ML analysis used a heuristic search with a nearest neighbour interchange algorithm and a general time reversible model with a gamma distributed rate variation and a proportion of invariable sites. This model of sequence evolution was selected by comparing alternative models using Akaike

72 Chapter 4: Host-switching information criterion (AIC) in Model Test (Posada and Crandall 1998). Node- support was estimated by non-parametric bootstrapping. The Bayesian analysis used the same model of sequence evolution and the Markov chain Monte Carlo (MCMC) was run for 1 million generations, sampled every 100 steps with the first 25% of samples being discarded as burn-in.

Detecting incongruent tree topologies

To detect topological incongruence between host and parasite phylogenies we used a Shimodaira-Hasegawa test (SH-test; Shimodaira and Hasegawa 1999), which compares the likelihood of the viral phylogeny inferred from the data with one constrained to match the host topology (Gao et al. 2007; van der Linde et al. 2010). We also used a Bayesian approach which identifies the proportion of the posterior sample of viral topologies that match the host phylogeny (e.g. Weinert et al. 2009). As these approaches compare only topologies (and not branch lengths) they are a conservative test for host switching. Even when topologies are incongruent, some cospeciation or switching between related hosts may make host and virus topologies more similar than expected by chance. To test for topological similarity, we compared the distribution of Robinson-Foulds (Robinson and Foulds 1981) distance metrics provided by 104 random viral topologies to that derived from the posterior sample of viral topologies.

4.3 Results

We detected novel sigma viruses in 4 dipteran species, including three species of Drosophila — D. tristis, D. immigrans and D. ananassae — and one member of the Muscidae, Muscina stabulans. We have tentatively named these new viruses as DTriSV, DImmSV, DAnaSV and MStaSV, respectively. This brings the total number of sigma viruses described to seven, and for the first time extends their distribution outside the genus Drosophila.

73 Chapter 4: Host-switching We sequenced 1845-3006 bp of the RDRP gene from the four viruses (accession- numbers: JF311399-JF211402). The sequences are highly divergent from one another, with the most closely related pair, DAffSV and DTriSV, having an amino- acid sequence identity of 0.73. The other 5 genes of sigma viruses have previously been shown to have even greater divergence (Chapter 2; Longdon et al. 2010). The new sigma viruses form a clade of dipteran-infecting viruses that also contains the previously described sigma viruses DMelSV, DObsSV and DAffSV (Figure 1). In common with previous phylogenies (Dacheux et al. 2010; Chapter 2; Longdon et al. 2010), the sigma virus clade is most closely related to the Ephemerovirus and Vesiculovirus clades (which together form the Dimarhabdoviruses).

Figure 4.1 Bayesian phylogeny of the sigma viruses (left) and their hosts (right). Node labels represent Bayesian posterior supports with maximum likelihood bootstrap support in brackets. The tree is rooted with the Lyssavirus clade. Non- sigma virus clades are collapsed. Nodes marked with an asterisk had bootstrap support of <50%. D. melanogaster and D. ananassae shared a common ancestor ~20 million years ago (mya) but both fall within the D. melanogaster group, which is separated from the obscura group (D. obscura, D. affinis and D. tristis) by ~25mya. Both of these groups fall within the subgenus , while D. immigrans is in the subgenus Drosophila, which separated from Sophophora ~40mya (dates from Russo et al. 1995).

74 Chapter 4: Host-switching To test whether the sigma viruses have exclusively cospeciated with their hosts we compared the host and virus phylogenies. The phylogeny of these host species is extremely well resolved (Gao et al. 2007; van der Linde et al. 2010). We found the likelihood of the virus tree constrained to follow the topology of the host taxa was significantly reduced (SH-test: 2ΔlnL=328, P<0.005). For the Bayesian trees, we also found that the viral phylogeny differed significantly from the host topology, with none of the topologies in the posterior sample of trees matching that of the hosts. Therefore, both methods suggest these viruses have switched between host species. Incongruence is due to two factors; firstly, the presence of DImmSV in a clade of viruses with hosts from a different subgenus of Drosophila to D. immigrans (van der Linde et al. 2010). Also D. obscura is much more closely related to D. tristis than D. affinis (Gao et al. 2007), yet a viral clade comprising DTriSV and DAffSV is well supported.

However, although the trees were not congruent, we found that the inferred virus topology was more similar to the host topology than expected by chance. Only 2 per cent of random viral topologies were closer to the host topology than the posterior sample of actual viral topologies were to the host topology (Robinson and Foulds 1981). This may imply cospeciation events, but could be due to other factors such as preferential host switching between closely related species.

4.4 Discussion

We have discovered four new sigma viruses in D. ananassae, D. immigrans, D. tristis and Muscina stabulans. Together with the three existing sigma viruses in other Drosophila, they form a clade of Dipteran infecting rhabdoviruses. It is likely that these viruses are vertically-transmitted as not only are all of the previously known sigma viruses vertically-transmitted (Chapter 3; Longdon et al. 2011), but vertical- transmission of CO2 sensitivity — the hallmark of sigma virus infection — is known from other Diptera (Williamson 1961; Shroyer and Rosen 1983). The phylogeny of the viruses reflects neither the phylogeny of the hosts, nor the region of the world

75 Chapter 4: Host-switching where they were collected (these viruses were isolated in Europe, Africa and America). Therefore, sigma viruses have switched between host lineages during their evolution.

Sigma viruses have highly dynamic interactions with their hosts. In D. melanogaster populations there has been a recent selective sweep of a gene conferring resistance to DMelSV (Bangham et al. 2007), and this was followed by the sweep of a viral genotype that overcomes host resistance (Fleuriet et al. 1990). DObsSV also shows evidence of a recent and rapid sweep (Chapter 3; Longdon et al. 2011). Such rapid changes in host resistance are expected to drive fluctuations in viral prevalence, and may make virus-host associations unstable and short-lived (Koehncke et al. 2009). If so, then the virus will only persist in the long-term by switching between host species. This appears to be a general phenomenon among vertically-transmitted parasites, as similar patterns are seen among bacterial endosymbionts (see introduction(O'Neill et al. 1992; Weinert et al. 2009; Haselkorn 2010), and genomic parasites such as transposable elements (Loreto et al. 2008) and homing endonucleases (Goddard and Burt 1999). Although the transfer mechanism is unclear, we have previously suggested that parasitic mites could act as vectors of sigma viruses (Chapter 2; Longdon et al. 2010), and arthropod vectors may be responsible for other endosymbionts and genomic parasites switching between host lineages (Vavre et al. 1999; Jaenike et al. 2007; Loreto et al. 2008).

A sigma-like virus outside of the genus Drosophila suggests that these viruses may be widespread in Dipterans, if not insects as a whole. Unlike bacterial endosymbionts, the rapid evolution of the sigma virus genome makes it impossible to design a single pair of diagnostic PCR primers that can be used to test for new strains of the virus. In the course of this study, we encountered CO2 sensitive individuals of other species of flies from which we were unable to amplify virus using our primers, and these may harbour other sigma-like viruses (table S2). CO2 sensitivity has also been reported in 13 other Drosophila species (Brun and Plus 1980), and in Culex mosquitoes (Shroyer and Rosen 1983). It is also possible some rhabdoviruses may not cause CO2 sensitivity. Additionally, rhabdovirus sequences have inserted into the

76 Chapter 4: Host-switching genomes of various insect species (Katzourakis and Gifford 2010) and rhabdovirus- like particles have been found in firebug testes (Afzelius et al. 1989). The non- Drosophilid sigma virus we found is of particular interest, as the closely-related Dimarhabdoviruses are vector-borne diseases of vertebrates (some of which are vectored by other dipterans (Dacheux et al. 2010)). The discovery of other rhabdoviruses in insects that do not blood-feed may make it possible to understand how viruses may have switched between being vector-borne pathogens of vertebrates and being purely entomopathogenic.

Acknowledgments We thank two anonymous reviewers for useful comments.

77 Chapter 5: Phylogenetic determinants of host shifts Chapter 5: Phylogenetic determinants of host shifts

This chapter is in press at PLoS Pathogens.

LONGDON, B, HADFIELD, J.D., WEBSTER, C.L., OBBARD, D.J., AND JIGGINS, F.M. Host phylogeny determines viral persistence and replication in novel hosts.

It was written in collaboration with Jarrod Hadfield, Darren Obbard and Frank Jiggins. Jarrod Hadfield provided extensive help with model fitting and defining priors. Darren Obbard wrote the R code to integrate over the host phylogeny. Darren Obbard and Claire Webster helped with fly rearing during the experiment.

Summary

The emergence of new infectious diseases often results from pathogens switching to a new host, and determining which species are likely to be sources of such host shifts is essential to understand disease threats to both humans and wildlife. However, the factors that determine whether a pathogen can infect a novel host are poorly understood. We have examined the ability of three host-specific RNA-viruses (Drosophila sigma viruses from the family Rhabdoviridae) to persist and replicate in 51 different species of Drosophilidae. Using a novel analytical approach we found that the host phylogeny could explain most of the variation in viral replication and persistence between different host species. Part of this effect was because the viruses tended to reach higher titres when a novel host is more closely related to the original host. However, there is also a strong effect of host phylogeny that is independent of the distance from the original host, with viral titres being similar in groups of related hosts. Most of this effect could be explained by variation in general susceptibility to all three sigma viruses, as there is a strong phylogenetic correlation in the titres of the three viruses. These results suggest that the source of new emerging diseases may

78 Chapter 5: Phylogenetic determinants of host shifts often be predictable from the host phylogeny, but that the effect may be more complex than simply causing most host shifts to occur between closely related hosts.

5.1 Introduction

A major source of emerging infectious diseases are host shifts, where the parasite originates from a different host species. In humans, HIV (Hahn et al. 2000), influenza (Webby and Webster 2001) and Plasmodium (Liu et al. 2010) have all been recently acquired from other species. Host shifts can also have devastating effects on wildlife; for example Ebola epidemics have resulted in marked declines in some primate populations (Leroy et al. 2004) and canine distemper virus has jumped from dogs into Serengeti lions and caused considerable mortality (Roelke-parker et al. 1996). As we have come to realise that the sources of human, domestic or crop pathogens are likely to be from wild species (Woolhouse et al. 2005; Jones et al. 2008), understanding what causes these parasite host shifts to occur has become increasingly important.

For a host shift to occur, the new host must first be exposed to the parasite, the parasite must then be able to replicate in the new host, and finally there must be sufficient onward transmission in the new host for the infection to spread in the population (Woolhouse et al. 2005). Exposure is clearly important in determining whether a host shift occurs, and some cases of disease emergence have followed changes in the geographic range of species that have brought parasites in contact with new hosts (Roelke-parker et al. 1996; Lanciotti et al. 1999; Lips et al. 2006; Vasilakis and Weaver 2008). However, once exposure has occurred, the factors that determine whether the pathogen can replicate in a new host are poorly understood.

One factor that can potentially affect whether a parasite can replicate in a new host species is host relatedness — parasites may be more likely to replicate in species closely related to the original host (Engelstadter and Hurst 2006). The reason for this is that closely related hosts will tend to present a more similar environment to the

79 Chapter 5: Phylogenetic determinants of host shifts parasite. Parasites must evade an elaborate array of host defences and rely on the host for their physiological needs, and this will result in specialised adaptations (Turner and Elena 2000; Duffy et al. 2007). These adaptations have in turn resulted in many parasites being extreme specialists that are only able to survive in a narrow range of similar host species (Thompson 1994). If this is the case, host shifts may occur most frequently between closely related species.

Here we use a new analytical approach to analyse host shifts, which allows us to separate two different ways in which the host phylogeny might affect the ability of a parasite to infect a new host species. The first of these we term the ‘distance effect’, and reflects the fact that the chances of successful infection may be higher in species that are more closely related to the natural host. However, it is also likely that related species share similar levels of susceptibility independently of how related they are to the natural host, a process that we term the ‘phylogenetic effect’. The two effects may generate very different patterns of host switching. The distance effect would result in most host shifts infecting species closely related to the natural host. In contrast, the phylogenetic effect might mean that host clades distantly related to the natural host are susceptible to a parasite, and this could cause parasites to jump between distantly related species.

Previous studies have examined the distance effect only. While there is evidence that parasites most often shift between related hosts from correlative studies of parasite- incidence in wild animals (Davies and Pedersen 2008), experimental evidence has been surprisingly rare. Cross-infection experiments using plants and fungi (Gilbert and Webb 2007; de Vienne et al. 2009), Drosophila and nematode worms (Perlman and Jaenike 2003), and beetles and Spiroplasma bacteria (Tinsley and Majerus 2007) have all found that the ability of a parasite to establish an infection declines as a novel host’s relatedness to the natural host declines. However, some of these studies have limited phylogenetic coverage of the hosts, and few control for the effects of phylogeny (but see; Perlman and Jaenike 2003).

80 Chapter 5: Phylogenetic determinants of host shifts The extent to which host relatedness influences host switching varies between different groups of parasites, and it has been suggested that RNA viruses may be particularly prone to jump between distantly related hosts (Woolhouse and Gowtage- Sequeria 2005). Reviewing emerging viral diseases in vertebrates, Parrish et al (2008) observed that “Spillover or epidemic infections have occurred between hosts that are closely or distantly related, and no rule appears to predict the susceptibility of a new host.” Viruses are more likely than other groups of parasites to be shared between distantly related primates (Davies and Pedersen 2008), and many human diseases that have been recently acquired from other species are RNA viruses (Parrish et al. 2008). The ability of certain viruses to infect distantly related hosts may result from the use of conserved host receptors to enter cells (Baranowski et al. 2001; Woolhouse 2002), or the existence of hosts that do not posses broad resistance mechanisms to that type of parasite (Kuiken et al. 2006; Havard et al. 2009). However, some studies have found evidence for the importance of the host phylogeny; rabies virus strains have higher rates of cross species transmission between closely related host species in the wild (Streicker et al. 2010) and primate lentivirus phylogenies show signs of preferential switching between closely related hosts (Charleston and Robertson 2002).

To explore this question we have conducted a large cross-infection experiment in which three sigma viruses were injected into 51 different species of Drosophilidae. Sigma viruses are a clade of rhabdoviruses (RNA viruses with single-stranded negative-sense genomes), which infect various species of Diptera (Chapters 2 and 4; Longdon et al. 2010; Longdon et al. 2011). They are normally vertically transmitted (Brun and Plus 1980; Chapter 3; Longdon et al. 2011), leading to extreme specialisation on just a single host species. However, the sigma virus of Drosophila melanogaster (DMelSV) will replicate in a range of different dipteran hosts (Jousset 1969), and differences between the host and virus phylogenies show that sigma viruses have switched between distantly related host lineages during their evolution (Chapter 4; Longdon et al. 2011). We find that the host phylogeny explains most of the variation in the ability of sigma viruses to replicate in novel hosts, with both the distance and phylogenetic effects being large. These results not only allow us to

81 Chapter 5: Phylogenetic determinants of host shifts explore the different ways in which the host phylogeny may affect host switching, but they are also, to our knowledge, the first study to experimentally test the effect of host genetic distance on infection success in RNA viruses — the most important source of emerging diseases.

5.2 Methods

We measured the ability of three Drosophila sigma viruses to persist and replicate following injection into 51 fly species sampled from across the phylogeny of the Drosophilidae (Figure 5.1). The three viruses were DAffSV, DMelSV and DObsSV, which naturally occur in D. affinis, D. melanogaster and D. obscura respectively (Chapter 2; Longdon et al. 2010).

Virus isolates

We extracted DAffSV, DMelSV and DObsSV from infected stocks of D. affinis, D. melanogaster and D. obscura. To clear these stocks of any bacterial or other viral infections, they were aged for at least 20 days, before collecting embryos (see Chapter 3; Longdon et al. 2011) and de-chorionating them in ~2.5% w/v sodium hypochlorite solution for one minute (Brun and Plus 1980). The embryos were then rinsed in distilled water and placed onto clean food. To collect flies infected with a sigma virus, the adults were exposed to 100% CO2 at 12°C for 15 mins and the paralysed individuals were retained (Brun and Plus 1980; Wilfert and Jiggins 2010; Chapter 3; Longdon et al. 2011). These were frozen at -80°C to rupture cells, homogenised in Ringer’s solution (Sullivan et al. 2000) (2.5µl/fly), and then briefly centrifuged twice, each time retaining the supernatant. This was passed through Millex PVDF 0.45µM and 0.22µM syringe filters (Millipore, Billerica, MA, USA) to remove any remaining host cells or bacteria, before being stored in aliquots at -80°C.

82 Chapter 5: Phylogenetic determinants of host shifts Injections

Stocks of each fly species were kept in half pint bottles of staggered ages, and each day freshly eclosed flies were sexed, males were removed, and females were aged at 18°C for 3 days on agar (recipe in supplementary materials) before injection. At the same time we stored remaining flies in ethanol for wing size measurements. The food medium, rearing temperature and whether each species was composed of single or multiple lines can be found Table S1.

Female flies were injected with 69nl of the virus extract intra-abdominally using a Nanoject II micro-injector (Drummond scientific, Bromall, PA, USA). Half the flies were frozen immediately in liquid nitrogen as a reference sample to control for relative dose size, and the rest were kept on agar medium at 18°C for 15 days to allow the virus to replicate before being frozen in liquid nitrogen. The day 15 time point was chosen based on pilot time-course data, and we note that the change in viral titre includes a decline in the virus following injection, followed by a growth/replication phase (Figure S1). Frozen flies were then homogenised in Trizol reagent (Invitrogen Corp, San Diego, CA, USA). Based on quantitative reverse- transcription PCR (qRT-PCR), the dose of the three viruses was similar (with a maximum of a 1.6x difference between viruses).

The injections were carried out over a period of 18 days, with the aim of completing 3 biological replicates for each virus per fly species (3 replicates each of the day 0 and day 15 treatments). The virus used (DAffSV, DMelSV or DObsSV) was rotated on a daily basis, whilst treatment (frozen immediately or on day 15) and the injection order of fly species were randomised each day. On average we injected and quantified viral titre in 10 flies per replicate (range of across species means=5-15). Out of the 153 fly-virus combinations, 126 had 3 biological replicates, 24 had 2 biological replicates and 3 had 1 biological replicate.

83 Chapter 5: Phylogenetic determinants of host shifts Other factors

Wolbachia endosymbionts have recently been shown to provide resistance to a range of positive sense RNA viruses (Hedges et al. 2008; Teixeira et al. 2008; Moreira et al. 2009; Bian et al. 2010). Although it does not affect the replication of DMelSV (L. Wilfert and M. Magwire, unpublished data), we nonetheless tested each species for Wolbachia using PCR primers that amplify the wsp gene (Zhou et al. 1998).

We also checked that the body size of the different species did not affect our results. To do this, we measured wing length, which is commonly used as a body size measure in Drosophila and strongly correlates with thorax length (Sokoloff 1966; Huey et al. 2006). Wings were removed from ethanol-stored flies, photographed under a dissecting microscope and the length of the IV longitudinal vein from the tip of the proximal segment to where the distal segment joins vein V (Gilchrist et al. 2001) was measured (relative to a standard measurement) using ImageJ software (NIH, U.S.A, v1.43u).

Measuring change in viral titre

Viral titres were estimated using qRT-PCR. To ensure that we only amplified viral genomic RNA and not messenger RNA, the PCR primers were designed to amplify a region spanning two different genes. The copy number of viral genomic RNA was expressed relative to the endogenous control housekeeping gene RpL32 (Rp49). We designed different RpL32 primers specific for each species. First, we sequenced the RpL32 gene from all of the species (we were not able to amplify RpL32 from Drosophila busckii, see Table S2). We then designed species-specific primers in two conserved regions (Table S3).

Total RNA was extracted from our samples using Trizol reagent, reverse-transcribed with Promega GoScript reverse transcriptase (Promega Corp, Madison, WI, USA) and random hexamer primers, and then diluted 1:4 with DEPC treated water. The qRT-PCR was performed on an Applied Biosystems StepOnePlus system using a

84 Chapter 5: Phylogenetic determinants of host shifts Power SYBR Green PCR Master-Mix (Applied Biosystems, CA, USA) and 40 PCR cycles (950C for 15 sec followed by 600C for 1min). Two qRT-PCR reactions (technical replicates) were carried out per sample with both the viral and endogenous control primers. Each qRT-PCR plate contained a standard sample, and all experimental samples were split across plates in a randomised block design. A linear model was used to correct for the effect of plate. We repeated any samples where the two technical replicates had cycle threshold (Ct) values more than 1.5 cycles apart after the plate correction.

To estimate the change in viral titre, we first calculated ΔCt as the difference between the cycle thresholds of the sigma virus qRT-PCR and the endogenous control. The viral titre of day 15 flies relative to day 0 flies was then calculated as 2- ΔΔCt , where ΔΔCt = ΔCtday0 – ΔCtday15, where ΔCtday0 and ΔCtday15 are a pair of ΔCt values from a day 0 biological replicate and a day 15 biological replicate for a particular species-virus combination. We used a dilution series to calculate the PCR efficiency of the three sets of viral primers and thirteen of the RpL32 primer combinations (covering 40 of the 51 Drosophila species). The efficiencies of the three virus primers were 95%, 97%, and 100%, (DAffSV, DMelSV and DObsSV) and the average efficiency of RpL32 primers across species was 106%, with all being within a range of 98-112%.

Host phylogeny

The host phylogeny was inferred using the COI, COII, 28S rDNA, Adh, SOD, Amyrel and RpL32 genes. We downloaded all the available sequences from Genbank, and attempted to sequence COI, COII, 28S rDNA, Adh and Amyrel in those species from which they were missing (details in Table S4). This resulted in sequence for all species for COI, COII and 28S and partial coverage for the other genes (50 out of 357 species-locus combinations were missing from the data matrix). The sequences of each gene were aligned using ClustalW (alignments and accession numbers in supplementary materials). To reconstruct the phylogeny we used BEAST (Drummond and Rambaut 2007), as this allows construction of an ultrametric (time- based) tree using a relaxed molecular clock model. The genes were partitioned into 3

85 Chapter 5: Phylogenetic determinants of host shifts groups each with their own substitution and molecular clock models. The three partitions were: mitochondrial (COI, COII); ribosomal (28S); and nuclear (Adh, SOD, Amyrel, RpL32). Each of the partitions used a HKY substitution model (which allows transitions and transversions to occur at different rates) with a gamma distribution of rate variation with 4 categories and estimated base frequencies. Additionally the mitochondrial and nuclear data sets were partitioned into codon positions 1+2 and 3, with unlinked substitution rates and base frequencies across codon positions. Empirical studies suggest that HKY models which gives a good fit to most protein coding data sets (Shapiro et al. 2006). A random starting tree was used, with a relaxed uncorrelated lognormal molecular clock and we used no external temporal information, so all dates are relative to the root age. The tree-shape prior was set to a speciation-extinction (birth-death) process. The BEAST analysis was run for 100 million MCMC generations sampled every 1000 steps (additionally a second run was carried out to ensure convergence). The MCMC process was examined using the program Tracer (Rambaut and Drummond 2007) to ensure convergence and adequate sampling. Trees were visualised using FigTree (v. 1.3; http://tree.bio.ed.ac.uk/software/figtree/).

Statistical Analysis

We used a phylogenetic mixed model to examine the effects of host relatedness on viral persistence and replication in a new host (Lynch 1991; Housworth et al. 2004; Hadfield and Nakagawa 2010). This framework allows (random) phylogenetic effects to be included in the model, with the correlation in phylogenetic effects between two host species being inversely proportional to the time since those two host species shared a common ancestor. We fitted the model using a Bayesian approach in the R package MCMCglmm (Hadfield 2010, R Foundation for Statistical Computing, Vienna, Austria) and REML in ASReml. The two methods gave similar results so we only report the Bayesian analysis. The model had the form:

yvsi = "v + dv:s"v:d + uv:p + uv:s + evsi

!

86 Chapter 5: Phylogenetic determinants of host shifts

th where yvsi is the viral titre of the i biological replicate of host species s infected with

virus v. "v is the intercept term for virus v, and can be interpreted as the viral

replication rate in the species at the root of the phylogeny. dv:s is the phylogenetic (patristic) distance between the original host of virus v and species s, and the ! associated regression coefficient ("v:d ) determines the degree to which viral replication rate changes as the phylogenetic distance increases. The random effect

uv:p is the deviation from the expected viral replication rate for virus v due to ! historical processes (i.e. the host phylogeny). The random effect uv:s is the deviation from the expected viral replication rate of virus v in host s that is not accounted for

by the host phylogeny. The residual is evsi, which included within-species genetic effects, individual and micro-environment effects and measurement/experimental error. The random effects (including the residual) are assumed to come from multivariate normal distributions with zero mean vectors (because they are

deviations) and structured covariance matrices. Denoting up:v as a vector of

phylogenetic effects across species for virus v, and A as a matrix with elements ajk representing the proportion of time that species j and k have had shared ancestry ! since the root of the phylogeny:

-u * ' -0* - . 2 A . A . A*$ p:v A % p:v A p:v A ,vM p:v A ,vO " + ( + 2 ( u ~ N% +0(, . A . A . A " = N(0,V ! A) + p:vM ( + ( + p:v A ,vM p:vM p:vM ,vO ( p % 2 " +u ( % +0( +. A . A . A(" , p:vO ) & , ) , p:v A ,vO p:vM ,vO p:vO )# where 2 is the variance of phylogenetic effects for DAffSV, and is the " p:vA " p:vA ,v M covariance between phylogenetic effects for DAffSV and DMelSV.

! Similar distributions are assumed for species effects: !

" u % $ s:vA ' u ~ N(0,V ( I) $ s:v M ' s $ u ' # s:vO & where I is an identity matrix indicating that species effects are independent of each other. The posterior modes for " 2 were close to zero for viruses DAffSV and ! v:s

!

87 Chapter 5: Phylogenetic determinants of host shifts

2 DObsSV and these were omitted from the model (except for the calculation of σ p/( 2 2 σ p+ σ s), see below).

The residuals are distributed as:

" e % $ vA ' e ~ N(0,V ( I) $ v M ' e $ e ' # vO &

The off-diagonal elements of Ve (i.e. the covariances) were set to zero since viruses were not replicated within biological replicates. In a Bayesian analysis prior ! probability distributions have to be specified for the fixed effects and the covariance matrices. As described in detail in the Supplementary Methods, we used several different priors to check if the results are sensitive to the choice of prior. The results

presented were obtained using parameter expanded priors for the Vp and Vs matrices.

The P-values reported (PMCMC) correspond to 2pmin, where pmin is the smaller of the two quantities a) the proportion of iterations in which the posterior distribution is positive or b) the proportion of iterations in which the posterior distribution is negative. The 95% credible intervals (CI) were taken to be the 95% highest posterior density intervals. The parameter estimates are the marginal means of the posterior distribution. Significance of the fixed effects was inferred if the 95% CI of the posterior distribution did not cross zero, and the P-values were equal to or less than 0.05.

We also checked whether several additional factors affected viral replication by repeating the analysis with these factors included in the model as fixed effects. There was no significant effect of wing size (an average of 33 measured per species,

PMCMC=0.50), Wolbachia (table S4, PMCMC =0.51) or rearing temperature (PMCMC =0.55). We also repeated the analysis with three outliers removed, so that the distribution of the residuals was not significantly different from normal according to an Anderson-Darling test (A=0.61, P=0.11). The parameter estimates were very similar to those obtained when including all the taxa (as reported in the results).

88 Chapter 5: Phylogenetic determinants of host shifts

5.3 Results

We measured the change in viral titre over 15 days for three sigma viruses each injected into 51 species of Drosophila, including their natural hosts (see Figure 5.1). In total we injected and quantified viral titre in 887 biological replicates (a total of 8762 flies). To allow us to investigate how the host phylogeny affects the ability of the virus to persist and replicate in the different species, we reconstructed the phylogeny of all 51 species using the sequences of seven different genes. The resulting tree broadly corresponds to previous studies (O'Grady and Desalle 2008; van der Linde et al. 2010), with the close phylogenetic relationships being generally well supported and more ancient nodes were less well supported (Figure 5.1).

There are two ways in which the host phylogeny could affect the ability of the three viruses to infect new host species. First, the chances of successful infection may be higher in species that are more closely related to the natural host. Second, related species may share similar levels of susceptibility independently of how related they are to the natural host — an effect that we refer to as the ‘phylogenetic effect’. To separate these two processes we fitted a phylogenetic mixed model to our data.

All three viruses have greater viral titres in fly species that are more closely related to their natural host (Figure 5.2). If we assume that titres of all three viruses decline with genetic distance from their natural host at the same rate, then there is a significant negative relationship between titre and distance (slope: βd= -1.96; 95%

CI= -3.66, -0.43; PMCMC =0.022). If we instead allow the effect to differ between viruses, the negative effect of genetic distance from the natural host on replication is greatest for DObsSV (Figure 5.2; slope: βO:d = -4.03; 95% CI =-6.11, -0.94; PMCMC =0.005), is smaller and only marginally significant for DAffSV (Figure 5.2; slope:

βA:d = -1.82; 95% CI =-3.99, 0.37; PMCMC =0.095), and not significant for DMelSV

(Figure 5.2; slope: βM:d = -0.47; 95% CI =-3.06, 1.94; PMCMC =0.692). These effects were still present when the natural host species were removed from the analysis (data

89 Chapter 5: Phylogenetic determinants of host shifts not shown). Therefore, the rate at which viral titres decline with genetic distance of the new host from the natural host differs between the individual viruses.

There is also a strong influence of host phylogeny on viral replication that could not be explained by the distance of the novel host from the original host. The between- 2 species variance consists of two components; σ p, which is the variance that can be 2 explained by the host phylogeny, and a species-specific component σ s which cannot be explained by the host phylogeny. These statistics do not include the effects of the distance from the natural host, as this was included as a fixed effect in the model (Wilson 2008). To assess the importance of the host phylogeny, we calculated the proportion of the between-species variance that can be explained by the phylogeny 2 2 2 (σ p/( σ p+ σ s), which is similar to Pagel’s λ (Pagel 1999; Freckleton et al. 2002) or phylogenetic heritability (Lynch 1991; Housworth et al. 2004)). The phylogeny explained almost all of the between-species variance in viral titre for DAffSV and 2 2 2 DMelSV (σ p/( σ p+ σ s)=0.86, 95% CI=0.53-1 and 0.91, 95% CI=0.74-1, 2 2 2 respectively), and most of the between-species variation for DObsSV (σ p/( σ p+ σ s) = 0.72, 95% CI =0.43-0.98). Therefore, most of the differences between species in viral titres can be explained either by the host phylogeny or the distance from the natural host.

Is it the distance from the natural host, or host phylogeny per se, that is most important in determining viral replication and persistence in a new host? To allow a direct comparison of these two effects, we calculated the expected amount of change in viral titre from the root to the tips of the tree that will result from the phylogenetic effect. This was done by taking the product of the standard deviation of the phylogenetic effect and 2 " , which is the mean of a folded normal distribution with mean of zero. The expected difference between the ancestral trait value and a descendant trait value under Brownian motion is zero. However, there ! will be some variance around this expectation and so the expected divergence (i.e. absolute difference) will be non-zero. Under Brownian motion the distribution of differences is normal with zero mean and variance equal to the phylogenetic variance. The absolute difference is

90 Chapter 5: Phylogenetic determinants of host shifts therefore a folded normal (where the folding is around zero), which has mean 2*standard deviation/π which is the expected divergence. This gave values of 2.15, 3.28 and 2.69 viral-titre-units for DAffSV, DMelSV and DObsSV respectively. These can be compared directly to the estimates described above of the amount of change in viral titre as the genetic distance from the natural host increases (-1.82, - 0.47 and -3.70 viral-titre-units for DAffSV, DMelSV and DObsSV, respectively). The time from the root to tip of the phylogeny is ~40 million years (Russo et al. 1995), so for every ~40 million year

91 Chapter 5: Phylogenetic determinants of host shifts Figure 5.1 Phylogeny of host species and the respective mean change in viral titre (log2 scale) for each species-virus combination. Natural host-virus combinations are shaded in red. The phylogeny is mid-point rooted, node labels are posterior supports, the scale bar is number of substitutions per site and the scale axis represents the approximate age since divergence in millions of years (my) based on estimates from (Russo et al. 1995).

Figure 5.2 The effect of the genetic distance of a novel host from the natural host on the titre of three sigma viruses 15 days after injection. The estimates of viral titre have been corrected for phylogenetic effects and are plotted on a log2 scale. Genetic distance is relative to the distance from root to tip (root to tip equals 1). Trend line is for illustrative purposes. travelled along the phylogeny, or from the natural host, we expect to see the above changes in viral titre. From these estimates it is clear that over this timescale the two processes are of similar importance for DAffSV and DObsSV, but that the host- phylogeny is much more important than distance-from-the-original-host in determining the replication and persistence of DMelSV in a new host.

Differences between hosts in viral replication and persistence could either reflect differences in susceptibility to all three viruses (‘general susceptibility’), or the effects on the three viruses could be independent (‘specific susceptibility’). We found that most of the phylogenetic effect was caused by species differing in their

92 Chapter 5: Phylogenetic determinants of host shifts level of general susceptibility, as there were strong phylogenetic correlations between viruses (table 5.1). Furthermore, the correlation is not greater between the two viruses that naturally infect closely related hosts (DAffSV and DObsSV). Therefore, the phylogenetic effects mean that a given host species’ susceptibility to one virus is strongly correlated to its susceptibility to another sigma virus, regardless of whether the virus originated from a closely or distantly related host.

Viruses Phylogenetic correlation 95% CI r DAffSV,DObsSV 0.67 0.33-0.96 DAffSV-DMelSV 0.74 0.50-0.95 DObsSV-DMelSV 0.78 0.54-0.98 Table 5.1 Phylogenetic correlations and 95% CI between each pair of viruses.

The analysis above assumes that we have the correct phylogeny, but some of the relationships are poorly resolved (Figure 5.1). To check whether this affected our results, we repeated the analysis integrating over the posterior sample of trees generated during the phylogenetic analysis (Pagel 1994). This was achieved by fitting the phylogenetic mixed model to 2000 different trees from the posterior sample (from 100,000 trees we used a burn-in of 30,000 trees and then used every 35th tree). This gave very similar results to our main analysis, suggesting that phylogenetic uncertainty does not affect our conclusions. We would note however, 2 that σ p is biased downwards whenever the tree is incorrect, and this bias is not removed by this procedure.

5.4 Discussion

We found that the ability of three sigma viruses to persist and replicate in 51 different species of Drosophila is largely explained by the host phylogeny. The effect of phylogeny can be broken down into two components; not only did viral

93 Chapter 5: Phylogenetic determinants of host shifts titres tend to decline with increasing genetic distance from the natural host, but there is also a tendency for related hosts to have similar titres, independent of this effect.

The decline in viral titres with increasing distance from the natural host suggests that the greater the change in the cellular environment, the less well adapted the virus is. This might be caused by changes in the cellular machinery used by the virus in its replication cycle, or the virus being less adept at avoiding or suppressing the immune response. Regardless of the causes of this effect, it suggests that successful host shifts may be more likely between closely related hosts (Woolhouse et al. 2005). This is supported by the observation that the titre of DMelSV in D. melanogaster correlates with the rate at which the virus is transmitted (Bregliano 1970; Brun and Plus 1980) and therefore the ability of the virus to spread though the populations. It has also been reported that although DMelSV will replicate in a range of Drosophila, it was stably transmitted only in the closely related Drosophila simulans and not the more distantly related Drosophila funebris (L'Heritier 1957). Furthermore, there is tentative evidence that host shifts of sigma viruses may often occur between closely related species in natural populations. Although comparisons of Drosophila and sigma virus phylogenies show evidence of past host shifts, the host and virus phylogenies are more similar than expected by chance (Chapter 4; Longdon et al. 2011). This may be the result of more frequent host switches between closely related species, as would be predicted by our results (although cospeciation would produce the same pattern).

This result is interesting because it has previously been questioned whether the genetic distance between host species plays an important role in predicting the source of host shifts, especially for RNA viruses (Woolhouse et al. 2005; Parrish et al. 2008). Indeed, some plant viruses can replicate in an enormous range of species; Cucumber mosaic virus can infect 1, 300 species in over 100 families of plants and Tomato spotted wilt virus can infect 800 species in 80 plant families (Hull 2009). The use of conserved receptors to enter host cells may be key to such large potential host ranges (Baer et al. 1990; Baranowski et al. 2001; Woolhouse 2002). However, although a virus may be able to enter the cells of many different species, it then relies

94 Chapter 5: Phylogenetic determinants of host shifts on numerous different components of the cellular machinery to replicate efficiently, and this may make shifts to hosts that are distant from the natural host unlikely.

A factor that could lead to changes in host suitability across the phylogeny is selection for resistance to viruses. One reason to suspect that this may be important is that genes involved in antiviral immunity often evolve exceptionally rapidly in Drosophila (Obbard et al. 2006; Obbard et al. 2009; Obbard et al. 2011), and this may translate into rapid phenotypic changes in host susceptibility. If this process is driving the patterns that we see, then the observation that natural host-parasite combinations tend to be more susceptible would suggest that the viruses have been able to overcome these host defences, resulting in viruses that are well adapted to their natural hosts, rather than vice versa

After accounting for the effect of distance from the natural host, the host phylogeny still explains most of the remaining variation in viral titre between species. This ‘phylogenetic effect’ means that that closely related host species will have similar levels of resistance due to their non-independence as a result of common ancestry. For two of the viruses (DAffSV and DObsSV), we found that this phylogenetic effect was of comparable importance to the effect of genetic distance from the natural host, and for the third virus (DMelSV) it was more important. While the distance effect may be explained by increasing differences in the host cellular environment as described above, a phylogenetic effect could also produce this pattern. The establishment of a parasite in a host species may be determined by a phylogenetic effect (i.e. a natural host may have become infected in the first place as it was particularly susceptible compared to other species), which could also explain the decline in viral titre in species distantly related to the natural hosts.

The phylogenetic effect is mostly caused by variation in susceptibility to all three viruses (there is a strong phylogenetic correlation in the titres of the three viruses). Such patterns may arise if the common ancestors of different host clades have acquired or lost immune or cellular components that affect susceptibility to all sigma viruses. The frequent gain and loss of immune components is well established. For

95 Chapter 5: Phylogenetic determinants of host shifts example, Drosophila species in the obscura group have lost a type of blood cell (lamellocytes) that are found in other Drosophila, which means they are particularly susceptible to parasitoid wasps (Havard et al. 2009). Similarly, a class of antifungal peptides (drosomycins) are found only in the melanogaster group of Drosophila (Jiggins and Kim 2005; Sackton et al. 2007), and components of antiviral RNAi pathways have lineage-specific distributions (Obbard et al. 2009). The strong phylogenetic correlation between the three viruses we studied might seem surprising as these viruses are very different to one another at the sequence level (amino-acid identities are ~20%-40%; Chapters 2 and 4; Longdon et al. 2010; Longdon et al. 2011). However, even viruses which show no similarities at the sequence level often share elements of protein structure (Rossmann and Tao 1999; Walker and Kongsuwan 1999; Koonin et al. 2008), and different rhabdoviruses are known to have similar modes of action (for example, infecting nervous tissue; Brun and Plus 1980; Fu 2005).

The strong phylogenetic effect that we found also has practical implications for comparative studies of resistance in different species. It means that observations on related species will not be independent, so it is essential to account for these effects in the analysis of comparative data (Felsenstein 1985). For example, the decline in the resistance of novel hosts with genetic distance from the natural hosts that has been observed in some previous studies may be attributable to a phylogenetic effect, rather than distance itself.

In conclusion, our results show that the host phylogeny is an important determinant of viral persistence and replication in novel hosts, and therefore may also be an important influence on the source of new emerging diseases. The effect is more subtle than simply leading to a decline in infection success with genetic distance from the original host, because the strong phylogenetic effect may sometimes result in susceptible hosts being grouped in phylogenetically distant clades, allowing parasites to jump great phylogenetic distances. Indeed, here we found the most distantly related clade to all of the natural hosts (the Scaptodrosophila) have one of the highest viral titres before correcting for the effect of host phylogeny. The

96 Chapter 5: Phylogenetic determinants of host shifts importance of these phylogenetic effects on replication and persistence relative to factors affecting exposure and onward transmission requires further study if we are to understand how they affect a parasites ability to host shift in nature.

Acknowledgments

We would like to thank the following people who kindly supplied flies; Mike Ritchie, Gil Smith, Paris Veltsos, Shuo-yang Wen, Bryant McAllister, John Roote, Rhonda Snook, Sarah Fahle, John Jaenike, Penny Haddrill, Andrea Betancourt, Lena Wilfert, and Terry Markow and Sergio Castrezana at the San Diego stock centre. Kim van der Linde and Greg Spicer provided advice about sequencing genes for the fly phylogeny. We thank Tom Little, Amy Pedersen and John Welch for helpful comments. Andrew Rambaut provided useful advice on creating the phylogeny. Greg Hurst first made us realise this is an interesting question.

97 Chapter 6: General Discussion 6. General Discussion

I wrote this chapter, with comments on structure from Darren Obbard and Frank Jiggins.

6.1 Summary of field

Understanding the diversity of pathogens, how they spread through host populations (or novel host species), and what factors affect their ability to host switch are all important questions in addressing disease emergence (Daszak et al. 2000; Woolhouse et al. 2005), but also raise intriguing questions in evolutionary biology (Woolhouse et al. 2002; Holmes 2009). In particular RNA viruses are widely thought to be one of the most likely sources of emerging disease (Parrish et al. 2008) and with their high mutation rates, offer the chance to observe evolution occurring on an experimental time scale. However, in addition to the plentiful field studies and in vitro cell culture experiments (Novella et al. 1995; Turner and Elena 2000; Greene et al. 2005; Anishchenko et al. 2006; Elena and Sanjuan 2007; Zhao 2007; Elena et al. 2009; Holmes 2009; Sharp and Hahn 2010), in vivo experimental studies of animal RNA viruses are needed to address questions relating to host-switching and host- parasite coevolution (Coffey et al. 2008). Drosophila has been well utilised as a model for understanding the mechanistic basis of pathogen resistance (Lemaitre and Hoffmann 2007), but comparatively few studies have examined its interactions with natural pathogens (Brun and Plus 1980; Lopez-Ferber et al. 1997; Ebbert et al. 2001; Hurst et al. 2001; Jaenike and Perlman 2002; Polak 2003; Lazzaro et al. 2006; Weeks et al. 2007; Juneja and Lazzaro 2009; Jaenike et al. 2010). While Drosophila’s ability to resist viral infections has been examined (Lemaitre and Hoffmann 2007; Huszar and Imler 2008), many of these studies have utilised non-natural viruses. It is unclear as to how resistance may differ between non-natural and natural coevolving pathogens (see 6.3 Future Directions). To study the co-evolutionary and ecological interactions between host and parasites, studying naturally occurring host-parasite combinations is essential (Little 2002). Sigma viruses and Drosophila therefore have

98 Chapter 6: General Discussion great potential as a model to study the fundamental questions relating to the evolution, ecology and emergence of infectious diseases.

6.2 Overview

6.2.1 Viral discovery and phylogenetics

Viruses are ubiquitous in nature, yet we have hardly begun to scratch the surface of the total viral biodiversity, with most of the reported viruses being in vertebrate hosts with direct impacts on human health and the economy (Kitchen et al. 2011). While this is clearly an essential focus, by additionally expanding our search to viruses infecting other organisms, major insights into the evolution of viruses can be made. Indeed there are currently only 32 reported genera of viruses infecting invertebrates with over 3 times as many reported genera in vertebrates (ICTVdB 2006), suggesting viruses in non-vertebrates may be under-represented.

The genus Drosophila is composed of over 2000 species (Markow and O'Grady 2007), yet we are currently aware of fewer than 10 viruses in this group (Felluga et al. 1971; Brun and Plus 1980; Lopez-Ferber et al. 1997; Huszar and Imler 2008), yet up to 40% of D. melanogaster populations have been reported to be infected with a virus (Brun and Plus 1980). Of these viruses some are simply reports of virus-like particles in flies, most are in D. melanogaster and whether some are natural pathogens or lab contaminants is unclear (for example Drosophila X virus has never been found in natural populations; Huszar and Imler 2008). Perhaps the best studied Drosophila virus is the sigma virus, a naturally occurring rhabdovirus found in D. melanogaster. Rhabdoviruses are unusual as they are one of only three known families of viruses to infect plants, invertebrates and vertebrate organisms (Fu 2005; Ammar et al. 2009) (the other two being Reoviridae and Bunyaviridae (Holmes 2009)). Therefore by increasing our understanding of the diversity of this group we will get a better grasp on its evolutionary history, and more generally how these transitions between different host taxa occur (Bourhy et al. 2005).

99 Chapter 6: General Discussion

In this thesis I attempted to uncover a small amount of the diversity in rhabdoviruses that infect Drosophila. In doing so I discovered six novel sigma viruses (one outside of the genus Drosophila), bringing the total number of these viruses to seven (Chapters 2 and 4). These viruses were isolated from three continents, suggesting sigma viruses may be widely distributed across insect species and found on a global scale. By completing the genome sequence of DMelSV, its phylogenetic position has been resolved for the first time. Together with the other sigma viruses it forms a clade of dipteran-infecting viruses, which are closely related to the dimarhabdoviruses, a group of vertebrate pathogens vectored by arthropods. The sigma virus clade is ancient, showing a greater level of diversity than four of the six previously described rhabodovirus genera, suggesting this is a diverse group of viruses, perhaps with many more sigma-like viruses waiting to be discovered. From this, and reports of CO2 sensitivity and rhabdovirus particles in other insects (Brun and Plus 1980; Shroyer and Rosen 1983; Afzelius et al. 1989), it seems likely sigma viruses are common pathogens of insects (Chapters 2 and 4). More generally, by discovering such viruses in non-vertebrate species, we may broaden our understanding of the relationships between viruses, how viruses evolve and how transitions between host taxa occur.

6.2.2 Transmission and population dynamics

While vertically transmitted bacteria which are inherited maternally are common (Duron et al. 2008), we know very little about vertically transmitted viruses. Even less common are paternally transmitted viruses (Longdon and Jiggins 2010), with only a few reported cases of transmission through sperm, or pollen (Frosheiser 1974; Afzelius et al. 1989; Mink 1993; Lopez-Ferber et al. 1997; Li et al. 2007). The simplest explanation for the scarcity of reports of paternally transmitted parasites is that sperm/pollen are small and contribute little cytoplasm to the zygote, so parasites are physically excluded from male gametes. This is consistent with the observation that even among paternally transmitted parasites the rate of transmission through male gametes is usually lower than through eggs (Frosheiser 1974; Brun and Plus

100 Chapter 6: General Discussion 1980; Li et al. 2007). One theory to explain why sperm have evolved to be small compared to eggs is for this very purpose; by excluding paternal cytoplasm they prevent transmission of microbes and organelles to the next generation to avoid cytoplasmic conflict (Hurst 1990). Due to this, there is often a misconception that sperm cannot transmit parasites at all. Vertical transmission through both sperm and eggs can be a potent driving force to allow a parasite to spread through a host population (L'Heritier 1970). DMelSV is a rare example of a biparentally transmitted parasite, but it was previously unclear whether this was an odd curiosity or whether such parasites are more common than we might expect.

In Chapter 3 I found the newly discovered DAffSV and DObsSV are both vertically transmitted, and like DMelSV this transmission is through both eggs and sperm. Horizontal transmission appears rare or absent. Males have reduced transmission compared to females, which appears to be due to the reduced viral titre in zygotes infected by sperm rather than eggs. Using simulations based on these experimental transmission rates, I found that this biparental mode of transmission allows the virus to invade and rapidly spread through populations, even it is carries a cost to the host. I then described such a spread in natural populations of DObsSV in the UK, and based on analysis of sequence data suggested that this spread has been very recent, within approximately the last decade, which is consistent with the simulations results

Similarly DMelSV has undergone a recent sweep through its host population (Fleuriet et al. 1990; Fleuriet and Sperlich 1992) in response to the evolution of a host resistance gene (Bangham et al. 2007), and similar sweeps have been observed in a number of other vertically transmitted symbionts (Turelli and Hoffmann 1991; Hornett et al. 2006; Charlat et al. 2009; Jaenike et al. 2010; Himler et al. 2011). It therefore seems vertically-transmitted rhabdoviruses are common pathogens of insects, and this mode of transmission offers an alternate strategy of spread than those observed in endosymbiotic bacteria (Engelstadter and Hurst 2009).

101 Chapter 6: General Discussion 6.2.3 Host-switching

While some parasite species co-speciate with their hosts at a rate greater than we would expect by chance (Hafner and Page 1995; McGeoch et al. 2000; Switzer et al. 2005), many contemporary host-parasite interactions are the result of a host switching event, where a parasite jumps from one host species to another (Woolhouse and Gowtage-Sequeria 2005). Vertically-transmitted arthropod endosymbionts are widespread and might be expected, a priori, to display extreme host-fidelity and thus co-speciate with their hosts.

In Chapter 4 I examined whether the vertically-transmitted sigma viruses have undergone host switches. There was significant incongruence between host and viral phylogenies suggesting that sigma viruses have host switched in their evolutionary history. The incongruence was due to two major inconsistencies between the host and viral topologies, and the switches were between flies collected on different continents suggesting multiple switching events have occurred.

Other vertically transmitted endosymbionts, most notably parasitic Wolbachia, do not seem to co-diverge with their hosts (Baldo et al. 2006; though see Opijnen et al. 2005 for a Nasonia-Wolbachia cospeciation as an exception). This may be due to the evolution of host repressors of Wolbachia (Hornett et al. 2006) causing extinctions, so preventing persistence across speciation events (Koehncke et al. 2009). Similar processes may be occurring in sigma viruses with infections in a given host being relatively short-lived, possibly due to the spread of host resistance genes (Bangham et al. 2007). The ability of sigma viruses to survive in the long term may rely on rare horizontal transmission events between host lineages, possibly through parasitic mites (see Chapter 2).

6.2.4 Host-shift determinants

A large number of emerging diseases are the result of a host-shift, where a parasite jumps from one host species to another (Woolhouse et al. 2005). However, we still

102 Chapter 6: General Discussion have a great deal to learn about predicting which pathogens are likely to undergo host shifts. Sympatry is clearly a prerequisite, but what other factors affect a parasite’s ability to host shift? Both parasite and host properties may affect parasites’ ability to host shift. For example the high mutation rates and large population sizes of RNA viruses have been suggested to make them the pathogen group with the greatest potential for host switching (Woolhouse et al. 2005; Parrish et al. 2008). Likewise the genetic distance between donor and recipient host species may affect a parasites ability to infect a novel host (Daszak et al. 2000; Davies and Pedersen 2008).

Experimental studies of host-shifts may highlight factors that play a role in disease emergence. Given sigma viruses have undergone host switches, I examined the effect of host phylogeny on the ability of these viruses to host shift (chapter 5). I found the host phylogeny plays a major role in determining these viruses ability to persist and replicate in a novel host. Viral titre decreased with increased genetic distance from the natural host. Similarly, there was a large phylogenetic effect (after accounting for the effect of genetic distance from the natural host) for all of the viruses. There were also strong phylogenetic correlations between the viruses, suggesting the phylogenetic effect is mostly caused by general susceptibility to all three viruses.

Previously the importance of host phylogeny has been questioned as a predictor of host shifts (Woolhouse et al. 2005). Here I have shown it is highly important for sigma viruses; this seems to be a general trend also found elsewhere (Perlman and Jaenike 2003; Streicker et al. 2010), with infection success declining in more genetically distant hosts. Additionally, the large amount of variation explained by the phylogenetic effect suggests some clades of host species may be very resistant or susceptible to certain pathogens. Such trends may be observed in nature as preferential host switching has been suggested to be responsible for closely related viruses infecting clusters of closely related species (Charleston and Robertson 2002; De Vienne et al. 2007; Kitchen et al. 2011). By teasing apart these questions, through

103 Chapter 6: General Discussion experiments and collecting data in the field, we can try and explain such patterns and improve our understanding of disease emergence.

6.3 Future Directions

In this thesis I have examined the diversity, ecology and evolution of sigma viruses in Drosophila. The data presented suggests that sigma viruses are likely to be widespread vertically transmitted insect viruses, which have dynamic interactions with their hosts. However, many questions still remain, with these viruses offering great potential to answer a number of questions relating to pathogen evolution.

6.3.1 Host shifts

Many host-parasite interactions are the result of a host-switching event (Chapters 4 and 5), where a parasite jumps from one host species to another (Woolhouse and Gowtage-Sequeria 2005). If estimates of incidence across host species can be obtained, then inferences can be made about which host species parasites are switching between (Davies and Pedersen 2008). For example, do parasites switch between hosts in the same niche, species on the same continent, or perhaps preferentially between closely related species? By discovering more sigma-like viruses we may better understand the relationships of these viruses to one another, which in turn would give us a greater understanding of which species they have switched between (Chapter 4). This will also uncover a greater array of the diversity in this group of viruses and the rhabdoviruses as a whole, which may improve our understanding of the evolutionary history of these viruses, and help resolve their phylogeny, for example by breaking up long branches (Hillis 1996).

6.3.2 Sweeps

The sigma viruses of D. melanogaster and D. obscura both appear to have undergone a recent sweep though their host populations (Fleuriet and Periquet 1993;

104 Chapter 6: General Discussion Fleuriet 1994; Chapter 3). This raises the question as to whether such sweeps are common in sigma viruses? If it is possible to catch a sweep occurring in a species in which we have already sampled the virus, we can try to understand the selective advantage of the virus that has swept through host population. Similarly by monitoring a number of species of Drosophila we can examine whether sigma viruses are host switching and sweeping through novel host populations. This could offer insights into host-parasite coevolution in the wild, but also allow controlled laboratory experiments to try to unravel the processes that are occurring.

6.3.3 Costs of being paternally transmitted

A number of questions remain to be answered about sigma virus’ basic biology. In particular we do not yet understand how the virus invades the host gametes (although see: (Teninges 1968; Bregliano 1970). Additionally, examining the costs on male fertility of harbouring these viruses would also be of interest as a contrast to costs associated with other parasitic endosymbionts; do these viruses effect sperm competition? (Champion de Crespigny and Wedell 2006); do infected sperm have reduced survival, motility or longevity? (Snook et al. 2000; Holman and Snook 2008); do the non-fertilising “short” sperm of the obscura group hosts carry the virus? (Holman et al. 2008), and can repeated copulation temporarily rid males sperm of the of the virus as in Wolbachia infections (Karr et al. 1998)?

6.3.4 RNAi as a host defence

It has recently been suggested the RNAi pathway is involved in resistance to another rhabdovirus (vesicular stomatitis virus) in Drosophila (Mueller et al. 2010). It was previously thought negative sense RNA viruses do not produce double stranded (ds) RNA accessible to host RNAi components (dsRNA is a prerequisite of RNAi silencing (Obbard et al. 2009). While no detectable amounts of double stranded RNA were detected, viral short interfering RNAs were detected (Mueller et al. 2010). These short RNA equally matched both positive and negative genomic copies of the virus, suggesting RNAi processing of double stranded RNA from the host genome

105 Chapter 6: General Discussion and its replication intermediate, rather than as a result of secondary structure. It would be of significant interest to see whether this is a general response to negative sense RNA viruses, and whether naturally occurring viruses such as sigma viruses induce an RNAi antiviral response, in particular as RNAi pathways are amongst the most rapidly evolving genes in the Drosophila genome (Obbard et al. 2006; Obbard et al. 2009), suggesting they may be locked in coevolutionary arms races with their viral parasites. If sigma viruses do elicit an RNAi response, it is of interest as to whether this is a response against inherited infection or invading viruses, and whether sigma viruses try to suppress the RNAi pathway as is observed for other RNA viruses, including Drosophila C virus (van Rij et al. 2006).

6.3.5 Host resistance and disease emergence

While we know genetic variation to non-natural pathogens exists (Lazzaro et al. 2004; Tinsley et al. 2006) we do not know how it differs to that of pathogens which have been naturally coevolving with their hosts. Notably, resistance to non-natural pathogens may be due to standing genetic variation not currently under selection, and so may behave differently to those arising through co-evolution (for example; do they explain less of the variation in resistance and how do they behave when selection is weak post-reproductive senescence).

It could be possible to test this by examining D. melanogaster’s ability to resist sigma viruses isolated from different host species. Using genome wide association studies, polymorphisms could be identified which are involved in resistance to novel sigma viruses. Are these the same polymorphisms involved in resistance to its own virus (the phylogenetic correlations from Chapter 5 suggests this may be possible), or different ones (in the same or different genes)? This could be done on several scales, for example using sigma viruses of varying distances from DMelSV, and perhaps even more distantly related rhabdoviruses. For example vesicular stomatitis virus (VSV) and other vesiculoviruses have previously been shown to replicate in, and adapt to culture in D. melanogaster (Bussereau and Contamine 1980; Laurent 1983), but the ref(2)P resistance allele is not thought to affect replication of VSV

106 Chapter 6: General Discussion (Wyers et al. 1993). Similarly insects exposed to a single insecticide have been shown to evolve cross resistance to a number of insecticides, as they share the same target (although bind to it by different mechanisms; Cochrane et al. 1998).

Not only could this inform us on how much variation there is in resistance to a non- natural pathogen, but it may also enable us to detect the “ghosts of coevolution past” (Vale and Little 2010). These are genes that have previously swept through the host population in response to sigma virus infection, but following reciprocal adaptations from the virus, no-longer play a role in viral resistance. This could also provide insights into how coevolving with a related parasite may provide cross-resistance to parasites from other host species. For example, D. melanogaster lines selected for resistance to certain parasitoid species were cross resistant to other parasitoids (Fellowes et al. 1999); however, it remains to be seen as to whether hosts are more resistant to parasites closely related to the parasite they have coevolved with.

6.3.6 Other factors affecting host shifts

It is not only the phylogeny of the host that may be a determinant of host shifts; the parasite phylogeny has also been suggested to have an effect (Perlman and Jaenike 2003; de Vienne et al. 2009). Parasites infection success may be negatively correlated with genetic distance from the natural pathogen (of that host) to the novel- infecting parasite. While the host phylogeny is thought to explain other aspects of parasite ecology (Alizon et al. 2010), further research is needed to examine the generality of parasite phylogeny on the success host shifts. The sigma viruses of which there are currently lab isolates (DAffSV, DMelSV and DObsSV) form a soft polytomy so the isolation of other sigma viruses is needed to address this question.

6.3.7 Experimental evolution in novel hosts

If parasite fitness is reduced in a novel host, to the point where sustained transmission is not possible, then we might suggest this host-parasite combination to possess a low risk of emergence. However, due to their high mutation rates, RNA

107 Chapter 6: General Discussion viruses can adapt to novel hosts (Novella et al. 1995; Turner and Elena 2000; Elena and Sanjuan 2007) given adequate exposure. Determining the amount of genetic change a virus must undergo to adapt to a novel host is critical in assessing whether a virus posses a risk of evolving sustained transmission in a novel host (Pepin et al. 2010).

As sigma viruses appear to have the ability to replicate in a wide range of hosts, it would be interesting to examine the process of “adaptive fine tuning” (Pepin et al. 2010) and the fitness landscape of host switching and emergence (Parrish et al. 2008), in particular in hosts with lower rates of viral replication (see Chapter 5). By experimentally evolving sigma viruses in novel hosts (of varying genetic distances from the natural host) we could examine the virus’ ability to adapt to a novel host, and the genetics underlying this. For example, are a greater number of mutations required to adapt to a more genetically distant host (Holmes 2009), and is this process conserved with the same mutations evolving in different replicates? Such processes seem to occur during host shifts. Strains of SARs coronavirus that were transmitted human-human were found to carry a certain mutation; in human cell culture this mutation was found to increase viral fitness compared to the strain found in its palm civet donor host (Pepin et al. 2010). In Venezuelan encephalitis virus, a single mutation in the envelope glycoprotein allowed the virus to switch from infecting rodents to infect equine hosts (Anishchenko et al. 2006). Similarly in HIV drug resistance, the same mutations occur in a set order in different patients treated with the same antiviral drug (Boucher et al. 1992). If viruses do adapt to a novel host we could also examine whether this leads to a reduction in fitness in the original host and is this due to antagonistic pleiotropy (i.e. a beneficial mutation in the new host is detrimental in the original host) or an alternative process (for example, the accumulation of mutations into genes which are no longer required in the new host). Such results have been found in virus-cell culture experiments (Novella et al. 1995; Greene et al. 2005; Elena et al. 2009), plant-virus experiments (Wallis et al. 2007; Agudelo-Romero et al. 2008) and a mosquito-rodent in vivo system (Coffey et al. 2008).

108 Chapter 6: General Discussion Whilst we have learnt a great deal about the potential for host specialisation from experimental evolution experiments using RNA viruses (Elena and Sanjuan 2007; Elena et al. 2009), most of these studies are in vitro studies in animal cell culture which may represent a very different environment to the multi-cell and tissue types found in whole organisms (although see (Coffey et al. 2008), and note there are numerous studies in plants (Wallis et al. 2007; Agudelo-Romero et al. 2008; Elena et al. 2008; Elena et al. 2009)). In vitro studies are also often limited to a small number of different cell types. The use of in vivo animal studies carried out on a large number of different host species has the potential to provide a great deal of information on the role of natural selection in viral host shifts.

6.4 Conclusions

In this thesis I have used a mixture of fieldwork and lab experiments to examine the evolution of sigma viruses in Drosophila. The data I present suggests vertically transmitted sigma viruses are common pathogens in insects, and these viruses can rapidly sweep through host populations. These viruses have undergone host shifts during their evolutionary history, and I found that the host phylogeny is a major determinant of their ability to persist and replicate in novel hosts. Furthermore, this system has great potential as a model for answering fundamental questions in virus evolution and disease emergence.

109 References References

Afzelius, B. A., G. Alberti, R. Dallai, J. Godula and W. Witalinski (1989). "Virus- Infected and Rickettsia-Infected Sperm Cells in Arthropods." Journal of Invertebrate Pathology 53(3): 365-377. Agrawal, A. and C. M. Lively (2002). "Infection genetics: gene-for-gene versus matching-alleles models and all points in between." Evolutionary Ecology Research 4(1): 79-90. Agudelo-Romero, P., F. de la Iglesia and S. F. Elena (2008). "The pleiotropic cost of host-specialization in Tobacco etch potyvirus." Infect Genet Evol 8(6): 806-814. Ahne, W., H. V. Bjorklund, S. Essbauer, N. Fijan, G. Kurath and J. R. Winton (2002). "Spring viremia of carp (SVC)." Diseases of Aquatic Organisms 52: 261- 272. Alizon, S., A. Hurford, N. Mideo and M. Van Baalen (2009). "Virulence evolution and the trade-off hypothesis: history, current state of affairs and the future." J Evol Biol 22(2): 245-259. Alizon, S., V. von Wyl, T. Stadler, R. D. Kouyos, S. Yerly, B. Hirschel, J. Boni, C. Shah, T. Klimkait, H. Furrer, A. Rauch, P. L. Vernazza, E. Bernasconi, M. Battegay, P. Burgisser, A. Telenti, H. F. Gunthard and S. Bonhoeffer (2010). "Phylogenetic approach reveals that virus genotype largely determines HIV set-point viral load." PLoS Pathog 6(9). Altizer, S. and A. B. Pedersen (2008). Host-pathogen evolution, biodiversity and disease risk for natural populations. Conservation Biology: Evolution in Action. S. Carroll and C. Fox. NY, Oxford University Press: 259-277. Ambrose, R. L., G. C. Lander, W. S. Maaty, B. Bothner, J. E. Johnson and K. N. Johnson (2009). "Drosophila A virus is an unusual RNA virus with a T=3 icosahedral core and permuted RNA-dependent RNA polymerase." Journal of General Virology 90: 2191-2200. Ammar, E. D., C. W. Tsai, A. E. Whitfield, M. G. Redinbaugh and S. A. Hogenhout (2009). "Cellular and Molecular Aspects of Rhabdovirus Interactions with Insect and Plant Hosts." Annual Review of Entomology 54: 447-468. Anishchenko, M., R. A. Bowen, S. Paessler, L. Austgen, I. P. Greene and S. C. Weaver (2006). "Venezuelan encephalitis emergence mediated by a phylogenetically predicted viral mutation." Proc Natl Acad Sci U S A 103(13): 4994-4999. Baer, G. M., J. H. Shaddock, R. Quirion, T. V. Dam and T. L. Lentz (1990). "Rabies susceptibility and acetylcholine receptor." Lancet 335(8690): 664-665. Baldo, L., N. A. Ayoub, C. Y. Hayashi, J. A. Russell, J. K. Stahlhut and J. H. Werren (2008). "Insight into the routes of Wolbachia invasion: high levels of horizontal transfer in the spider genus Agelenopsis revealed by Wolbachia strain and mitochondrial DNA diversity." Mol Ecol 17(2): 557-569. Baldo, L., J. C. Dunning Hotopp, K. A. Jolley, S. R. Bordenstein, S. A. Biber, R. R. Choudhury, C. Hayashi, M. C. Maiden, H. Tettelin and J. H. Werren (2006). "Multilocus sequence typing system for the endosymbiont Wolbachia pipientis." Appl Environ Microbiol 72(11): 7098-7110. Bandelt, H. J., P. Forster and A. Rohl (1999). "Median-joining networks for inferring intraspecific phylogenies." Molecular Biology and Evolution 16(1): 37-48.

110 References Bandi, C., M. Sironi, G. Damiani, L. Magrassi, C. A. Nalepa, U. Laudani and L. Sacchi (1995). "The establishment of intracellular symbiosis in an ancestor of cockroaches and termites." Proc Biol Sci 259(1356): 293-299. Bangham, J., K. W. Kim, C. L. Webster and F. M. Jiggins (2008). "Genetic variation affecting host-parasite interactions: Different genes affect different aspects of sigma virus replication and transmission in Drosophila melanogaster." Genetics 178(4): 2191-2199. Bangham, J., S. A. Knott, K. W. Kim, R. S. Young and F. M. Jiggins (2008). "Genetic variation affecting host-parasite interactions: major-effect quantitative trait loci affect the transmission of sigma virus in Drosophila melanogaster." Molecular Ecology 17(17): 3800-3807. Bangham, J., D. J. Obbard, K. W. Kim, P. R. Haddrill and F. M. Jiggins (2007). "The age and evolution of an antiviral resistance mutation in Drosophila melanogaster." Proceedings of the Royal Society B-Biological Sciences 274(1621): 2027-2034. Bao, S. N., E. W. Kitajima, G. Callaini and R. Dallai (1996). "Virus-like particles and rickettsia-like organisms in male germ and cyst cells of Bemisia tabaci (Homoptera, Aleyrodidae)." Journal of Invertebrate Pathology 67(3): 309-311. Baranowski, E., C. M. Ruiz-Jarabo and E. Domingo (2001). "Evolution of cell recognition by viruses." Science 292(5519): 1102-1105. Basden, E. B. (1954). "The distribution and biology of Drosophilidae (Diptera) in Scotland, including a new species of Drosophila." Transaction of the Royal Society of Edinburgh 62: 602-654. Begon, M. (1976). "Temporal Variations in Reproductive Condition of Drosophila- Obscura Fallen and Drosophila-Subobscura Collin." Oecologia 23(1): 31-47. Bendtsen, J. D., H. Nielsen, G. von Heijne and S. Brunak (2004). "Improved prediction of signal peptides: SignalP 3.0." Journal of Molecular Biology 340(4): 783-795. Berkalof, A., J. Breglian and A. Ohanessi (1965). "Mise En Evidence De Virions Dans Des Drosophiles Infectees Par Le Virus Hereditaire Sigma." Comptes Rendus Hebdomadaires Des Seances De L Academie Des Sciences 260(22): 5956-&. Berkalof, A., J. Breglian and A. Ohanessi (1965). "Mise En Evidence Des Virions Dans Des Drosophiles Infectees Par Le Virus Hereditaire Sigma." Pathologie Biologie 13(19-2): 976-&. Bezier, A., M. Annaheim, J. Herbiniere, C. Wetterwald, G. Gyapay, S. Bernard- Samain, P. Wincker, I. Roditi, M. Heller, M. Belghazi, R. Pfister-Wilhem, G. Periquet, C. Dupuy, E. Huguet, A. N. Volkoff, B. Lanzrein and J. M. Drezen (2009). "Polydnaviruses of Braconid Wasps Derive from an Ancestral Nudivirus." Science 323(5916): 926-930. Bian, G., Y. Xu, P. Lu, Y. Xie and Z. Xi (2010). "The endosymbiotic bacterium Wolbachia induces resistance to dengue virus in Aedes aegypti." PLoS Pathog 6(4): e1000833. Bjorklund, H. V., K. H. Higman and G. Kurath (1996). "The glycoprotein genes and gene junctions of the fish rhabdoviruses spring viremia of carp virus and hirame rhabdovirus: Analysis of relationships with other rhabdoviruses." Virus Research 42(1-2): 65-80. Bjorkoy, G., T. Lamark, A. Brech, H. Outzen, M. Perander, A. Overvatn, H. Stenmark and T. Johansen (2005). "p62/SQSTM1 forms protein aggregates degraded

111 References by autophagy and has a protective effect on huntingtin-induced cell death." J Cell Biol 171(4): 603-614. Boucher, C. A., E. O'Sullivan, J. W. Mulder, C. Ramautarsing, P. Kellam, G. Darby, J. M. Lange, J. Goudsmit and B. A. Larder (1992). "Ordered appearance of zidovudine resistance mutations during treatment of 18 human immunodeficiency virus-positive subjects." J Infect Dis 165(1): 105-110. Bourhy, H., J. A. Cowley, F. Larrous, E. C. Holmes and P. J. Walker (2005). "Phylogenetic relationships among rhabdoviruses inferred using the L polymerase gene." Journal of General Virology 86: 2849-2858. Bourtzis, K., M. M. Pettigrew and S. L. O'Neill (2000). "Wolbachia neither induces nor suppresses transcripts encoding antimicrobial peptides." Insect Molecular Biology 9(6): 635-639. Bras, F., D. Teninges and S. Dezelee (1994). "Sequences of the N-Gene and M-Gene of the Sigma-Virus of Drosophila and Evolutionary Comparison." Virology 200(1): 189-199. Bregliano, J. C. (1970). "Study of Infection of Germ Line in Female Drosophila Infected with Sigma Virus .2. Evidence of a Correspondance between Ovarian Cysts with Increased Virus Yield and Stabilized Progeny." Annales De L Institut Pasteur 119(6): 685-&. Brownlie, J. C. and K. N. Johnson (2009). "Symbiont-mediated protection in insect hosts." Trends in Microbiology 17(8): 348-354. Brun, G. and N. Plus (1980). The viruses of Drosophila. The genetics and biology of Drosophila. M. Ashburner and T. R. F. Wright. New York, Academic Press: 625- 702. Buchner, P. (1965). Endosymbiosis of Animals with Plant Microorganisms New York, Interscience, Inc. Buckling, A. and P. B. Rainey (2002). "Antagonistic coevolution between a bacterium and a bacteriophage." Proceedings of the Royal Society of London Series B-Biological Sciences 269(1494): 931-936. Busserea, F. (1970). "Study of Co2 Sensitivity Symptom Produced by Sigma Virus in Drosophila .2. Comparative Development of Efficiency of Nerve Centers and Different Organs after Intraabdominal and Intrathoracic Inoculation." Annales De L Institut Pasteur 118(5): 626-&. Busserea, F. (1970). "Symptons of Co2-Sensitivity Induced by Sigma Virus in Drosophila .1. Effect of Inoculation Site on Delay in Symptom Appearance." Annales De L Institut Pasteur 118(3): 367-&. Bussereau, F. and D. Contamine (1980). "Infectivity of two rhabdoviruses in Drosophila melanogaster: Rabies and isfahan viruses." Annales de virologie 131(1): 3-12. Calisher, C. H., N. Karabatsos, H. Zeller, J. P. Digoutte, R. B. Tesh, R. E. Shope, A. P. Travassos da Rosa and T. D. St George (1989). "Antigenic relationships among rhabdoviruses from vertebrates and hematophagous arthropods." Intervirology 30(5): 241-257. Carpenter, J. (2008). The evolution and genetics of Drosophila melanogaster and the sigma virus, University of Edinburgh. PhD thesis. Carpenter, J., S. Hutter, J. F. Baines, J. Roller, S. S. Saminadin-Peter, J. Parsch and F. M. Jiggins (2009). "The transcriptional response of Drosophila melanogaster to infection with the sigma virus (Rhabdoviridae)." Plos One 4(8): e6838.

112 References Carpenter, J. A., L. P. Keegan, L. Wilfert, M. A. O'Connell and F. M. Jiggins (2009). "Evidence for ADAR-induced hypermutation of the Drosophila sigma virus (Rhabdoviridae)." BMC Genet 10: 75. Carpenter, J. A., D. J. Obbard, X. Maside and F. M. Jiggins (2007). "The recent spread of a vertically transmitted virus through populations of Drosophila melanogaster." Molecular Ecology 16(18): 3947-3954. Carre-Mouka, A., S. Gaumer, P. Gay, A. M. Petitjean, C. Coulondre, P. Dru, F. Bras, S. Dezelee and D. Contamine (2007). "Control of sigma virus multiplication by the ref(2)P gene of Drosophila melanogaster: An in vivo study of the PB1 domain of Ref(2)P." Genetics 176(1): 409-419. Champion de Crespigny, F. E. and N. Wedell (2006). "Wolbachia infection reduces sperm competitive ability in an insect." Proc Biol Sci 273(1593): 1455-1458. Chare, E. R., Gould, E.A., and Holmes, E.C. (2003). "Phylogenetic analysis reveals a low rate of homologus recombinaion in negative-sense RNA viruses." Journal of General Virology 84: 2961-2703. Charlat, S., A. Duplouy, E. A. Hornett, E. A. Dyson, N. Davies, G. K. Roderick, N. Wedell and G. D. D. Hurst (2009). "The joint evolutionary histories of Wolbachia and mitochondria in Hypolimnas bolina." Bmc Evolutionary Biology 9: -. Charlat, S., E. A. Hornett, J. H. Fullard, N. Davies, G. K. Roderick, N. Wedell and G. D. Hurst (2007). "Extraordinary flux in sex ratio." Science 317(5835): 214. Charleston, M. A. and D. L. Robertson (2002). "Preferential host switching by primate lentiviruses can account for phylogenetic similarity with the primate phylogeny." Systematic Biology 51(3): 528-535. Chen, X. A., S. Li and S. Aksoy (1999). "Concordant evolution of a symbiont with its host insect species: Molecular phylogeny of genus Glossina and its bacteriome- associated endosymbiont, Wigglesworthia glossinidia." Journal of Molecular Evolution 48(1): 49-58. Christian, P. D. (1987). Studies of Drosophila C and A viruses in Australian populations of Drosophila melanogaster, Australian National University. PhD thesis. Clark, A. G., M. B. Eisen, D. R. Smith, C. M. Bergman, B. Oliver, T. A. Markow, T. C. Kaufman, M. Kellis, W. Gelbart, V. N. Iyer, D. A. Pollard, T. B. Sackton, A. M. Larracuente, N. D. Singh, J. P. Abad, D. N. Abt, B. Adryan, M. Aguade, H. Akashi, W. W. Anderson, C. F. Aquadro, D. H. Ardell, R. Arguello, C. G. Artieri, D. A. Barbash, D. Barker, P. Barsanti, P. Batterham, S. Batzoglou, D. Begun, A. Bhutkar, E. Blanco, S. A. Bosak, R. K. Bradley, A. D. Brand, M. R. Brent, A. N. Brooks, R. H. Brown, R. K. Butlin, C. Caggese, B. R. Calvi, A. Bernardo de Carvalho, A. Caspi, S. Castrezana, S. E. Celniker, J. L. Chang, C. Chapple, S. Chatterji, A. Chinwalla, A. Civetta, S. W. Clifton, J. M. Comeron, J. C. Costello, J. A. Coyne, J. Daub, R. G. David, A. L. Delcher, K. Delehaunty, C. B. Do, H. Ebling, K. Edwards, T. Eickbush, J. D. Evans, A. Filipski, S. Findeiss, E. Freyhult, L. Fulton, R. Fulton, A. C. Garcia, A. Gardiner, D. A. Garfield, B. E. Garvin, G. Gibson, D. Gilbert, S. Gnerre, J. Godfrey, R. Good, V. Gotea, B. Gravely, A. J. Greenberg, S. Griffiths-Jones, S. Gross, R. Guigo, E. A. Gustafson, W. Haerty, M. W. Hahn, D. L. Halligan, A. L. Halpern, G. M. Halter, M. V. Han, A. Heger, L. Hillier, A. S. Hinrichs, I. Holmes, R. A. Hoskins, M. J. Hubisz, D. Hultmark, M. A. Huntley, D. B. Jaffe, S. Jagadeeshan, W. R. Jeck, J. Johnson, C. D. Jones, W. C. Jordan, G. H. Karpen, E. Kataoka, P. D. Keightley, P. Kheradpour, E. F. Kirkness, L. B. Koerich, K. Kristiansen, D. Kudrna, R. J. Kulathinal, S. Kumar, R. Kwok, E. Lander, C. H. Langley, R. Lapoint, B. P.

113 References Lazzaro, S. J. Lee, L. Levesque, R. Li, C. F. Lin, M. F. Lin, K. Lindblad-Toh, A. Llopart, M. Long, L. Low, E. Lozovsky, J. Lu, M. Luo, C. A. Machado, W. Makalowski, M. Marzo, M. Matsuda, L. Matzkin, B. McAllister, C. S. McBride, B. McKernan, K. McKernan, M. Mendez-Lago, P. Minx, M. U. Mollenhauer, K. Montooth, S. M. Mount, X. Mu, E. Myers, B. Negre, S. Newfeld, R. Nielsen, M. A. Noor, P. O'Grady, L. Pachter, M. Papaceit, M. J. Parisi, M. Parisi, L. Parts, J. S. Pedersen, G. Pesole, A. M. Phillippy, C. P. Ponting, M. Pop, D. Porcelli, J. R. Powell, S. Prohaska, K. Pruitt, M. Puig, H. Quesneville, K. R. Ram, D. Rand, M. D. Rasmussen, L. K. Reed, R. Reenan, A. Reily, K. A. Remington, T. T. Rieger, M. G. Ritchie, C. Robin, Y. H. Rogers, C. Rohde, J. Rozas, M. J. Rubenfield, A. Ruiz, S. Russo, S. L. Salzberg, A. Sanchez-Gracia, D. J. Saranga, H. Sato, S. W. Schaeffer, M. C. Schatz, T. Schlenke, R. Schwartz, C. Segarra, R. S. Singh, L. Sirot, M. Sirota, N. B. Sisneros, C. D. Smith, T. F. Smith, J. Spieth, D. E. Stage, A. Stark, W. Stephan, R. L. Strausberg, S. Strempel, D. Sturgill, G. Sutton, G. G. Sutton, W. Tao, S. Teichmann, Y. N. Tobari, Y. Tomimura, J. M. Tsolas, V. L. Valente, E. Venter, J. C. Venter, S. Vicario, F. G. Vieira, A. J. Vilella, A. Villasante, B. Walenz, J. Wang, M. Wasserman, T. Watts, D. Wilson, R. K. Wilson, R. A. Wing, M. F. Wolfner, A. Wong, G. K. Wong, C. I. Wu, G. Wu, D. Yamamoto, H. P. Yang, S. P. Yang, J. A. Yorke, K. Yoshida, E. Zdobnov, P. Zhang, Y. Zhang, A. V. Zimin, J. Baldwin, A. Abdouelleil, J. Abdulkadir, A. Abebe, B. Abera, J. Abreu, S. C. Acer, L. Aftuck, A. Alexander, P. An, E. Anderson, S. Anderson, H. Arachi, M. Azer, P. Bachantsang, A. Barry, T. Bayul, A. Berlin, D. Bessette, T. Bloom, J. Blye, L. Boguslavskiy, C. Bonnet, B. Boukhgalter, I. Bourzgui, A. Brown, P. Cahill, S. Channer, Y. Cheshatsang, L. Chuda, M. Citroen, A. Collymore, P. Cooke, M. Costello, K. D'Aco, R. Daza, G. De Haan, S. DeGray, C. DeMaso, N. Dhargay, K. Dooley, E. Dooley, M. Doricent, P. Dorje, K. Dorjee, A. Dupes, R. Elong, J. Falk, A. Farina, S. Faro, D. Ferguson, S. Fisher, C. D. Foley, A. Franke, D. Friedrich, L. Gadbois, G. Gearin, C. R. Gearin, G. Giannoukos, T. Goode, J. Graham, E. Grandbois, S. Grewal, K. Gyaltsen, N. Hafez, B. Hagos, J. Hall, C. Henson, A. Hollinger, T. Honan, M. D. Huard, L. Hughes, B. Hurhula, M. E. Husby, A. Kamat, B. Kanga, S. Kashin, D. Khazanovich, P. Kisner, K. Lance, M. Lara, W. Lee, N. Lennon, F. Letendre, R. LeVine, A. Lipovsky, X. Liu, J. Liu, S. Liu, T. Lokyitsang, Y. Lokyitsang, R. Lubonja, A. Lui, P. MacDonald, V. Magnisalis, K. Maru, C. Matthews, W. McCusker, S. McDonough, T. Mehta, J. Meldrim, L. Meneus, O. Mihai, A. Mihalev, T. Mihova, R. Mittelman, V. Mlenga, A. Montmayeur, L. Mulrain, A. Navidi, J. Naylor, T. Negash, T. Nguyen, N. Nguyen, R. Nicol, C. Norbu, N. Norbu, N. Novod, B. O'Neill, S. Osman, E. Markiewicz, O. L. Oyono, C. Patti, P. Phunkhang, F. Pierre, M. Priest, S. Raghuraman, F. Rege, R. Reyes, C. Rise, P. Rogov, K. Ross, E. Ryan, S. Settipalli, T. Shea, N. Sherpa, L. Shi, D. Shih, T. Sparrow, J. Spaulding, J. Stalker, N. Stange-Thomann, S. Stavropoulos, C. Stone, C. Strader, S. Tesfaye, T. Thomson, Y. Thoulutsang, D. Thoulutsang, K. Topham, I. Topping, T. Tsamla, H. Vassiliev, A. Vo, T. Wangchuk, T. Wangdi, M. Weiand, J. Wilkinson, A. Wilson, S. Yadav, G. Young, Q. Yu, L. Zembek, D. Zhong, A. Zimmer, Z. Zwirko, P. Alvarez, W. Brockman, J. Butler, C. Chin, M. Grabherr, M. Kleber, E. Mauceli and I. MacCallum (2007). "Evolution of genes and genomes on the Drosophila phylogeny." Nature 450(7167): 203-218.

114 References Clark, J. B. and M. G. Kidwell (1997). "A phylogenetic perspective on P transposable element evolution in Drosophila." Proc Natl Acad Sci U S A 94(21): 11428-11433. Cochrane, B. J., M. Windelspecht, S. Brandon, M. Morrow and L. Dryden (1998). "Use of recombinant inbred lines for the investigation of insecticide resistance and cross resistance in Drosophila simulans." Pesticide Biochemistry and Physiology 61(2): 95-114. Coffey, L. L., N. Vasilakis, A. C. Brault, A. M. Powers, F. Tripet and S. C. Weaver (2008). "Arbovirus evolution in vivo is constrained by host alternation." Proc Natl Acad Sci U S A 105(19): 6970-6975. Contamine, D. and S. Gaumer (2008). "Sigma Rhabdoviruses." Encyclopeida of Virology 5: 576-581. Coustau, C., C. Chevillon and R. ffrench-Constant (2000). "Resistance to xenobiotics and parasites: can we count the cost?" Trends Ecol Evol 15(9): 378-383. Dacheux, L., N. Berthet, G. Dissard, E. C. Holmes, O. Delmas, F. Larrous, G. Guigon, P. Dickinson, O. Faye, A. A. Sall, I. G. Old, K. Kong, G. C. Kennedy, J. C. Manuguerra, S. T. Cole, V. Caro, A. Gessain and H. Bourhy (2010). "Application of broad-spectrum resequencing microarray for genotyping rhabdoviruses." J Virol 84(18): 9557-9574. Daszak, P., A. A. Cunningham and A. D. Hyatt (2000). "Wildlife ecology - Emerging infectious diseases of wildlife - Threats to biodiversity and human health." Science 287(5452): 443-449. Davies, T. J. and A. B. Pedersen (2008). "Phylogeny and geography predict pathogen community similarity in wild primates and humans." Proceedings of the Royal Society B-Biological Sciences 275(1643): 1695-1701. De Vienne, D. M., T. Giraud and J. A. Shykoff (2007). "When can host shifts produce congruent host and parasite phylogenies? A simulation approach." Journal of Evolutionary Biology 20(4): 1428-1438. de Vienne, D. M., M. E. Hood and T. Giraud (2009). "Phylogenetic determinants of potential host shifts in fungal pathogens." Journal of Evolutionary Biology 22(12): 2532-2541. Decaestecker, E., S. Gaba, J. A. Raeymaekers, R. Stoks, L. Van Kerckhoven, D. Ebert and L. De Meester (2007). "Host-parasite 'Red Queen' dynamics archived in pond sediment." Nature 450(7171): 870-873. Degrugillier, M. E., S. S. Degrugillier and J. J. Jackson (1991). "Nonoccluded, Cytoplasmic Virus-Particles and Rickettsia-Like Organisms in Testes and Spermathecae of Diabrotica-Virgifera." Journal of Invertebrate Pathology 57(1): 50- 58. Dezelee, S., F. Bras, D. Contamine, M. Lopez-Ferber, D. Segretain and D. Teninges (1989). "Molecular analysis of ref(2)P, a Drosophila gene implicated in sigma rhabdovirus multiplication and necessary for male fertility." EMBO J 8(11): 3437- 3446. Douady, C. J., F. Delsuc, Y. Boucher, W. F. Doolittle and E. J. P. Douzery (2003). "Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability." Molecular Biology and Evolution 20(2): 248-254. Douglas, A. E. (1989). "MYCETOCYTE SYMBIOSIS IN INSECTS." Biological Reviews of the Cambridge Philosophical Society 64(4): 409-434.

115 References Douglas, A. E. (1998). "Nutritional interactions in insect-microbial symbioses: Aphids and their symbiotic bacteria Buchnera." Annual Review of Entomology 43: 17-37. Dru, P., F. Bras, S. Dezelee, P. Gay, A. M. Petitjean, A. Pierredeneubourg, D. Teninges and D. Contamine (1993). "Unusual Variability of the Drosophila- Melanogaster Ref(2)P Protein Which Controls the Multiplication of Sigma Rhabdovirus." Genetics 133(4): 943-954. Drummond, A. J. and A. Rambaut (2007). "BEAST: Bayesian evolutionary analysis by sampling trees." Bmc Evolutionary Biology 7: 214. Duffy, S., C. L. Burch and P. E. Turner (2007). "Evolution of host specificity drives reproductive isolation among RNA viruses." Evolution 61(11): 2614-2622. Duron, O., D. Bouchon, S. Boutin, L. Bellamy, L. Q. Zhou, J. Engelstadter and G. D. Hurst (2008). "The diversity of reproductive parasites among arthropods: Wolbachia do not walk alone." Bmc Biology 6: -. Ebbert, M. A., J. J. Burkholder and J. L. Marlowe (2001). "Trypanosomatid prevalence and host habitat choice in woodland Drosophila." J Invertebr Pathol 77(1): 27-32. Edwards, R. A. and F. Rohwer (2005). "Viral metagenomics." Nat Rev Microbiol 3(6): 504-510. Elena, S. F., P. Agudelo-Romero, P. Carrasco, F. M. Codoner, S. Martin, C. Torres- Barcelo and R. Sanjuan (2008). "Experimental evolution of plant RNA viruses." Heredity 100(5): 478-483. Elena, S. F., P. Agudelo-Romero and J. Lalic (2009). "The evolution of viruses in multi-host fitness landscapes." Open Virol J 3: 1-6. Elena, S. F. and R. Sanjuan (2007). "Virus evolution: Insights from an experimental approach." Annual Review of Ecology Evolution and Systematics 38: 27-52. Engelstadter, J. and G. D. D Hurst (2006). "The dynamics of parasite incidence across host species." Evolutionary Ecology 20(6): 603-616. Engelstadter, J. and G. D. D. Hurst (2009). "The Ecology and Evolution of Microbes that Manipulate Host Reproduction." Annual Review of Ecology Evolution and Systematics 40: 127-149. Felix, R., J. Guzman and A. Arellano (1977). "CO2 sensitivity of Drosophilid flies from a location in the outskirts of Mexico city." Drosophila information service 47: 110-112. Fellowes, M. D. E., A. R. Kraaijeveld and H. C. J. Godfray (1999). "Cross-resistance following artificial selection for increased defense against parasitoids in Drosophila melanogaster." Evolution 53(3): 966-972. Felluga, B., V. Jonsson and M. R. Liljeros (1971). "Ultrastructure of new viruslike particles in Drosophila." J Invertebr Pathol 17(3): 339-346. Felsenstein, J. (1985). "Phylogenies and the Comparative Method." American Naturalist 125(1): 1-15. Fleming, J. and M. D. Summers (1991). "POLYDNAVIRUS DNA IS INTEGRATED IN THE DNA OF ITS PARASITOID WASP HOST." Proceedings of the National Academy of Sciences of the United States of America 88(21): 9770- 9774. Fleuriet, A. (1981). "Comparison of Various Physiological Traits in Flies (Drosophila-Melanogaster) of Wild Origin, Infected or Uninfected by the Hereditary Rhabdovirus Sigma." Archives of Virology 69(3-4): 261-272.

116 References Fleuriet, A. (1981). "Effect of Overwintering on the Frequency of Flies Infected by the Rhabdovirus Sigma in Experimental Populations of Drosophila-Melanogaster." Archives of Virology 69(3-4): 253-260. Fleuriet, A. (1988). "Maintenance of a Hereditary Virus - the Sigma-Virus in Populations of Its Host, Drosophila-Melanogaster." Evolutionary Biology 23: 1-30. Fleuriet, A. (1994). "Female Characteristics in the Drosophila-Melanogaster Sigma- Virus System in Natural-Populations from Languedoc (Southern France)." Archives of Virology 135(1-2): 29-42. Fleuriet, A. (1999). "Evolution of the proportions of two sigma viral types in experimental populations of Drosophila melanogaster in the absence of the allele that is restrictive of viral multiplication." Genetics 153(4): 1799-1808. Fleuriet, A. and G. Periquet (1993). "Evolution of the Drosophila-Melanogaster Sigma Virus System in Natural-Populations from Languedoc (Southern France)." Archives of Virology 129(1-4): 131-143. Fleuriet, A., G. Periquet and D. Anxolabehere (1990). "Evolution of Natural- Populations in the Drosophila-Melanogaster Sigma Virus System .1. Languedoc (Southern France)." Genetica 81(1): 21-31. Fleuriet, A. and D. Sperlich (1992). "Evolution of the Drosophila-Melanogaster- Sigma Virus System in a Natural-Population from Tubingen." Theoretical and Applied Genetics 85(2-3): 186-189. Freckleton, R. P., P. H. Harvey and M. Pagel (2002). "Phylogenetic analysis and comparative data: a test and review of evidence." Am Nat 160(6): 712-726. Frosheiser, F. I. (1974). "Alfalfa Mosaic-Virus Transmission to Seed through Alfalfa Gametes and Longevity in Alfalfa Seed." Phytopathology 64(1): 102-105. Fu, Z. F. (2005). "Genetic comparison of the rhabdoviruses from animals and plants." World of Rhabdoviruses 292: 1-24. Furio, V., A. Moya and R. Sanjuan (2005). "The cost of replication fidelity in an RNA virus." Proceedings of the National Academy of Sciences of the United States of America 102(29): 10233-10237. Gao, J. J., H. A. Watabe, T. Aotsuka, J. F. Pang and Y. P. Zhang (2007). "Molecular phylogeny of the Drosophila obscura species group, with emphasis on the Old World species." Bmc Evolutionary Biology 7. Gay, P. (1978). "Drosophila Genes Which Intervene in Multiplication of Sigma Virus." Molecular & General Genetics 159(3): 269-283. Gelman, A. (2006). "Prior distributions for variance parameters in hierarchical models." Bayesian Analysis 1(3): 515-533. Gilbert, C. and C. Feschotte (2010). "Genomic Fossils Calibrate the Long-Term Evolution of Hepadnaviruses." Plos Biology 8(9). Gilbert, G. S. and C. O. Webb (2007). "Phylogenetic signal in plant pathogen-host range." Proceedings of the National Academy of Sciences of the United States of America 104(12): 4979-4983. Gilchrist, G. W., R. B. Huey and L. Serra (2001). "Rapid evolution of wing size clines in Drosophila subobscura." Genetica 112-113: 273-286. Goddard, M. R. and A. Burt (1999). "Recurrent invasion and extinction of a selfish gene." Proc Natl Acad Sci U S A 96(24): 13880-13885. Greene, I. P., E. Wang, E. R. Deardorff, R. Milleron, E. Domingo and S. C. Weaver (2005). "Effect of alternating passage on adaptation of sindbis virus to vertebrate and invertebrate cells." J Virol 79(22): 14253-14260.

117 References Habayeb, M. S., S. K. Ekengren and D. Hultmark (2006). "Nora virus, a persistent virus in Drosophila, defines a new picorna-like virus family." Journal of General Virology 87: 3045-3051. Hadfield, J. D. (2010). "MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package." Journal of Statistical Software 33(2): 1-22. Hadfield, J. D. and S. Nakagawa (2010). "General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters." Journal of Evolutionary Biology 23(3): 494-508. Hafner, M. S. and R. D. M. Page (1995). "Molecular Phylogenies and Host-Parasite Cospeciation - Gophers and Lice as a Model System." Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 349(1327): 77-83. Hahn, B. H., G. M. Shaw, K. M. De Cock and P. M. Sharp (2000). "AIDS - AIDS as a zoonosis: Scientific and public health implications." Science 287(5453): 607-614. Haine, E. R. (2008). "Symbiont-mediated protection." Proc Biol Sci 275(1633): 353- 361. Hambly, E. and C. A. Suttle (2005). "The viriosphere, diversity, and genetic exchange within phage communities." Curr Opin Microbiol 8(4): 444-450. Hasegawa, M., H. Kishino and T. A. Yano (1985). "DATING OF THE HUMAN APE SPLITTING BY A MOLECULAR CLOCK OF MITOCHONDRIAL-DNA." Journal of Molecular Evolution 22(2): 160-174. Haselkorn, T. S. (2010). "The Spiroplasma heritable bacterial endosymbiont of Drosophila." Fly (Austin) 4(1): 80-87. Havard, S., P. Eslin, G. Prevost and G. Doury (2009). "Encapsulation ability: Are all Drosophila species equally armed? An investigation in the obscura group." Canadian Journal of Zoology-Revue Canadienne De Zoologie 87(7): 635-641. Hedges, L. M., J. C. Brownlie, S. L. O'Neill and K. N. Johnson (2008). "Wolbachia and Virus Protection in Insects." Science 322(5902): 702-702. Heredia, F., E. L. S. Loreto and V. L. S. Valente (2007). "Distribution and conservation of the transposable element gypsy in Drosophilid species." Genetics and Molecular Biology 30(1): 133-138. Herforth and Westphal (1966). "Observations on the frequency of carbon dioxide sensitivity in a natural population of Drosophilid flies." Drosophila information service 41: 79. Hillis, D. M. (1996). "Inferring complex phylogenies." Nature 383(6596): 130-131. Himler, A. G., T. Adachi-Hagimori, J. E. Bergen, A. Kozuch, S. E. Kelly, B. E. Tabashnik, E. Chiel, V. E. Duckworth, T. J. Dennehy, E. Zchori-Fein and M. S. Hunter (2011). "Rapid spread of a bacterial symbiont in an invasive whitefly is driven by fitness benefits and female bias." Science 332(6026): 254-256. Hogenhout, S. A., M. G. Redinbaugh and E. D. Ammar (2003). "Plant and animal rhabdovirus host range: a bug's view." Trends in Microbiology 11(6): 264-271. Holman, L., R. P. Freckleton and R. R. Snook (2008). "What use is an infertile sperm? A comparative study of sperm-heteromorphic Drosophila." Evolution 62(2): 374-385. Holman, L. and R. R. Snook (2008). "A sterile sperm caste protects brother fertile sperm from female-mediated death in Drosophila pseudoobscura." Curr Biol 18(4): 292-296.

118 References Holmes, E. C. (2003). "Molecular clocks and the puzzle of RNA virus origins." J Virol 77(7): 3893-3897. Holmes, E. C. (2009). The evolution and emergence of RNA viruses, Oxford University Press. Holmes, E. C. and A. Rambaut (2004). "Viral evolution and the emergence of SARS coronavirus." Philos Trans R Soc Lond B Biol Sci 359(1447): 1059-1065. Hornett, E. A., S. Charlat, A. M. R. Duplouy, N. Davies, G. K. Roderick, N. Wedell and G. D. D. Hurst (2006). "Evolution of male-killer suppression in a natural population." Plos Biology 4(9): 1643-1648. Houck, M. A., J. B. Clark, K. R. Peterson and M. G. Kidwell (1991). "Possible horizontal transfer of Drosophila genes by the mite Proctolaelaps regalis." Science 253(5024): 1125-1128. Housworth, E. A., E. P. Martins and M. Lynch (2004). "The phylogenetic mixed model." American Naturalist 163(1): 84-96. Hudson, R. R., D. D. Boos and N. L. Kaplan (1992). "A Statistical Test for Detecting Geographic Subdivision." Molecular Biology and Evolution 9(1): 138-151. Hudson, R. R. and N. L. Kaplan (1985). "Statistical Properties of the Number of Recombination Events in the History of a Sample of DNA-Sequences." Genetics 111(1): 147-164. Huelsenbeck, J. P. and F. Ronquist (2001). "MRBAYES: Bayesian inference of phylogenetic trees." Bioinformatics 17(8): 754-755. Huey, R. B., B. Moreteau, J. C. Moreteau, P. Gibert, G. W. Gilchrist, A. R. Ives, T. Garland and J. R. David (2006). "Sexual size dimorphism in a Drosophila clade, the D-obscura group." Zoology 109(4): 318-330. Hull, R. (2009). Comparative plant virology, Academic Press. Hurst, G. D. D., L. D. Hurst and M. E. N. Majerus (1993). "Altering Sex-Ratios - the Games Microbes Play." Bioessays 15(10): 695-697. Hurst, G. D. D. and K. J. Hutchence (2010). "Host Defence: Getting By with a Little Help from Our Friends." Current Biology 20(18): R806-R808. Hurst, G. D. D., F. M. Jiggins and S. J. W. Robinson (2001). "What causes inefficient transmission of male-killing Wolbachia in Drosophila?" Heredity 87: 220- 226. Hurst, G. D. D. and M. E. N. Majerus (1993). "Why Do Maternally Inherited Microorganisms Kill Males." Heredity 71: 81-95. Hurst, L. D. (1990). "Parasite Diversity and the Evolution of Diploidy, Multicellularity and Anisogamy." Journal of Theoretical Biology 144(4): 429-443. Huszar, T. and J. L. Imler (2008). "Drosophila viruses and the study of antiviral host- defense." Adv Virus Res 72: 227-265. ICTVdB (2006) Dicistroviridae ICTVdB - The Universal Virus Database, version 3. Büchen-Osmond, C. (Ed), Columbia University, New York, USA ICTVdB (2006) Nodaviridae. ICTVdB - The Universal Virus Database, version 3. Büchen-Osmond, C. (Ed), Columbia University, New York, USA ICTVdB (2006) Rhabdoviridae. ICTVdB - The Universal Virus Database, version 3. Büchen-Osmond, C. (Ed), Columbia University, New York, USA ICTVdB (2006) of Viruses. ICTVdB - The Universal Virus Database, version 3. Büchen-Osmond, C. (Ed), Columbia University, New York, USA

119 References Jaenike, J. and S. J. Perlman (2002). "Ecology and evolution of host-parasite associations: mycophagous Drosophila and their parasitic nematodes." Am Nat 160 Suppl 4: S23-39. Jaenike, J., M. Polak, A. Fiskin, M. Helou and M. Minhas (2007). "Interspecific transmission of endosymbiotic Spiroplasma by mites." Biology Letters 3(1): 23-25. Jaenike, J., J. K. Stahlhut, L. M. Boelio and R. L. Unckless (2010). "Association between Wolbachia and Spiroplasma within Drosophila neotestacea: an emerging symbiotic mutualism." Molecular Ecology 19(2): 414-425. Jaenike, J., R. Unckless, S. N. Cockburn, L. M. Boelio and S. J. Perlman (2010). "Adaptation via Symbiosis: Recent Spread of a Drosophila Defensive Symbiont." Science 329(5988): 212-215. Jiggins, F. M. (2003). "Male-killing Wolbachia and mitochondrial DNA: selective sweeps, hybrid introgression and parasite population dynamics." Genetics 164(1): 5- 12. Jiggins, F. M., J. K. Bentley, M. E. Majerus and G. D. Hurst (2002). "Recent changes in phenotype and patterns of host specialization in Wolbachia bacteria." Mol Ecol 11(8): 1275-1283. Jiggins, F. M. and K. W. Kim (2005). "The evolution of antifungal peptides in Drosophila." Genetics 171(4): 1847-1859. Johnson, K. N. and P. D. Christian (1999). "Molecular characterization of Drosophila C virus isolates." J Invertebr Pathol 73(3): 248-254. Johnson, M. C., J. M. Maxwell, P. C. Loh and J. A. C. Leong (1999). "Molecular characterization of the glycoproteins from two warm water rhabdoviruses: snakehead rhabdovirus (SHRV) and rhabdovirus of penaeid shrimp (RPS)/spring viremia of carp virus (SVCV)." Virus Research 64(2): 95-106. Jokela, J., M. F. Dybdahl and C. M. Lively (2009). "The maintenance of sex, clonal dynamics, and host-parasite coevolution in a mixed population of sexual and asexual snails." Am Nat 174 Suppl 1: S43-53. Jones, K. E., N. G. Patel, M. A. Levy, A. Storeygard, D. Balk, J. L. Gittleman and P. Daszak (2008). "Global trends in emerging infectious diseases." Nature 451(7181): 990-993. Jousset, F. X. (1969). "PRELIMINARY STUDIES OF DROSOPHILA SIGMA- VIRUS PROLIFERATION IN TAXONOMICALLY DIFFERENT INSECTS." Comptes Rendus Hebdomadaires Des Seances De L Academie Des Sciences Serie D 269(11): 1035-&. Jousset, F. X. (1972). "PRESENTATION OF IOTA VIRUS VIRIONS OF DROSOPHILA-IMMIGRANS." Comptes Rendus Hebdomadaires Des Seances De L Academie Des Sciences Serie D 274(5): 749-&. Jousset, F. X. (1976). "Host Range of Drosophila-Melanogaster C Virus among Diptera and Lepidoptera." Annales De Microbiologie A127(4): 529-&. Jousset, F. X. and N. Plus (1975). "STUDY OF VERTICAL TRANSMISSION AND HORIZONTAL TRANSMISSION OF DROSOPHILA-MELANOGASTER AND DROSOPHILA-IMMIGRANS PICORNAVIRUS." Annales De Microbiologie B126(2): 231-249. Juneja, P. and B. P. Lazzaro (2009). "Providencia sneebia sp. nov. and Providencia burhodogranariea sp. nov., isolated from wild Drosophila melanogaster." Int J Syst Evol Microbiol 59(Pt 5): 1108-1111.

120 References Kapun, M., V. Nolte, T. Flatt and C. Schlotterer (2010). "Host Range and Specificity of the Drosophila C Virus." Plos One 5(8): -. Karr, T. L., W. Yang and M. E. Feder (1998). "Overcoming cytoplasmic incompatibility in Drosophila." Proc Biol Sci 265(1394): 391-395. Katzourakis, A. and R. J. Gifford (2010). "Endogenous Viral Elements in Animal Genomes." Plos Genetics 6(11): -. Kelley, L. and M. Sternberg (2009). "Protein structure prediction on the web: a case study using the Phyre server." Nature Protocols 4: 363-471. Kidwell, M. G. (1994). "The Wilhelmine E. Key 1991 Invitational Lecture. The evolutionary history of the P family of transposable elements." J Hered 85(5): 339- 346. Kitchen, A., L. A. Shackelton and E. C. Holmes (2011). "Family level phylogenies reveal modes of macroevolution in RNA viruses." Proc Natl Acad Sci U S A 108(1): 238-243. Koehncke, A., A. Telschow, J. H. Werren and P. Hammerstein (2009). "Life and Death of an Influential Passenger: Wolbachia and the Evolution of CI-Modifiers by Their Hosts." Plos One 4(2): -. Koonin, E. V., Y. I. Wolf, K. Nagasaki and V. V. Dolja (2008). "The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups." Nat Rev Microbiol 6(12): 925-939. Kuiken, T., E. C. Holmes, J. McCauley, G. F. Rimmelzwaan, C. S. Williams and B. T. Grenfell (2006). "Host species barriers to influenza virus infections." Science 312(5772): 394-397. Kuzmin, I. V., G. J. Hughes and C. E. Rupprecht (2006). "Phylogenetic relationships of seven previously unclassified viruses within the family Rhabdoviridae using partial nucleoprotein gene sequences." Journal of General Virology 87: 2323-2331. Kuzmin, I. V., I. S. Novella, R. G. Dietzgen, A. Padhi and C. E. Rupprecht (2009). "The rhabdoviruses: Biodiversity, phylogenetics, and evolution." Infection, Genetics and Evolution 9(4): 541-553. L'Heritier, P. (1957). "The hereditary virus of Drosophila." Advances in Virus Research 5: 195-245. L'Heritier, P. H. (1948). "Sensitivity to CO2 in Drosophila. A review." Heredity 2: 325-348. L'Heritier, P. H. (1970). "Drosophila viruses and their role as evolutionary factors." Evolutionary Biology 4: 185-209. L'Heritier, P. H. and G. Teissier (1937). "Une anomalie physiologique héréditaire chez la Drosophile." C.R. Acad. Sci. Paris 231: 192-194. Lahaye, X., A. Vidy, C. Pomier, L. Obiang, F. Harper, Y. Gaudin and D. Blondel (2009). "Functional characterization of Negri bodies (NBs) in rabies virus-infected cells: Evidence that NBs are sites of viral transcription and replication." J Virol 83(16): 7948-7958. Lanciotti, R. S., J. T. Roehrig, V. Deubel, J. Smith, M. Parker, K. Steele, B. Crise, K. E. Volpe, M. B. Crabtree, J. H. Scherret, R. A. Hall, J. S. MacKenzie, C. B. Cropp, B. Panigrahy, E. Ostlund, B. Schmitt, M. Malkinson, C. Banet, J. Weissman, N. Komar, H. M. Savage, W. Stone, T. McNamara and D. J. Gubler (1999). "Origin of the West Nile virus responsible for an outbreak of encephalitis in the northeastern United States." Science 286(5448): 2333-2337.

121 References Landesdevauchelle, C., F. Bras, S. Dezelee and D. Teninges (1995). "Gene 2 of the sigma rhabdovirus genome encodes the p-protein, and gene-3 encodes a protein related to the reverse-transcriptase of retroelements." Virology 213(2): 300-312. Lastra, J. and J. Esparza (1976). "Multipication of Vesicular Stomatitis Virus in the leafhopper Peregrinus maidis (Ashm.), a Vector of a plant rhabdovirus." Journal of General Virology 32: 139-142. Laurent, J. (1983). "Are the antigenic characteristics of vesicular stomatitis virus modifid during evolution in Drosophila?" Ann. Virol (Inst. Pasteur) 134(E): 465-485. Lazzaro, B. P. and T. J. Little (2009). "Immunity in a variable world." Philosophical Transactions of the Royal Society B-Biological Sciences 364(1513): 15-26. Lazzaro, B. P., T. B. Sackton and A. G. Clark (2006). "Genetic variation in Drosophila melanogaster resistance to infection: A comparison across bacteria." Genetics 174(3): 1539-1554. Lazzaro, B. P., B. K. Sceurman and A. G. Clark (2004). "Genetic basis of natural variation in D-melanogaster antibacterial immunity." Science 303(5665): 1873-1876. Lemaitre, B. and J. Hoffmann (2007). "The host defense of Drosophila melanogaster." Annual Review of Immunology 25: 697-743. Leroy, E. M., B. Kumulungui, X. Pourrut, P. Rouquet, A. Hassanin, P. Yaba, A. Delicat, J. T. Paweska, J. P. Gonzalez and R. Swanepoel (2005). "Fruit bats as reservoirs of Ebola virus." Nature 438(7068): 575-576. Leroy, E. M., P. Rouquet, P. Formenty, S. Souquiere, A. Kilbourne, J. M. Froment, M. Bermejo, S. Smit, W. Karesh, R. Swanepoel, S. R. Zaki and P. E. Rollin (2004). "Multiple Ebola virus transmission events and rapid decline of central African wildlife." Science 303(5656): 387-390. Levine, B. and V. Deretic (2007). "Unveiling the roles of autophagy in innate and adaptive immunity." Nat Rev Immunol 7(10): 767-777. Lewis, E. (1960). "A new standard food medium." Drosophila information service 34: 117-118. Li, L., X. F. Wang and G. H. Zhou (2007). "Analyses of maize embryo invasion by Sugarcane mosaic virus." Plant Science 172(1): 131-138. Librado, P. and J. Rozas (2009). "DnaSP v5: a software for comprehensive analysis of DNA polymorphism data." Bioinformatics 25(11): 1451-1452. Lips, K. R., F. Brem, R. Brenes, J. D. Reeve, R. A. Alford, J. Voyles, C. Carey, L. Livo, A. P. Pessier and J. P. Collins (2006). "Emerging infectious disease and the loss of biodiversity in a Neotropical amphibian community." Proceedings of the National Academy of Sciences of the United States of America 103(9): 3165-3170. Litman, G. W., J. P. Cannon and L. J. Dishaw (2005). "Reconstructing immune phylogeny: new perspectives." Nat Rev Immunol 5(11): 866-879. Little, T. J. (2002). "The evolutionary significance of parasitism: do parasite-driven genetic dynamics occur ex silico?" Journal of Evolutionary Biology 15(1): 1-9. Liu, W. M., Y. Y. Li, G. H. Learn, R. S. Rudicell, J. D. Robertson, B. F. Keele, J. B. N. Ndjango, C. M. Sanz, D. B. Morgan, S. Locatelli, M. K. Gonder, P. J. Kranzusch, P. D. Walsh, E. Delaporte, E. Mpoudi-Ngole, A. V. Georgiev, M. N. Muller, G. M. Shaw, M. Peeters, P. M. Sharp, J. C. Rayner and B. H. Hahn (2010). "Origin of the human malaria parasite Plasmodium falciparum in gorillas." Nature 467(7314): 420- U467. Lively, C. M. and M. F. Dybdahl (2000). "Parasite adaptation to locally common host genotypes." Nature 405(6787): 679-681.

122 References Longdon, B. and F. M. Jiggins (2010). "Paternally transmitted parasites." Current Biology 20(17): R695-R696. Longdon, B., D. J. Obbard and F. M. Jiggins (2010). "Sigma viruses from three species of Drosophila form a major new clade in the rhabdovirus phylogeny." Proceedings of the Royal Society B 277: 35-44. Longdon, B., L. Wilfert, D. J. Obbard and F. M. Jiggins (2011). "Rhabdoviruses in two species of Drosophila: vertical transmission and a recent sweep." Genetics Advance online doi: 10.1534/genetics.111.127696. Longdon, B., L. Wilfert, J. Osei-Poku, H. Cagney, D. J. Obbard and F. M. Jiggins (2011). "Host switching by a vertically-transmitted rhabdovirus in Drosophila." Biology Letters Advance online doi:10.1098/rsbl.2011.0160. Lopez-Ferber, M., A. F. Rios, G. Kuhl, M. A. Comendador and C. Louis (1997). "Infection of the gonads of the SimES strain of Drosophila simulans by the hereditary reovirus DSV." Journal of Invertebrate Pathology 70(2): 143-149. Lopez-Ferber, M., J. C. Veyrunes and L. Croizier (1989). "Drosophila S virus is a member of the Reoviridae family." J Virol 63(2): 1007-1009. Loreto, E. L. S., C. M. A. Carareto and P. Capy (2008). "Revisiting horizontal transfer of transposable elements in Drosophila." Heredity 100(6): 545-554. Lyles, D. S. and C. E. Rupprecht (2007). Rhabdoviridae. Fields Virology. D. M. Knipe and P. M. Howley: 1364-. Lynch, M. (1991). "METHODS FOR THE ANALYSIS OF COMPARATIVE DATA IN EVOLUTIONARY BIOLOGY." Evolution 45(5): 1065-1080. Markow, T. A. and P. M. O'Grady (2007). "Drosophila biology in the genomic age." Genetics 177(3): 1269-1276. McGeoch, D. J., A. Dolan and A. C. Ralph (2000). "Toward a comprehensive phylogeny for mammalian and avian herpesviruses." Journal of Virology 74(22): 10401-10406. Mead, D. G., E. W. Howerth, M. D. Murphy, E. W. Gray, R. Noblet and D. E. Stallknecht (2004). "Black fly involvement in the epidemic transmission of Vesicular stomatitis New Jersey virus (Rhabdoviridae : Vesiculovirus)." Vector-Borne and Zoonotic Diseases 4(4): 351-359. Mims, C. A. (1981). "Vertical Transmission of Viruses." Microbiological Reviews 45(2): 267-286. Mink, G. I. (1993). "Pollen-Transmitted and Seed-Transmitted Viruses and Viroids." Annual Review of Phytopathology 31: 375-402. Moran, N. A., M. A. Munson, P. Baumann and H. Ishikawa (1993). "A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts." Proceedings of the Royal Society of London Series B-Biological Sciences 253(1337): 167-171. Moreira, L. A., I. Iturbe-Ormaetxe, J. A. Jeffery, G. Lu, A. T. Pyke, L. M. Hedges, B. C. Rocha, S. Hall-Mendelin, A. Day, M. Riegler, L. E. Hugo, K. N. Johnson, B. H. Kay, E. A. McGraw, A. F. van den Hurk, P. A. Ryan and S. L. O'Neill (2009). "A Wolbachia symbiont in Aedes aegypti limits infection with dengue, Chikungunya, and Plasmodium." Cell 139(7): 1268-1278. Mueller, S., V. Gausson, N. Vodovar, S. Deddouche, L. Troxler, J. Perot, S. Pfeffer, J. A. Hoffmann, M. C. Saleh and J. L. Imler (2010). "RNAi-mediated immunity provides strong protection against the negative-strand RNA vesicular stomatitis virus in Drosophila." Proc Natl Acad Sci U S A 107(45): 19390-19395.

123 References Nezis, I. P., A. Simonsen, A. P. Sagona, K. Finley, S. Gaumer, D. Contamine, T. E. Rusten, H. Stenmark and A. Brech (2008). "Ref(2)P, the Drosophila melanogaster homologue of mammalian p62, is required for the formation of protein aggregates in adult brain." J Cell Biol 180(6): 1065-1071. Novella, I. S., D. K. Clarke, J. Quer, E. A. Duarte, C. H. Lee, S. C. Weaver, S. F. Elena, A. Moya, E. Domingo and J. J. Holland (1995). "Extreme fitness differences in mammalian and insect hosts after continuous replication of vesicular stomatitis virus in sandfly cells." J Virol 69(11): 6805-6809. O'Grady, P. and R. Desalle (2008). "Out of Hawaii: the origin and biogeography of the genus Scaptomyza (Diptera: Drosophilidae)." Biol Lett 4(2): 195-199. O'Neill, S. L., R. Giordano, A. M. Colbert, T. L. Karr and H. M. Robertson (1992). "16S rRNA phylogenetic analysis of the bacterial endosymbionts associated with cytoplasmic incompatibility in insects." Proc Natl Acad Sci U S A 89(7): 2699-2702. Obbard, D. J., K. H. Gordon, A. H. Buck and F. M. Jiggins (2009). "The evolution of RNAi as a defence against viruses and transposable elements." Philos Trans R Soc Lond B Biol Sci 364(1513): 99-115. Obbard, D. J., F. M. Jiggins, N. J. Bradshaw and T. J. Little (2011). "Recent and recurrent selective sweeps of the antiviral RNAi gene Argonaute-2 in three species of Drosophila." Mol Biol Evol 28(2): 1043-1056. Obbard, D. J., F. M. Jiggins, D. L. Halligan and T. J. Little (2006). "Natural selection drives extremely rapid evolution in antiviral RNAi genes." Current Biology 16(6): 580-585. Obbard, D. J., J. J. Welch, K. W. Kim and F. M. Jiggins (2009). "Quantifying adaptive evolution in the Drosophila immune system." Plos Genetics 5(10): e1000698. Ogden, T. H. and M. S. Rosenberg (2006). "Multiple sequence alignment accuracy and phylogenetic inference." Systematic Biology 55(2): 314-328. Opijnen, T., E. Baudry, L. Baldo, J. Bartos and J. H. Werren (2005). "Genetic variability in the three genomes of Nasonia: nuclear, mitochondrial and Wolbachia." Insect Molecular Biology 14(6): 653-663. Pagel, M. (1994). "DETECTING CORRELATED EVOLUTION ON PHYLOGENIES - A GENERAL-METHOD FOR THE COMPARATIVE- ANALYSIS OF DISCRETE CHARACTERS." Proceedings of the Royal Society of London Series B-Biological Sciences 255(1342): 37-45. Pagel, M. (1999). "Inferring the historical patterns of biological evolution." Nature 401(6756): 877-884. Pankiv, S., T. H. Clausen, T. Lamark, A. Brech, J. A. Bruun, H. Outzen, A. Overvatn, G. Bjorkoy and T. Johansen (2007). "p62/SQSTM1 binds directly to Atg8/LC3 to facilitate degradation of ubiquitinated protein aggregates by autophagy." Journal of Biological Chemistry 282(33): 24131-24145. Parrish, C. R., E. C. Holmes, D. M. Morens, E. C. Park, D. S. Burke, C. H. Calisher, C. A. Laughlin, L. J. Saif and P. Daszak (2008). "Cross-species virus transmission and the emergence of new epidemic diseases." Microbiology and Molecular Biology Reviews 72(3): 457-470. Pedersen, A. B. and A. Fenton (2007). "Emphasizing the ecology in parasite community ecology." Trends Ecol Evol 22(3): 133-139.

124 References Pepin, K. M., S. Lass, J. R. Pulliam, A. F. Read and J. O. Lloyd-Smith (2010). "Identifying genetic markers of adaptation for surveillance of viral host jumps." Nat Rev Microbiol 8(11): 802-813. Perlman, S. J. and J. Jaenike (2003). "Infection success in novel hosts: An experimental and phylogenetic study of Drosophila-parasitic nematodes." Evolution 57(3): 544-557. Pinsker, W., E. Haring, S. Hagemann and W. J. Miller (2001). "The evolutionary life history of P transposons: from horizontal invaders to domesticated neogenes." Chromosoma 110(3): 148-158. Plus, N. (1955). "Etude De La Sensibilite Hereditaire Au Co2 Chez La Drosophile .1. Multiplication Du Virus-Sigma Et Passage a La Descendance Apres Inoculation Ou Transmission Hereditaire." Annales De L Institut Pasteur 88(3): 347-363. Plus, N., G. Croizier, F. X. Jousset and J. David (1975). "Picornaviruses of laboratory and wild Drosophila melanogaster: geographical distribution and serotypic composition." Ann Microbiol (Paris) 126(1): 107-117. Plus, N., L. Gissman, J. C. Veyrunes, H. Pfister and E. Gateff (1981). "REOVIRUSES OF DROSOPHILA AND CERATITIS POPULATIONS AND OF DROSOPHILA CELL-LINES - A POSSIBLE NEW GENUS OF THE REOVIRIDAE FAMILY." Annales De Virologie 132(2): 261-&. Poch, O., B. M. Blumberg, L. Bougueleret and N. Tordo (1990). "Sequence comparison of 5 polymerases (l-proteins) of unsegmented negative-strand rna viruses - theoretical assignment of functional domains." Journal of General Virology 71: 1153-1162. Poch, O., Sauvaget, I., Delaure, M, and Noel Tordo. (1989). "Identification of four cosnerved motifs among the RDRP encoding elements." EMBO 8(12): 3867-3874. Polak, M. (2003). "Heritability of resistance against ectoparasitism in the Drosophila-Macrocheles system." J Evol Biol 16(1): 74-82. Posada, D. and K. A. Crandall (1998). "MODELTEST: testing the model of DNA substitution." Bioinformatics 14(9): 817-818. Prangishvili, D., P. Forterre and R. A. Garrett (2006). "Viruses of the Archaea: a unifying view." Nat Rev Microbiol 4(11): 837-848. Rambaut, A. and A. J. Drummond (2007). Tracer v1.4, Available from http://beast.bio.ed.ac.uk/Tracer Ramsden, C., E. C. Holmes and M. A. Charleston (2009). "Hantavirus evolution in relation to its rodent and insectivore hosts: no evidence for codivergence." Mol Biol Evol 26(1): 143-153. Raychoudhury, R., B. K. Grillenberger, J. Gadau, R. Bijlsma, L. van de Zande, J. H. Werren and L. W. Beukeboom (2010). "Phylogeography of Nasonia vitripennis (Hymenoptera) indicates a mitochondrial-Wolbachia sweep in North America." Heredity 104(3): 318-326. Robinson, D. F. and L. R. Foulds (1981). "COMPARISON OF PHYLOGENETIC TREES." Mathematical Biosciences 53(1-2): 131-147. RoelkeParker, M. E., L. Munson, C. Packer, R. Kock, S. Cleaveland, M. Carpenter, S. J. Obrien, A. Pospischil, R. HofmannLehmann, H. Lutz, G. L. M. Mwamengele, M. N. Mgasa, G. A. Machange, B. A. Summers and M. J. G. Appel (1996). "A canine distemper virus epidemic in Serengeti lions (Panthera leo)." Nature 379(6564): 441-445.

125 References Roossinck, M. J., D. Sleat and P. Palukaitis (1992). "Satellite RNAs of plant viruses: structures and biological effects." Microbiol Rev 56(2): 265-279. Rosen, L. (1980). " Carbon-dioxide sensitivity in mosquitos infected with sigma, vesicular stomatitis, and other rhabdoviruses." Science 207(4434): 989-991. Rossmann, M. G. and Y. Tao (1999). "Structural insight into insect viruses." Nat Struct Biol 6(8): 717-719. Russo, C. A., N. Takezaki and M. Nei (1995). "Molecular phylogeny and divergence times of drosophilid species." Mol Biol Evol 12(3): 391-404. Russell, J. A., B. Goldman-Huertas, et al. (2009). "Specialization and geographicisolation among Wolbachia symbionts from ants and lycaenid butterflies." Evolution 63(3): 624-640. Sackton, T. B., B. P. Lazzaro, T. A. Schlenke, J. D. Evans, D. Hultmark and A. G. Clark (2007). "Dynamic evolution of the innate immune system in Drosophila." Nat Genet 39(12): 1461-1468. Sanjuan, R., M. R. Nebot, N. Chirico, L. M. Mansky and R. Belshaw (2010). "Viral mutation rates." J Virol 84(19): 9733-9748. Sauer, C., E. Stackebrandt, J. Gadau, B. Holldobler and R. Gross (2000). "Systematic relationships and cospeciation of bacterial endosymbionts and their carpenter ant host species: proposal of the new taxon Candidatus Blochmannia gen. nov." Int J Syst Evol Microbiol 50 Pt 5: 1877-1886. Schmid-Hempel, P. (2005). "Evolutionary ecology of insect immune defenses." Annu Rev Entomol 50: 529-551. Schrankel, K. R. and F. E. Schwalm (1975). "Viruslike Particles in Spermatids of Coelopa-Frigida (Diptera)." Journal of Invertebrate Pathology 26(2): 265-268. Seecof, R. L. (1964). "Deleterious Effects on Drosophila Development Associated with Sigma Virus Infection." Virology 22(1): 142-&. Shapiro, B., A. Rambaut and A. J. Drummond (2006). "Choosing appropriate substitution models for the phylogenetic analysis of protein-coding sequences." Mol Biol Evol 23(1): 7-9. Sharp, P. M. and B. H. Hahn (2010). "The evolution of HIV-1 and the origin of AIDS." Philosophical Transactions of the Royal Society B-Biological Sciences 365(1552): 2487-2494. Shelly, S., N. Lukinova, S. Bambina, A. Berman and S. Cherry (2009). "Autophagy is an essential component of Drosophila immunity against vesicular stomatitis virus." Immunity 30(4): 588-598. Shimodaira, H. and M. Hasegawa (1999). "Multiple comparisons of log-likelihoods with applications to phylogenetic inference." Molecular Biology and Evolution 16(8): 1114-1116. Shorrocks, B. (1975). "Distribution and Abundance of Woodland Species of British Drosophila (Diptera-Drosophilidae)." Journal of Animal Ecology 44(3): 851-864. Shroyer, D. A. and L. Rosen (1983). " Extrachromosomal-inheritance of carbon- dioxide sensitivity in the mosquito culex-quinquefasciatus." Genetics 104(4): 649- 659. Snook, R. R., S. Y. Cleland, M. F. Wolfner and T. L. Karr (2000). "Offsetting effects of Wolbachia infection and heat shock on sperm production in Drosophila simulans: analyses of fecundity, fertility and accessory gland proteins." Genetics 155(1): 167- 178.

126 References Sokoloff, A. (1966). "Morphological Variation in Natural and Experimental Populations of Drosophila Pseudoobscura and Drosophila Persimilis." Evolution 20(1): 49-&. Sorensen, D. and D. Gianola (2002). Likelihood, Bayesian and MCMC Methods in Quantitative Genetics. New York, Springer-Verlag. Stahlhut, J. K., C. A. Desjardins, et al. (2010). "The mushroom habitat as an ecological arena for global exchange of Wolbachia." Mol Ecol 19(9): 1940-1952. Streicker, D. G., A. S. Turmelle, M. J. Vonhof, I. V. Kuzmin, G. F. McCracken and C. E. Rupprecht (2010). "Host Phylogeny Constrains Cross-Species Emergence and Establishment of Rabies Virus in Bats." Science 329(5992): 676-679. Sullivan, W., M. Ashburner and S. Hawley (2000). Drosophila Protocols, Cold Spring Harbor Laboratory Press. Suttle, C. (2005). "Crystal ball. The viriosphere: the greatest biological diversity on Earth and driver of global processes." Environ Microbiol 7(4): 481-482. Suttle, C. A. (2005). "Viruses in the sea." Nature 437(7057): 356-361. Switzer, W. M., M. Salemi, V. Shanmugam, F. Gao, M. E. Cong, C. Kuiken, V. Bhullar, B. E. Beer, D. Vallet, A. Gautier-Hion, Z. Tooze, F. Villinger, E. C. Holmes and W. Heneine (2005). "Ancient co-speciation of simian foamy viruses and primates." Nature 434(7031): 376-380. Swofford, D. L. (1993). "Paup - a computer-program for phylogenetic inference using maximum parsimony." Journal of General Physiology 102(6): A9-A9. Sylvester, E. and J. Richardson (1992). Aphid-borne rhabdoviruses-relationship with their aphid vectors. Advances Disease Vector Research. H. KF. New York, Springer- Verlag. 9: 313-341. Tajima, F. (1989). "Statistical-Method for Testing the Neutral Mutation Hypothesis by DNA Polymorphism." Genetics 123(3): 585-595. Tandler, B. (1972). "Viruslike Particles in Testis of Drosophila-Virilis." Journal of Invertebrate Pathology 20(2): 214-&. Teixeira, L., A. Ferreira and M. Ashburner (2008). "The Bacterial Symbiont Wolbachia Induces Resistance to RNA Viral Infections in Drosophila melanogaster." Plos Biology 6(12): 2753-2763. Teninges, D. (1968). "[Demonstration of sigma viruses in the cells of the stabilized male germinal line of Drosophila]." Arch Gesamte Virusforsch 23(4): 378-387. Teninges, D., F. Bras and S. Dezelee (1993). "Genome Organization of the Sigma Rhabdovirus - 6 Genes and a Gene Overlap." Virology 193(2): 1018-1023. Teninges, D., A. Ohanessian, C. Richardmolard and D. Contamine (1979). "Isolation and Biological Properties of Drosophila X-Virus." Journal of General Virology 42(Feb): 241-254. Tesh, R. B., N. C. Byron and K. M. Johnson (1972). "Vesicular Stomatitis Virus (Indiana Serotype): Transovarial Transmission by Phlebotomine Sandflies." Science 175(4029): 1477-1479. Thompson, J. N. (1994). The Coevolutionary Process, The University of Chicago press. Tinsley, M. C., S. Blanford and F. M. Jiggins (2006). "Genetic variation in Drosophila melanogaster pathogen susceptibility." Parasitology 132: 767-773. Tinsley, M. C. and M. E. N. Majerus (2007). "Small steps or giant leaps for male- killers? Phylogenetic constraints to male-killer host shifts." Bmc Evolutionary Biology 7.

127 References Tsai, C. W., E. A. McGraw, E. D. Ammar, R. G. Dietzgen and S. A. Hogenhout (2008). "Drosophila melanogaster mounts a unique immune response to the rhabdovirus Sigma virus." Applied and Environmental Microbiology 74(10): 3251- 3256. Turelli, M. and A. A. Hoffmann (1991). "Rapid Spread of an Inherited Incompatibility Factor in California Drosophila." Nature 353(6343): 440-442. Turner, P. E. and S. F. Elena (2000). "Cost of host radiation in an RNA virus." Genetics 156(4): 1465-1470. Vale, P. F. and T. J. Little (2010). "CRISPR-mediated phage resistance and the ghost of coevolution past." Proc Biol Sci 277(1691): 2097-2103. van der Linde, K., D. Houle, G. S. Spicer and S. J. Steppan (2010). "A supermatrix- based molecular phylogeny of the family Drosophilidae." Genetics Research 92(1): 25-38. van Rij, R. P., M. C. Saleh, B. Berry, C. Foo, A. Houk, C. Antoniewski and R. Andino (2006). "The RNA silencing endonuclease Argonaute 2 mediates specific antiviral immunity in Drosophila melanogaster." Genes & Development 20(21): 2985-2995. Vasilakis, N. and S. C. Weaver (2008). "The history and evolution of human dengue emergence." Adv Virus Res 72: 1-76. Vavre, F., F. Fleury, D. Lepetit, P. Fouillet and M. Bouletreau (1999). "Phylogenetic evidence for horizontal transmission of Wolbachia in host-parasitoid associations." Molecular Biology and Evolution 16(12): 1711-1723. Walker, P. J. and K. Kongsuwan (1999). "Deduced structural model for animal rhabdovirus glycoproteins." Journal of General Virology 80: 1211-1220. Wallis, C. M., A. L. Stone, D. J. Sherman, V. D. Damsteegt, F. E. Gildow and W. L. Schneider (2007). "Adaptation of plum pox virus to a herbaceous host (Pisum sativum) following serial passages." J Gen Virol 88(Pt 10): 2839-2845. Wang, A. L. and C. C. Wang (1991). "VIRUSES OF PARASITIC PROTOZOA." Parasitology Today 7(4): 76-80. Watterson, G. A. (1975). "NUMBER OF SEGREGATING SITES IN GENETIC MODELS WITHOUT RECOMBINATION." Theoretical Population Biology 7(2): 256-276. Wayne, M. L., D. Contamine and M. Kreitman (1996). "Molecular population genetics of ref(2)P, a locus which confers viral resistance in Drosophila." Molecular Biology and Evolution 13(1): 191-199. Webby, R. J. and R. G. Webster (2001). "Emergence of influenza A viruses." Philos Trans R Soc Lond B Biol Sci 356(1416): 1817-1828. Weeks, A. R., M. Turelli, W. R. Harcombe, K. T. Reynolds and A. A. Hoffmann (2007). "From parasite to mutualist: Rapid evolution of Wolbachia in natural populations of Drosophila." Plos Biology 5(5): 997-1005. Weinert, L. A., J. J. Welch and F. M. Jiggins (2009). "Conjugation genes are common throughout the genus Rickettsia and are transmitted horizontally." Proceedings of the Royal Society B-Biological Sciences 276(1673): 3619-3627. Weinert, L. A., J. H. Werren, A. Aebi, G. N. Stone and F. M. Jiggins (2009). "Evolution and diversity of Rickettsia bacteria." Bmc Biology 7: 6. Werren, J. H., W. Zhang and L. R. Guo (1995). "Evolution and Phylogeny of Wolbachia - Reproductive Parasites of Arthropods." Proceedings of the Royal Society of London Series B-Biological Sciences 261(1360): 55-63.

128 References Wileman, T. (2006). "Aggresomes and autophagy generate sites for virus replication." Science 312(5775): 875-878. Wilfert, L. and F. M. Jiggins (2010). "Disease association mapping in Drosophila can be replicated in the wild." Biol Lett 6(5): 666-668. Wilfert, L. and F. M. Jiggins (2010). "Host-parasite coevolution: genetic variation in a virus population and the interaction with a host gene." J Evol Biol 23(7): 1447- 1455. Williamson, A. and R. Lehman (1996). "Germ cell development in Drosophila." Annual Review Cell Developmental Biology 12: 365-391. Williamson, D. (1961). "Carbon Dioxide Sensitivity in Drosophila Affinis and Drosophila Athabasca." Genetics 46(8): 1053-&. Williamson, D. L. (1959). Carbon dioxide sensitivity in Drosophila affinis and Drosophila athabasca. Zoology, University of Nebraska. PhD: 66. Wilson, A. J. (2008). "Why h(2) does not always equal V-A/V-P?" Journal of Evolutionary Biology 21(3): 647-650. Wolf, K. W. (1997). "Virus-like particles, bacteria and microsporidia affect spindle- associated membranes in spermatocytes of Lepidoptera species." Zygote 5(1): 21-30. Woolhouse, M. E. (2002). "Population biology of emerging and re-emerging pathogens." Trends Microbiol 10(10 Suppl): S3-7. Woolhouse, M. E. and S. Gowtage-Sequeria (2005). "Host range and emerging and reemerging pathogens." Emerg Infect Dis 11(12): 1842-1847. Woolhouse, M. E., D. T. Haydon and R. Antia (2005). "Emerging pathogens: the epidemiology and evolution of species jumps." Trends Ecol Evol 20(5): 238-244. Woolhouse, M. E. J., J. P. Webster, E. Domingo, B. Charlesworth and B. R. Levin (2002). "Biological and biomedical implications of the co-evolution of pathogens and their hosts." Nature Genetics 32(4): 569-577. Wu, Q., Y. Luo, R. Lu, N. Lau, E. C. Lai, W. X. Li and S. W. Ding (2010). "Virus discovery by deep sequencing and assembly of virus-derived small silencing RNAs." Proc Natl Acad Sci U S A 107(4): 1606-1611. Wyers, F., P. Dru, B. Simonet and D. Contamine (1993). "Immunological cross- reactions and interactions between the Drosophila melanogaster ref(2)P protein and sigma rhabdovirus proteins." J Virol 67(6): 3208-3216. Wyers, F., A. M. Petitjean, P. Dru, P. Gay and D. Contamine (1995). "Localization of domains within the Drosophila Ref(2)P protein involved in the intracellular control of sigma rhabdovirus multiplication." J Virol 69(7): 4463-4470. Yampolsky, L. Y., C. T. Webb, S. A. Shabalina and A. S. Kondrashov (1999). "Rapid accumulation of a vertically transmitted parasite triggered by relaxation of natural selection among hosts." Evolutionary Ecology Research 1(5): 581-589. Zdobnov, E. M. and R. Apweiler (2001). "InterProScan - an integration platform for the signature-recognition methods in InterPro." Bioinformatics 17(9): 847-848. Zhao, G. P. (2007). "SARS molecular epidemiology: a Chinese fairy tale of controlling an emerging zoonotic disease in the genomics era." Philos Trans R Soc Lond B Biol Sci 362(1482): 1063-1081. Zhou, W., F. Rousset and S. O'Neil (1998). "Phylogeny and PCR-based classification of Wolbachia strains using wsp gene sequences." Proc Biol Sci 265(1395): 509-515.

129 Appendix/Supplementary materials Appendix/Supplementary materials

Other publications that are not part of this thesis:

Longdon B, and Jiggins FM 2010. Paternally transmitted parasites. Current biology. 20: R695-R696. http://dx.doi.org/10.1016/j.cub.2010.06.026

Wilfert L, Longdon B, Ferreira AGA, Bayer F, Jiggins FM 2011. Trypanosomatids are common and diverse parasites of Drosophila. Parasitology. 138: 858-865. http://dx.doi.org/10.1017/S0031182011000485

Longdon B, Hurst GG and Jiggins FM (in review). Male-killing Wolbachia do not protect Drosophila bifasciata against viral infection.

Below are the supplementary materials for each chapter. Additionally the links to the published journal articles at the start of each chapter provide access to all supplementary materials in electronic format.

130 Appendix/Supplementary materials Chapter 2

Table S1 Conserved degenerate PCR primers which successfully amplified the various sigma viruses.

Primer 5-3’ primer sequence Virus which the primers

successfully amplified cons F3 GGVTTGACNATGGCNGATGA DMelSV/ DAffSV cons F4 TRAGACARAARGGATGGAG DMelSV cons F7 AGRCARAARGGATGGAG DMelSV cons R4 ATRCTCCATCCYTTCTG DMelSV/ DAffSV cons R5 TGTCTYAGHCCYTCTAATCC DMelSV cons R2 TCCTTTGATGRTTRTTCCA DMelSV cons R6 TCHGCWGATTGCATNGTCTCATC DMelSV cons F YMGDCATTGGGGNCATCC DObsSV cons R1 TCATCNGCCATNGTCAABCC DObsSV cons F6 GACAATCARGTNATHTGCAC DAffSV cons R8 TCNGTCACNGGATCHGG DAffSV

131 Appendix/Supplementary materials Table S2 Amino-acid sequence identitites between the different genes in the three sigma viruses

Viruses compared

Gene DMelSV-DObsSV DMelSV-DAffSV DObsSV -DAffSV

N gene 0.22 - -

P gene 0.15 - -

3/X gene 0.14 - -

M gene 0.18 - -

G gene 0.22 - -

L gene 0.41 0.39 0.40

132 Appendix/Supplementary materials Table S3 Amino-acid sequence identitites of L genes between and within the different genera of rhabdoviruses

Taxa compared Amino acid

identity

Within sigma1 0.40

Within Vesiculoviruses2 0.53

Within Ephemeroviruses2 0.41

Within Lyssaviruses2 0.74

Within Cytorhabdoviruses2 0.23

Within Nucleorhabdoviruses2 0.20

Within Novirhabdoviruses2 0.59

Sigma-Vesiculovirus3 0.38

Sigma-Ephemerovirus3 0.35

Sigma-Lyssaviruses3 0.31

Sigma-cyto- + nucelo- + novi-rhabdoviruses3 0.14

Vesiculoviruses –Ephemeroviruses3 0.40

Vesiculoviruses –Lyssaviruses3 0.32

Ephemeroviruses- Lyssaviruses3 0.31

1Average of all three pairwise comparisons. 2 Two viruses from each genera were selected which had diverged at the base of the clade (only taxa included included in the genus by the ICTV were considered with the exception of the ephemerovirus clade where the unassigned wongabel virus was used with bovine ephemeral fever virus). 3 The mean identity of all possible pairwise comparisons between selected taxa in the two genera.

133 Appendix/Supplementary materials

Figure S1 Bayesian nucleotide tree using L gene conserved region sequence alignments. Node labels are posterior supports, labels in brackets are bootstrap supports taken from the maximum likelihood tree with 1000 bootstrap replicates. The tree is rooted with three lyssaviruses (West Caucasian bat virus, Mokola virus 86101RCA and Rabies virus).

134 Appendix/Supplementary materials Accession numbers for the sequences used in the phylogenetic analysis are as follows:

Irkut virus (EF614260); Lettuce necrotic yellows virus (NC_007642); Adelaide

River virus (U05987); Aravan virus (EF614259); Australian bat lyssavirus (a)

(AF418014); Australian bat lyssavirus (b) (AF081020); Bovine ephemeral fever virus (AF234533); Cocal virus, vesicular stomatitis (EU373657); Duvenhage virus isolate 86132SA, (EU293119); Duvenhage virus isolate 94286SA (EU293120);

European bat lyssavirus 1 isolate RV9 (EF157976); Flanders virus (AF523199);

Hirame Rhabdovirus (NC_005093); Human parainfluenza virus 1 (AF117818);

Infectious haematopoietic necrosis virus (IHNV) (X89213); Iranian maize mosaic nucleorhabdovirus (DQ186554); Isfahan virus (AJ810084); Khujand lyssavirus

(EF614261); Lagos bat virus isolate 8619NGA (EU293110); Lagos bat virus isolate

KE131 (EU259198); Maize mosaic virus (NC_005975); Mokola virus isolate

86100CAM (EU293117); Mokola virus isolate 86101RCA (EU293118); Northern cereal mosaic virus (NC_002251); Rabies virus (NC_001542); Rice yellow stunt virus (NC_003746); Snakehead rhabdovirus (NC_000903); Sonchus yellow net virus

(NC_001615); Spring viremia of carp virus (U18101); Starry flounder rhabdovirus

(AY450644); Vesicular stomatitis Indiana virus (NC_001560); Vesicular stomatitis virus, New Jersey serotype, Hazelhurst subtype (M20166); Vesicular stomatitis virus

New Jersey serotype (M29788); Viral hemorrhagic septicemia virus (NC_000855);

West Caucasian bat virus (EF614258); Wongabel virus strain CS264 (EF612701);

Chandipura virus (AJ810083); Siniperca chuatsi rhabdovirus (DQ399789); European bat lyssavirus 2 isolate 9018HOL (EU293114); European bat lyssavirus 1 isolate

8918FRA (EU293112); Taro vein chlorosis virus (NC_006942); Maize fine streak

135 Appendix/Supplementary materials virus(AY618417); Lettuce yellow mottle virus (NC_011532); Orchid fleck virus

RNA 2 (NC_009609).

136 Appendix/Supplementary materials Chapter 3

Table S1 PCR cycles. PCR were carried out using thermopol PCR reagents (New England Biolabs). A final concentration of 0.2mM MgSO4and of 0.2mM for each dNTP was used. Cyt-b Adh RpL32

940C 2min 940C 2min 940C 2min 940C 30sec 940C 30sec 940C 30sec 51.50C 1min 34x 590C 30sec 34x 620C 30sec (-10C/cycle) 10x 720C 1min 720C 1min 720C 1min 720C 5min 720C 5min 940C 30sec 520C 30sec (-10C/cycle) 25x 720C 1min

Table S2 Primers used for species identification and qRT-PCR Primer name Sequence 5’- 3’ Cyt-b F TTATGGTTGATTATTACGAA Cyt-b obscura R CTAGGTAAGGAACTGCTGATCG Cyt-b subobscura R CATTCAGGTTGAATATGTGCAGTA Adh F CGTTTACTCTGGCAGCAAGG Adh obscura R AGTGTGCCCAAATCCAGC Adh subobscura R GTCTTGGTGATGCCCGGATGC RpL32 + F GTCGGATCGTTATGCCAAGT RpL32 + R GGCGATCTCACCGCAGTA DObs RpL32 qPCR F CTTAGTTGTCGCACAAATGG DObs RpL32 qPCR R TGCGCTTGTTGGAACCGTAAC DAff RpL32 qPCR F CCAAGTTGTCGCACAAATGG DAff RpL32 qPCR R TGCGCTTGTTGGAGCCATAAC DObsSV qPCR F TGGTTTCGATGGGTTAGTGG DObsSV qPCR R ATTGGACAATGGGTCAAAGC DAffSV qPCR F GCAGATGTATTAGTCTGTCCACG DAffSV qPCR TGTGAGTCCAAACGAAAGGA

137 Appendix/Supplementary materials

Banana-malt food ingredients Amount Water 1.88l Agar 15g Malt powder 65.5g Yeast 57.5g Golden syrup 47.5ml Bananas (blended) 4 Tergosept 15ml Proprionic acid 9ml The first five ingredients were mixed together, brought to the boil, and simmered for 10mins while stirring continuously. The mixture was allowed to cool to <700C, then the tergosept and proprionic acid were added. PH was then neutralised using NaOH.

138 Appendix/Supplementary materials

139 Appendix/Supplementary materials Chapter 4

Table S1 Conserved degenerate PCR primers which successfully amplified the various sigma viruses. PCR cycle: 95°C 30sec, 62°C (-1°C per cycle) 30sec, 72°C 1min; for 10x cycles followed by; 95°C 30sec, 52°C 30sec, 72°C 1min; for a further 25x cycles. Primer 5’-3’ primer sequence Virus which the primers

successfully amplified

Con F YMGDCATTGGGGNCATCC MStaSV/DImmSV/DTriSV

SV Con F5 GGVYTRACVATGGCAGATGA DAnaSV/MStaSV

Con F7 AGRCARAARGGATGGAG MStaSV

Con R1 TCATCNGCCATNGTCAABCC MStaSV/DImmSV/DTriSV

SV Con R6 ATRCTCCAHCCYTTYTGTC DAnaSV/MStaSV

Con R8 GGCCWGADATYAARGGRTTTTGRA MStaSV

140 Appendix/Supplementary materials

Table S2 CO2 sensitive flies from which the conserved primers did not amplify a

sigma-like virus. The CO2 trait can subjective to score and it is likely some of these

species were simply in poor condition which resulted in death after the stress of CO2 exposure. Species ID Number Location Notes paralysed Drosophila pseudoobscura 2 Utah, U.S.A. Partially sensitive Muscidae: stomoxys species 1 Cambridge, UK ID uncertain Muscidae: anthomyiidae 6 Cambridge, UK ID uncertain species Drosophila busckii ~25 Edinburgh, UK Zaprionus (Drosophilidae) 2 South Africa Species ID uncertain species Drosophila hydei 1 Kent, UK Drosophila cameraria 6 Perth and Kinross, ID uncertain UK Drosophila (Scaptodrosophila) 1 Perth and Kinross, ID uncertain deflexa UK Drosophila sp (repleta group) 1 Montpellier, France Species ID uncertain Drosophila funebris 1 Montpellier, France ID uncertain Drosophila subsilvestris 2 Derby, UK Drosophila helvetica 6 Derby (3), Falmouth(2), Bristol (1): UK Drosophila subobscura 21 Bristol (1), Essex (4), Derby (13), Falmouth (3): UK Unknown Drosophila sp 1 Kenya ID uncertain Unknown Drosophila sp 6 Ithaca, U.S.A ID uncertain

141 Appendix/Supplementary materials Chapter 5

Prior specification

In a Bayesian analysis prior probability distributions have to be specified for the fixed effects and the covariance matrices Vp, Vs and Ve. Independent normal priors with zero mean and large variance (1010) were used for the fixed effects and are virtually non-informative in this context. An inverse-Wishart prior was placed on each diagonal element of Ve (i.e. the virus specific residual variances) with scale matrix and degree of belief parameter set to 0.002. This corresponds to an inverse- gamma distribution with shape and scale parameters set to 0.001 and is weakly informative when the posterior distribution does not have substantial density at values close to zero, which was the case.

Diffuse priors for fully parameterised covariance matrices such as Vp and Vs are harder to define so we used several types of prior with differing properties, in order to evaluate the sensitivity of our conclusions to alternative prior specifications. Historically inverse-Wishart priors with low degree of belief had been used as purported weakly informative prior distributions because they are conjugate to the likelihood for covariance matrices. Conceptually, having a degree of belief of nu is like observing nu effects (a priori) with known variance which is specified (V). More recently, it has been suggested that inverse-Wishart priors (which are equivalent to inverse-gamma priors when the covariance matrix is 1x1 and simply a variance) can be highly informative when the posterior distribution has some support at small values, and parameter expanded priors have been suggested as a good alternative (Gelman 2006). Parameter expansion results in F priors (or t priors for the standard deviation), and also improves mixing of the chain. We evaluated our posterior distribution under several commonly used priors: prior 1: inverse-Wishart with scale = I*0.002 and degree of belief = 2.002 (i.e V=diag(3)*(0.002/2.002), nu=2.002) which induces a marginal prior distribution on

142 Appendix/Supplementary materials the variances that is equivalent to an inverse-gamma distribution with shape and scale equal to 0.001. prior 2: a flat prior (i.e V=0, nu=0) which for 3x3 covariance matrices results in posterior modes that should coincide with the REML estimates (Sorensen and Gianola 2002). prior 3: a parameter expanded prior was used with scale = I, degree of belief = 3 and a multivariate normal prior specification for the three redundant working parameters with null mean vector and covariance I*103 (i.e V=diag(3)/3, nu=3, alpha.mu=rep(0,3), alpha.V=diag(3)* 103). This induces a scaled F-distributed marginal prior on the variances with numerator and denominator degrees of freedom equal to 1 and scale set to 101.5 (see (Gelman 2006) for details).

Posterior modes for the species-level variances were close to zero for viruses A and M and these were omitted from the model and a species-level variance estimated for virus O only. In this case the prior for the single variance was: prior 1: scale=0.002 and degree of belief =0.002 (i.e. V=1, nu=0.002) prior2: V=0, nu=-2 prior3: scale=1, degree of belief =1, and null mean and variance 103 for the working parameters (i.e. V=1, nu=1, alpha.mu=0, alpha.V=1000)

Parameter expanded models were run for 500,000 iterations with a thinning interval of 200 and a burn-in of 50,000, whereas all non parameter expanded models were run for 10 times as long due to their poorer mixing properties. This resulted in 2250 samples from the posterior for which autocorrelation between successive samples for all parameters was less than 0.01. It has been suggested that parameter expanded priors are preferable when weakly informative priors are sought (Gelman 2006) and indeed they gave point estimates consistent with the REML analysis (Figure S2) and so are those presented (they also have the lowest Deviance Information Criterion (DIC)). All analysis gave very similar results.

143 Appendix/Supplementary materials Table S1. Full list of species used; whether they harboured Wolbachia (yes or no); their rearing temperature; whether they were composed of multiple lines(yes or no); food medium reared on (b=banana (Longdon et al. 2011), l=lewis (Lewis 1960), lm=lewis with mushroom (peeled Agaricus bisporus), m=malt (recipe below), i= 4- 24 instant Drosophila medium Carolina (Burlington, North Carolina, U.S.A), im= instant with mushroom), and mean wing length. All species are in the genus Drosophila, with the exceptions of; Scaptomyza pallida, Hirtodrosophila duncani, Zaprionous badyi and Scaptodrosophila. lebanonensis and Scaptodrosophila.stonei. Species name Wolbachia Rearing Multiple Food Mean wing Temp lines length (mm) (°C) D.tristis y 18 n lm 2.53 D.busckii n 18 y i 1.71 D.melanogaster y 18 y l 2.12 D.subobscura n 18 y lm 2.32 D.simulans n 18 y l 1.84 D.ananassae y 25 n m 1.68 D.takahashii n 18 n l 1.90 D.pseudotakahashii y 18 n l 1.92 D.affinis n 18 y b 2.25 D.persimilis n 18 n m 2.35 D.pseudoobscura n 18 n b 2.25 D.miranda n 18 n b 2.61 D.buzzatii n 18 n b 1.97 D.nigromelanica n 18 y b 2.20 D.bifasciata n 18 n m 2.34 H.duncani y 25 n m 2.03 D.orena n 18 n m 1.81 D.borealis n 18 n b 2.37 D.paramelanica n 18 n b 2.52 D.guanche n 18 n m 2.17 D.immigrans n 18 y m 2.60

144 Appendix/Supplementary materials Z.badyi n 18 n m 2.18 S.lebanonensis n 18 n m 2.10 D.erecta n 18 n m 1.63 S.stonei y 25 n m 2.13 D.ambigua n 18 n m 2.36 D.algonquin n 18 n m 2.36 D.hydei n 18 n m 2.50 S.pallida n 18 n i 2.10 D.phalerata n 18 n im 2.57 D.tenebrosa n 18 n im 2.23 D.sechellia n 18 y l 1.66 D.nebulosa n 18 n l 1.93 D.santomea n 18 n l 1.75 D.lummei n 18 n b 2.81 D.lacicola n 18 n m 3.06 D.flavomontana n 18 n b 2.90 D.novamexicana n 18 y b 2.38 D.littoralis n 18 n b 2.60 D.americana n 18 n m 2.26 D.virilis n 18 y b 2.54 D.obscura n 18 y lm 2.42 D.mojavensis n 18 y m 1.90 D.lini n 18 n m 1.65 D.ohnishii n 18 n m 1.76 D.mauritiana n 18 n m 1.66 D.montana n 18 y m 2.71 D.teissieri y 18 n m 1.91 D.saltans n 25 n m 1.66 D.willistoni n 25 n m 1.69 D.yakuba n 18 y m 1.74

145 Appendix/Supplementary materials Table S2. RpL32 primers for sequencing. Initially the RpL32 seq F and R pair were used. However, if these failed, then combinations of the remaining primers were used. DNA was extracted using a Chelex-Proteinase K extraction (Longdon et al. 2011) and PCRs were carried out using a touchdown PCR cycle (95°C 30sec, 62°C (-1°C per cycle) 30sec, 72°C 1min; for 10x cycles followed by; 95°C 30sec, 52°C 30sec, 72°C 1min; for a further 25x cycles). In cases where the initial PCRs did not work, the PCR was repeated on cDNA (see below). Following PCR, unincorporated primers and dNTPs were removed using exonuclease I and shrimp alkaline phosphatase, and the products were then sequenced in both directions using BigDye v3.1 (Applied Biosystems) and using a ABI capillary sequencer (Gene Pool facility, University of Edinburgh). The sequence chromatograms were inspected by eye to confirm the validity of all variants within and between species and assembled using Sequencher (v4.9). Primer name Sequence 5’-3’ RpL32 seq F ACAGGCCCAAGATCGTGAAGAAGC RpL32 seq R CTCTTGAGAACGCAGGCGACC RpL32 seq F1 AGACACTGGCGCWGTAAT RpL32 seq F2 YAACTATMAAATTSCAGCTCC RpL32 seq F3 AATGACSATTCGCCCAGCRTACMGG RpL32 seq R1 TGCGCTKGTTGGADCCRTAACC RpL32 seq R2 CGGTTCTGCATVARCARVACC

Table S3. qRT-PCR primers. Drosophila RpL32 primers were designed to match the homologous sequence in each species and crossed an intron-exon boundary so will only amplify mRNA. The intron location (located bases 457:518 in D. melanogaster accession: Y13939) was confirmed in a subset of 7 species (D. melanogaster, D. obscura, D. affinis, D. paramelanica, D. ambigua, D. algonquin and Scaptomyza pallida). Sigma virus primers crossed gene boundaries so as to only amplify genomes and not mRNA. We were unable to sequence RpL32 for D. busckii. However, we found that the most closely related species in this study (Z. badyi)

146 Appendix/Supplementary materials primers worked successfully in this species, with a suitable efficiency, and the PCR product was confirmed to be RpL32 by sequencing. qRT-PCR primer name and Sequence 5’-3’ location/species DMelSV F (L gene-5’ trailer junction) TTCAATTTTGTACGCGGAATC DMelSV R (L gene-5’ trailer junction) TGATCAAACCGCTAGCTTCA DAffSV F (L gene-5’ trailer junction) GCAGATGTATTAGTCTGTCCACG DAffSV R (L gene-5’ trailer junction) TGTGAGTCCAAACGAAAGGA DObsSV F (N-P gene junction) TGGTTTCGATGGGTTAGTGG DObsSV R (N-P gene junction) ATTGGACAATGGGTCAAAGC RpL32 qRT-PCR F (D. melanogaster) TGCTAAGCTGTCGCACAAATGG RpL32 qRT-PCR R (D. melanogaster) TGCGCTTGTTCGATCCGTAAC

Table S4. Drosophila gene sequencing primers for creating the phylogeny. PCRs were carried out using a touchdown PCR cycle (see table S2) of 62-52°C for COII and 28s, and 58-48°C for COI, Adh and Amyrel, then sequenced as described above (table S2). Primer Sequence 5’-3’ COI seq F ACAAATCAYAARGATATTGGAAC COI seq R TADCTRTGTTCAGCDGG COI internal sequencing only F TTTTGGNCAYCCWGAAGT Adh seq F GGYATTGGHYTSGACACCAG Adh seq R GARTCCCAGTGCTKGGTCCA COII (provided by Greg Spicer) seq F ATGGCAGATTAGTGCAATGG COII (provided by Greg Spicer) seq R GTTTAAGAGACCAGTACTTG 28S rDNA seq F AGTTCAGCACTAAGTCAC 28S rDNA seq R TTAGACTCCTTGGTCCGTG Amyrel seq F CAGCACAAYCCHCANTGGTG Amyrel seq R TGATGCCRTAKGGRWAGGCCA

147 Appendix/Supplementary materials

Fly food recipes

Malt food recipe: 1000ml distilled water, 10g agar, 60g semolina, 20g yeast, 80g powdered malt extract, brought to the boil and simmered for 15 mins, cooled for 20mins and then 14ml nipagin (10%) and 5ml proprionic acid added.

Agar food for ageing post injection: 1000ml water, 20g agar, 84g sugar-boiled, cooled then add 7mls tergosept.

MCMCglmm syntax (R script) for priors and main model prior1<-list(G=list(G1=list(V=diag(3)*(0.002/2.002),n=2.002), G2=list(V=diag(3)*(0.002/2.002),n=2.002)), R=list(V=diag(3),nu=0.002)) prior2<-list(G=list(G1=list(V=diag(3)*1e-2,nu=1e-2), G2=list(V=diag(3)*1e- 2,nu=1e-2)), R=list(V=diag(3)*1e-2,n=1e-2)) prior3<-list(G=list(G1=list(V=diag(3),n=3, alpha.mu=rep(0,3), alpha.V=diag(3)*1000), G2=list(V=diag(3),n=3, alpha.mu=rep(0,3), alpha.V=diag(3)*1000)), R=list(V=diag(3),n=0.002))

#Alterations to priors so only species level variance for DObsSV (coded as “O”) prior1$G$G2<-list(V=1, nu=0.002) prior2$G$G2<-list(V=1e-8, nu=-2) prior3$G$G2<-list(V=1, nu=1, alpha.mu=0, alpha.V=1000) model <-MCMCglmm(change_in_viral_titre~distance*virus, random=~us(virus):animal+us(at.level(virus, "O")):fly_species,

148 Appendix/Supplementary materials rcov=~idh(virus):units, data=dataset, prior=prior3, pedigree=tree, nitt=500000, thin=200, burnin=5000)

Figure S1. A pilot study was used to measure the change in viral titre at fixed time points post-injection (0,1,3,5,10 days). Viral titre is measured by qRT-PCR using the ΔΔCt method with values presented being relative to the inoculating dose. A large decrease in titre was found immediately after injection, with viral titre beginning to increasing again around 3-5 days post injection. The different coloured lines represent the different host species injected.

149 Appendix/Supplementary materials

Figure S2. Model estimates of distance effects for each virus (DAffSV is black, DMelSV is red, DObsSV is blue) with the different lines representing the posterior distribution estimated using the different priors (the solid line = prior 1 (inverse wishart), the dotted line = prior 2 (flat) and the dashed line = prior 3 (parameter expanded). Vertical lines are estimates of the distance effect from the ASREML analysis for each virus.

150