The Evolution and Comparative Genomics of the Reproductive Manipulator Cardinium hertigii

Item Type text; Electronic Dissertation

Authors Stouthamer, Corinne Marie

Publisher The University of Arizona.

Rights Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction or presentation (such as public display or performance) of protected items is prohibited except with permission of the author.

Download date 29/09/2021 13:28:29

Link to Item http://hdl.handle.net/10150/626748 THE EVOLUTION AND COMPARATIVE GENOMICS OF THE REPRODUCTIVE

MANIPULATOR CARDINIUM HERTIGII.

by

Corinne Marie Stouthamer

______

Copyright © Corinne Marie Stouthamer

A Dissertation Submitted to the Faculty of the

GRADUATE INTERDISCIPLINARY PROGRAM IN ENTOMOLOGY AND INSECT SCIENCE

In Partial Fulfillment of the Requirements

For the Degree of

DOCTOR OF PHILOSOPHY

In the Graduate College

THE UNIVERSITY OF ARIZONA

2018

THE UNIVERSITY OF ARIZONA GRADUATE COLLEGE

STATEMENT BY AUTHOR

This dissertation has been submitted in partial fulfillment of the requirements for an advanced degree at the University of Arizona and is deposited in the University Library to be made available to borrowers under rules of the Library.

Brief quotations from this dissertation are allowable without special permission, provided that an accurate acknowledgement of the source is made. Requests for permission for extended quotation from or reproduction of this manuscript in whole or in part may be granted by the head of the major department or the Dean of the Graduate College when in his or her judgment the proposed use of the material is in the interests of scholarship. In all other instances, however, permission must be obtained from the author.

SIGNED: Corinne Marie Stouthamer

ACKNOWLEDGMENTS

If I were to thank everyone who has helped me during my PhD, I would have to spend another entire dissertation-length of time to write it, so this is my best attempt at a surely incomplete approximation of thanks.

First, I would like to start by thanking the Hunter lab graduate students in their various incarnations over the years. My first year, I was welcomed by Joe, who kindly spent a lot of time initiating me to the EIS student community and talking about science and being a graduate student. Bodil has been a steady force for kindness and organization in my life, which helped me think more clearly and keep my desk cleaner than it would otherwise be. Liz and Jimmy have both been wonderful friends, officemates, and D&D players, and I have been so lucky to know that I could always count on their help in the lab. Jessica and Matt, the most recent additions to the lab, have been so inspiring in their enthusiasm, work ethic, talent, and creativity. The unofficial Hunter lab graduate student, Tim, has been a fast friend, and I have always enjoyed his company, support, and science talks in the office.

More broadly, I would like to thank the EIS student community. I feel so fortunate to have been able to spend so much time around such wonderful, intelligent, and caring people. In particular, I must extend my most heartfelt thanks to Cristina. Not only has Cristina been there for me the entire time I’ve known her, she always graciously invited the ladies of the EIS program into her well decorated house for Ladies Night. I would also like to thank the Moore Lab: Reilly, Alan, Garrett, and Angela. It’s always a happy time in the Moore Lab, and I thank Garrett in particular for helping me with and commiserating with me about phylogenies.

I would also like to thank my advisor, Molly, who over these years, which contained happy things like being roommates in Austria and not so happy things like my long recovery from a car accident, has become more like my third parent. Molly’s boundless optimism and great editorship made this dissertation possible and it is in large part because of her that I can keep working in Marley, just across the hallway from the lab. On a similar note, I also thank Suzanne. She keeps the lab running and somehow always has the patience and time to help when needed.

My committee has been enormously helpful. Wendy Moore greatly helped in understanding both phylogenies and Mesquite- a powerful, but stubborn program. Noah Whiteman, although he moved to Berkeley partway through my PhD, was hugely influential in my written question, which might stand as the only piece of writing that I’ve ever reread without cringing. Yves Carriere always has been encouraging, friendly, and has great suggestions during committee meetings. Jeremiah Hackett has been a great addition to the committee, and has very thoughtful suggestions about the Cardinium genomes and phylogeny. Diana has very graciously stepped in for Yves absence, and has been a wonderful colleague and friend to both Pedro and me during our PhDs.

I would also like to thank my parents, Richard and Carol Stouthamer. They have helped Pedro and I through a lot of moves in Tucson, and kind enough to host us when I needed to finish writing my PhD. I have always been very fortunate to be able to talk about my dissertation in extreme detail with my father.

Most importantly, I thank my husband, Pedro, who has supported me throughout my entire PhD, and has made it possible for me to pursue my dreams after graduation. I am beyond fortunate to have him as a partner in raising our little daughter, Helena.

During my PhD I was supported by NSF for three years and also supported by travel and research grants from the Center for Insect Sciences (CIS, University of Arizona), and the Graduate and Professional Student Council (GPSC, University of Arizona).

DEDICATION

This dissertation is dedicated to my favorite endo- and ecto- parasite, Helena Sophia Stouthamer Rodrigues. Her smiles light up my days and she is cuter than a wasp eating honey.

TABLE OF CONTENTS

ABSTRACT...... 8

INTRODUCTION...... 9

CHAPTER 1...... 15

CHAPTER 2...... 18

CHAPTER 3...... 21

REFERENCES...... 24

APPENDIX A: Development of a multi-locus sequence typing system helps reveal the evolution of Cardinium hertigii, a reproductive manipulator symbiont of insects...... 35

APPENDIX B: Comparative genomics of Cardinium reveals new candidates for genes underlying Cardinium-induced reproductive manipulations...... 74

APPENDIX C: Enrichment of low-density symbionts from minute insects...... 125

ABSTRACT

Many insects and other have symbiotic microorganisms that may influence key facets of their biology. Cardinium hertigii is an intracellular bacterial symbiont, (phylum

Bacteroidetes) of arthropods and . This versatile symbiont has been shown to cause three of four reproductive manipulations of their hosts known to be caused by symbionts: parthenogenesis induction (PI), where genetic males are converted into genetic females; feminization, where genetic males become functional females; and cytoplasmic incompatibility (CI), the symbiont-induced death of offspring from matings of infected males and uninfected females. Here, I explored the evolution of this symbiont and its reproductive manipulations, and found that closely related Cardinium strains have a tendency to associate with closely related hosts and the reproductive manipulations do not display a clear phylogenetic signal. To further understand the possible genes underlying these reproductive manipulations, I sequenced four Cardinium genomes and compared these with the two genomes analyzed in the literature. In these comparisons, I found that, although closely related Cardinium strains tend to reside in closely related hosts, there is no evidence for a suite of genes associated with host specificity, as few differences separate two strains residing in different host orders, suggesting that ecological opportunity for horizontal transmission may be more limiting to Cardinium than genomic capability. I additionally identify some genes that may be associated with the

Cardinium’s ability to induce PI and CI in its wasp host. Overall, this dissertation has led to a better understanding of Cardinium and its effects on its hosts.

INTRODUCTION

For many organisms, key life history traits, such as fecundity and survival, are intimately entwined with the microbes that they carry. Microbes line the guts of many animals and, in many instances, affect nutrient absorption and host immune defenses (Ley et al., 2008; Engel and

Moran, 2013). Many arthropods also carry intracellular symbionts, which are distinct from gut symbionts because they live inside of host cells and are primarily vertically inherited (Moran et al., 2008). These intracellular symbionts exhibit short and long evolutionary associations with their hosts, and their interactions span the broad spectrum of symbiosis, ranging from beneficial, to commensal, to parasitic (Moran et al., 2008).

Intracellular symbionts are generally classified into two groups: primary symbionts, which are essential for host viability and reproduction, and secondary symbionts, which are not essential for host survival (Moran et al., 2008). Primary symbionts often provide their host with a nutritional benefit, as in the case of Buchnera, which synthesizes essential amino acids that are missing from the aphid host’s diet (Douglas, 1998). Facultative, or secondary, symbionts can have a variety of effects on their hosts. Some may provide context-dependent benefits to their hosts, such as aphid defense from (Oliver et al., 2003), aphid color change (Tsuchida et al., 2010), and a preservation of the leaf-miner’s habitat via leaf greening (Kaiser et al., 2010).

Parasitic reproductive manipulations occur when the symbiont manipulates its host reproduction to maximize its own transmission (Turelli, 1994; O’Neill et al., 1997). Because these intracellular symbionts are maternally inherited via the egg cytoplasm, the primary way in which reproductive manipulators increase their frequency in a host population is by increasing the daughter production of infected females. There are four ways by which endosymbionts have been shown to manipulate host reproduction: male killing, parthenogenesis, feminization, and cytoplasmic incompatibility (“CI”; reviewed in Werren 1997). In male killing, the endosymbiont kills embryonic male progeny of infected females, thereby increasing the fitness of the infected daughters, presumably through the release of resources. Symbiont-induced parthenogenesis occurs when the symbiont turns genetic males turn into genetic females during embryogenesis.

In feminization, infected genetic males are converted into functional females. Finally, in cytoplasmic incompatibility, infected females produce both male and female offspring. The infected daughters can mate with any males while uninfected females cannot produce offspring when they mate with infected males (Engelstädter and Hurst, 2009).

Symbionts causing reproductive manipulations can influence the evolutionary trajectory of their host organisms. For example, in the parasitic wasp, Trichogramma pretiosum, the invasion of a

PI symbiont caused a female-biased sex ratio, which generated the conditions for a nuclear mutation which decreased sexual reproduction (the “virginity mutation”) to spread, as virgin females produce only males, which have a higher fitness in strongly female biased populations.

The fixation of this mutation leads to a permanently asexual population of wasps (Russell and

Stouthamer, 2011). Cytoplasmic incompatibility, which functions by producing mating incompatibilities between infected and uninfected individuals (Werren, 1997), has been shown to reinforce reproductive isolation in populations (Shoemaker et al., 1999; Gebiola, Kelly, et al.,

2016) and contribute to speciation. Globally, a prominent reproductive manipulator, , is associated with an overall decrease in silent mitochondrial polymorphism, and an increase in non-synonymous substitution rates in mitochondria, perhaps driving cyto-nuclear compensatory evolution (Cariou et al., 2017).

There are many different that affect their host’s reproduction, but the most prevalent is the alpha-proteobacterium Wolbachia pipientis. Different strains of Wolbachia can perform all four of the aforementioned reproductive manipulations and this bacterium is found in approximately 40% of all arthropods (Zug and Hammerstein, 2014). In addition to being a reproductive manipulator, Wolbachia also spans the rest of the symbiosis spectrum, by being an obligate mutualist for some insects (Hosokawa et al., 2010) and nematodes (Fenn and Blaxter,

2004), a facultative mutualist for others (Brownlie et al., 2009; Kaiser et al., 2010), and a strict reproductive manipulator for others. Bacteria from many different bacterial phyla have evolved similar reproductive manipulations. These include Rickettsia, which can induce male killing

(Werren et al., 1994) and parthenogenesis (Hagimori et al., 2006); Arsenophonus, which can induce male-killing in Nasonia wasps (Skinner, 1985); Spiroplasma, some strains of which cause male killing as well (Hurst et al., 1999); and finally, Cardinium, which has been found to cause three of the reproductive manipulations, all but male killing (Weeks et al., 2001; Zchori-Fein et al., 2001; Hunter et al., 2003).

Candidatus Cardinium hertigii, a member of , infects approximately 7-9% of arthropods (Weeks et al., 2003; Zchori-Fein and Perlman, 2004; Russell et al., 2012), and is characterized both by its striking ultrastructure of parallel arrays of subcellular tube-like structures, and its 16S ribosomal subunit DNA sequence. This symbiont was relatively recently discovered in 1996 (Kurtti et al., 1996), ascribed as the causative agent of several reproductive manipulations in 2001 (Weeks et al., 2001; Zchori-Fein et al., 2001), and named and characterized later in 2004 (Zchori-Fein et al., 2004). Cardinium became known as a prominent reproductive manipulator alongside Wolbachia (Duron, Bouchon, et al., 2008), especially after it became apparent that Cardinium was the only bacterium outside the Alphaproteobacteria to induce CI (Hunter et al., 2003; Takano et al., 2017).

The diversity of Cardinium is not as well-characterized as is that of Wolbachia. Cardinium has been found in a small number of insect orders: Hymenoptera (Zchori-Fein and Perlman, 2004),

Hemiptera (Provencher et al., 2005), Diptera (Nakamura et al., 2009) and Thysanoptera (Nguyen et al., 2017), but the center of this genus’ diversity appears to lies in non-insect arthropods, particularly Arachnida, including Acari, Araneae, and Opiliones (Gotoh et al., 2007; Duron,

Hurst, et al., 2008; Wu and Hoy, 2012). Molecular phylogenetic studies of the genus have used sequence 16S rDNA alone (Chang et al., 2010) or 16S rDNA and Gyrase B sequence (Nakamura et al., 2009; Russell et al., 2012) and more genetic resources are needed for better strain differentiation and phylogenetic resolution. Also, only two Cardinium genomes have been published (Penz et al., 2012; Santos-Garcia et al., 2014) and the genetic bases for the reproductive manipulations Cardinium is able to induce are not known, although comparative genomics of the published CI strain, cEper1, shows that very few eukaryotic interacting proteins have shared homology with Wolbachia (Penz et al., 2012). The current leading candidates genes for Wolbachia CI, cifA and cifB (Beckmann et al., 2017; LePage et al., 2017) are missing in the cEper1 genome, further supporting an independent origin of CI. The similarities in lifestyle and induced effects between these unrelated symbionts generates an powerful comparative system to further investigate how these reproductive manipulations have evolved and the possible functional and genetic mechanisms that underlie these manipulations.

In my first chapter, I investigated the evolution of Cardinium by developing genetic resources for strain designation and phylogenetic analysis. I developed primers for four different loci that would amplify sequence from a diverse set of samples of Cardinium sent from all around the world and in many insect and arthropod hosts. Through an iterative process of primer design and sequencing, DNA sequence was garnered for the full panel of samples. The primers developed serve as the first multi locus sequence typing (MLST) scheme for Cardinium. Sequences from these loci aided in delineating different strains of Cardinium and showed their evolutionary relationships, and the primers will allow better characterization of newly discovered Cardinium strains in the future. The resulting phylogenetic trees are the first step towards understanding the evolution of this diverse genus and the reproductive manipulations it is able to induce.

In my second chapter, I sequenced, assembled and analyzed four draft genomes of Cardinium residing in Encarsia wasps. Two of these Cardinium genomes are associated with parthenogenesis induction (PI) and at least one of the other two is associated with cytoplasmic incompatibility (CI). I compared these genomes with the two sequenced genomes already in the literature, one that causes CI in Encarsia (cEper1: Penz et al. 2012) and one that is apparently asymptomatic in the whitefly Bemisia tabaci (cBtQ1: Santos-Garcia et al. 2014) both to learn more about Cardinium’s intracellular lifestyle and to uncover candidate genes for Cardinium- induced CI and PI.

In my third chapter, I addressed a problem that stands in the way of efficient sequencing of symbionts within small arthropods: getting enough bacterial DNA and at concentrations rich enough to generate high coverage of bacterial genomes. I revised the DNA extraction protocol from Penz et al. (2012) and successfully developed a protocol for enriched, high molecular weight DNA of symbionts. The DNA generated from this protocol can be used for both short read sequencing technology (e.g. Illumina HiSeq or MiSeq) as well as for long read sequencing technology, such as PacBio.

CHAPTER 1. Development of a multi-locus sequence typing system helps reveal the evolution of Cardinium hertigii, a reproductive manipulator symbiont of insects

Appendix A

Cardinium hertigii (Bacteroidetes) is an intracellular, bacterial symbiont of many arthropods.

This symbiont has the ability to induce several reproductive manipulations in its hosts (Duron,

Bouchon, et al., 2008) thereby fundamentally altering the host’s ecology and evolution (Moran et al., 2008). In the first chapter I developed genetic tools for a better understanding of the evolution of Cardinium as a secondary symbiont of arthropods and as a reproductive manipulator.

I designed four general primer sets for housekeeping genes that were relatively conserved relative to the sister taxon of Cardinium, Amoebophilus asiaticus, to generate a multi locus sequence typing (MLST) system for Cardinium. The primers were based on Cardinium sequence from a broad array of host taxa, sent to us from all over the world. These primers amplify 450-

700 basepairs of each of the four Cardinium genes. The sequences generated with these primer sets were used to better identify and delineate Cardinium strains and to infer relationships between Cardinium strains. From the phylogeny generated in this paper, I found clear support for a clade of Cardinium in Opiliones (Leiobunum) as another major group of Cardinium, as was previously suggested with a 16S phylogeny (Chang et al., 2010). Further, I found that the reproductive manipulations Cardinium induces are fairly phylogenetically labile– the scattered pattern of reproductive manipulations across the tree, as well as closely related strains displaying different induced phenotypes indicated that perhaps a mobile genetic element may spread these effector genes in Cardinium, as it has been suggested in Wolbachia. There was also some support for the hypothesis that Cardinium tends to either form clades in closely related arthropod hosts, or host-parasitoid associated clades, posing the question of whether Cardinium is limited by ecological opportunity to switch hosts or by strict adaptation to a specific host. The development of primers for four genes across the diversity of Cardinium was designed to form the basis of a Multi Locus Sequence Typing (MLST) scheme for adoption by Cardinium workers to further our understanding of the evolution of this diverse genus in future studies.

I received help from Suzanne Kelly; she sent a number of samples for Sanger sequencing that I had prepared with PCR. I received gene sequences of the Heterodera glycines Cardinium from

S. Bekal, in the laboratory of K. N. Lambert. I received Cardinium samples from researchers all over the world: H. Noda, T. Kurtti, R. Stouthamer, M. Hoy, J. Werren, M. Riegler, and H.

Breeuwer. Martha Hunter helped with writing, editing, and experimental design.

CHAPTER 2: Comparative genomics of Cardinium reveals new candidates for genes underlying

Cardinium-induced reproductive manipulations.

Appendix B

To further understand the evolution, genomic architecture, and reproductive manipulations of

Cardinium, we sequenced four Cardinium genomes residing in Encarsia wasps and analyzed them with two genomes that were already sequenced in the literature. The strains chosen were in the same clade as the sequenced strains, and resided in Encarsia species in laboratory culture. Of the newly sequenced genomes, two strains are associated with parthenogenesis, where

Cardinium turns genetic males into genetic and functional females, and at least one of two co- infecting the same host species is associated with causing cytoplasmic incompatibility, a failure to produce offspring between infected males and uninfected females. The goal of this study was to compare closely related genomes with different phenotypes, and to build on a transcriptomic study of the CI-causing cEper1 Cardinium gene expression in males and females (Mann et al., unpublished), which proposed some candidate CI genes and eukaryotic interacting genes. In addition, this study allowed us to examine in more detail the patterns uncovered from the first chapter, specifically a hypothesis for the malleability of reproductive manipulations in

Cardinium and whether the strict host association pattern of Cardinium is more due to ecological opportunity or genomic capability.

We compared candidate Cardinium CI genes from the sequenced cEper1 with the two other possible CI genomes. One gene of interest was a gene with a ubiquitin ligase domain that was highly expressed in the Mann et al. (unpublished) Cardinium CI transcriptome study. This gene was particularly interesting because a gene recently associated with Wolbachia CI, was also purported to target the host ubiquitination system. We found, however, that this gene was not present in either of the two other potential CI genomes, but some other genes that were highly expressed in the transcriptome study as well as other eukaryotic interacting genes were common to the other potential CI genomes. Interestingly, the potential CI genomes were the only strains to share a complete biotin synthesis cassette, perhaps suggesting a role of this cofactor in reproductive manipulation. These genes could be subject to future functional tests of their involvement in CI. For PI, we uncovered a number of potential candidates that were more highly conserved in the two PI strains than one would expect based on their evolutionary relatedness.

These included several hypothetical proteins as well as one ankyrin domain-containing protein.

We also found that between the strains that doubly infect the wasp Encarsia inaron (IT), the plasmids are more similar than we would expect from relatedness alone. This result may indicate the potential for plasmid recombination between strains that co-infect the same host.

Additionally, we found that two strains infecting hosts in different orders of insects (whiteflies in the Hemiptera and parasitic wasps in the Hymenoptera) were highly similar, dispensing with the hypothesis that large suites of genes are needed to specifically adapt to one host type or another.

Suzanne Kelly reared the wasps for the initial whole wasp extractions of this project, and

Stephan Schmitz-Esser and Evelyne Mann performed the whole wasp extractions for the HiSeq sequencing. I received bioinformatics advice from Dan Deblasio, Pedro Rodrigues, Piotr

Łukasik, Dario Copetti, and the UA HPC Consulting team. Stephan Schmitz-Esser assisted greatly with the advice on the annotation of these genomes and on planning the paper format.

Martha Hunter helped with writing, editing, and experimental design.

CHAPTER 3. Enrichment of low-density symbionts from minute insects

Appendix C

After the initial sequencing of whole wasps and their Cardinium symbionts, we noticed a strange pattern of many duplicated genes in the Encarsia inaron (IT) sample. I separated the duplicated genes into the “high coverage” and “low coverage” sets, and we had two hypotheses to explain these results: 1. This wasp is doubly infected with strains of Cardinium present at differing densities. 2. A nearly complete Cardinium genome is integrated into the host genome of this wasp. To distinguish between these hypotheses and to obtain more Cardinium sequence data for better genome assembly, I greatly modified a symbiont DNA extraction protocol from Penz et al.

(2012) to obtain high molecular weight, symbiont-enriched DNA for long and short read sequencing applications. The protocol I designed increased the proportion of bacterial DNA by

100 fold. Additionally, template using this extraction protocol was sequenced with PacBio to accumulate higher concentrations of Cardinium DNA, which aided in scaffolding our short read data. The enrichment method uncovered the double infection of E. inaron (IT), the first double infection of Cardinium reported so far to my knowledge, and generated enough coverage of these two co-infecting genomes, as well as the cEper2 genome, to confidently report their genic contents.

Both Suzanne Kelly and Matthew Doremus helped with certain critical steps when several extractions needed to be done in quick succession and with a portion of the wasp rearing. Martha

Hunter helped with writing, editing, and experimental design. CONCLUSION

Infections of facultative, maternally inherited, intracellular ("secondary") symbionts, like

Cardinium hertigii, are prevalent among arthropods (Zchori-Fein and Perlman, 2004; Russell et al., 2012; Zug and Hammerstein, 2012), and their sex ratio or fitness distorting effects can have profound ecological and evolutionary consequences for their host populations. Knowledge of secondary symbionts and their effects on their hosts is critical to clearly understanding their hosts’ biology as well as to better understand host-microbial interactions generally. Cardinium is known to cause reproductive manipulations in some insect hosts, but in others the phenotype is unknown. Cardinium infection is widespread particularly in arachnid orders Acari, Araneae, and

Opiliones, and very little is known about the effects of Cardinium in these hosts, just as little is known about the evolution of this diverse genus as well as the genes responsible for the induction of reproductive manipulations.

In my dissertation, I generated genetic tools to uncover more of the diversity of Cardinium and put the diversity into an evolutionary framework. The phylogeny revealed a number of resolved clades, including an additional group associated with Opiliones hosts. The primers for the first

Cardinium MLST will enable Cardinium researchers to better distinguish and characterize newly discovered Cardinium strains. Comparing the evolution of Cardinium with that of another prominent reproductive manipulator, Wolbachia, generated some hypotheses to further investigate with a deeper look into the genomes of a clade of Cardinium that resides in both

Encarsia wasps and their whitefly hosts, Bemisia tabaci. The comparative genome analysis uncovered the potential role of the plasmid of Cardinium as a possible agent for exchanging

DNA between strains. Analyses of shared genes revealed many candidate genes for two of the reproductive manipulations Cardinium is able to induce in this genus of parasitic wasp. The analyses also downgraded the likelihood that other candidates, unique to the first sequenced CI strain cEper1, were important in causing Cardinium-induced CI.

Lastly, a method I developed for enriching symbiont DNA in the minute Encarsia wasps proved critical for differentiating two co-infecting Cardinium strains in a single wasp, E. inaron (IT), and for providing sufficient high quality Cardinium DNA for long-read PacBio sequencing. This method builds on simple, existing extraction methods of filtration and centrifugation to favor bacterial DNA over eukaryotic DNA and could be used on a number of a genomes of symbionts that have been neglected to date because of the size of their hosts, for example mites, lice, thrips, and biting midges.

Together, the chapters in this dissertation have the potential to make it easier to study the diverse and intriguing symbiont lineage Cardinium, and to attract new researchers to study this symbiont. The MLST study provides the symbiont community with tools to assess the broader diversity of Cardinium as well as to better characterize individual strains. The genome chapter sets the stage for more targeted work on Encarsia wasps on the candidate Cardinium reproductive manipulation genes found in my genome analyses, as well as for parallel genomic and genetic studies on strains in other host lineages. Lastly, the new extraction protocol will allow researchers to more efficiently extract and concentrate symbiont DNA from small, non- model organisms for genome sequencing.

REFERENCES

Ayala-Castro C, Saini A, Outten FW (2008). Fe-S Cluster Assembly Pathways in

Bacteria. Microbiol Mol Biol Rev 72: 110–125.

Beckmann JF, Ronau JA, Hochstrasser M (2017). A Wolbachia deubiquitylating enzyme

induces cytoplasmic incompatibility. Nat Microbiol 2: 17007.

Bordenstein SR, Bordenstein SR (2016). Eukaryotic association module in phage WO

genomes from Wolbachia. Nat Commun 7: 13155.

Brown AMV, Wasala SK, Howe DK, Peetz AB, Zasada IA, Denver DR (2016). Genomic

evidence for plant-parasitic nematodes as the earliest Wolbachia hosts. Sci Rep 6:

34955.

Brownlie JC, Cass BN, Riegler M, Witsenburg JJ, Iturbe-Ormaetxe I, McGraw E a, et al.

(2009). Evidence for metabolic provisioning by a common invertebrate

endosymbiont, Wolbachia pipientis, during periods of nutritional stress. PLoS

Pathog 5: e1000368.

Burke GR, Moran NA (2011). Massive genomic decay in Serratia symbiotica, a recently

evolved symbiont of aphids. Genome Biol Evol 3: 195–208.

Cariou M, Duret L, Charlat S (2017). The global impact of Wolbachia on mitochondrial

diversity and evolution. bioRxiv.

Caspi-Fluger A, Inbar M, Mozes-Daube N, Mouton L, Hunter MS, Zchori-Fein E (2011).

Rickettsia ‘In’ and ‘Out’: two different localization patterns of a bacterial symbiont

in the same insect species. PLoS One 6: e21096.

Chang J, Masters A, Avery A, Werren JH (2010). A divergent Cardinium found in daddy

long-legs (Arachnida: Opiliones). J Invertebr Pathol 105: 220–227.

23 Clark MA, Moran NA, Baumann P, Wernegreen JJ (2000). Cospeciation between

bacterial endosymbionts (Buchnera) and a recent radiation of aphids (Uroleucon)

and pitfalls of testing for phylogenetic congruence. Evolution 54: 517–525.

Dedeine F, Bouletreau M, Vavre F (2005). Wolbachia requirement for oogenesis:

occurrence within the genus Asobara (Hymenoptera, Braconidae) and evidence for

intraspecific variation in A. tabida. Heredity 95: 394–400.

Dedeine F, Vavre F, DeWayne Shoemaker D, Boulétreau M, Elena S (2004). Intra-

individual coexistence of a Wolbachia strain required for host oogenesis with two

strains inducing cytoplasmic incompatibility in the wasp Asobara tabida. Evolution

58: 2167–2174.

Douglas AE (1998). Nutritional interactions in insect-microbial symbioses: aphids and

their symbiotic bacteria Buchnera. Annu Rev Entomol 43: 17–37.

Duron O, Bouchon D, Boutin S, Bellamy L, Zhou L, Engelstädter J, et al. (2008). The

diversity of reproductive parasites among arthropods: Wolbachia do not walk alone.

BMC Biol 6: 27.

Duron O, Hurst GDD, Hornett E a, Josling JA, Engelstädter J (2008). High incidence of

the maternally inherited bacterium Cardinium in spiders. Mol Ecol 17: 1427–37.

Edgar RC (2004). MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res 32: 1792–1797.

Edlund A, Ek K, Breitholtz M, Gorokhova E (2012). Antibiotic-induced change of

bacterial communities associated with the copepod Nitocra spinipes. PLoS One 7:

e33107.

Engel P, Moran NA (2013). The gut microbiota of insects – diversity in structure and

24 function. FEMS Microbiol Rev 37: 699–735.

Engelstädter J, Hurst GDD (2009). The ecology and evolution of microbes that

manipulate host reproduction. Annu Rev Ecol Evol Syst 40: 127–149.

Fang YW, Liu LY, Zhang HL, Jiang DF, Chu D (2014). Competitive ability and fitness

differences between two introduced populations of the invasive whitefly Bemisia

tabaci Q in China. PLoS One 9.

Fayet O, Ziegelhoffer T, Georgopoulos C (1989). The groES and groEL heat shock gene

products of Escherichia coli are essential for bacterial growth at all temperatures. J

Bacteriol 171: 1379–1385.

Fenn K, Blaxter M (2004). Are filarial Wolbachia obligate mutualist

symbionts? Trends Ecol Evol 19: 163–6.

Ferri E, Bain O, Barbuto M, Martin C, Lo N, Uni S, et al. (2011). New insights into the

evolution of Wolbachia infections in filarial nematodes inferred from a large range

of screened species. PLoS One 6: e20843.

Gebiola M, Kelly SE, Hammerstein P, Giorgini M, Hunter MS (2016). ‘Darwin’s

corollary’ and cytoplasmic incompatibility induced by Cardinium may contribute to

speciation in Encarsia wasps (Hymenoptera: Aphelinidae). Evolution 70: 2447–

2458.

Gebiola M, White JA, Cass BN, Kozuch A, Harris LR, Kelly SE, et al. (2016). Cryptic

diversity, reproductive isolation and cytoplasmic incompatibility in a classic

biological control success story. Biol J Linn Soc 117: 217–230.

Gotoh T, Noda H, Hong X-Y (2003). Wolbachia distribution and cytoplasmic

incompatibility based on a survey of 42 spider mite species (Acari: Tetranychidae)

25 in Japan. Heredity 91: 208–216.

Gotoh T, Noda H, Ito S (2007). Cardinium symbionts cause cytoplasmic incompatibility

in spider mites. Heredity 98: 13–20.

Groot TVM, Breeuwer JAJ (2006). Cardinium symbionts induce haploid thelytoky in

most clones of three closely related Brevipalpus species. Exp Appl Acarol 39: 257–

71.

Hagimori T, Abe Y, Date S, Miura K (2006). The first finding of a Rickettsia bacterium

associated with parthenogenesis induction among insects. Curr Microbiol 52: 97–

101.

Hoffmann AA, Clancy D, Duncan J (1996). Naturally-ocurring Wolbachia infection in

Drosophila simulans that does not cause cytoplasmic incompatibility. Heredity : 1–

8.

Hosokawa T, Koga R, Kikuchi Y, Meng X-Y, Fukatsu T (2010). Wolbachia as a

bacteriocyte-associated nutritional mutualist. Proc Natl Acad Sci USA 107: 769–74.

Huigens ME, de Almeida RP, Boons PAH, Luck RF, Stouthamer R (2004). Natural

interspecific and intraspecific horizontal transfer of parthenogenesis–inducing

Wolbachia in Trichogramma wasps. Proc R Soc London Ser B Biol Sci 271: 509–

515.

Hunter MS, Perlman SJ, Kelly SE (2003). A bacterial symbiont in the Bacteroidetes

induces cytoplasmic incompatibility in the parasitoid wasp Encarsia pergandiella.

Proc R Soc Lond B 270: 2185–90.

Hurst GDD, von der Schulenburg JHG, Majerus TMO, Bertrand D, Zakharov IA,

Baungaard J, et al. (1999). Invasion of one insect species, Adalia bipunctata, by two

26 different male-killing bacteria. Insect Mol Biol 8: 133–139.

Jiggins FM, Bentley JK, Majerus MEN, Hurst GDD (2002). Recent changes in phenotype

and patterns of host specialization in Wolbachia bacteria. Mol Ecol 11: 1275–1283.

Kaiser W, Huguet E, Casas J, Commin C, Giron D (2010). Plant green-island phenotype

induced by leaf-miners is mediated by bacterial symbionts. Proc Biol Sci 277: 2311–

9.

Koressaar T, Remm M (2007). Enhancements and modifications of primer design

program Primer3. Bioinformatics 23: 1289–1291.

Kurtti TJ, Munderloh UG, Andreadis TG, Magnarelli L a, Mather TN (1996). Tick cell

culture isolation of an intracellular prokaryote from the tick Ixodes scapularis. J

Invertebr Pathol 67: 318–21.

LePage DP, Metcalf JA, Bordenstein SR, On J, Perlmutter JI, Shropshire JD, et al.

(2017). Prophage WO genes recapitulate and enhance Wolbachia-induced

cytoplasmic incompatibility. Nature 543: 243–247.

Lewis SE, Rice A, Hurst GDD, Baylis M (2014). First detection of endosymbiotic

bacteria in biting midges Culicoides pulicaris and Culicoides punctatus, important

Palaearctic vectors of bluetongue virus. Med Vet Entomol 28: 453–456.

Ley RE, Lozupone CA, Hamady M, Knight R, Gordon JI (2008). Worlds within worlds:

evolution of the vertebrate gut microbiota. Nat Rev Microbiol 6.

Maddison W, Maddison D (2010). Mesquite 2. Manual: 1–258.

Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE, Urwin R, et al. (1998).

Multilocus sequence typing: a portable approach to the identification of clones

within populations of pathogenic microorganisms. Proc Natl Acad Sci USA 95:

27 3140–5.

Manzano-Marín A, Lamelas A, Moya A, Latorre A (2012). Comparative genomics of

Serratia spp.: two paths towards endosymbiotic life. PLoS One 7: e47274.

McFall-Ngai M, Hadfield MG, Bosch TCG, Carey H V, Domazet-Lošo T, Douglas AE,

et al. (2013). Animals in a bacterial world, a new imperative for the life sciences.

Proc Natl Acad Sci 110: 3229–3236.

Messing J (1983). New M13 Vectors for Cloning. Methods Enzymol 101: 20–78.

Moran NA, McCutcheon JP, Nakabachi A (2008). Genomics and evolution of heritable

bacterial symbionts. Annu Rev Genet 42: 165–90.

Nakamura Y, Kawai S, Yukuhiro F, Ito S, Gotoh T, Kisimoto R, et al. (2009). Prevalence

of Cardinium bacteria in planthoppers and spider mites and taxonomic revision of

‘Candidatus Cardinium hertigii’ based on detection of a new Cardinium group from

biting midges. Appl Environ Microbiol 75: 6757–63.

Nakamura Y, Yukuhiro F, Matsumura M, Noda H (2012). Cytoplasmic incompatibility

involving Cardinium and Wolbachia in the white-backed planthopper Sogatella

furcifera (Hemiptera: Delphacidae). Appl Entomol Zool 47: 273–283.

Nguyen DT, Morrow JL, Spooner-Hart RN, Riegler M (2017). Independent cytoplasmic

incompatibility induced by Cardinium and Wolbachia maintains endosymbiont

coinfections in haplodiploid thrips populations. Evolution 71: 995–1008.

Nikoh N, Hosokawa T, Moriyama M, Oshima K, Hattori M, Fukatsu T (2014).

Evolutionary origin of insect-Wolbachia nutritional mutualism. Proc Natl Acad Sci

USA 111.

Noel GR, Atibalentja N (2006). ‘Candidatus Paenicardinium endonii’, an endosymbiont

28 of the plant-parasitic nematode Heterodera glycines (Nemata: Tylenchida), affiliated

to the phylum Bacteroidetes. Int J Syst Evol Microbiol 56: 1697–702.

Nováková E, Hypša V, Moran NA (2009). Arsenophonus, an emerging clade of

intracellular symbionts with a broad host distribution. BMC Microbiol 9: 143.

O’Neill SL, Hoffmann AA, Werren JH (1997). Influential passengers: inherited

microorganisms and arthropod reproduction (SL O’Neill, AA Hoffmann, and JH

Werren, Eds.). Oxford University Press: Oxford, UK.

Oliver KM, Russell JA, Moran NA, Hunter MS (2003). Facultative bacterial symbionts in

aphids confer resistance to parasitic wasps. Proc Natl Acad Sci USA 100: 1803–

1807.

Partridge SR (2011). Analysis of antibiotic resistance regions in Gram-negative bacteria.

FEMS Microbiol Rev 35: 820–855.

Penz T, Schmitz-Esser S, Kelly SE, Cass BN, Müller A, Woyke T, et al. (2012).

Comparative genomics suggests an independent origin of cytoplasmic

incompatibility in Cardinium hertigii. PLoS Genet 8: e1003012.

Perlman SJ, Kelly SE, Zchori-Fein E, Hunter MS (2006). Cytoplasmic incompatibility

and multiple symbiont infection in the ash whitefly parasitoid, Encarsia inaron. Biol

Control 39: 474–480.

Perlman SJ, Magnus SA, Copley CR (2010). Pervasive associations between Cybaeus

spiders and the bacterial symbiont Cardinium. J Invertebr Pathol 103: 150–5.

Posada D (2008). jModelTest: Phylogenetic model averaging. Mol Biol Evol 25: 1253–

1256.

Provencher LM, Morse GE, Weeks AR, Benjamin B. Normark (2005). Parthenogenesis

29 in the Aspidiotus nerii complex (Hemiptera:Diaspididae): a single origin of a

worldwide, polyphagous lineage associated with Cardinium bacteria. Ann Entomol

Soc Am 98: 629–635.

Reece RJ, Maxwell A (1991). DNA gyrase: structure and function. Crit Rev Biochem Mol

Biol 26: 335–375.

Ronquist F, Huelsenbeck JP (2003). MrBayes 3: Bayesian phylogenetic inference under

mixed models. Bioinformatics 19.

Roush RT, Hoy MA (1981). Genetic improvement of _Metaseiulus occidentalis_:

selection with methomyl, dimethoate, and carbaryl and genetic analysis of carbaryl

resistance. J Econ Entomol 74: 138–141.

Russell JA, Funaro CF, Giraldo YM, Goldman-Huertas B, Suh D, Kronauer DJC, et al.

(2012). A veritable menagerie of heritable bacteria from ants, butterflies, and

beyond: broad molecular surveys and a systematic review. PLoS One 7: e51027.

Russell JE, Stouthamer R (2011). The genetics and evolution of obligate reproductive

parasitism in Trichogramma pretiosum infected with parthenogenesis-inducing

Wolbachia. Heredity 106: 58–67.

Santos-Garcia D, Rollat-Farnier PA, Beitia F, Zchori-Fein E, Vavre F, Mouton L, et al.

(2014). The genome of Cardinium cBtQ1 provides insights into genome reduction,

symbiont motility, and its settlement in Bemisia tabaci. Genome Biol Evol 6: 1013–

30.

Shoemaker DD, Katju V, Jaenike J (1999). Wolbachia and the evolution of reproductive

isolation between Drosophila recens and Drosophila subquinaria. Evolution 53:

1157–1164.

30 Shoji S, Walker SE, Fredrick K (2009). Ribosomal translocation: One step closer to the

molecular mechanism. ACS Chem Biol 4: 93–107.

Skinner SWAY (1985). Son-Killer: A third extrachromosomal factor affecting the sex

ratio in the parasitoid wasp, Nasonia (=Mormoniella) vitripennis. Genetics 109:

745–759.

Stokes HW, Gillings MR (2011). Gene flow, mobile genetic elements and the recruitment

of antibiotic resistance genes into Gram-negative pathogens. FEMS Microbiol Rev

35: 790–819.

Stouthamer R, Breeuwer JA, Hurst GD (1999). Wolbachia pipientis: microbial

manipulator of arthropod reproduction. Annu Rev Microbiol 53: 71–102.

Stouthamer R, Luck RF (1991). Transition from Bisexual to Unisexual Cultures in

Encarsia-Perniciosi (Hymenoptera, Aphelinidae) - New Data and A

Reinterpretation. Ann Entomol Soc Am 84: 150–157.

Takano S, Tuda M, Takasu K, Furuya N, Imamura Y, Kim S, et al. (2017). Unique clade

of alphaproteobacterial endosymbionts induces complete cytoplasmic

incompatibility in the coconut beetle. Proc Natl Acad Sci 114: 6110–6115.

Tsuchida T, Koga R, Horikawa M, Tsunoda T, Maoka T, Matsumoto S, et al. (2010).

Symbiotic bacterium modifies aphid body color. Science (80- ) 330: 1102–4.

Turelli M (1994). Evolution of incompatibility-inducing microbes and their hosts.

Evolution 48: 1500–1513.

Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. (2012).

Primer3 — new capabilities and interfaces. 40: 1–12.

Weeks a R, Marec F, Breeuwer JA (2001). A mite species that consists entirely of

31 haploid females. Science 292: 2479–82.

Weeks AR, Velten R, Stouthamer R (2003). Incidence of a new sex-ratio-distorting

endosymbiotic bacterium among arthropods. Proc Biol Sci 270: 1857–65.

Werren JH (1997). Biology of Wolbachia. Annu Rev Entomol 42: 587–609.

Werren JH, Hurst GD, Zhang W, Breeuwer JA, Stouthamer R, Majerus ME (1994).

Rickettsial relative associated with male killing in the ladybird beetle (Adalia

bipunctata). J Bacteriol 176: 388–394.

White JA, Kelly SE, Perlman SJ, Hunter MS (2009). Cytoplasmic incompatibility in the

parasitic wasp Encarsia inaron: disentangling the roles of Cardinium and Wolbachia

symbionts. Heredity 102: 483–489.

Wu K, Hoy MA (2012). Cardinium is associated with reproductive incompatibility in the

predatory mite Metaseiulus occidentalis (Acari: Phytoseiidae ). J Invertebr Pathol

110: 359–365.

Zchori-Fein E, Gottlieb Y, Kelly SE, Brown JK, Wilson JM, Karr TL, et al. (2001). A

newly discovered bacterium associated with parthenogenesis and a change in host

selection behavior in parasitoid wasps. Proc Natl Acad Sci USA 98: 12555–60.

Zchori-Fein E, Perlman SJ (2004). Distribution of the bacterial symbiont Cardinium in

arthropods. Mol Ecol 13: 2009–2016.

Zchori-Fein E, Perlman SJ, Kelly SE, Katzir N, Hunter MS (2004). Characterization of a

’Bacteroidetes’ symbiont in Encarsia wasps (Hymenoptera: Aphelinidae): proposal

of ‘Candidatus Cardinium hertigii’. Int J Syst Evol Microbiol 54: 961–968.

Zhang X, Zhao D, Hong X (2012). Cardinium—the leading factor of cytoplasmic

incompatibility in the planthopper Sogatella furcifera doubly infected with

32 Wolbachia and Cardinium. Environ Entomol 41: 833–840.

Zug R, Hammerstein P (2012). Still a host of hosts for Wolbachia: analysis of recent data

suggests that 40% of terrestrial arthropod species are infected. PLoS One 7: e38544.

Zug R, Hammerstein P (2014). Bad guys turned nice? A critical assessment of Wolbachia

mutualisms in arthropod hosts. Biol Rev Camb Philos Soc 49.

33 APPENDIX A

Development of a multi-locus sequence typing system helps reveal the evolution of

Cardinium hertigii, a reproductive manipulator symbiont of insects

Corinne M Stouthamer, Suzanne Kelly, Sadia Bekal, Kris N Lambert, Evelyne Mann,

Stephan Schmitz-Esser, Martha S Hunter

34 ABSTRACT

Cardinium is an intracellular bacterial symbiont in the phylum Bacteroidetes that is found in many different species of arthropods and some nematodes. This symbiont is known to be able to induce three reproductive phenotypes, including cytoplasmic incompatibility.

Placing individual strains of Cardinium within a larger evolutionary context has been challenging because only two, relatively slowly evolving genes, 16S rDNA and Gyrase

B, have been used to generate phylogenetic trees, and consequently, the relationship of different strains has been elucidated in only its roughest form. We developed a Multi

Locus Sequence Typing (MLST) system that provides researchers with three new genes in addition to Gyrase B for inferring phylogenies and delineating strains for Cardinium.

From our Cardinium phylogeny, we confirmed the presence of a new group D, a

Cardinium clade that resides in harvestmen (Opiliones). Many Cardinium clades appear to display a high degree of host affinity, while some show evidence of host shifts to phylogenetically distant hosts, likely associated with ecological opportunity. Like the unrelated reproductive manipulator Wolbachia, the Cardinium phylogeny also shows no clear phylogenetic signal associated with particular reproductive manipulations.

35 1. INTRODUCTION

The life histories and evolution of many multicellular organisms are intimately entwined with the microbes they carry (McFall-Ngai et al., 2013). A large number of arthropods carry maternally inherited, intracellular bacterial symbionts that can affect their host’s reproductive outcomes in both detrimental and beneficial ways (O’Neill et al., 1997;

Moran et al., 2008). These symbionts come from various bacterial phyla, but are categorized based on their associations with their hosts. Primary (or obligate) symbionts complement their hosts’ diet with essential amino acids or other limiting nutrients, are often housed in specialized structures, and are essential to their host’s reproduction

(reviewed in Moran et al., 2008). Secondary (or facultative) symbionts, though largely unnecessary for successful host reproduction, can provide conditional benefits to their host, have no measurable effect, or manipulate their host’s reproduction in ways that increase the spread of the symbiont (Werren, 1997; Stouthamer et al., 1999; Zug and

Hammerstein, 2014).

Symbiont phylogenies may offer clues to the relationship between the symbionts and their hosts. For instance, primary symbionts, such as Buchnera in their aphid hosts, display congruent phylogenies (Clark et al., 2000), indicating the long evolutionary history and cospeciation of these groups. Secondary symbionts generally have shorter term associations with their hosts and may occur at intermediate frequencies within the host population (Moran et al., 2008). The evolutionary phylogenies of secondary symbionts generally display many host switches and are non-congruent with the host phylogenies (e.g. Nováková et al. 2009). Genera of bacteria commonly thought of as

36 secondary symbionts may also include lineages of primary symbionts in their midst, as with Serratia symbiotica in aphids (Burke and Moran, 2011; Manzano-Marín et al.,

2012). Even the best known secondary symbiont, Wolbachia, a notorious host switcher, contains a clade of symbionts that display congruent evolution and co-cladogensis in their obligatory symbiosis with nematodes (Fenn and Blaxter, 2004; Ferri et al., 2011) as well as a lineage that is required for B-vitamin production in bedbugs (Nikoh et al., 2014).

These patterns show that different strains within one group of secondary symbionts can differ dramatically in their relationships with their hosts.

While transitions from secondary to obligate symbiosis may be apparent in phylogenies, as shown by host and symbiont phylogenetic congruence, subtler facets of secondary symbiont life histories may also be elucidated by a well-resolved phylogeny. Horizontal transmission of secondary symbionts between hosts is key to the secondary symbiont lifestyle, yet these transmission events are rarely captured in experiments (see exceptions in Huigens et al. (2004) and Caspi-Fluger et al. (2011)), and are likely to happen infrequently in nature. Phylogenies are currently the most powerful tools we have to describe these host switches. Well resolved phylogenies may also elucidate co- cladogenesis over a short evolutionary time scale, which can occur when a reproductive manipulator in essence “hijacks” a key reproductive function of their host, creating host- symbiont dependency (Dedeine et al., 2004, 2005; Zug and Hammerstein, 2014). In this paper, we explore evolution of the secondary symbiont of arthropods, Cardinium hertigii

(Bacteroidetes), and address questions concerning horizontal transmission and the evolution of reproductive manipulations with a well-resolved phylogeny.

37

Cardinium hertigii, a member of Bacteroidetes, infects approximately 7-9% of arthropods

(Weeks et al., 2003; Zchori-Fein et al., 2004; Russell et al., 2012) as well as at least one lineage of the plant parasitic nematode, Heterodera glycines (Noel and Atibalentja,

2006). Although it infects many insects, particularly members of Hymenoptera and

Hemiptera, much of the diversity of this genus as described so far appears to lie in arachnids, such as mites, spiders, and harvestmen (Nakamura et al., 2009; Chang et al.,

2010; Russell et al., 2012). Although the phenotype of Cardinium in many hosts is unknown, it has been shown to manipulate host reproduction in insects and mites, and rivals Wolbachia in its versatility. Strains of Cardinium induce at least three reproductive manipulations: parthenogenesis, feminization, and cytoplasmic incompatibility (CI). In symbiont-induced parthenogenesis, genetic males turn into genetic females during embryogenesis. Parthenogenesis has been associated with Cardinium infection in several parasitoid wasps in the genus Encarsia (Zchori-Fein et al., 2001, 2004) and with the oleander scale, Aspidiotus nerii (Provencher et al., 2005). In feminization, as has been shown in Brevipalpus mites, Cardinium causes infected genetic males to be converted into functional females (Groot and Breeuwer, 2006). Finally, Cardinium is able to induce cytoplasmic incompatibility in several wasps, mites, and planthoppers (Perlman et al.,

2006; Gotoh et al., 2007; White et al., 2009; Nakamura et al., 2012; Wu and Hoy, 2012;

Zhang et al., 2012), where infected females produce both male and female offspring, but uninfected females mated with infected males produce few or no offspring (in diploid systems) or few or no daughters (in haplodiploid systems). Of all reproductive manipulators, so far only Cardinium, Wolbachia, and a recently discovered Alpha-

38 Proteobacterium have been found to induce CI, although genomic evidence of the

Cardinium strain cEper1, found in the parasitic wasp Encarsia suzannae, suggest that at least Wolbachia and Cardinium independently evolved this trait (Penz et al., 2012;

Takano et al., 2017). In addition to the reproductive manipulations, Cardinium has been shown to affect other host fitness traits as well. In the planthopper Sogatella furcifera,

Cardinium infection is associated with faster nymphal developmental times (Zhang et al.,

2012) and in the parasitoid wasp Encarsia inaron, Cardinium infection is associated with increased longevity of female wasps (White et al. 2009).

Despite the diverse impacts Cardinium can have on key aspects of its host’s survival and reproduction, few resources have been devoted towards developing better genetic tools for assessing the evolutionary history of this genus, leaving open some intriguing questions about the symbiont’s evolution and ecological interactions with its hosts. Some of the enduring mysteries involving secondary symbionts, and Cardinium in particular, are how these reproductive manipulations evolved. For example, are the genes coding for these manipulations largely horizontally transmitted between strains or do they evolve independently, perhaps repeatedly, within lineages? Additionally, Cardinium horizontal transmission rate at a genus-wide level is poorly understood; with poorly resolved phylogenies, it is not clear whether Cardinium displays the same low level of host affinity as most other secondary symbionts, or whether the shorter list of host taxa with which it is associated than, for example, the cosmopolitan Wolbachia, is indicative of fewer host switches among host lineages. We present four sets of primers from single locus housekeeping genes that each amplify 450-700bp of DNA in order to more fully

39 resolve the evolutionary relationships of the divergent Cardinium strains. By providing genetic tools for the community of Cardinium researchers to use to diagnose Cardinium and discriminate among as yet uncharacterized strains, the study provides a framework for future phylogenetic studies of this versatile symbiont.

40 2. METHODS

2.1 Gene selection

Four genes with the highest amino acid identity between the sister group to Cardinium,

Amoebophilus asiaticus, and the sequenced Cardinium strain, cEper1, were chosen to develop a Multi Locus Sequence Typing system (MLST) with other strains (Penz et al.

2012). We did not attempt to choose genes that are evenly spaced around the Cardinium chromosome. While, in more conserved lineages, linkage among loci is often avoided by choosing MLST genes that are evenly spaced (Maiden et al., 1998), in Cardinium there is little shared synteny, even between the two related sequenced genomes, cBtQ1 and cEper1 (Santos-Garcia et al., 2014). In addition to making even spacing of chosen genes unworkable across the genus, the low level of synteny suggests frequent gene rearrangements in this lineage, and a low probability of linkage among loci. The genes selected for this study were: Elongation Factor G, a protein responsible for coordinating the movement of tRNA and mRNA during polypeptide elongation (Shoji et al., 2009); gyrase B, a topoisomerase that unwinds DNA during DNA replication (Reece and

Maxwell, 1991); Iron Sulfur Cluster Assembly Protein (SufB), a protein involved in generating Fe-S complexes mainly involved in electron transfer (Ayala-Castro et al.,

2008); and the Heat shock protein, GroEL, a chaperone protein essential in stress-related responses (Fayet et al., 1989).

2.2 DNA extractions

Arthropods with confirmed Cardinium infections and DNA samples were received from cooperators around the world (Table 2). From Japan (H. Noda), we received planthopper,

41 mite, and biting midge DNA, extracted as described in Nakamura et al. (2009).

Cardinium from the Ixodes cell line ISE6 (T. Kurrti) was processed by shearing the cells and filtering them through a 1.5 um syringe, then extracting the lysate with 3 µl of 20 mg/ml proteinase K and 50 µl of water with 10% w/v chelex beads (White et al., 2009).

Cybeus spiders (S. Perlman) were extracted using Qiagen DNeasy extraction kits. All other samples of alcohol-preserved specimen were also extracted using the proteinase K and chelex beads protocol described.

2.3 Primer Design, PCR, and Sequencing

Primers were iteratively designed as sequenced products from strains were added to sequence alignments. Initially, general primers were designed based on the only two sequenced (and closely related) Cardinium strains (cEper1, cBtQ1) and the sister taxon to

Cardinium, Amoebophilus asiaticus 52a. These initial primers were designed using cEper1 as the reference strand in Primer3 (Koressaar and Remm, 2007; Untergasser et al., 2012) with ambiguities based on the other strains added manually. Not all strains successfully amplified using these initial primers, particularly strains divergent with respect to cEper1 and cBtQ1, such as those in the biting midges, Culicoides spp. In these instances, strain-specific primers were designed once a small segment of the genes was sequenced. These strain-specific primers were then used in conjunction with the initial degenerate primers to obtain more sequence. When more than three bacterial strains were used for primer design, areas of conservation were manually detected and these potential primer regions were checked for hairpins and tendency to form primer dimers in Primer3 against every strain. All primers were selected by minimizing the number of ambiguities

42 and maximizing the number of conserved base pairs in the 3’ primer region, and M13 tags were added to the primers for ease of sequencing (Messing, 1983).

Although the melting temperature varied depending on the primer pair (Table 1), PCR conditions were generally as follows: 15uL reaction volume with New England Biolabs buffer and Taq at 1X concentration, 5mM dNTPs, 0.76 mM MgCl2, 1.1 uM primers with

2uL of DNA. If it was a mite extraction, 4uL of DNA was added. The initial melting temperature was 94oC for 2 minutes; this was followed by 40 cycles of 94oC for 45s, the annealing temperature (Table 1) for 45s, and extension at 68oC for 45s. The final extension was at 68 oC for 7 minutes.

2.4 Phylogenetic analysis

DNA sequences were quality controlled and aligned using CLC Main Workbench 6

(Qiagen) and MUSCLE (Edgar, 2004). jModeltest was used to select the optimum model of evolution based on the Akaike information criterion (Posada, 2008). Bayesian trees were constructed in MrBayes with one million MCMC generations and sampled every

1000 generations (Ronquist and Huelsenbeck, 2003). Maximum likelihood trees were constructed using RaxML with 1000 rapid bootstraps. Both Bayesian and ML methods used the GTR+I+G model of nucleotide evolution with a total of 2145bp from Gyrase B

(gyrB), translation elongation factor G (EF-G), Iron Sulfur cluster assembly protein

(sufB), and heat shock protein (groEL) for each taxon, partitioned by gene and codon position. Phylogenetic tree figures were generated in Mesquite (Maddison and Maddison,

2010)

43 3. RESULTS

3.1 MLST primers

The MLST primers (Table 1) were tested against most of the arthropods in our set of taxa, including members from groups A (the largest arthropod group), C (biting midges in the Culicoides group), and D (Opiliones group). All primers amplified products for

Cardinium residing in Opiliones and Culicoides spp. The EF-G primers worked on all samples, the SufB and GyrB primers worked on most samples in group A and all in group C and E. For the GroEL primers, two sets of forwards were used (Table 1), depending on which amplified better, but only sequences from the inner forward primer

(groel_346F) were used for the phylogenies.

3.2 Phylogenetic trees

The phylogeny of concatenated loci supports the monophyly of Cardinium as a genus.

Groups A, B, C, and E are each supported as monophyletic groups, although the support for group A is increased when group B (the long branch for the Cardinium in the plant parasitic nematode, Heterodera glycines) is omitted (Figure 2). The individual gene trees are not completely topologically congruent (Figures 3-6), although they also become more congruent when group B is removed (Figures 7-10).

44 4. DISCUSSION

This study aimed to better understand the evolution of the diverse arthropod symbiont

Cardinium, and provide genetic tools to better identify individual strains within this group. Phylogenies based on sequences derived from four loci across a representative set of Cardinium strains show a greater resolution of Cardinium clades in this diverse genus than single gene trees using more slowly evolving DNA such as 16S rDNA.

Using 16S rDNA and gyrase B, Nakamura et al. (2009) grouped Cardinium into three groups: A, which contains Cardinium strains infecting insects, mites, and other arthropods, B, which contains the Cardinium strain infecting the plant parasitic nematode, Heterodera glycines; and C, which contains Cardinium infecting biting midges in the genus Culicoides. These groups are supported in the current study using the concatenated sequence of four loci. Chang et al. (2010) suggested that the Cardinium found in in the harvestmen clade (Leiobunum spp., Opiliones) might be its own group based on a phylogeny constructed using a partial 16S rDNA sequence. However, because

16S rDNA displays a relatively slow rate of evolution, the support for this idea was limited. Based on the current phylogeny using the concatenated loci, there is more robust support for Cardinium found in the Leiobunum Opiliones to be a separate clade.

Following the convention of Nakamura et al. (2009), this clade is designated group E, with clade D reserved for Cardinium in water fleas (Edlund et al., 2012).

The tree of four concatenated loci with our full set of taxa was unable to resolve the relationships among clades within group A (Figure 1). This appears in part to be due to

45 the Cardinium in Heterodera glycines residing on a long branch, which confuses the placement of this strain in the phylogeny; when this taxon is removed, the group A relationships are better resolved (Figure 2). This is also shown by the single gene trees

(Figures 3-6); the four gene trees display three different placements of the major groups of Cardinium. However, when the nematode strain cHgly1 is removed (Figures 7-10), the gene trees become much more congruent, and clades become better resolved. To more fully resolve these clades, it is likely that more taxa must be sampled in groups A and B, particularly from the Cardinium residing in mites, spiders, and nematodes.

The monophyly of the group of Cardinium found in the oleander scale, Aspidiotus nerii

(Diaspididae), Encarsia parasitic wasps, and whiteflies is supported. Species of Encarsia that harbor these Cardinium parasitize either whiteflies (E. hispida, E. suzannae, E. tabacivora, E. inaron (IT and US)) or diaspidid scale insects (E. perniciosi). The placement of these Encarsia Cardinium strains with those from scale insects (A. nerii) and whiteflies (A. floccus, B. tabaci) suggests that horizontal transmission events between host and parasitoids, and perhaps within parasitoids, have occurred. The directionality of these events cannot be discerned from this phylogeny without deeper sampling of both hosts and parasitoids. The Cardinium within this clade is the clearest example of closely related Cardinium strains residing in distantly related hosts, in contrast to the previously observed pattern of closely related Cardinium strains residing in closely related hosts

(Zchori-Fein et al., 2004; Nakamura et al., 2009; Perlman et al., 2010; Lewis et al.,

2014).

46 Interestingly, some patterns appear at least superficially similar between Cardinium and

Wolbachia. Wolbachia strains have been found in both animal and plant parasitic nematodes, and recently the plant parasitic nematode has been shown to harbor an early diverging strain of Wolbachia situated on a long branch (Brown et al., 2016). The

Cardinium found in the one plant-parasitic nematode sequenced to date also resides on a long branch, but does not appear to be an early diverging lineage, as it is sister to the mite and arthropod Cardinium, group A. The reproductive manipulations that Cardinium is able to induce also share a superficial pattern with Wolbachia. Strains that cause the same reproductive manipulations do not clearly form one monophyletic clade, except perhaps in the case of the mite strains causing feminization, but this might change when further examples of feminizing Cardinium are discovered. Additionally, closely related

Cardinium strains do not necessarily cause the same reproductive manipulations, as exemplified by the sister strains cEper1, which causes CI and the PI strain, cEper2

(Zchori-Fein et al., 2001; Hunter et al., 2003). Similarly cEsug1, which causes CI, and cTpue1, which does not cause CI or PI, are sister taxa (Gotoh et al., 2007). This pattern also occurs in Wolbachia; closely related Wolbachia strains in Acraea butterflies have shown multiple transitions between sex ratio distorting and CI-inducing Wolbachia strains (Jiggins et al., 2002). Additionally, in Drosophila, wMel, causing CI, and wAu, having no phenotype, are also very closely related (Hoffmann et al., 1996). These similar patterns between Wolbachia and Cardinium trees are not necessarily expected; recently, it has been suggested that the horizontal transfer of the CI phenotype may be linked to the

Wolbachia’s WO phage, which can cross-infect Wolbachia strains (Bordenstein and

Bordenstein, 2016; LePage et al., 2017). So far, sequenced genomes of Cardinium do not

47 show the presence of phage DNA. Unlike Wolbachia, however, many Cardinium strains do harbor plasmids (Penz et al., 2012; Santos-Garcia et al., 2014), which could perhaps serve a similar function in horizontal transmission of reproductive manipulation genes

(Partridge, 2011; Stokes and Gillings, 2011).

48 5. CONCLUSION

Cardinium evolution appears to be driven by both ecological opportunity and host specialization. Cardinium has frequently switched between parasitoids and their hosts, even though they are physiologically quite different, causing these strains to form a clade.

In contrast, the Cardinium in Cybaeus spiders, Culicoides spp, and Leobinium spp appear to be quite specialized to particular host lineages, without distantly related hosts breaking up these clades. Similar to Wolbachia, the relatedness of Cardinium strains does not necessarily predict their reproductive phenotypes. Overall, the new genetic tools proposed in this study allow for clearer strain delimitation and a more detailed picture of the evolution of Cardinium, one that will keep unfolding the more the MLST primers are used to characterize strains and add taxa to the Cardinium phylogeny.

49 6. TABLES AND FIGURES

Table 1. MLST primers and their suggested melting temperatures for PCR. § This primer

is an alternative to the other groEL forward primer.

Primer name Primer sequence Tm Gene Amplified MLST (C) length nucleotide fragment (bp) range of size (bp) gene (bp) gyrb_859F ATGCAYGTMACBGGDTTTARAAG 50 1950 859-1637 736 gyrb_1637R TARAGTGGRGGRGARGCAAT groel_346F VTHAARCGBGGBATWGACAA 52 1638 346-842 476 groel_287F§ CNCARKCTATWTTYRYVCATGG groel_842R TTGGBGAYAGAAGRAARGCNATG sufb_806F CTACNGTDCARAATTGGTATCC 50 1443 806-1289 451 sufb_1289R ADYTGRTCYKCRCTRATTTT

EF_1689R AAABCCYTTYTGAATIGCTGG 52 2142 1689- 482 EF_1162F GCNGTRGTIGGITTTAARGARATTA 1162

50 Table 2. Collection localities of Cardinium strains and their associated reproductive phenotypes.

Host organism Cardinium Collection information Reproductive phenotype Strain Aleurothrixus Israel unknown floccosus cAflo1 Encarsia cEper1 Texas, USA CI (Hunter et al., 2003) suzannae Encarsia hispida cEhis1 San Diego, USA PI (Zchori-Fein et al., 2004) Encarsia cEper2 Brazil PI association (Zchori-Fein tabacivora et al., 2001) Encarsia inaron cEina1 Italy CI? (One or both of dual (IT), high infection strains causes CI; density strain (Gebiola, White, et al., 2016) (HIT) Encarsia inaron, cEina3 Italy CI? (One or both of dual low density infection strains causes CI; strain (LIT) (Gebiola, White, et al., 2016) Encarsia inaron cEina2 USA no CI, no PI (White et al., (USA) 2009) Aspidiotus nerii cAner1 University of California, associated with Riverside culture parthenogenetic host (Provencher et al., 2005) Bemisia tabaci, cBtQ1 Valencia, Spain no CI, no PI (Fang et al., Q1 species 2014) Ixodes cIsca1 Nantucket Island unknown scapularis cell (Massachusetts), USA line Indozuriel cIdan1 Japan unknown dantur Sogatella cSfur1 Japan CI (Nakamura et al., 2009) furcifera Eotetranychus cEsug1 Taiwan CI (Gotoh et al., 2007) suginamensis Oligonychus cOcof1 Japan unknown coffeae Oligonychus cOgot1 Japan unknown gotohi Panonychus cPmor1 Japan CI (Gotoh et al., 2003) mori Tetranychus cTpue1 Japan no CI, no PI (Gotoh et al., pueraricola 2003) Culicoides cCara1 Kagoshima Pref. or unknown

51 arakawae Okinawa Pref., Japan Culicoides cCohm1 Kagoshima Pref, Japan unknown ohmorii Culicoides cCper1 Yonaguni Isl., Okinawa unknown peregrinus Pref. Japan Cybaeus eutypus cCeut Vancouver Island, Canada unknown Cybaeus signifer cCsig1 Vancouver Island, Canada unknown Cybaeus cCcha1 Northern California, USA unknown chauliodus Cybaeus cCsom1 Northern California, USA unknown somesbar Cybaeus cCsan1 North central California, unknown sanbruno USA Cybaeus cCmor1 British Columbia, Canada unknown morosus Cybaeus hesper cChes1 North central California, unknown USA Cybaeus cCmul1 Oregon, USA unknown multnoma Cybaeus cCpen1 North central California, unknown penedentatus USA Culicoides cCimi1 Unknown unknown imicola Metaseiulus cMocc1 Washington and Oregon, CI (Roush and Hoy, 1981) occidentalis USA Leiobunum sp 1 cLsp2 Georgetown Island, unknown Maine, USA Leiobunum sp 2 cLsp3 N. Monmouth, Maine, unknown USA Leiobunum cLsp1 Ellison Park, Monroe unknown County, New York, USA Brevipalpus cBcal1 Minas Gerais, Brazil feminization (Groot and californicus Breeuwer, 2006) Brevipalpus cBpho1 Minas Gerais, Brazil feminization (Groot and phoenicus Breeuwer, 2006) Macrosteles cMque1 unknown quadrilineatus Encarsia cEper3 Tijuana River Valley Park, associated with perniciosi San Diego, USA parthenogenetic host (Stouthamer and Luck, 1991) Heterodera cHgly1 In culture, University of unknown glycines Illinois at Urbana- Champaign. Pezothrips cPkel1 Australia CI (Nguyen et al., 2017) kellyanus

52 Figure 1. Bayesian phylogeny with both Maximum likelihood and posterior probability of all Cardinium strains from this study using

concatenated loci: gyrB, sufB, EF-G, and groEL. Cardinium strains are labeled by the host taxon species name and colored by the host

taxon order or sub-class.

A.Amoebophilus#asiaHcus# asiaticus C.#imicola#C. imicola */100" C.#ohmorii#C. ohmorii C.#arakawae#C. arakawae C.#peregrinus#C. peregrinus Leiobunum#sp."3# */100"Leiobunum sp..3 */U" LeiobunumLeiobunum#sp.# sp. Leiobunum#sp.Leiobunum sp..2"2# H.H.#glycines# glycines P.#kellyanus#P. kellyanus */64" P.#mori#P. mori M.#occidentalis#M. occidentalis .87/71" I.I.#dantur# dantur */100" S.S.#furcifera# furcifera O.#gotohi#O. gotohi */64" */99" T.#peuraricola#T. pueraricola */100" E.#suginamensis#E. suginamensis .73/37" B.#californicus#B. californicus */100" B.#phoenicus#B. phoenicus */62" O.O.#coffeae# coffeae .96/58" I.I.#scapularis# scapularis */100" .66/53" M.M.#quadrilineatus# quadrilineatus A.A.#nerii## nerii E.E.#hispida# hispida */99" */99" A.A.#floccosus# floccosus */100" E.E.#inaron# inaron (US)# */100" B.#tabaci#B. tabaci .88/67" E.E.#inaron# inaron (HIT)# E.#inaron#(LIT)# .91/63" E. inaron (LIT) E.E.#perniciosi# perniciosi */96" E.E.#suzannae# suzannae */100" Legend" E.E.#tabacivora# tabacivora C.C.#chauliodus# chauliodus C.C.#somesbar# somesbar Acari":"pink" Cytoplasmic"Incompa@bility"(CI)" C.C.#penedentatus# penedentatus Maybe"causes"CI" Diptera:"Yellow" C.C.#hesper# hesper */99" Opiliones":"Green" Does"not"cause"CI" C.C.#sanbruno# sanbruno C.#signifer# Nematode":"Purple" Feminiza@on" C. signifer Thysanoptera":"Gray" C.C.#signifer# signifer (male)(male)# Parthenogenesis" Hemiptera":"Blue" C.C.#eutypus# eutypus C.#multnoma# Araneae":"Light"Blue" Associated"with"a"parthenogene@c"host" C. multnoma C.C.#morosus# morosus *"="over".99"Bayes"support" Bayes/Maximum"likelihood"support"

0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Edited, based on con 50 majrule

53 Figure 2. Bayesian phylogeny with both Maximum likelihood and posterior probability of all Cardinium strains from this study except that from Heterodera glycines using concatenated loci: gyrB, sufB, EF-G, and groEL. Cardinium strains are labeled by the host taxon species name and colored by the host taxon order or sub phylum.

A.Amoebophilus#asiaEcus# asiaticus C.#imicola#C. imicola */100! C.#ohmorii#C. ohmorii C.#arakawae#C. arakawae C.C.#peregrinus# peregrinus Leiobunum#sp.Leiobunum sp..3!3# */100! */2! LeiobunumLeiobunum#sp.# sp. Leiobunum#sp.Leiobunum sp..2!2# O.#gotohi#O. gotohi */99! T.#peuraricola# */100!T. pueraricola E.E.#suginamensis# suginamensis */69! B.#californicus#B. californicus */100! B.#phoenicus# */65! B. phoenicus O.O.#coffeae# coffeae .95/54! I.#scapularis#I. scapularis */100!M.#quadrilineatus# */76! M. quadrilineatus A.A.#nerii## nerii E.#hispida#E. hispida */99! */99! A.#floccosus#A. floccosus */100! E.#inaron#E. inaron(US) (US)# */100! B.#tabaci#B. tabaci .64/63! E.#inaron#E. inaron(HIT) (HIT)# E.#inaron#(LIT)# .89/23! .69/60!E. inaron (LIT) E.#perniciosi#E. perniciosi */97! E.#suzannae#E. suzannae */100! E.#tabacivora#E. tabacivora P.P.#mori# mori Legend! P.P.#kellyanus# kellyanus M.#occidentalis#M. occidentalis Cytoplasmic!IncompaPbility!(CI)! */83! I.I.#dantur# dantur Acari!:!pink! */100! Diptera:!Yellow! Maybe!causes!CI! S.#furcifera#S. furcifera C.C.#chauliodus# chauliodus Opiliones!:!Green! Does!not!cause!CI! .84/55! C.C.#somesbar# somesbar Nematode!:!Purple! FeminizaPon! Thysanoptera!:!Gray! C.C.#penedentatus# penedentatus Parthenogenesis! C.#hesper# Hemiptera!:!Blue! C. hesper .*/100!C.#sanbruno# Araneae!:!Light!Blue! Associated!with!a!parthenogenePc!host! C. sanbruno C.C.#signifer# signifer *!=!over!.99!Bayes!support! C.#signifer#(male)# Bayes/Maximum!likelihood!support! C. signifer (male) C.C.#eutypus# eutypus C.C.#multnoma# multnoma C.C.#morosus# morosus Edited, based on con 50 majrule

0.6 0.5 0.4 0.3 0.2 0.1 0.0

54 Figure 3. Bayesian single gene tree of 482 bp of Translation Elongation Factor G (EF-G) of all strains in this study.

A.Amoebophilus#asiaGcus# asiaticus C.C.#imicola# imicola *" C.#arakawae#C. arakawae C.#ohmorii#C. ohmorii C.#peregrinus#C. peregrinus *" H.H.#glycines# glycines LeiobunumLeiobunum#sp. sp..3"3# *" LeiobunumLeiobunum#sp.# sp. .76" LeiobunumLeiobunum#sp. sp..2"2# B.#californicus#B. californicus *" B.#phoenicus#B. phoenicus .60" .91" O.#coffeae#O. coffeae .92" I. I.#scapularis#scapularis *" M.#quadrilineatus#M. quadrilineatus O.#gotohi#O. gotohi *" *" T.T.#peuraricola# pueraricola *" E.E.#suginamensis# suginamensis M.M.#occidentalis# occidentalis .98" P.P.#kellyanus# kellyanus .89" .82" I.I.#dantur# dantur *" S.S.#furcifera# furcifera P.#mori#P. mori .70" A.A.#nerii## nerii *" *" E.E.#hispida# hispida E.#inaron#E. inaron (US)(US)# *" *" B.B.#tabaci# tabaci E.E.#inaron# inaron (HIT)(HIT)# *" E.E.#suzannae# suzannae *" Legend" .92" E.#tabacivora#E. tabacivora *" E.#inaron#(LIT)# .55" E. inaron (LIT) Acari":"pink" E.#perniciosi#E. perniciosi Diptera:"Yellow" C.C.#hesper# hesper Opiliones":"Green" C.C.#penedentatus# penedentatus Nematode":"Purple" C.C.#sanbruno# sanbruno Thysanoptera":"Gray" *"C.C.#chauliodus# chauliodus Hemiptera":"Blue" C.C.#somesbar# somesbar Araneae":"Light"Blue" C.#signifer#C. signifer C.#signifer#C. signifer(male) (male)# *"="over".99"Bayes"support" Bayes/Maximum"likelihood"support" C.C.#eutypus# eutypus C.C.#multnoma# multnoma C.C.#morosus# morosus Modified, based on con 50 majrule 1.9 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

55 Figure 4. Bayesian single gene tree of 452 bp of Iron Sulfur Cluster Assembly Protein B (sufB) of all strains in this study.

A.Amoebophilus#asia>cus# asiaticus C.#arakawae#C. arakawae C.#imicola#C. imicola *" C.#ohmorii#C. ohmorii C.#peregrinus#C. peregrinus Leiobunum#sp.#Leiobunum sp. *" *" Leiobunum#sp.Leiobunum sp..3"3# T.#peuraricola#T. pueraricola *" E.#suginamensis# *" E. suginamensis M.M.#occidentalis# occidentalis P.#kellyanus# .83" P. kellyanus .79" H.#glycines#H. glycines I.I.#dantur# dantur *" S.S.#furcifera# furcifera P.#mori# .89" P. mori .68" O.#coffeae#O. coffeae .83" B.#californicus#B. californicus B.#phoenicus# .60" B. phoenicus I.I.#scapularis# scapularis *" M.M.#quadrilineatus# quadrilineatus C.C.#eutypus# eutypus C.C.#multnoma# multnoma C.C.#penedentatus# penedentatus C.C.#signifer# signifer *" C.#somesbar#C. somesbar C.C.#morosus# morosus Legend" C.C.#hesper# hesper C.#sanbruno#C. sanbruno Acari":"pink" A.A.#nerii## nerii Diptera:"Yellow" *" E.E.#hispida# hispida E.#inaron#(LIT)# Opiliones":"Green" *" E. inaron (LIT) Nematode":"Purple" E.#perniciosi#E. perniciosi .69" Thysanoptera":"Gray" E.#suzannae#E. suzannae *" Hemiptera":"Blue" .96" E.#tabacivora#E. tabacivora Araneae":"Light"Blue" B.#tabaci#B. tabaci *"="over".99"Bayes"support" *" E.#inaron#E. inaron(HIT) (HIT)# Bayes/Maximum"likelihood"support" E.E.#inaron# inaron (US)#

Modified, based on con 50 majrule0.2 0.1 0.0

56 Figure 5. Bayesian single gene tree of 476 bp of gene coding for heat shock protein GroEL of all strains in this study.

A.Amoebophilus#asiaDcus# asiaticusv" H.H.#glycines# glycinesv" C.#ohmorii#C. ohmorii *" v" C.C.#imicola# imicola .94" v" *" Leiobunum#sp.Leiobunumv" sp..3"3# *" LeiobunumLeiobunum#sp.# sp. LeiobunumLeiobunum#sp. sp..2"2# .97" P.P.#mori# mori I. I.#scapularis#scapularis *" M.M.#quadrilineatus# quadrilineatus .98" T.T.#peuraricola# pueraricola *" E.E.#suginamensis# suginamensis .83" A.A.#nerii## nerii E.#hispida# .55" *" E. hispida E.E.#perniciosi# perniciosi *" *" E.E.#suzannae# suzannae *" E.#tabacivora# .90" E. tabacivora E.E.#inaron# inaron (LIT)(LIT)# .69" *" B.B.#tabaci# tabaci *" E.E.#inaron# inaron (HIT)(HIT)# *" E.E.#inaron# inaron (US)# P.P.#kellyanus# kellyanus M.M.#occidentalis# occidentalis .69" .68" B.B.#phoenicus# phoenicus .65" I.I.#dantur# dantur Legend" *" S.#furcifera# .94" S. furcifera C.#chauliodus# Acari":"pink" C. chauliodus C.#somesbar# Diptera:"Yellow" C. somesbar Opiliones":"Green" *" C.#penedentatus#C. penedentatus Nematode":"Purple" C.#hesper#C. hesper Thysanoptera":"Gray" C.#sanbruno#C. sanbruno Hemiptera":"Blue" C.C.#signifer# signifer Araneae":"Light"Blue" C.C.#multnoma# multnoma *"="over".99"Bayes"support" C.#eutypus#C. eutypus Bayes/Maximum"likelihood"support" C.#morosus#C. morosus

0.2 Modified, based on con 50 majrule 0.1 0.0

57 Figure 6. Bayesian single gene tree of 736 bp of Gyrase B (gyrB) of all strains in this study.

A.Amoebophilus#asiaEcus# asiaticus H.H.#glycines# glycines M.#occidentalis#M. occidentalis I.I.#dantur# dantur *" S.S.#furcifera# furcifera *" Leiobunum#sp.Leiobunum sp..3"3# *" Leiobunum#sp.#Leiobunum sp. Leiobunum#sp."2# .86" Leiobunum sp..2 C.C.#imicola# imicola .71" *" C.#ohmorii#C. ohmorii C.C.#arakawae# arakawae C.#peregrinus#C. peregrinus C.#hesper#C. hesper C.#penedentatus#C. penedentatus C.#sanbruno#C. sanbruno .*" C.#chauliodus#C. chauliodus .55" C.#somesbar#C. somesbar C.#eutypus#C. eutypus C.#multnoma#C. multnoma C.#signifer#C. signifer C.#morosus#C. morosus P.P.#mori# mori *" O.#gotohi#O. gotohi *" T.T.#peuraricola# pueraricola *" E.E.#suginamensis# suginamensis .66" P.P.#kellyanus# kellyanus .84" Legend" I.#scapularis#I. scapularis .83" B.B.#californicus# californicus *" Acari":"pink" .62" B.B.#phoenicus# phoenicus Diptera:"Yellow" A.A.#nerii## nerii Opiliones":"Green" .92" E.#inaron#E. inaron(LIT) (LIT)# Nematode":"Purple" * E.E.#perniciosi# perniciosi Thysanoptera":"Gray" .83" E.E.#suzannae# suzannae * Hemiptera":"Blue" .55" E.E.#tabacivora# tabacivora Araneae":"Light"Blue" E.#hispida#E. hispida *"="over".99"Bayes"support" E.#inaron#E. inaron(US) (US)# Bayes/Maximum"likelihood"support" B.#tabaci#B. tabaci E.#inaron#E. inaron(HIT) (HIT)# Modified, based on con 50 majrule 0.3 0.2 0.1 0.0

58 Figure 7. Bayesian single gene tree of 482 bp of Translation Elongation Factor G (EF-G) of all strains in this study absent the

Cardinium found in Heterodera glycines.

Amoebophilus#asiaFcus#A. asiaticus C.#imicola#C. imicola * C.#arakawae#C. arakawae C.C.#ohmorii# ohmorii C.#peregrinus#C. peregrinus Leiobunum#sp.Leiobunum sp.3!3# * * Leiobunum#sp.#Leiobunum sp. Leiobunum#sp.Leiobunum sp.2!2# B.B.#californicus# californicus * .96! B.B.#phoenicus# phoenicus * O.#coffeae#O. coffeae .97! I. scapularisI.#scapularis# * M.#quadrilineatus#M. quadrilineatus O.#gotohi#O. gotohi * * *T.#peuraricola#T. pueraricola E.#suginamensis#E. suginamensis M.#occidentalis#M. occidentalis * P.#kellyanus#P. kellyanus .88! .83! I.I.#dantur# dantur S.#furcifera#S. furcifera P.#mori#P. mori .69! A.A.#nerii## nerii * * E.E.#hispida# hispida E.E.#inaron# inaron (US)(US)# * * B.#tabaci#B. tabaci E.#inaron#E. inaron (HIT)# Legend! * E.E.#suzannae# suzannae .90! * E.#tabacivora#E. tabacivora * Acari!:!pink! E.#inaron#E. inaron (LIT)(LIT)# Diptera:!Yellow! E.#perniciosi#E. perniciosi Opiliones!:!Green! C.C.#hesper# hesper C.C.#penedentatus# penedentatus Nematode!:!Purple! C.#sanbruno#C. sanbruno Thysanoptera!:!Gray! *C.#chauliodus#C. chauliodus Hemiptera!:!Blue! C.C.#somesbar# somesbar Araneae!:!Light!Blue! C.#signifer#C. signifer C. signifer (male) *!=!over!.99!Bayes!support! C.#signifer#(male)# Bayes/Maximum!likelihood!support! C.#eutypus#C. eutypus C.#multnoma#C. multnoma C.#morosus#C. morosus

Modified, based on con 50 majrule 1.8 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

59 Figure 8. Bayesian single gene tree of 452 bp of Iron Sulfur Cluster Assembly Protein B (sufB) of all strains in this study absent the

Cardinium found in Heterodera glycines.

Amoebophilus#asiaGcus#A. asiaticus C.C.#arakawae# arakawae C. imicola *! C.#imicola# C.C.#ohmorii# ohmorii C.C.#peregrinus# peregrinus LeiobunumLeiobunum#sp.# sp. *! *! LeiobunumLeiobunum#sp. sp.3!3# T.#peuraricola#T. pueraricola *! *! E.E.#suginamensis# suginamensis M.#occidentalis#M. occidentalis A.A.#nerii## nerii .87! *! E.E.#hispida# hispida *! E.E.#inaron# inaron (LIT)(LIT)# .66! E.E.#perniciosi# perniciosi .92! .50! E.E.#suzannae# suzannae *! .95! E.E.#tabacivora# tabacivora B.B.#tabaci# tabaci *! E.E.#inaron# inaron (HIT)(HIT)# .97! E.E.#inaron# inaron (US)(US)# P.P.#kellyanus# kellyanus I.I.#dantur# dantur *! .57! S.S.#furcifera# furcifera P.P.#mori# mori Legend! .66! O.O.#coffeae# coffeae .89! B.B.#californicus# californicus Acari!:!pink! *! .55! B.B.#phoenicus# phoenicus Diptera:!Yellow! .57! I.I.#scapularis# scapularis *! Opiliones!:!Green! M.#quadrilineatus#M. quadrilineatus Nematode!:!Purple! C.C.#eutypus# eutypus Thysanoptera!:!Gray! C.C.#multnoma# multnoma Hemiptera!:!Blue! C.C.#penedentatus# penedentatus Araneae!:!Light!Blue! C. signifer *! C.#signifer# *!=!over!.99!Bayes!support! C.C.#somesbar# somesbar Bayes/Maximum!likelihood!support! C.C.#morosus# morosus C.C.#hesper# hesper C.C.#sanbruno# sanbruno

Edited, based on con 50 majrule 0.2 0.1 0.0

60 Figure 9. Bayesian single gene tree of 476 bp of gene coding for heat shock protein GroEL of all strains in this study absent the

Cardinium found in Heterodera glycines.

Amoebophilus#asiaCcus#A. asiaticus C.#ohmorii#C. ohmorii *! C.#imicola#C. imicola Leiobunum#sp.Leiobunum sp.3!3# * *! Leiobunum#sp.#Leiobunum sp. Leiobunum#sp.Leiobunum sp.2!2# P.#mori#P. mori .56! I.#scapularis#I. scapularis *! M.#quadrilineatus#M. quadrilineatus .87! T.#peuraricola#T. pueraricola *! E.E.#suginamensis# suginamensis .86! A.#nerii##A. nerii E.#hispida#E. hispida .54! *! E.E.#perniciosi# perniciosi *! *! E.E.#suzannae# suzannae *! .88! E.#tabacivora#E. tabacivora .71! E.#inaron#E. inaron (LIT)# *! B.#tabaci#B. tabaci *! E.#inaron#E. inaron (HIT)# *! E.#inaron#E. inaron (US)(US)# P.P.#kellyanus# kellyanus Legend! M.#occidentalis#M. occidentalis .69! .64! B.B.#phoenicus# phoenicus .70! Acari!:!pink! I.I.#dantur# dantur *! Diptera:!Yellow! S.S.#furcifera# furcifera .90! Opiliones!:!Green! C.#chauliodus#C. chauliodus *! Nematode!:!Purple! C.#somesbar#C. somesbar Thysanoptera!:!Gray! *! C.#penedentatus#C. penedentatus Hemiptera!:!Blue! .97!C.C.#hesper# hesper Araneae!:!Light!Blue! C.#sanbruno#C. sanbruno *!=!over!.99!Bayes!support! C.#signifer#C. signifer Bayes/Maximum!likelihood!support! .95! C.#multnoma#C. multnoma .78! C.#eutypus#C. eutypus .75! C.#morosus#C. morosus Edited, based on con 50 majrule 0.1 0.0

61 Figure 10. Bayesian single gene tree of 736 bp of Gyrase B (gyrB) of all strains in this study absent the Cardinium found in

Heterodera glycines.

Amoebophilus#asiaAcus#A. asiaticus C.C.#imicola# imicola *! C.C.#ohmorii# ohmorii C.C.#arakawae# arakawae C.C.#peregrinus# peregrinus Leiobunum#sp.Leiobunum sp.3!3# *!Leiobunum#sp.#Leiobunum sp. *! LeiobunumLeiobunum#sp. sp.2!2# P.P.#mori# mori *! O.O.#gotohi# gotohi *! T.T.#peuraricola# pueraricola *! E.E.#suginamensis# suginamensis .51! I.I.#scapularis# scapularis .85! .71! B.B.#californicus# californicus *! B. phoenicus .65! B.#phoenicus# A.A.#nerii## nerii .74! E.E.#inaron# inaron (LIT)(LIT)# E.E.#perniciosi# perniciosi *! .91! E.E.#suzannae# suzannae *! .75! .65! E.E.#tabacivora# tabacivora E.E.#hispida# hispida *! E.E.#inaron# inaron (US)(US)# *! B.B.#tabaci# tabaci Legend! E.E.#inaron# inaron (HIT)(HIT)# M.#occidentalis#M. occidentalis Acari!:!pink! .91! I.I.#dantur# dantur Diptera:!Yellow! S.S.#furcifera# furcifera Opiliones!:!Green! C.C.#hesper# hesper Nematode!:!Purple! C.C.#penedentatus# penedentatus Thysanoptera!:!Gray! C.C.#sanbruno# sanbruno Hemiptera!:!Blue! *! C.C.#chauliodus# chauliodus Araneae!:!Light!Blue! C.C.#somesbar# somesbar *!=!over!.99!Bayes!support! C.C.#eutypus# eutypus Bayes/Maximum!likelihood!support! C.C.#multnoma# multnoma C.C.#signifer# signifer C.C.#morosus# morosus Edited, based on con 50 majrule 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0

62 7. REFERENCES

Ayala-Castro C, Saini A, Outten FW (2008). Fe-S Cluster Assembly Pathways in Bacteria.

Microbiol Mol Biol Rev 72: 110–125.

Beckmann JF, Ronau JA, Hochstrasser M (2017). A Wolbachia deubiquitylating enzyme induces

cytoplasmic incompatibility. Nat Microbiol 2: 17007.

Bordenstein SR, Bordenstein SR (2016). Eukaryotic association module in phage WO genomes

from Wolbachia. Nat Commun 7: 13155.

Brown AMV, Wasala SK, Howe DK, Peetz AB, Zasada IA, Denver DR (2016). Genomic

evidence for plant-parasitic nematodes as the earliest Wolbachia hosts. Sci Rep 6: 34955.

Brownlie JC, Cass BN, Riegler M, Witsenburg JJ, Iturbe-Ormaetxe I, McGraw E a, et al. (2009).

Evidence for metabolic provisioning by a common invertebrate endosymbiont, Wolbachia

pipientis, during periods of nutritional stress. PLoS Pathog 5: e1000368.

Burke GR, Moran NA (2011). Massive genomic decay in Serratia symbiotica, a recently evolved

symbiont of aphids. Genome Biol Evol 3: 195–208.

Cariou M, Duret L, Charlat S (2017). The global impact of Wolbachia on mitochondrial diversity

and evolution. bioRxiv.

Caspi-Fluger A, Inbar M, Mozes-Daube N, Mouton L, Hunter MS, Zchori-Fein E (2011).

Rickettsia ‘In’ and ‘Out’: two different localization patterns of a bacterial symbiont in the

same insect species. PLoS One 6: e21096.

Chang J, Masters A, Avery A, Werren JH (2010). A divergent Cardinium found in daddy long-

legs (Arachnida: Opiliones). J Invertebr Pathol 105: 220–227.

Clark MA, Moran NA, Baumann P, Wernegreen JJ (2000). Cospeciation between bacterial

endosymbionts (Buchnera) and a recent radiation of aphids (Uroleucon) and pitfalls of

63 testing for phylogenetic congruence. Evolution 54: 517–525.

Dedeine F, Bouletreau M, Vavre F (2005). Wolbachia requirement for oogenesis: occurrence

within the genus Asobara (Hymenoptera, Braconidae) and evidence for intraspecific

variation in A. tabida . Heredity 95: 394–400.

Dedeine F, Vavre F, DeWayne Shoemaker D, Boulétreau M, Elena S (2004). Intra-individual

coexistence of a Wolbachia strain required for host oogenesis with two strains inducing

cytoplasmic incompatibility in the wasp Asobara tabida. Evolution 58: 2167–2174.

Douglas AE (1998). Nutritional interactions in insect-microbial symbioses: aphids and their

symbiotic bacteria Buchnera. Annu Rev Entomol 43: 17–37.

Duron O, Bouchon D, Boutin S, Bellamy L, Zhou L, Engelstädter J, et al. (2008). The diversity

of reproductive parasites among arthropods: Wolbachia do not walk alone. BMC Biol 6: 27.

Duron O, Hurst GDD, Hornett E a, Josling JA, Engelstädter J (2008). High incidence of the

maternally inherited bacterium Cardinium in spiders. Mol Ecol 17: 1427–37.

Edgar RC (2004). MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res 32: 1792–1797.

Edlund A, Ek K, Breitholtz M, Gorokhova E (2012). Antibiotic-induced change of bacterial

communities associated with the copepod Nitocra spinipes. PLoS One 7: e33107.

Engel P, Moran NA (2013). The gut microbiota of insects – diversity in structure and function.

FEMS Microbiol Rev 37: 699–735.

Engelstädter J, Hurst GDD (2009). The ecology and evolution of microbes that manipulate host

reproduction. Annu Rev Ecol Evol Syst 40: 127–149.

Fang YW, Liu LY, Zhang HL, Jiang DF, Chu D (2014). Competitive ability and fitness

differences between two introduced populations of the invasive whitefly Bemisia tabaci Q

64 in China. PLoS One 9.

Fayet O, Ziegelhoffer T, Georgopoulos C (1989). The groES and groEL heat shock gene

products of Escherichia coli are essential for bacterial growth at all temperatures. J

Bacteriol 171: 1379–1385.

Fenn K, Blaxter M (2004). Are filarial nematode Wolbachia obligate mutualist symbionts?

Trends Ecol Evol 19: 163–6.

Ferri E, Bain O, Barbuto M, Martin C, Lo N, Uni S, et al. (2011). New insights into the

evolution of Wolbachia infections in filarial nematodes inferred from a large range of

screened species. PLoS One 6: e20843.

Gebiola M, Kelly SE, Hammerstein P, Giorgini M, Hunter MS (2016). ‘Darwin’s corollary’ and

cytoplasmic incompatibility induced by Cardinium may contribute to speciation in Encarsia

wasps (Hymenoptera: Aphelinidae). Evolution 70: 2447–2458.

Gebiola M, White JA, Cass BN, Kozuch A, Harris LR, Kelly SE, et al. (2016). Cryptic diversity,

reproductive isolation and cytoplasmic incompatibility in a classic biological control

success story. Biol J Linn Soc 117: 217–230.

Gotoh T, Noda H, Hong X-Y (2003). Wolbachia distribution and cytoplasmic incompatibility

based on a survey of 42 spider mite species (Acari: Tetranychidae) in Japan. Heredity 91:

208–216.

Gotoh T, Noda H, Ito S (2007). Cardinium symbionts cause cytoplasmic incompatibility in

spider mites. Heredity 98: 13–20.

Groot TVM, Breeuwer JAJ (2006). Cardinium symbionts induce haploid thelytoky in most

clones of three closely related Brevipalpus species. Exp Appl Acarol 39: 257–71.

Hagimori T, Abe Y, Date S, Miura K (2006). The first finding of a Rickettsia bacterium

65 associated with parthenogenesis induction among insects. Curr Microbiol 52: 97–101.

Hoffmann AA, Clancy D, Duncan J (1996). Naturally-ocurring Wolbachia infection in

Drosophila simulans that does not cause cytoplasmic incompatibility. Heredity : 1–8.

Hosokawa T, Koga R, Kikuchi Y, Meng X-Y, Fukatsu T (2010). Wolbachia as a bacteriocyte-

associated nutritional mutualist. Proc Natl Acad Sci USA 107: 769–74.

Huigens ME, de Almeida RP, Boons PAH, Luck RF, Stouthamer R (2004). Natural interspecific

and intraspecific horizontal transfer of parthenogenesis–inducing Wolbachia in

Trichogramma wasps. Proc R Soc London Ser B Biol Sci 271: 509–515.

Hunter MS, Perlman SJ, Kelly SE (2003). A bacterial symbiont in the Bacteroidetes induces

cytoplasmic incompatibility in the parasitoid wasp Encarsia pergandiella. Proc R Soc Lond

B 270: 2185–90.

Hurst GDD, von der Schulenburg JHG, Majerus TMO, Bertrand D, Zakharov IA, Baungaard J,

et al. (1999). Invasion of one insect species, Adalia bipunctata, by two different male-

killing bacteria. Insect Mol Biol 8: 133–139.

Jiggins FM, Bentley JK, Majerus MEN, Hurst GDD (2002). Recent changes in phenotype and

patterns of host specialization in Wolbachia bacteria. Mol Ecol 11: 1275–1283.

Kaiser W, Huguet E, Casas J, Commin C, Giron D (2010). Plant green-island phenotype induced

by leaf-miners is mediated by bacterial symbionts. Proc Biol Sci 277: 2311–9.

Koressaar T, Remm M (2007). Enhancements and modifications of primer design program

Primer3. Bioinformatics 23: 1289–1291.

Kurtti TJ, Munderloh UG, Andreadis TG, Magnarelli L a, Mather TN (1996). Tick cell culture

isolation of an intracellular prokaryote from the tick Ixodes scapularis. J Invertebr Pathol

67: 318–21.

66 LePage DP, Metcalf JA, Bordenstein SR, On J, Perlmutter JI, Shropshire JD, et al. (2017).

Prophage WO genes recapitulate and enhance Wolbachia-induced cytoplasmic

incompatibility. Nature 543: 243–247.

Lewis SE, Rice A, Hurst GDD, Baylis M (2014). First detection of endosymbiotic bacteria in

biting midges Culicoides pulicaris and Culicoides punctatus, important Palaearctic vectors

of bluetongue virus. Med Vet Entomol 28: 453–456.

Ley RE, Lozupone CA, Hamady M, Knight R, Gordon JI (2008). Worlds within worlds:

evolution of the vertebrate gut microbiota. Nat Rev Microbiol 6.

Maddison W, Maddison D (2010). Mesquite 2. Manual: 1–258.

Maiden MC, Bygraves JA, Feil E, Morelli G, Russell JE, Urwin R, et al. (1998). Multilocus

sequence typing: a portable approach to the identification of clones within populations of

pathogenic microorganisms. Proc Natl Acad Sci USA 95: 3140–5.

Manzano-Marín A, Lamelas A, Moya A, Latorre A (2012). Comparative genomics of Serratia

spp.: two paths towards endosymbiotic life. PLoS One 7: e47274.

McFall-Ngai M, Hadfield MG, Bosch TCG, Carey H V, Domazet-Lošo T, Douglas AE, et al.

(2013). Animals in a bacterial world, a new imperative for the life sciences. Proc Natl Acad

Sci 110: 3229–3236.

Messing J (1983). New M13 Vectors for Cloning. Methods Enzymol 101: 20–78.

Moran NA, McCutcheon JP, Nakabachi A (2008). Genomics and evolution of heritable bacterial

symbionts. Annu Rev Genet 42: 165–90.

Nakamura Y, Kawai S, Yukuhiro F, Ito S, Gotoh T, Kisimoto R, et al. (2009). Prevalence of

Cardinium bacteria in planthoppers and spider mites and taxonomic revision of ‘Candidatus

Cardinium hertigii’ based on detection of a new Cardinium group from biting midges. Appl

67 Environ Microbiol 75: 6757–63.

Nakamura Y, Yukuhiro F, Matsumura M, Noda H (2012). Cytoplasmic incompatibility

involving Cardinium and Wolbachia in the white-backed planthopper Sogatella furcifera

(Hemiptera: Delphacidae). Appl Entomol Zool 47: 273–283.

Nguyen DT, Morrow JL, Spooner-Hart RN, Riegler M (2017). Independent cytoplasmic

incompatibility induced by Cardinium and Wolbachia maintains endosymbiont coinfections

in haplodiploid thrips populations. Evolution 71: 995–1008.

Nikoh N, Hosokawa T, Moriyama M, Oshima K, Hattori M, Fukatsu T (2014). Evolutionary

origin of insect-Wolbachia nutritional mutualism. Proc Natl Acad Sci USA 111.

Noel GR, Atibalentja N (2006). ‘Candidatus Paenicardinium endonii’, an endosymbiont of the

plant-parasitic nematode Heterodera glycines (Nemata: Tylenchida), affiliated to the

phylum Bacteroidetes. Int J Syst Evol Microbiol 56: 1697–702.

Nováková E, Hypša V, Moran NA (2009). Arsenophonus, an emerging clade of intracellular

symbionts with a broad host distribution. BMC Microbiol 9: 143.

O’Neill SL, Hoffmann AA, Werren JH (1997). Influential passengers: inherited microorganisms

and arthropod reproduction (SL O’Neill, AA Hoffmann, and JH Werren, Eds.). Oxford

University Press: Oxford, UK.

Oliver KM, Russell JA, Moran NA, Hunter MS (2003). Facultative bacterial symbionts in aphids

confer resistance to parasitic wasps. Proc Natl Acad Sci USA 100: 1803–1807.

Partridge SR (2011). Analysis of antibiotic resistance regions in Gram-negative bacteria. FEMS

Microbiol Rev 35: 820–855.

Penz T, Schmitz-Esser S, Kelly SE, Cass BN, Müller A, Woyke T, et al. (2012). Comparative

genomics suggests an independent origin of cytoplasmic incompatibility in Cardinium

68 hertigii. PLoS Genet 8: e1003012.

Perlman SJ, Kelly SE, Zchori-Fein E, Hunter MS (2006). Cytoplasmic incompatibility and

multiple symbiont infection in the ash whitefly parasitoid, Encarsia inaron. Biol Control

39: 474–480.

Perlman SJ, Magnus SA, Copley CR (2010). Pervasive associations between Cybaeus spiders

and the bacterial symbiont Cardinium. J Invertebr Pathol 103: 150–5.

Posada D (2008). jModelTest: Phylogenetic model averaging. Mol Biol Evol 25: 1253–1256.

Provencher LM, Morse GE, Weeks AR, Benjamin B. Normark (2005). Parthenogenesis in the

Aspidiotus nerii complex (Hemiptera:Diaspididae): a single origin of a worldwide,

polyphagous lineage associated with Cardinium bacteria. Ann Entomol Soc Am 98: 629–

635.

Reece RJ, Maxwell A (1991). DNA gyrase: structure and function. Crit Rev Biochem Mol Biol

26: 335–375.

Ronquist F, Huelsenbeck JP (2003). MrBayes 3: Bayesian phylogenetic inference under mixed

models. Bioinformatics 19.

Roush RT, Hoy MA (1981). Genetic improvement of _Metaseiulus occidentalis_: selection with

methomyl, dimethoate, and carbaryl and genetic analysis of carbaryl resistance. J Econ

Entomol 74: 138–141.

Russell JA, Funaro CF, Giraldo YM, Goldman-Huertas B, Suh D, Kronauer DJC, et al. (2012).

A veritable menagerie of heritable bacteria from ants, butterflies, and beyond: broad

molecular surveys and a systematic review. PLoS One 7: e51027.

Russell JE, Stouthamer R (2011). The genetics and evolution of obligate reproductive parasitism

in Trichogramma pretiosum infected with parthenogenesis-inducing Wolbachia. Heredity

69 106: 58–67.

Santos-Garcia D, Rollat-Farnier P-A, Beitia F, Zchori-Fein E, Vavre F, Mouton L, et al. (2014).

The genome of Cardinium cBtQ1 provides insights into genome reduction, symbiont

motility, and its settlement in Bemisia tabaci. Genome Biol Evol 6: 1013–30.

Shoemaker DD, Katju V, Jaenike J (1999). Wolbachia and the evolution of reproductive

isolation between Drosophila recens and Drosophila subquinaria. Evolution 53: 1157–

1164.

Shoji S, Walker SE, Fredrick K (2009). Ribosomal translocation: One step closer to the

molecular mechanism. ACS Chem Biol 4: 93–107.

Skinner SWAY (1985). Son-Killer: A third extrachromosomal factor affecting the sex ratio in

the parasitoid wasp, Nasonia (=Mormoniella) vitripennis. Genetics 109: 745–759.

Stokes HW, Gillings MR (2011). Gene flow, mobile genetic elements and the recruitment of

antibiotic resistance genes into Gram-negative pathogens. FEMS Microbiol Rev 35: 790–

819.

Stouthamer R, Breeuwer JA, Hurst GD (1999). Wolbachia pipientis: microbial manipulator of

arthropod reproduction. Annu Rev Microbiol 53: 71–102.

Stouthamer R, Luck RF (1991). Transition from Bisexual to Unisexual Cultures in Encarsia-

Perniciosi (Hymenoptera, Aphelinidae) - New Data and A Reinterpretation. Ann Entomol

Soc Am 84: 150–157.

Takano S, Tuda M, Takasu K, Furuya N, Imamura Y, Kim S, et al. (2017). Unique clade of

alphaproteobacterial endosymbionts induces complete cytoplasmic incompatibility in the

coconut beetle. Proc Natl Acad Sci 114: 6110–6115.

Tsuchida T, Koga R, Horikawa M, Tsunoda T, Maoka T, Matsumoto S, et al. (2010). Symbiotic

70 bacterium modifies aphid body color. Science (80- ) 330: 1102–4.

Turelli M (1994). Evolution of incompatibility-inducing microbes and their hosts. Evolution 48:

1500–1513.

Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. (2012). Primer3

— new capabilities and interfaces. 40: 1–12.

Weeks a R, Marec F, Breeuwer JA (2001). A mite species that consists entirely of haploid

females. Science 292: 2479–82.

Weeks AR, Velten R, Stouthamer R (2003). Incidence of a new sex-ratio-distorting

endosymbiotic bacterium among arthropods. Proc Biol Sci 270: 1857–65.

Werren JH (1997). Biology of Wolbachia. Annu Rev Entomol 42: 587–609.

Werren JH, Hurst GD, Zhang W, Breeuwer JA, Stouthamer R, Majerus ME (1994). Rickettsial

relative associated with male killing in the ladybird beetle (Adalia bipunctata). J Bacteriol

176: 388–394.

White JA, Kelly SE, Perlman SJ, Hunter MS (2009). Cytoplasmic incompatibility in the parasitic

wasp Encarsia inaron: disentangling the roles of Cardinium and Wolbachia symbionts.

Heredity 102: 483–489.

Wu K, Hoy MA (2012). Cardinium is associated with reproductive incompatibility in the

predatory mite Metaseiulus occidentalis (Acari: Phytoseiidae ). J Invertebr Pathol 110:

359–365.

Zchori-Fein E, Gottlieb Y, Kelly SE, Brown JK, Wilson JM, Karr TL, et al. (2001). A newly

discovered bacterium associated with parthenogenesis and a change in host selection

behavior in parasitoid wasps. Proc Natl Acad Sci USA 98: 12555–60.

Zchori-Fein E, Perlman SJ (2004). Distribution of the bacterial symbiont Cardinium in

71 arthropods. Mol Ecol 13: 2009–2016.

Zchori-Fein E, Perlman SJ, Kelly SE, Katzir N, Hunter MS (2004). Characterization of a

’Bacteroidetes’ symbiont in Encarsia wasps (Hymenoptera: Aphelinidae): proposal of

‘Candidatus Cardinium hertigii’. Int J Syst Evol Microbiol 54: 961–968.

Zhang X, Zhao D, Hong X (2012). Cardinium—the leading factor of cytoplasmic

incompatibility in the planthopper Sogatella furcifera doubly infected with Wolbachia and

Cardinium. Environ Entomol 41: 833–840.

Zug R, Hammerstein P (2012). Still a host of hosts for Wolbachia: analysis of recent data

suggests that 40% of terrestrial arthropod species are infected. PLoS One 7: e38544.

Zug R, Hammerstein P (2014). Bad guys turned nice? A critical assessment of Wolbachia

mutualisms in arthropod hosts. Biol Rev Camb Philos Soc 49.

72 APPENDIX B

Comparative genomics of Cardinium reveals new candidates for genes underlying Cardinium- induced reproductive manipulations.

Corinne M Stouthamer, Evelyne Mann, Dario Copetti, Suzanne Kelly, Stephan Schmitz-Esser, and Martha S Hunter.

73 ABSTRACT

Some maternally inherited bacterial symbionts are able to alter their insect hosts’ reproduction in order to promote their own spread. Cardinium hertigii is a reproductive manipulator that, like the better-known and distantly related Wolbachia pipientis, is able to induce parthenogenesis, feminize genetic males into functional females, and induce cytoplasmic incompatibility (CI). CI causes a reproductive failure between symbiont-infected males and uninfected females. The symbiont genes responsible for CI in Wolbachia are becoming better understood through proteomics, comparative genomics, and transgenics, but current cytological and genomic evidence indicates that Cardinium and Wolbachia have independently evolved this set of reproductive manipulations. In this study, the genomes of four Cardinium strains that infect parasitic wasps in the genus Encarsia were compared along with two published Cardinium genomes. These six related strains induce different reproductive manipulations in their hosts and include two parthenogenesis-inducing (PI) strains, at least two CI strains, and one non- manipulating strain. The candidate genes identified in this analysis offer insights into the genetic underpinnings of CI and PI in Cardinium. Further, insights into the molecular mechanism of CI

Cardinium may inform the search for function in CI Wolbachia and vice versa, yielding a better general understanding of how two unrelated lineages have evolved convergent phenotypes.

74 1. INTRODUCTION

Many arthropods are infected or associated with heritable, intracellular microbes that can influence their biology (Moran et al., 2008). Some are essential to their host’s survival and reproduction by providing essential nutrients missing in their hosts’ diets (e.g. Douglas 1998;

Brownlie et al. 2009; Hosokawa et al. 2010). Others may cause context-dependent benefits in their host by, for example, providing defense against pathogens or parasites (e.g. Oliver et al.

2003; Scarborough et al. 2005; Łukasik et al. 2013). Others may not be beneficial at all; some microbes manipulate their host reproduction, often to the detriment of that host’s fitness, to further their own spread through the host population (Engelstädter and Hurst, 2009).

Reproductive manipulators are microbes, mostly bacteria, that skew their host’s sex ratio to benefit their own transmission. As most of these symbionts are transmitted maternally through the egg cytoplasm, these symbionts generally distort their host population’s sex ratio towards a female bias (Turelli, 1994; O’Neill et al., 1997). There are four ways by which symbionts have been shown to manipulate host reproduction, but relevant to the current study are symbiont- induced parthenogenesis (PI), where genetic males are turned into genetic females during embryogenesis, and, cytoplasmic incompatibility (CI), where infected females produce both male and female offspring, but uninfected females, when mated with infected males, produce fewer or no daughters (Werren, 1997; Engelstädter and Hurst, 2009). Reproductive manipulators can sometimes permanently change their host’s populations (e.g. selection against sexual reproduction in populations of PI-infected individuals, Russell and Stouthamer 2011) and are therefore likely important drivers of host evolution. Wolbachia is the most prevalent reproductive manipulator, but other bacteria have been shown to induce similar phenotypes in

75 their hosts. These include Rickettsia (Werren et al., 1994; Hagimori et al., 2006), Arsenophonus

(Skinner, 1985), Spiroplasma (Hurst et al., 1999), a new alpha-proteobacterium (Takano et al.,

2017), and Cardinium (Weeks et al., 2001; Zchori-Fein et al., 2001; Hunter et al., 2003), the focus here.

For many of these bacteria, the genes responsible for these induced phenotypes are unknown, but in Wolbachia, the genes for CI are starting to be better understood through a combination of proteomics, comparative genomics, and transgenics (Beckmann et al., 2017; LePage et al., 2017;

Lindsey et al., 2017). In the case of Wolbachia, comparative genomics was used both to help identify these genes as well as to further explore the distribution and function of these potential

CI-inducing genes (Lindsey et al., 2017). Comparing many genomes is a powerful tool for analysis of these host-manipulating genes.

Cardinium hertigii (Bacteroidetes), although unrelated to Wolbachia (Alpha-proteobacteria), is able to induce three of the four reproductive manipulations (Zchori-Fein and Perlman, 2004); cytoplasmic incompatibility (CI), parthenogenesis induction (PI), and feminization, where genetic males develop into functional females. The independent evolution of these host manipulations by Cardinium allows us to more deeply understand these phenotypes by identifying the underlying genes that may be functionally equivalent to those responsible for the phenotypes Wolbachia induces. Studying Cardinium adds a layer of complexity, as symbionts in this group with known reproductive manipulation phenotypes reside in small, non-model arthropods, and like some other secondary symbionts the relatively small genomes of these symbionts (~1 MB) have mobile elements and repetitive DNA that make assembly challenging.

76

The four Cardinium genomes sequenced in the current study were all isolated from species of

Encarsia that are parasitic wasps of whiteflies. Encarsia are minute (~1 mm) parasitoid wasps in the family Aphelinidae. The newly sequenced Cardinium are in the same clade as the other two

Cardinium genomes sequenced. In Encarsia inaron from Italy, Cardinium infection is associated with cytoplasmic incompatibility (Gebiola, Kelly, et al., 2016). Until this study, it was thought that this wasp was infected with a single strain of Cardinium, but we found it to be infected with two strains of Cardinium present in different densities. These two strains were designated cEina2 and cEina3, and it is yet unknown whether one or both of these strains induce CI (Gebiola,

White, et al., 2016). One of these strains, cEina2, is closely related to the sequenced strain of

Cardinium from the sweetpotato whitefly Bemisia tabaci (Santos-Garcia et al., 2014) that does not cause CI nor PI (Fang et al., 2014). The other Cardinium strain known to cause CI (Hunter et al., 2003) is cEper1 in Encarsia suzannae (= E. pergandiella, Gebiola et al. 2016), the first

Cardinium to be sequenced (Penz et al., 2012). Interestingly, in the wasp most closely related to

E. suzannae, Encarsia tabacivora (= E. pergandiella, Gebiola et al. 2016), a sister strain of

Cardinium, cEper2, is associated with parthenogenesis (Zchori-Fein et al., 2001) and was the third strain sequenced here. Cardinium has been shown to cause parthenogenesis in Encarsia hispida (Zchori-Fein et al., 2004), and the strain from this wasp, cEhis1, was the fourth in the current study. The distinction between parthenogenesis association and parthenogenesis induction is drawn because males are produced after antibiotic curing in E. hispida, showing reversal of parthenogenesis (Zchori-Fein et al., 2004), but no males and only very few, sterile offspring are produced after curing of E. tabacivora (Zchori-Fein et al., 2001). While PI is the

77 most likely phenotype of cEper2, we cannot be sure that there is not another cause of parthenogenesis and antibiotic-induced sterility in this wasp.

78 2. METHODS

2.1 Encarsia cultures

To isolate Cardinium, three different Encarsia species were reared on whiteflies (Bemisia tabaci) on cowpea (Vigna unguiculata) at 27°C, ambient humidity. Encarsia tabacivora, collected in Brazil (Zchori-Fein et al., 2001) is infected with cEper2, associated with parthenogenesis. A species of E. inaron, originally collected in Portici, Italy, and provisionally known as E. inaron (IT) to distinguish it from the sister species E. inaron now established in the

USA (White et al., 2009; Gebiola, Kelly, et al., 2016) was, as mentioned, doubly infected with cEina2 and cEina3, one or both of which cause CI. Cardinium has been shown to induce parthenogenesis (PI) in Encarsia hispida (Zchori-Fein et al., 2004). This species is infected with cEhis1 and was originally collected in San Diego, CA, USA. In our cultures, when the Encarsia wasps reached early pupation, the leaves were harvested and placed in a ventilated jar with honey and water to allow adult emergence. They were then collected in tubes of hundreds of wasps for DNA extraction.

2.2 Genomic DNA extraction and sequencing

Three sequencing approaches were employed for the wasp and bacterium metagenomes. For whole genome shotgun sequencing using Hiseq 2500 Illumina, approximately 1000 wasps (~19 mg), 1-3 days old, were frozen and the DNA subsequently extracted in a single column of the

DNEasy kit (Qiagen) following the manufacturer’s protocol. The library preparation and sequencing was performed at Vienna Biocenter Core Facilities VBCF NGS Unit

(www.vbcf.ac.at) using the Hiseq 2500 sequencer, and the sequence length was 150 bp paired end reads with 500-1000 bp inserts. For PacBio sequencing and Illumina Miseq sequencing, the

79 three different Encarsia species were extracted following the protocol described in Stouthamer et al. (MS in preparation) that enriched for Cardinium DNA. PacBio sequencing was performed at the University of Arizona Genomics Institute while library preparation and Miseq sequencing was performed at the New York Medical College (New York City) with an Illumina Nextera XT

Miseq platform, and 300bp paired-end reads with 400-600bp insert size.

2.3 Genome assemblies

The metagenomic HiSeq 2500 and Miseq Illumina data were cleaned using FastQC (-l 50 -q 20 -

-qual-mean 20). The HiSeq 2500 reads were assembled in ABySS 1.5.0 with the kmer=64 for cEper2, and kmer=51 for cEhis1 and the joint cEina2 and cEina3 assembly. Following the

Kumar et al. (2013) “Blobology” pipeline, bacterial and eukaryotic contigs were binned based on the GC content, BLAST identification, and coverage with Bowtie2 (version 2.3.2, settings: -- very-fast-local -k 1 -t -p 12). The reads were extracted from bacterial and non-annotated contigs below 50% GC, and reassembled in ABySS with the same kmers as the initial assemblies. Next, the “Blobology” pipeline was once again applied, but this time only the reads from bacterial contigs were extracted. PacBio reads were binned based solely on their BLAST identification to eukaryotes or bacteria. MiSeq sequencing was performed for cEper2, cEina2 and cEina3. HiSeq,

MiSeq, and PacBio libraries were used to assemble the cEper2 genome using SPAdes with kmer=99. The cEhis1 strain was assembled twice in SPAdes with kmer=123; for one assembly, the corrected PacBio reads were used in the fastq form, and in the other run, the raw reads of insert were used. The two assemblies were then combined in Minimus2 (Sommer et al., 2007).

Since Encarsia inaron (IT) was doubly infected with two strains of Cardinium, the assemblies of

PacBio, Miseq, and Hiseq data for cEina2 and cEina3 were done jointly in SPAdes using a kmer

80 size of 123bp. The HiSeq blobplot indicated that the two Cardinium genomes were present at different densities inside their wasp host, with cEina2 at higher density than cEina3, therefore

HiSeq reads were mapped to this assembly (Bowtie2 version 2.3.2 default settings, paired reads) and the coverage of the Hiseq data was used to separate the contigs into the two separate strains

(cEina2 average cov: 71.68; cEina3 average coverage value: 16.3). Assemblies were assessed using BLASTp for the presence of three plasmid genes identified (Santos-Garcia et al., 2014): virE (virulence-associated E family protein), pre (plasmid recombination enzyme), and (traC, putative conjugal transfer protein traC). Likely plasmid contigs were identified by assessing gene content (i.e. presence of plasmid genes), whether the GC content was lower than the majority of the contigs, and whether the contigs showed higher than expected coverage.

2.4 Genome annotation and analysis

The bacterial genomes were automatically annotated in RAST (Rapid Annotation using

Subsystem Technology) (Aziz et al., 2008). BLASTp and BLASTn (cutoff E-10), Swiss-Prot

(Bairoch and Apweiler, 2000), InterPro (Finn et al., 2017), and Phobius (Käll et al., 2004) were used to further verify the automatic annotation and analyze the predicted proteins. Full genome alignment figures were generated using MAUVE (Darling et al., 2004). The phylogeny of

Cardinium strains was inferred with RaXML Blackbox Maximum likelihood, and automatic halting of bootstraps (Stamatakis, 2014) on a subset of genes used in Lang et al. (2013): ribosomal proteins (LSU L7, L11p, L21p, L28p, L36p; SSU S3p, S7p, S15p) as well as full length genes Translation elongation factor G (EF-G), GTP-binding protein (LepA) and SSU rRNA (EC 2.1.1.182). The individual gene tree of traC was inferred using the same RaXML settings as above with midpoint rooting. The core Cardinium genome was determined through

81 protein BLASTp search via the RAST platform using cEper1 as the reference genome. Only proteins with 40% base amino acid identity and above in all strains were considered to be present.

2.5 Ankyrin and eukaryotic-interacting gene analyses

Ankyrin repeat domains (“ANKs”) are common in eukaryotes, eukaryote-associated bacteria and archea such as pathogens and symbionts (Jernigan and Bordenstein, 2014). Ankyrins act as a platform to mediate protein-protein interactions (Mosavi et al., 2004), and are likely to play an important role in bacterial-host interactions (Duron et al., 2007; Jernigan and Bordenstein, 2014).

The domains themselves share a baseline amino acid identity of 30-40%; therefore, here they were analyzed separately for their presence and homology in all the sequenced strains of

Cardinium using BLASTp with more stringent amino acid identity cutoffs (60%) and sequence alignments. To understand whether the ANKs are involved in host specialization or reproductive manipulations, we looked specifically for ANKs whose amino acid identity grouped them by host or reproductive manipulation. A Venn diagram was created using InterActiVenn (Heberle et al., 2015) to represent orthologous sets of ankyrins present or absent in each of the six strains.

Mann et al. (MS in review) compared the gene expression of cEper1 to find differentially expressed genes of cEper1 infecting males and females. Additionally, these authors identified cEper1 genes that were highly expressed in both wasp sexes, and potential eukaryotic interacting proteins based on their functional domains. Here, the presence and degree of amino acid identity of the Mann et al. set of genes in the current sequenced strains was determined with BLASTp.

Tables were prepared in CLC Workbench (Qiagen).

82 3. RESULTS

3.1 Cardinium genome assemblies, characteristics, and relatedness

The strains from this study are in draft genome status, with 13-48 contigs in each genome; the large amounts of repetitive DNA in these genomes made assembly challenging. The cEper2 assembly is in 48 contigs (in total 886,860 bp, 36.6% GC; Table 1). The cEhis1 chromosome is in 13 contigs, for a total of 957,289 bp (36% GC) and the cEhis1 plasmid is in two contigs at

65,578 bp (31.88%GC; Table 1). The cEina2 chromosome is in 28 contigs, totaling 910,533 bp

(36.7% GC), with a plasmid in four contigs (81,694 bp, 31.32% GC; Table 1). The cEina3 chromosome has 42 contigs (994,734 bp, 36% GC) and 5 plasmid contigs totaling 74,421bp

(31.94% GC; Table1). Although the cEper2, cEina2, and cEina3 assemblies consist of many contigs, we believe the genic content is complete because of high coverage values for these assemblies (Tables S1, S2, S3, and S4). All strains except cEper2 appear to have a plasmid; cEhis1, cEina2, and cEina3 all have the indicator genes (virE, pre, and traC). cEper2 lacks both pre and traC, but has a pseudogene similar to virE which contains a premature stop codon, rendering this gene likely non-functional.

Phylogenetic analysis and average nucleotide identity lend deeper insight into the relatedness among the Cardinium strains of this study. The chromosomes of cEina2 and cBtQ1 share >99% average nucleotide identity, and similarly, cEper1 and cEper2 share >97% average nucleotide identity. The average pairwise identity between chromosomes of all the other strains ranges from

90-92.7% (Table 2). The phylogeny of all strains based on some core ribosomal proteins, EF-G, and LepA shows that cEper1, cEper2 and cEina3 form one clade (100 bootstrap support) with cEper1 and cEper2 forming a sister clade to cEina3 (Figure 1). Strains cEhis1, cEina2, and

83 cBtQ1 form a second clade (100 bootstrap support), with cEhis1 being sister to cEina2 and cBtQ1 (Figure 1). The average pairwise identity among the plasmids is more variable; their average nucleotide identities range from 85.16% to 94.31%, and the plasmids in general did not align well (8.8% - 49%; Table 3). A phylogeny of the plasmid gene traC is incongruent with the phylogeny based on ribosomal proteins and shows that cEina2, cEina3, and cBtQ1 form a sister clade to cEhis1 and cEper1. The pairwise amino acid identities reflect these relationships (Figure

2).

3.2 Cardinium metabolism, biosynthetic capability, and secretory system

The core Cardinium genome shows that metabolic capability of these strains is reduced. Genes remain for the breakdown of amino acids, the pay-off phase of glycolysis, and fatty acid biosynthesis (Table S5). Additionally, 31 LSU ribosomal proteins and 22 SSU ribosomal proteins are encoded in the core genome, as well as at least 40 transport proteins, which include

ABC transporters, a C4-dicarboxylate transporter, nucleotide transporters, and several

Na+/proline symporters. All strains have recA, a gene necessary for homologous recombination in bacteria, and all strains encode a large number of mobile element proteins. Every strain has the genes to produce peptidoglycan from UDP-N-acetylglucosamine and lipoate from octanoyl-[acp]

(Table S5). Although all strains have retained some of the genes for biotin synthesis from pimelate, just three of the four strains, cEper1, cEina2 and cEina3, have the complete biotin synthesis cassette (Table 4). BioC in cEina2 is truncated, but it still retains the putative S- adenosyl-L-methionine-dependent methyltransferase domain (Interpro: IPR029063). All strains encode the complete set of proteins for a type VI secretion system, formerly known as the Anti-

84 feeding prophage-like (“afp”) secretion system, in which hollow contractile tubes excrete factors outside of the cell (Penz et al., 2012; Böck et al., 2017) (Table 5).

3.3 Bilateral comparisons of closely related Cardinium strains

Among the six Cardinium genomes now sequenced, two pairs of genomes are closely related

(cEper1 and cEper2, and cEina2 and cBtQ1), as indicated by their average nucleotide identity

(97% and 99%, respectively). One pair includes a strain with a CI phenotype (cEper1) and one that is associated with parthenogenesis in its host (cEper2), while cBtQ1 does not induce CI nor parthenogenesis (Fang, 2014), and cEina2 may cause CI (Gebiola et al. 2016) or cause no reproductive phenotype. The genome structure of these strains appears to be quite fluid; even when the contigs were aligned to produce maximum synteny, blocks of genes are still rearranged between the genomes in each related pair of strains (Figures 3 and 4). To better understand the differences in genic content of the two sets of genomes, we found genes that are unique, truncated, or of low identity between the pairs. For both comparisons, the majority of the unique protein-coding sequences were small hypothetical proteins, mobile element proteins, or ankyrin- domain containing genes. However, there were a few long (> 200 amino acids) hypothetical protein coding genes as well as those with annotated functions that appeared to be unique to one of the strains in the pair. The presence of these genes was then assessed in the other strains of

Cardinium.

The Cardinium strain cEina2 encodes 119 genes that are absent in its related cBtQ1, but out of these, only 33 genes are not mobile element-related nor are hypothetical proteins smaller than

200 amino acids (Table S6). Strain cEina2 has three genes that are absent in cBtQ1, but

85 conserved in all the other Cardinium strains that reside in wasps: two hypothetical proteins with no known functional domains, and a Heat shock protease. Strain cEina2 also encodes two identical small heat shock proteins with IbpA functional domains (Table S7), missing in cBtQ1 and all other strains except cEper1.

Strain cBtQ1 encodes 274 genes absent in cEina2, but only 32 of these are not mobile element- related nor small (<200 AA) hypothetical proteins (Table S8). Among the unique proteins that cBtQ1 encodes that have no homologs in any of the other strains are six hypothetical proteins, a long protein containing an RHS domain, a domain sometimes implicated in insecticidal activity and intracellular competition between bacteria (Koskiniemi et al., 2013; Chen et al., 2014), an

ANK domain, and a Quaternary ammonium compound-resistance protein (sugE), whose closest relative in the NCBI Nucleotide database resides on a plasmid of Rickettsia raoultii (88% nucleotide identity) (Table S9).

The sequenced Cardinium strain cEper1 encodes 130 genes absent in cEper2, 47 of which are not mobile elements nor short hypothetical proteins, and eight genes that are truncated relative to the same genes in cEper2 (Table S10 and S11). Strain cEper2 broadly lacks most of the plasmid proteins of cEper1, as well as three of the biotin synthesis proteins, and six ankyrin repeat domain-containing genes (Table S10). Interestingly, most of the genes truncated in cEper2 relative to cEper1 are complete in all the other strains, except GTP pyrophosphokinase (EC

2.7.6.5), which lacks conservation in most strains, and a conserved protein with a weak D- galactarate dehydratase/altronate hydrolase domain, which is shared only with cBtQ1 and cEina3. Of the 47 genes, only two hypothetical proteins are entirely unique to cEper1 among all

86 other Cardinium strains. Three cEper1 genes are uniquely shared with cEina3; one hypothetical protein with low amino acid identity, a patatin family protein, and a threonine efflux protein.

And, as mentioned previously, cEper1 and cEina2 uniquely encode a small heat shock protein

(85.16% AA ID) (Table S12).

Strain cEper2 has fewer unique genes overall (99), with only 22 unique coding genes that are not mobile elements or small hypothetical proteins (Table S13), and eight genes that are interrupted relative to their orthologs in cEper1 (Table S14). Half of the 22 unique genes are ankyrin repeat containing genes, and two genes, Lanthionine biosynthesis protein (LanM) and Streptococcal hemagglutinin protein, have no homologs in the other strains. Notably, one hypothetical protein encoding a putative coiled coil is highly similar (98.68 %AA ID) to a hypothetical protein of identical length in cEhis1, the other PI-associated strain. The only other strain that contains this gene is cEina3, but here AA identity is lower (76.35% AA ID). Additionally, another cEper2 hypothetical protein absent in cEper1 is present in cEhis1 (72.69% AA ID) and cBtQ1 (67.98%

AA ID), and these proteins encode putative signal peptides for excretion. Eight genes were truncated in cEper1 relative to cEper2. Half of these were also ankyrin domain containing genes and all but one share little homology with other strains (Table S15).

3.4 Ankyrin repeat domain-containing proteins

The total number of ankyrin domain-containing genes (ANKs) per genome ranges from 19 in cEper1 to 35 in cEina3 (Figure 5). Whether all of these genes are functional is not yet known; particularly in cEina3, at least three ANKs appeared to have a premature stop codon separating the gene into two parts. Therefore, these numbers likely overestimate the prevalence of

87 functional ANKs in these genomes. Seven sets of homologous ANKs were shared between all strains in this study. Orthologs of the cEper1 ANK CAHE_0095 had very conserved AA ID (90-

100%) that generally followed the expected pattern based on the evolutionary relationships of the strains (Figure 6A), and orthologs of CAHE_0632 were less conserved amongst the other strains

(~76-100 %AA ID) but still followed the expected identity pattern based on their evolutionary relatedness (Figure 6B). The other sets of orthologous ANKs were variable both in length and

AA ID (Figure 6C-E). Orthologs of CAHE_0834 had low AA ID between cEina2 and cBtQ1

(Figure 6F), but this was the result of a deletion in cBtQ1, which truncated the protein. The aligned part of the protein was >99% identical. Orthologs of CAHE_0435 had a higher pairwise

AA ID between the two PI strains, cEper2 and cEhis1, than expected based on the relatedness of these strains (Figure 6G). Strains cEina2 and cBtQ1 share eight ANKs unique to those two strains, and have only four and five unique ANKs respectively. Strains cEper1 and cEper2 share only three ANKs with just each other and have seven and nine unique ANKs respectively.

Interestingly, the two parthenogenesis-associated strains, cEper2 and cEhis1, also share one

ANK, absent in all the other strains (Figure 5).

3.5 Differentially expressed, highly expressed, and eukaryotic-interacting genes

Genes that are highly and/or differentially expressed in cEper1 that were not housekeeping genes tended to have low amino acid identity with other strains sequenced in this study (Table 6). One of the highly expressed gene upregulated in males, a putative ubiquitin ligase, CAHE_p0026, contains an ankyrin domain and a RING domain. This gene does not appear to have any orthologs in the other strains of Cardinium, although the ankyrin region shares limited identity with other ankyrin-repeat domain containing genes (Table 6). In cEper1, CAHE_0757 is the

88 second most highly expressed gene in Cardinium in both males and females (Mann et al., unpublished). Despite the fairly low pairwise amino acid identity among the orthologs of this gene, all strains have a conserved signal peptide, and the orthologs in the two PI strains, cEhis1 and cEper2 share the highest pairwise AA identity with one another (72.95%; Figure 7A). The highly expressed CAHE_0267 in cEper1 encodes a putative MutS domain, which interacts with

DNA. The MutS domain is conserved in all the other strains of Cardinium in this study, but only orthologs in the PI strains have a predicted signal peptide for excretion and share 89.26% pairwise AA ID (Figure 7B). Hypothetical protein CAHE_0390 ranked in the top 10 of highly expressed genes in cEper1 (Mann et al., unpublished). Orthologs of this gene in strains cEina3 and cEper1 share a high AA identity (77.58%) and all strains encode a predicted signal peptide

(Figure 7C). The highly expressed CAHE_0662 contains an inhibitor of an apoptosis-promoting

Bax1 domain. This domain is conserved in orthologs of this gene in the other strains and the highest pairwise AA ID (90.24%) is shared by the two PI-associated strains, cEhis1 and cEper2, but only cEina2 has a predicted signal peptide (Figure 7D). Among other highly expressed genes in cEper1 are CAHE_0050, for which cEper1 and its ortholog in cEhis1 share the highest pairwise AA identity (Figure 7E) and CAHE_0406, a hypothetical protein for which the length as well as amino acid identity varies amongst the orthologs in the other strains (Figure 7F). Also, an operon containing CAHE_0676 (Figure 7G), a cold shock protein, and CAHE_0677 (Figure

7H), which encodes a putative DEAD BOX RNA helicase, is highly conserved (>90% AA ID) across all the orthologs in the strains in this study.

Several other genes expressed in cEper1 with eukaryotic interacting domains were conserved in the genomes sequenced here. For instance, cEper1 encodes three tetratricopeptide repeat (TPR)

89 domain-containing genes: CAHE_0312, CAHE_0450, and CAHE_0452. Each of these is fairly conserved across Cardinium strains, though CAHE_0312 is most conserved, followed by

CAHE_0450, and then CAHE_0452 (Figure 8A, B, C). CAHE_0028, a ubiquitin protease in cEper1, has orthologs in all strains of Cardinium, and all have predicted signal peptides for excretion, but the gene in cEper1 has a very low amino acid identity to the other strains’ ubiquitin proteases (Figure 8D). CAHE_0010 bears a WH2 actin-binding motif only in cEper1 and in cEhis1, and its orthologs in cEper1, cBtQ1, and cEina2 encode a predicted signal peptide

(Figure 8E). CAHE_0017 in cEper1 and all the other strains encodes a second gene with a DNA mismatch repair MutS domain, and all of the orthologs also encode a signal peptide (Figure 8F).

Lastly, CAHE_0286 is a patatin protein, and is shared only by cEina3, although the protein in cEina3 is truncated.

90 4. DISCUSSION

4.1 The host-associated, intracellular lifestyle of Cardinium

The Cardinium genomes in this study have been shaped by their intracellular, secondary symbiont lifestyle; they encode a large number of transporters as the basis of their metabolism and have retained a limited set of biosynthetic capabilities. Cardinium is likely able to undergo homologous recombination; like many other secondary symbionts, these Cardinium strains encode recA, as well as putative ruvA, ruvB, and yggF to resolve Holliday junctions. This capability, along with the presence of a plasmid in most strains, and a plethora of mobile genetic elements may explain why these closely related genomes have little shared synteny. The strains of Cardinium in this study were able to produce lipoate, an essential cofactor in some multi- protein complexes, and have genes for peptidoglycan synthesis for cell wall production as well as fatty acids. Not all strains are able to produce biotin and the role of biotin synthesis in this symbiont is still unclear. Biotin is an essential cofactor for carboxylase enzymes, some of which are involved in amino acid catabolism and fatty acid biosynthesis (Gravel and Narang, 2005).

Most eukaryotes are able to obtain sufficient amounts of biotin from their diet, unless they subsist on highly specialized, biotin-deficient sources such as blood or plant sap (Lamelas et al.,

2011; Nikoh et al., 2014). In contrast, several strains of Cardinium do not produce biotin, so it does not appear to be required for the bacterium. In addition, Encarsia wasps that harbor

Cardinium capable of biotin synthesis are not negatively affected when cured of their Cardinium symbiont (Gebiola et al. In press).

In general, it appears that closely related Cardinium reside in closely related hosts (Stouthamer et al. MS in prep, Nakamura et al. 2009). However, here we found two very closely related

91 Cardinium strains, cEina2 and cBtQ1 (>99% average nucleotide identity) that reside in hosts from different insect orders (Hymenoptera and Hemiptera). There is no evidence of a suite of genes that encode for host order; only one gene in cEina2 is shared by, or more similar in all the other wasp strains than in cBtQ1 (Table S7). Additionally, cEper1 and cEper2 have roughly the same number of genes unique to each strain, and they reside in closely related hosts that until recently were thought to be the same species (Gebiola, Monti, et al., 2017). Taken together, this suggests that large numbers of host taxon-specific genes are not necessary for host specialization in Cardinium.

Ankyrin repeat-containing proteins are involved in many basic eukaryotic cellular processes such as cell division, transcriptional regulation, and development (Bork, 1993; Mosavi et al., 2004;

Sedgwick and Smerdon, 2017). Ankyrin repeat domain-containing proteins are prevalent in eukaryotes as well as eukaryotic-associated bacteria and archeans (Jernigan and Bordenstein,

2014). In Wolbachia, ANKs have been found to be excreted in strains infecting Brugia malayi

(Foster et al., 2005). All of the Cardinium genomes are enriched for ANKs, but their numbers vary widely even within this clade of Cardinium. These genes also evolve quickly, possibly due to intra-genic recombination events (Siozios et al., 2013). Only seven ANKs clearly shared orthologs amongst all the Cardinium strains in this study. Between the two most closely related strains in this dataset that reside in very different hosts, cEina2 in Encarsia inaron (IT) and cBtQ1 in Bemisia tabaci, the number of unique ANKs for each strains were low (four and five, respectively), and out of the 24 ANKs they share, only four ANKs were conserved with less than

90% AA ID. Conversely, the closely related strains residing in sister species of Encarsia, cEper1 and cEper2, have seven and nine unique ANKs, respectively, and of the ten ANKs they share,

92 four have less than 90% AA ID. If these genes were solely involved in host interaction, one would expect more divergence between cEina2 and cBtQ1, as these strains reside in such different hosts, and less divergence between cEper1 and cEper2. Perhaps certain ANKs have outsized effects in particular hosts. For example, cBtQ1 contains a large Ankyrin repeat domain with an RHS domain, a domain associated with insecticidal activity and intracellular toxicity in some bacteria (Koskiniemi et al., 2013; Chen et al., 2014). The sequencing of more genomes both within this clade and more distantly related clades might help uncover a clearer correlation between certain ANKs and their role in the intracellular lifestyle of Cardinium.

4.2 The role of the Cardinium plasmids in gene transfer and reproductive manipulations

Plasmid gene composition varies greatly among the Cardinium strains in this study, even between the very closely related strains cEina2 and cBtQ1 (Table 3, Figures 2a, 2b).

Interestingly, the plasmid average nucleotide identity (94.31%) was higher for cEina2 and cEina3, which co-infect the same wasp, than one would expect based on their chromosomal average nucleotide identity (90.9%; Table 2 and 3). Additionally, the phylogeny and nucleotide identity of a conserved plasmid gene, traC, show that traC in cEina2 and cEina3 are closely related (Figures 2a, 2b). The similarity of plasmids and an essential plasmid gene in less closely- related, but co-infecting Cardinium strains may indicate that these plasmids are easily exchanged. It is possible that these large mobile genetic elements, which might be exchanged among strains and may undergo homologous recombination with bacterial chromosomal DNA

(Fournier et al., 2006; Partridge, 2011; Stokes and Gillings, 2011), serve an analogous function to the WO phage in Wolbachia, which has recently been implicated in the spread of CI- associated cifA and cifB genes (Lindsey et al., 2017). No phage is known in Cardinium (Penz et

93 al., 2012; Santos-Garcia et al., 2014). The PI-associated strain, cEper2, is the first Cardinium strain identified so far that appears to lack a plasmid. In Wolbachia, some PI-inducing strains also may lack the WO phage; the PI-inducing strains, wTpre, does not harbor active WO phages

(Gavotte et al., 2007; Lindsey et al., 2016), while the PI-inducing strain wUni does (Klasson et al., 2008). Here, the PI-inducing strain, cEhis1, does appear to have a plasmid.

4.3 CI candidate genes- three CI scenarios

Cytoplasmic incompatibility (CI) occurs in two wasps with sequenced Cardinium: E. suzannae, which harbors cEper1, and E. inaron (IT), which is co-infected with cEina2 and cEina3. Because this double infection was discovered during this sequencing project, it is unclear which strain in

E. inaron actually causes CI, or whether both might cause this phenotype. Therefore, there were three scenarios under which we searched for possible CI candidate genes. We looked for genes present or of high conservation in cEper1 and both cEina2 and cEina3 strains, and of lower conservation or absent in other strains; genes present or of high conservation in cEper1 and cEina2, and not their closest relatives; and genes present or of high conservation in cEper1 and cEina3. All genomes were analyzed for the presence of the Wolbachia CI-candidates cifA and cifB using BLASTp, but these were absent in all of these Cardinium CI genomes, further supporting the hypothesis that Cardinium and Wolbachia have independently evolved reproductive manipulations.

Operating under the assumption that, in addition to cEper1, both cEina2 and cEina3 cause CI, very few CI-effector candidates stand out. Other than one ankyrin shared by all of these strains and cBtQ1, there were four long hypothetical proteins shared between these strains, but they are

94 not conserved; their AA ID ranges from approximately 25-38%. One interesting pattern lies in the biosynthetic capability of these three strains; the two potential CI-inducing strains, cEina2 and cEina3, appear able to produce biotin, and in cEper1, this pathway is encoded and also expressed (Mann et al., unpublished), but in all other strains this pathway is incomplete. How biotin would be involved in CI is unclear. In Wolbachia, abnormal histone protein deposition in male pronuclei is hypothesized to lead to a delay in entering the first mitosis in the embryo, and eventually the production of chromatin bridges, indicative of the failure of chromosomal replication, and the eventual embryonic death due to CI (Konev et al., 2007; Landmann et al.,

2009). In Cardinium, although nothing is known about histone deposition, similar chromatin bridges have been observed in association with CI (Gebiola, Giorgini, et al., 2017). The presence of biotinylated histones in eukaryotes is a fairly recent discovery (Kothapalli et al., 2005), and their function has not been fully characterized, although low levels of biotinylated histone proteins in adult Drosophila have been associated with a difference in gene expression involved in stress and longevity (Camporeale et al., 2006). Interestingly, in Drosophila, biotin deficient diets are associated with a decrease in biotinylated histones, but no decrease in biotinylated carboxylases, thus suggesting that there is a hierarchy of function for biotin in the eukaryotic cell

(Smith et al., 2007). Therefore, if CI were in some way dependent on biotinylated histones, an excess of biotin could be necessary to induce the phenotype, which could explain why all three potential CI-strains in this scenario have retained a complete biotin synthesis cassette.

Under the scenario that cEina2 alone causes CI, and cEina3 does not, cEper1 and cEina2 share one small heat shock protein absent in the other strains. Small heat shock proteins protect bacteria from oxidative or heat stress by forming large complexes around the proteins susceptible

95 to damage (Kitagawa et al., 2002). It is possible that these proteins might be exported from the

Cardinium cell, but unlikely that these are specifically effectors of CI because many bacteria encode these proteins (Haslbeck et al., 2005). One other potential CI effector under this scenario is CAHE_0757, a highly expressed protein in cEper1 encoded with a putative signal peptide.

This gene in cEper1 is most similar to cEina2 (43.42% AA ID), while sharing less AA ID with other strains (~27-36%). Notably also, this is one gene where cEina2 and its closest relative, cBtQ1, are quite different (25.93% AA ID), an unusual occurrence for these closely related genomes. Aside from the predicted putative signal protein, there are no other annotated domains in this protein, nor any close relatives in NCBI.

Under the scenario where, in addition to cEper1, only cEina3 causes CI, cEper3 and cEper1 share some effector proteins of interest. Among the shared homologs that are absent in other strains are a putative threonine efflux protein and a putative patatin protein. Efflux proteins exchange solutes and amino acids across a membrane, and their potential role in CI is dubious, as

Cardinium encodes many transporter proteins. The putative patatin protein is shared with some

CI Wolbachia, and the patatin protein is complete in cEper1. However, this protein is truncated in cEina3, and one active site is missing in this strain. Therefore, if this patatin gene is a CI- effector, it must not rely on the second active site in this protein. Finally, a very highly expressed hypothetical protein in cEper1 (CAHE_0390) contains homologs in all strains, but the highest

AA ID is shared with cEina3 (77.58%). This protein encodes a predicted signal peptide for excretion in all strains, and therefore might interact with the host.

4.4 Parthenogenesis-inducing candidate genes of interest

96 Parthenogenesis-inducing symbionts have an additional ecological hurdle to overcome in heteronomous hosts such as Encarsia, where males develop as obligate hyperparasitoids on other wasps developing within the whitefly (Hunter 1999, Hunter & Woolley 2001). For successful development of a wasp with a PI-symbiont, the male egg which becomes a female egg must be laid in the appropriate host for female development and needs to survive long enough in that host for the symbiont to duplicate the egg’s chromosomes and cause it to become female (Hunter,

1999; Hunter and Woolley, 2001). Strains cEper2 and cEhis1 are associated with parthenogenesis induction (PI) in their Encarsia hosts. These genomes are not very closely related; their average chromosomal nucleotide identity is 91.97% with 77.71% of cEhis1 aligned to 84.39% of the cEper2 genome. Despite this, our genomic analysis uncovered a number of putative eukaryotic interacting factors shared by these strains that could potentially be involved in parthenogenesis induction. In cEper1, CAHE_0662, the inhibitor of apoptosis-promoting

Bax1 domain is highly expressed. All strains have homologs of this protein, but only cEhis1 and cEper2 have homologs encoding a putative signal peptide for secretion outside the cell and they strikingly share the highest AA ID (90.24%), while the other pairwise AA IDs range from 60.08-

87.5% (Figure 7D). Perhaps for parthenogenesis-inducing symbionts specifically residing in heteronomous wasps, the inhibition of apoptosis could be important to prevent incipient male eggs from dying in the wrong host (Hunter, 1999; Hunter and Woolley, 2001). Another gene that is highly expressed in cEper1 and for which cEper2 and cEhis1 homologs share the highest amino acid identity (72.95%) is CAHE_0757. This hypothetical protein with a predicted signal peptide for excretion varies greatly among the strains. As mentioned previously, there are no homologs in NCBI for this gene, nor any predicted domains. Lastly, cEper2 and cEhis1 share an

97 ankyrin domain-containing protein that is absent in all other strains of Cardinium, and have one ankyrin that shares a relatively higher homology between these strains compared to the others.

98 5. CONCLUSION

The genomes of these closely related strains of Cardinium illuminate the lifestyle of this intracellular symbiont in different hosts. The genomes show highly reduced biosynthetic capabilities but an expanded repertoire of transporters for importing essential nutrients and cofactors from their hosts. Comparative genomics of highly expressed, differentially expressed, and putative eukaryotic interacting factors in these strains show a number of genes to be tested for their potential involvement in reproductive manipulations. This study provides the foundation for future functional studies, which will be able to further illuminate the complex interactions of

Cardinium with its hosts.

99 6. TABLES AND FIGURES

Table 1. Genome assembly information.

Genome Chromosome Chromosome N50 L50 Number Number of Number Number Plasmid Number Plasmid name size (bp) GC content (bp) of coding of of size of coding GC (%) (%) contigs sequences tRNAs plasmid (bp) sequences on and contigs on chromosome rRNAs plasmid cEper1 887,130 36.6 887130 1 1 838 40 1 57,800 61 31.4 cEper2 886,860 36.6 50467 6 48 822 38 - - - - cBtQ1 1,012,588 36.1 611945 1 11 1007 38 1 52,050 39 31.9 cEhis1 957,289 36 491520 1 13 848 38 2 65,578 38 31.88 cEina2 910,533 36.7 65396 3 28 863 38 4 81,694 75 31.32 cEina3 994,734 36 48593 7 42 894 37 5 74,421 47 31.94

100 Table 2: Average pairwise nucleotide identity (%) of Cardinium chromosomes, calculated with MUMmer (percentage of genomes aligned in brackets)

cBtQ1 cEina3 cEina2 cEhis1 cEper1 cEper2 cBtQ1 * 91.01 [84.37] 99.36 [92.53] 92.75 [75.92] 91.21 [79.55] 91.33 [82.04] cEina3 91.00 [78.06] * 90.98 [77.62] 90.91 [73.22] 91.29 [76.52] 91.44 [76.59] cEina2 99.37 [91.92] 90.97 [84.67] * 92.72 [83.41] 91.25 [81.37] 91.35 [80.94] cEhis1 92.74 [77.51] 90.92 [76.50] 92.72 [77.43] * 92.06 [78.74] 91.97 [77.71] cEper1 91.20 [82.44] 91.31 [84.08] 91.26 [83.37] 92.08 [85.62] * 97.50 [89.98] cEper2 91.34 [82.00] 91.45 [83.58] 91.34 [82.80] 91.97 [84.39] 97.49 [90.55] *

101 Table 3: Average pairwise nucleotide identity (%) of Cardinium plasmids calculated with MUMmer (percentage of genomes aligned in brackets)

cEper1 cBtQ1 cEina3 cEina2 cEhis1 cEper1 * 87.34 [13.82] 87.33 [16.23] 85.16 [31.58] 90.35 [38.70] cBtQ1 87.34 [10.39] * 93.12 [43.91] 93.54 [45.18] 84.45 [23.62] cEina3 86.76 [8.80] 93.09 [32.94] * 94.31 [27.97] 85.86 [20.03] cEina2 85.16 [26.00] 93.54 [42.62] 94.31 [31.25] * 84.88 [24.66] cEhis1 90.35 [30.94] 84.46 [21.16] 85.85 [27.33] 84.88 [24.25] *

102 Table 4. Biotin synthesis pathway genes for all Cardinium strains using genes in cEper1 genome as the reference.

Length % ID Length % ID Length % ID Length % ID Length % ID Length Function in to in to in to in to in to in cEper1 cEper2 cEper2 cBtQ1 cBtQ1 cEhis1 cEhis1 cEina2 cEina2 cEina3 cEina3

321 0 0 0 0 91.56 321 92.5 321 91.56 321 Biotin synthase (EC 2.8.1.6) 8-amino-7-oxononanoate synthase (EC 379 0 0 92.68 42 92.86 379 90.74 379 93.65 379 2.3.1.47) 221 0 0 88.07 219 81.01 341 87.61 219 88.07 219 Biotin synthesis protein BioH 246 0 0 85.71 246 0 0 73.96 173 88.16 246 Biotin synthesis protein BioC 206 97.45 158 88.29 206 0 0 88.29 206 86.83 206 Dethiobiotin synthetase (EC 6.3.3.3)

103 Table 5. Conservation and length of genes encoding the Type VI Secretion system between all strains in this study using cEper1 genes as the reference.

Length in % ID to Length in % ID to Length in % ID to Length in % ID to Length in % ID to Length in cEper1 cEper2 cEper2 (bp) cEina3 cEina3 cEhis1 cEhis1 cBtQ1 cBtQ1 (bp) cEina2 cEina2 (bp) (bp) (bp) (bp) CAHE_0036 1703 99.19 1705 96.75 1597 96.34 1595 96.75 1713 95.93 1713 CAHE_0037 247 97.57 247 93.24 247 93.06 247 93.5 247 93.76 247 CAHE_0118 1155 97.36 1257 82.94 1257 83.26 1257 83.47 1154 79.65 1257 CAHE_0409 669 99.1 669 94.45 670 94.77 670 94.93 630 95.67 630 CAHE_0456 187 100 187 95.68 185 96.24 187 95.68 187 94.59 187 CAHE_0457 288 98.26 288 87.11 288 90.24 288 87.46 288 92.33 288 CAHE_0458 496 99.18 487 97.99 487 98.16 436 97.94 449 97.33 487 CAHE_0459 155 98.7 155 94.81 155 93.51 155 94.81 155 94.16 155 CAHE_0460 152 98.68 152 93.38 152 94.7 152 92.72 152 93.38 152 CAHE_0461 150 100 150 89.19 150 90.98 134 93.29 75 82.55 150 CAHE_0462 58 100 63 84.48 60 84.21 58 77.19 58 84.21 58 CAHE_0463 437 97.71 437 84.32 395 84.48 394 84.01 441 80 395 CAHE_0760 828 98.91 828 90.56 826 91.52 828 90.68 826 90.92 826 CAHE_0761 135 100 135 95.7 135 91.79 134 94.62 94 94.03 94 CAHE_0762 103 100 103 95.1 103 97.06 103 96.08 103 99.02 103 CAHE_0763 597 99.33 597 95.64 597 94.63 612 95.76 604 95.3 567

104 Table 6. Highly expressed and differentially expressed genes in cEper1 compared to homologs in other strains.

Protein Protein Protein Protein Protein Expression length length length length length from Mann in in in in in et cEper2 % ID cEina3 % ID cBtQ1 % ID cEina2 % ID cEhis1 al.,unpubli % ID to as % of to as % of to as % of to as % of to as % of CAHE tag shed cEper2 cEper1 cEina3 cEper1 cBtQ1 cEper1 cEina2 cEper1 cEhis1 cEper1 Function Inner membrane protein translocase component YidC, long CAHE_0102 DE 98.57 92.85 88.69 100.83 86.34 98.50 85.61 93.51 87.50 93.34 form LSU ribosomal protein CAHE_0130 DE 96.61 100.00 91.53 100.00 91.53 100.00 91.53 100.00 91.53 100.00 L30p (L7e) SSU ribosomal protein CAHE_0131 DE 99.42 100.00 98.25 100.00 98.25 100.00 98.24 99.42 98.83 100.00 S5p (S2e) LSU ribosomal protein CAHE_0132 DE 97.67 100.00 94.87 90.77 95.73 90.77 95.73 90.77 94.02 90.77 L18p (L5e) Aspartyl-tRNA synthetase (EC - DE 98.07 100.00 91.85 100.00 90.99 100.00 91.20 100.00 88.20 100.00 6.1.1.12) Dipeptide transport system permease protein DppC (TC CAHE_0242 DE 98.25 100.00 88.81 100.00 88.46 100.00 88.81 100.00 91.26 100.00 3.A.1.5.2) LSU ribosomal protein CAHE_0335 DE 98.70 100.00 95.22 100.00 94.78 100.00 94.78 100.00 93.04 100.00 L1p (L10Ae) SSU ribosomal protein CAHE_0475 DE 98.45 100.00 95.35 100.00 93.80 100.00 93.80 100.00 93.80 100.00 S9p (S16e) CAHE_0544 DE 0.00 0.00 75.00 62.07 0.00 0.00 hypothetical protein Transcription CAHE_0565 DE 98.72 100.00 98.72 100.00 98.72 100.00 98.72 100.00 98.08 100.00 elongation factor GreA CAHE_0678 DE 0.00 37.39 100.87 68.28 65.94 68.12 100.44 0.00 hypothetical protein ankyrin repeat protein CAHE_p002 with RING domain and 6 DE 27.76 57.32 29.29 56.79 47.62 41.45 42.11 27.51 41.39 100.00 ubiquitin ligase CAHE_p002 7 DE 0.00 0.00 65.12 145.45 0.00 0.00 hypothetical protein

105 highly DNA-directed RNA expressed, polymerase beta CAHE_0338 DE 98.91 100.00 97.34 100.00 96.79 100.00 96.79 100.00 96.40 100.00 subunit (EC 2.7.7.6) highly DNA-directed RNA expressed, polymerase beta' CAHE_0339 DE 99.51 100.00 97.55 99.86 96.71 100.00 96.85 100.00 96.64 100.00 subunit (EC 2.7.7.6) Polyribonucleotide highly nucleotidyltransferase CAHE_0112 expressed 97.17 101.56 94.76 101.56 96.46 101.56 96.32 101.56 94.76 101.56 (EC 2.7.7.8) highly Chaperone protein CAHE_0016 expressed 96.29 97.18 95.14 100.00 95.77 100.00 95.92 100.00 93.26 100.00 DnaK highly Outer membrane CAHE_0182 expressed 96.04 101.48 85.15 100.00 78.22 100.00 78.22 100.00 63.86 100.00 protein H precursor Heat shock protein 60 highly family chaperone CAHE_0254 expressed 99.27 100.00 99.08 100.00 98.17 100.00 98.72 100.00 98.17 100.00 GroEL Chromosome highly (plasmid) partitioning CAHE_0315 expressed 100.00 100.00 98.07 100.00 97.67 100.00 97.28 100.00 99.23 100.00 protein ParA highly Translation elongation CAHE_0330 expressed 97.73 100.00 98.74 100.00 99.49 100.00 99.49 100.00 98.23 100.00 factor Tu highly Sodium Solute CAHE_0352 expressed 88.76 43.41 66.47 99.16 72.89 77.87 81.68 99.58 81.47 83.11 Symporter highly CAHE_0390 expressed 74.38 99.78 77.25 99.55 65.25 99.10 71.46 101.80 72.30 97.98 hypothetical protein highly DNA repair protein CAHE_0394 expressed 94.26 96.33 92.08 94.69 89.34 96.33 89.34 96.33 89.75 96.33 RadC highly CAHE_0406 expressed 90.00 98.36 64.29 145.90 63.33 145.90 56.82 119.67 58.62 245.90 hypothetical protein Signal recognition particle receptor highly protein FtsY (=alpha CAHE_0417 expressed 99.06 100.00 90.94 100.00 93.30 66.36 91.25 102.80 94.26 66.36 subunit) (TC 3.A.5.1.1) highly Phage tail sheath CAHE_0458 expressed 99.18 98.19 97.33 98.19 97.99 90.52 97.94 98.19 98.16 87.90 protein FI highly CAHE_0050 expressed 64.14 98.80 68.87 102.39 67.34 98.80 63.24 100.40 78.08 102.79 hypothetical protein highly CAHE_0536 expressed 99.19 100.00 95.25 100.00 96.88 100.00 97.11 100.00 96.88 100.00 ClpB protein

106 highly SSU ribosomal protein CAHE_0586 expressed 99.15 100.00 96.95 100.00 95.43 100.00 95.26 100.00 94.59 100.00 S1p highly Cold-shock DEAD- CAHE_0677 expressed 97.07 100.00 96.53 100.00 96.75 98.40 97.07 100.00 96.80 103.72 box protein A highly Translation elongation CAHE_0069 expressed 98.32 99.86 97.62 100.00 96.63 100.00 96.63 100.00 96.21 100.00 factor G highly CAHE_0757 expressed 27.99 99.12 36.61 101.47 28.16 98.39 42.23 99.56 28.61 99.41 hypothetical protein highly Sodium solute CAHE_0796 expressed 95.43 100.78 78.23 100.26 78.51 79.91 77.56 80.00 75.71 101.81 symporter highly CAHE_p004 expressed 0.00 38.10 106.39 0.00 37.44 129.07 0.00 hypothetical protein highly CAHE_p006 expressed 96.83 125.26 96.97 67.37 58.57 102.11 94.22 103.86 100.00 15.44 Mobile element protein

107 Figure 1: Maximum likelihood phylogeny based on Translation elongation factor G (EF-G), GTP-binding protein (lepA) SSU rRNA (EC 2.1.1.182), and ribosomal proteins: LSU L7, L11p, L21p, L28p, L36p, and SSU S3p, S7p, S15p. The reproductive manipulations are denoted on the phylogeny.

LIT cEina3&

100

cEper2&ceper2

100

cEper1&ceper1

cEina2&HIT Cytoplasmic,Incompa/bility,(CI),

100 May,cause,CI,

cBtQ1&CHV Does,not,cause,CI,nor,PI, 100

Parthenogenesis,

Associated,with,a,parthenogene/c,host, hisp cEhis1& 0.04

108 Figure 2a. The phylogeny of plasmid protein traC orthologs in all Cardinium strains in this study.

30.peg.851cEper1&

100

85.peg.560cEhis1&

65.peg.651cEina3&

100

31.peg.102cBtQ1&

85

53.peg.687cEina2& 0.03

109 Figure 2b. (a) The pairwise nucleotide identity among plasmid gene traC orthologs in all Cardinium strains in this study.

cEper1& cEhis1& cEina3& cEina2& cBtQ1&

110

1 Figure 3. Whole genome alignment showing the maximum amount of shared synteny between cBtQ1 (top) and its nearest relative cEina2 (bottom). Red lines represent edges of contigs in alignments.

111 Figure 4. Whole genome alignment showing maximum amount of shared synteny between cEper1 (top) and its nearest relative cEper2 (bottom). Red lines represent edges of contigs in alignments.

112 Figure 5. Venn diagram of ankyrin domain-containing genes in all six sequenced strains of Cardinium.

113 Figure 6. Tables of pairwise amino acid identities (top) and number of differences (bottom) between orthologs of ankyrin-repeat containing genes. Strains are named based on the genes in cEper1: A CAHE_0095, B CAHE_0632, C CAHE_0588, D CAHE_0670, E CAHE_0205, F CAHE_0834, G CAHE_0435.

A" B"

cEper1" cEper1"

cEper2" cEper2"

cEina3" cEina3" cEhis1" cEhis1"

cEina2" cEina2" cBtQ1" cBtQ1"

C" D" cEper1" cEper1" cEper2" cEper2" cEina3" cEina3" cEhis1" cEhis1" cEina2" cEina2" cBtQ1" cBtQ1"

E" F" cEper1" cEper1" cEper2" cEper2" cEina3" cEina3" cEhis1" cEhis1" cEina2" cEina2" cBtQ1" cBtQ1"

G"

cEper1"

cEper2"

cEina3"

cEhis1"

cEina2"

cBtQ1"

114

1 1

1 1

1 1

1 Figure 7: Tables of pairwise amino acid identities (top) and number of differences (bottom) between orthologs of highly and differentially expressed genes in strain cEper1. Strain names with stars (*) next to them are predicted to encode a signal peptide for secretion outside the bacterial cell. A CAHE_ 0757, B CAHE_0267, C CAHE_0390, D CAHE_0662, E CAHE_0050, F CAHE_0406, G CAHE_0676, and H CAHE_0677.

A& B&

*cEper1& cEper1&

*cEper2& *cEper2&

*cEina3& cEina3&

*cEhis1& *cEhis1&

*cEina2& cEina2&

*cBtQ1& cBtQ1&

C& D&

*cEper1& cEper1&

*cEper2& cEper2&

*cEina3& cEina3&

*cEhis1& cEhis1&

*cEina2& *cEina2&

*cBtQ1& cBtQ1&

E& F&

*cEper1& *cEper1&

*cEper2& *cEper2&

*cEina3& *cEina3&

*cEhis1& cEhis1&

*cEina2& cEina2&

*cBtQ1& *cBtQ1&

G& H&

cEper1& cEper1&

cEper2& cEper2&

cEina3& cEina3&

cEhis1& cEhis1&

cEina2& cEina2&

cBtQ1& cBtQ1&

115

1 1

1 1

1 1

1 1 Figure 8. Tables of pairwise amino acid identities (top) and number of differences (bottom) between orthologs of expressed, putative eukaryotic-interacting proteins identified in strain cEper1. Strain names with stars (*) next to them are predicted to encode a signal peptide for secretion outside the bacterial cell: A CAHE_0312, B CAHE_0450, C CAHE_0452, D CAHE_0028, E CAHE_0010, and F CAHE_0017.

A" B" cEper1" cEper1" cEper2" cEper2" cEina3" cEina3" cEhis1" cEhis1" cEina2" cEina2" cBtQ1" cBtQ1"

C" D"

*cEper1" cEper1" *cEper2" cEper2" *cEina3" cEina3" *cEhis1" cEhis1" *cEina2" cEina2" *cBtQ1" cBtQ1"

E" F"

*cEper1" *cEper1" cEper2" *cEper2" cEina3" *cEina3" cEhis1" *cEhis1"

*cEina2" *cEina2"

*cBtQ1" *cBtQ1"

116

1 1

1 1

1 1 7. REFERENCES

Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, et al. (2008). The RAST

Server: Rapid Annotations using Subsystems Technology. BMC Genomics 9: 75.

Bairoch A, Apweiler R (2000). The SWISS-PROT protein sequence database and its supplement

TrEMBL in 2000. Nucleic Acids Res 28: 45–48.

Beckmann JF, Ronau JA, Hochstrasser M (2017). A Wolbachia deubiquitylating enzyme induces

cytoplasmic incompatibility. Nat Microbiol 2: 17007.

Böck D, Medeiros JM, Tsao H-F, Penz T, Weiss GL, Aistleitner K, et al. (2017). In situ

architecture, function, and evolution of a contractile injection system. Science (80- ) 357:

713–717.

Bork P (1993). Hundreds of ankyrin-like repeats in functionally diverse proteins: mobile

modules that cross phyla horizontally? Proteins Struct Funct Bioinforma 17: 363–374.

Camporeale G, Giordano E, Rendina R, Zempleni J, Eissenberg JC (2006). Drosophila

melanogaster holocarboxylase synthetase is a chromosomal protein required for normal

histone biotinylation, gene transcription patterns, lifespan, and heat tolerance. J Nutr 136:

2735–2742.

Chen W-J, Hsieh F-C, Hsu F-C, Tasy Y-F, Liu J-R, Shih M-C (2014). Characterization of an

insecticidal toxin and pathogenicity of Pseudomonas taiwanensis against insects. PLOS

Pathog 10: e1004288.

Darling ACE, Mau B, Blattner FR, Perna NT (2004). Mauve: multiple alignment of conserved

genomic sequence with rearrangements. Genome Res 14: 1394–1403.

Douglas AE (1998). Nutritional interactions in insect-microbial symbioses: aphids and their

symbiotic bacteria Buchnera. Annu Rev Entomol 43: 17–37.

117 Duron O, Boureux A, Echaubard P, Berthomieu A, Berticat C, Fort P, et al. (2007). Variability

and expression of ankyrin domain genes in Wolbachia variants infecting the mosquito Culex

pipiens. J Bacteriol 189: 4442–8.

Engelstädter J, Hurst GDD (2009). The ecology and evolution of microbes that manipulate host

reproduction. Annu Rev Ecol Evol Syst 40: 127–149.

Fang YW, Liu LY, Zhang HL, Jiang DF, Chu D (2014). Competitive ability and fitness

differences between two introduced populations of the invasive whitefly Bemisia tabaci Q

in China. PLoS One 9.

Feldhaar H, Straka J, Krischke M, Berthold K, Stoll S, Mueller MJ, et al. (2007). Nutritional

upgrading for omnivorous carpenter ants by the endosymbiont Blochmannia. BMC Biol 5:

48.

Finn RD, Attwood TK, Babbitt PC, Bateman A, Bork P, Bridge AJ, et al. (2017). InterPro in

2017-beyond protein family and domain annotations. Nucleic Acids Res 45: D190–D199.

Foster J, Ganatra M, Kamal I, Ware J, Makarova K, Ivanova N, et al. (2005). The Wolbachia

genome of Brugia malayi: endosymbiont evolution within a human pathogenic nematode.

PLOS Biol 3: e121.

Fournier P-E, Vallenet D, Barbe V, Audic S, Ogata H, Poirel L, et al. (2006). Comparative

genomics of multidrug resistance in Acinetobacter baumannii. PLOS Genet 2: e7.

Gavotte L, Henri H, Stouthamer R, Charif D, Charlat S, Boulétreau M, et al. (2007). A survey of

the bacteriophage WO in the endosymbiotic bacteria Wolbachia. Mol Biol Evol 24: 427–

435.

Gebiola M, Giorgini M, Kelly SE, Doremus MR, Ferree PM, Hunter MS (2017). Cytological

analysis of cytoplasmic incompatibility induced by <em>Cardinium</em>

118 suggests convergent evolution with its distant cousin <em>Wolbachia</em>

Proc R Soc B Biol Sci 284.

Gebiola M, Kelly SE, Hammerstein P, Giorgini M, Hunter MS (2016). ‘Darwin’s corollary’ and

cytoplasmic incompatibility induced by Cardinium may contribute to speciation in Encarsia

wasps (Hymenoptera: Aphelinidae). Evolution 70: 2447–2458.

Gebiola M, Monti MM, Johnson RC, Woolley JB, Hunger MS, Giorgini M, et al. (2017). A

revision of the Encarsia pergandiella species complex (Hymenoptera: Aphelinidae) shows

cryptic diversity in parasitoids of whitefly pests. Syst Entomol 42: 31–59.

Gebiola M, White JA, Cass BN, Kozuch A, Harris LR, Kelly SE, et al. (2016). Cryptic diversity,

reproductive isolation and cytoplasmic incompatibility in a classic biological control

success story. Biol J Linn Soc 117: 217–230.

Gravel RA, Narang MA (2005). Molecular genetics of biotin metabolism: old vitamin, new

science. J Nutr Biochem 16: 428–431.

Hagimori T, Abe Y, Date S, Miura K (2006). The first finding of a Rickettsia bacterium

associated with parthenogenesis induction among insects. Curr Microbiol 52: 97–101.

Haslbeck M, Franzmann T, Weinfurtner D, Buchner J (2005). Some like it hot: the structure and

function of small heat-shock proteins. Nat Struct Mol Biol 12: 842–846.

Heberle H, Meirelles GV, da Silva FR, Telles GP, Minghim R (2015). InteractiVenn: a web-

based tool for the analysis of sets through Venn diagrams. BMC Bioinformatics 16: 169.

Hosokawa T, Koga R, Kikuchi Y, Meng X, Fukatsu T (2010). Wolbachia as a bacteriocyte-

associated nutritional mutualist. Proc Natl Acad Sci USA 107: 769–74.

Hunter (1999). The influence of parthenogenesis-inducing Wolbachia on the oviposition

behaviour and sex-specific developmental requirements of autoparasitoid wasps. J Evol Biol

119 12: 735–741.

Hunter MS, Perlman SJ, Kelly SE (2003). A bacterial symbiont in the Bacteroidetes induces

cytoplasmic incompatibility in the parasitoid wasp Encarsia pergandiella. Proc R Soc Lond

B 270: 2185–90.

Hunter MS, Woolley JB (2001). Evolution and behavioral ecology of heteronomous aphelinid

parasitoids. Annu Rev Entomol 46: 251–290.

Hurst GDD, von der Schulenburg JHG, Majerus TMO, Bertrand D, Zakharov IA, Baungaard J,

et al. (1999). Invasion of one insect species, Adalia bipunctata, by two different male-

killing bacteria. Insect Mol Biol 8: 133–139.

Jernigan KK, Bordenstein SR (2014). Ankyrin domains across the Tree of Life (A Weightman,

Ed.). PeerJ 2: e264.

Käll L, Krogh A, Sonnhammer ELL (2004). A combined transmembrane topology and signal

peptide prediction method. J Mol Biol 338: 1027–1036.

Kitagawa M, Miyakawa M, Matsumura Y, Tsuchido T (2002). Escherichia coli small heat shock

proteins, IbpA and IbpB, protect enzymes from inactivation by heat and oxidants. Eur J

Biochem 269: 2907–2917.

Klasson L, Walker T, Sebaihia M, Sanders MJ, Quail MA, Lord A, et al. (2008). Genome

evolution of Wolbachia strain wPip from the Culex pipiens group. Mol Biol Evol 25: 1877–

1887.

Konev AY, Tribus M, Park SY, Podhraski V, Lim CY, Emelyanov A V, et al. (2007). CHD1

motor protein is required for deposition of histone variant H3.3 into chromatin in vivo.

Science (80- ) 317: 1087–1090.

Koskiniemi S, Lamoureux JG, Nikolakakis KC, t’Kint de Roodenbeke C, Kaplan MD, Low DA,

120 et al. (2013). Rhs proteins from diverse bacteria mediate intercellular competition. Proc

Natl Acad Sci 110: 7032–7037.

Kothapalli N, Camporeale G, Kueh A, Chew YC, Oommen AM, Griffin JB, et al. (2005).

Biological functions of biotinylated histones. J Nutr Biochem 16: 446–448.

Kumar S, Jones M, Koutsovoulos G, Clarke M, Blaxter M (2013). Blobology: exploring raw

genome data for contaminants, symbionts and parasites using taxon-annotated GC-coverage

plots. Front Genet 4: 237.

Lamelas A, Gosalbes MJ, Manzano-Marín A, Peretó J, Moya A, Latorre A (2011). Serratia

symbiotica from the aphid Cinara cedri: a missing link from facultative to obligate insect

endosymbiont. PLOS Genet 7: e1002357.

Landmann F, Orsi G a, Loppin B, Sullivan W (2009). Wolbachia-mediated cytoplasmic

incompatibility is associated with impaired histone deposition in the male pronucleus. PLoS

Pathog 5: e1000343.

Lang JM, Darling AE, Eisen JA (2013). Phylogeny of bacterial and archaeal genomes using

conserved genes: supertrees and supermatrices. PLoS One 8.

LePage DP, Metcalf JA, Bordenstein SR, On J, Perlmutter JI, Shropshire JD, et al. (2017).

Prophage WO genes recapitulate and enhance Wolbachia-induced cytoplasmic

incompatibility. Nature 543: 243–247.

Lindsey ARI, Rice DW, Bordenstein SR, Brooks AW, Bordenstein SR, Newton ILG (2017).

Evolutionary genetics of cytoplasmic incompatibility genes cifA and cifB in prophage WO

of Wolbachia. bioRxiv.

Lindsey ARI, Werren JH, Richards S, Stouthamer R (2016). Comparative genomics of a

parthenogenesis-inducing Wolbachia symbiont. G3 Genes|Genomes|Genetics 6: 2113–

121 2123.

Łukasik P, Guo H, van Asch M, Ferrari J, Godfray HCJ (2013). Protection against a fungal

pathogen conferred by the aphid facultative endosymbionts Rickettsia and Spiroplasma is

expressed in multiple host genotypes and species and is not influenced by co-infection with

another symbiont. J Evol Biol 26: 2654–2661.

Moran NA, McCutcheon JP, Nakabachi A (2008). Genomics and evolution of heritable bacterial

symbionts. Annu Rev Genet 42: 165–90.

Mosavi LK, Cammett TJ, Desrosiers DC, Peng Z (2004). The ankyrin repeat as molecular

architecture for protein recognition. Protein Sci: 1435–1448.

Nakamura Y, Kawai S, Yukuhiro F, Ito S, Gotoh T, Kisimoto R, et al. (2009). Prevalence of

Cardinium bacteria in planthoppers and spider mites and taxonomic revision of ‘Candidatus

Cardinium hertigii’ based on detection of a new Cardinium group from biting midges. Appl

Environ Microbiol 75: 6757–63.

Nikoh N, Hosokawa T, Moriyama M, Oshima K, Hattori M, Fukatsu T (2014). Evolutionary

origin of insect-Wolbachia nutritional mutualism. Proc Natl Acad Sci USA 111.

O’Neill SL, Hoffmann AA, Werren JH (1997). Influential passengers: inherited microorganisms

and arthropod reproduction (SL O’Neill, AA Hoffmann, and JH Werren, Eds.). Oxford

University Press: Oxford, UK.

Oliver KM, Russell JA, Moran NA, Hunter MS (2003). Facultative bacterial symbionts in aphids

confer resistance to parasitic wasps. Proc Natl Acad Sci USA 100: 1803–1807.

Partridge SR (2011). Analysis of antibiotic resistance regions in Gram-negative bacteria. FEMS

Microbiol Rev 35: 820–855.

Penz T, Schmitz-Esser S, Kelly SE, Cass BN, Müller A, Woyke T, et al. (2012). Comparative

122 genomics suggests an independent origin of cytoplasmic incompatibility in Cardinium

hertigii. PLoS Genet 8: e1003012.

Russell JE, Stouthamer R (2011). The genetics and evolution of obligate reproductive parasitism

in Trichogramma pretiosum infected with parthenogenesis-inducing Wolbachia. Heredity

106: 58–67.

Santos-Garcia D, Rollat-Farnier P-A, Beitia F, Zchori-Fein E, Vavre F, Mouton L, et al. (2014).

The genome of Cardinium cBtQ1 provides insights into genome reduction, symbiont

motility, and its settlement in Bemisia tabaci. Genome Biol Evol 6: 1013–30.

Scarborough CL, Ferrari J, Godfray HCJ (2005). Aphid protected from pathogen by

endosymbiont. Science (80- ) 310: 1781 LP-1781.

Sedgwick SG, Smerdon SJ (2017). The ankyrin repeat: a diversity of interactions on a common

structural framework. Trends Biochem Sci 24: 311–316.

Siozios S, Ioannidis P, Klasson L, Andersson SGE, Braig HR, Bourtzis K (2013). The diversity

and evolution of Wolbachia ankyrin repeat domain genes. PLoS One 8: e55390.

Skinner SWAY (1985). Son-Killer: A third extrachromosomal factor affecting the sex ratio in

the parasitoid wasp, Nasonia (=Mormoniella) vitripennis. Genetics 109: 745–759.

Smith EM, Hoi JT, Eissenberg JC, Shoemaker JD, Neckameyer WS, Ilvarsonn AM, et al.

(2007). Feeding Drosophila a biotin-deficient diet for multiple generations increases stress

resistance and lifespan and alters gene expression and histone biotinylation patterns. J Nutr

137: 2006–2012.

Sommer DD, Delcher AL, Salzberg SL, Pop M (2007). Minimus: a fast, lightweight genome

assembler. BMC Bioinformatics 8: 64.

Stamatakis A (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of

123 large phylogenies. Bioinformatics 30: 1312–1313.

Stokes HW, Gillings MR (2011). Gene flow, mobile genetic elements and the recruitment of

antibiotic resistance genes into Gram-negative pathogens. FEMS Microbiol Rev 35: 790–

819.

Takano S, Tuda M, Takasu K, Furuya N, Imamura Y, Kim S, et al. (2017). Unique clade of

alphaproteobacterial endosymbionts induces complete cytoplasmic incompatibility in the

coconut beetle. Proc Natl Acad Sci 114: 6110–6115.

Turelli M (1994). Evolution of incompatibility-inducing microbes and their hosts. Evolution 48:

1500–1513.

Weeks a R, Marec F, Breeuwer JA (2001). A mite species that consists entirely of haploid

females. Science 292: 2479–82.

Werren JH (1997). Biology of Wolbachia. Annu Rev Entomol 42: 587–609.

Werren JH, Hurst GD, Zhang W, Breeuwer JA, Stouthamer R, Majerus ME (1994). Rickettsial

relative associated with male killing in the ladybird beetle (Adalia bipunctata). J Bacteriol

176: 388–394.

White JA, Kelly SE, Perlman SJ, Hunter MS (2009). Cytoplasmic incompatibility in the parasitic

wasp Encarsia inaron: disentangling the roles of Cardinium and Wolbachia symbionts.

Heredity 102: 483–489.

Zchori-Fein E, Gottlieb Y, Kelly SE, Brown JK, Wilson JM, Karr TL, et al. (2001). A newly

discovered bacterium associated with parthenogenesis and a change in host selection

behavior in parasitoid wasps. Proc Natl Acad Sci USA 98: 12555–60.

Zchori-Fein E, Perlman SJ (2004). Distribution of the bacterial symbiont Cardinium in

arthropods. Mol Ecol 13: 2009–2016.

124 Zchori-Fein E, Perlman SJ, Kelly SE, Katzir N, Hunter MS (2004). Characterization of a

’Bacteroidetes’ symbiont in Encarsia wasps (Hymenoptera: Aphelinidae): proposal of

‘Candidatus Cardinium hertigii’. Int J Syst Evol Microbiol 54: 961–968.

125 APPENDIX C

Enrichment of low-density symbionts from minute insects

Corinne M Stouthamer, Suzanne Kelly, Martha S Hunter

126 ABSTRACT

Symbioses between bacteria and insects are often associated with changes in important biological traits which can significantly affect host fitness. To a large extent, studies of these interactions have been based on physiological changes or induced phenotypes in the host, and the mechanisms by which symbionts interact with their hosts are only recently becoming better understood. Learning about symbionts has been challenging due to difficulties such as obtaining enough high quality genomic material for high throughput sequencing technology, especially for symbionts present in low titers, and in difficult to rear non-model hosts. Here we introduce a new method that substantially increases yield of bacterial DNA in minute arthropod hosts, while requiring relatively less starting material compared to previous published methods.

127 1. INTRODUCTION

Intracellular, maternally inherited symbionts can influence their arthropod hosts’ biology in various important ways. Obligate symbionts usually provide key nutrients missing in their hosts’ diet (Moran et al., 2008) while facultative symbionts may provide conditional benefits such as defense against parasitoids (e.g Oliver et al. 2003; Xie et al. 2014), pathogens (e.g. Scarborough et al. 2005; Łukasik et al. 2013), or heat shock protection (e.g. Montllor et al. 2002). Facultative symbionts might also manipulate host reproduction through increasing the daughter production or fitness of infected females relative to their uninfected counterparts. Several different lineages of bacteria have evolved the ability to manipulate the daughter production of their hosts, including those in the genera Wolbachia, Spiroplasma, Rickettsia, and Cardinium (reviewed in

Engelstädter and Hurst 2009). Intracellular symbionts in general are found in the hemolymph and other insect tissues and most are uncultivable outside of their hosts (Moran et al., 2008).

Although there is considerable interest in how these bacteria interact with their hosts, possible manipulations are limited relative to what can be achieved with a cultivable microbe. In this context, understanding the genomic capabilities of the symbiont can provide particular insight, but extraction of sufficient high quality bacterial DNA for sequencing may pose considerable technical challenges.

There are several obstacles to overcome when sequencing the genomes of intracellular symbionts. First, there is often a high amount of contaminating host DNA, as eukaryotic genomes are much larger than bacterial genomes. Second is the issue of obtaining enough bacterial DNA for sequencing, particularly from hosts that are very small or hard to rear, because the absolute amount of bacterial DNA per host is at least somewhat proportional to host body

128 size. Improved bioinformatics and less expensive short read technology now make it possible to obtain high quality draft genomes from samples of mixed host and symbiont DNA (e.g.

Koutsovoulos et al. 2016; Brown et al. 2016), but long read, low throughput technologies, such as PacBio sequencing, require a relatively large amount of high molecular weight DNA and are more effecient with greater concentrations of symbiont DNA. In fact, both low and high throughput technologies are more effective when a higher proportion of the input is the target

DNA, in this case symbiont DNA. This study introduces a protocol designed to enrich symbiont

DNA in insect samples, particularly for hosts that are small and/or harbor symbionts at relatively low densities. We tested our protocol in minute (~1 mm) Encarsia wasps, which harbor the facultative symbiont Cardinium.

Cardinium hertigii is a maternally inherited, intracellular symbiont of nematodes and arthropods, estimated to infect 9% of arthropods (Russell et al., 2012). Much like the very distantly related

Wolbachia, Cardinium can cause several reproductive manipulations in its host, including parthenogenesis (e.g. Zchori-Fein et al. 2004; Provencher et al. 2005), feminization (e.g. Weeks et al. 2001), and cytoplasmic incompatibility (e.g. Hunter et al. 2003). Cardinium infects several minute arthropods (≤1mm long), including many species of mites, Culicoides biting midges, thrips, and parasitic Encarsia wasps (Zchori-Fein and Perlman, 2004; Lewis et al., 2014; Nguyen et al., 2017). Encarsia sp. are small (~1mm, ~18 µg) and whole wasps harbor Cardinium at a low density; Cardinium genomes (~1MB) are at roughly an equal ratio with host genomes (200-

400MB) (Perlman et al., 2014). This makes both 1) separating bacterial and host reads difficult, as coverage does not clearly separate the two organisms, and 2) long read, low throughput technology impractical because the host DNA reads overwhelm those of the symbiont DNA.

129 Because 1,000 adult Encarsia wasps are, in weight, equal to roughly 12 adult female Drosophila melanogaster (Katz and Young 1975, Mann et al. unpublished), most laboratories do not have the capabilities to raise enough wasps to follow previously published extraction protocols that start with 2000-5000 adult Drosophila (Iturbe-Ormaetxe et al., 2011), equivalent to approximately 167, 000-417, 000 Encarsia wasps.

The following protocol is based on the Penz et al. (2012) extraction protocol, which itself was modified from Braig et al. (1998). It starts with roughly the same inputs as the Penz et al. (2012) protocol, is no more labor intensive, and produces a higher yield of symbiont-enriched DNA of a quality and length appropriate for long and short read libraries. Although this protocol was developed for minute Encarsia wasps and their Cardinium endosymbionts, we anticipate that it can be used for hosts of any size, but will be particularly useful for other minute hosts with similar low-density symbionts.

130 2. METHODS

2.1 DNA extraction method

Approximately 1000 wasps of each of Encarsia hispida, Encarsia inaron (Italy), and Encarsia tabacivora were used as the starting material. The wasps were homogenized using a tight fitting

(0.025 – 0.076 mm) 1 ml Dounce homogenizer in 800 µl of Buffer A (35 mM Tris HCl, 250 mM sucrose, 250 mM EDTA, 25 mM KCl, 10 mM MgCl2). The homogenate was then transferred to a 1.5 ml Eppendorf tube and the Dounce receptacle was rinsed with 400 µl of remaining filtered

Buffer A, which was also added to the Eppendorf tube. The 1.5 mL tube with homogenate incubated for one hour at 4°C, inverted every 10 minutes, and was then centrifuged at 600 x g at

4°C for 10 minutes. Next, the supernatant was loaded into a sterile 5 mL Luer-Lok syringe (BD) attached to a 13 mm diameter filter cassette holder (Swinnnex filter holder, Millipore) with a 0.8 to 8 µm pore size glass fiber prefilter (Millipore) on top of a strong protein-binding, mixed cellulose ester membrane (Millipore) with a 5 µm pore-size, and pushed slowly through. These steps remove most of the larger cellular fractions of the eukaryotic cells, while allowing the bacterial cells to pass through the filter. The resulting filtrate was then centrifuged to pellet the bacterial cells at 14,100 x g for 15 minutes at 4°C. The supernatant was removed and the pellet was resuspended in 150 µl lysis buffer (0.5% (w/v) SDS, (200 mM Tris, 25 mM EDTA, 250 mM

NaCl, and 1.3 mg/mL Rnase A) and incubated for 30 minutes shaking at 250 RPM at 37°C. To the resultant lysate, 150 µl of buffered phenol and 150 µl of chloroform were added, and the tube was inverted by hand for 10 minutes. Subsequently, the sample was centrifuged at 12,000 x g at room temperature for 10 minutes. The organic (lower) phase was then removed and 150 µl of chloroform and 100 µl of nuclease free water were added. The contents were then inverted by hand for 5 minutes and the sample was centrifuged at 12,000 x g at room temperature for 10

131 minutes. The supernatant was placed in a new tube, and, to precipitate DNA, 45 µl 5M NaCl was added, then 1000 µl of EtOH. The mixture was left in the freezer overnight, then centrifuged at

12,000 x g for 15 minutes. The pellet was washed twice with 500 µl of 70% ethanol, dried, and suspended in TE buffer with 0.5M EDTA.

2.2 DNA extraction with variation in incubation times

Earlier extractions with a version of this protocol omitted the initial 4°C incubation step of the raw wasp homogenate in Buffer A. To test whether this step affects the symbiont to host ratio,

DNA fragment size, or DNA yield, one aliquot of ~8000 homogenized wasps was split into three groups and extracted with varying initial incubation times at 4°C in Buffer A: no incubation, 1 hour incubation, 1.75 hour incubation. Fragment analysis (Bioanalyzer 2100 with High

Sensitivity DNA Kit (Agilent)), qPCR (Bio-Rad CFX Connect Real Time System; Maxima

SYBR Green qPCR mix (ThermoFisher Scientific)), and DNA quantification (Qubit 3.0) were performed on the resulting DNA.

2.3 Relative quantification of Cardinium : wasp genome copies with qPCR

The ratio of Cardinium to host genome copies was measured using qPCR and the delta-delta Ct method (Schmittgen and Livak, 2008). Quantitative PCR was performed using Maxima SYBR

Green/ROX qPCR Master Mix (2X) (ThermoFisher Scientific) with primers targeting the single- copy EF-1 Alpha gene (EF_F: AGATGCACCACGAAGCC and EF_R:

CCTTGGGTGGGTTGTTCTT) for all of the wasp species and primers targeting the Cardinium gyrB gene (gyrb737F: AAGTTATTGTAGCCGCTCAAG and gyrb911R:

GCAGTACCACCAGCAGAG) (Perlman et al., 2014) for the Cardinium strains.

132

2.4 Short and long read sequencing sample extraction and analysis.

Three sequencing technologies were chosen for this study: short reads were generated by either the Ilumina HiSeq platform (paired-end, 2x150 bp, insert size 500-1000bp) or by the Ilumina

MiSeq platform (paired-end, 2x300 bp, insert size 500-600bp), and long reads were generated by

PacBio. The non-enriched HiSeq samples were extracted from whole wasps using a DNeasy extraction kit (Qiagen). The samples destined for MiSeq sequencing were extracted using the enrichment protocol of the current study with a one-hour incubation step at 4°C in Buffer A. The samples for PacBio sequencing were extracted using the protocol of the current study without the incubation step at 4°C in Buffer A. For the MiSeq and HiSeq reads, the percentages of reads that were Cardinium were determined by mapping reads to reference Cardinium genomes

(Stouthamer et al., unpublished) in Bowtie2 (version 2.3.2 default settings, paired reads)

(Langmead et al., 2013). To determine the number of Cardinium reads in the PacBio samples, a custom BLAST database was made of each of the Cardinium genomes and the filtered subreads were queried (e-value = 0) against these databases (Camacho et al., 2009). The subreads were condensed into the reads of insert (Rhoads and Au, 2015), and these were used to calculate the percent of total reads that matched Cardinium. Encarsia inaron (Italy) is infected with two distinct strains of Cardinium (cEina2 and cEina3). The above analyses were the same for this wasp/symbionts combination; the different Cardinium genomes were concatenated for the mapping and BLAST identification.

133 3. RESULTS AND DISCUSSION

3.1 New extraction method outcomes

The extraction technique presented here is based on the extraction protocol reported in Penz et al. (2012) with several key differences. In general, the Cardinium purification and filtering steps were similar but the centrifuge speeds used in the current study were higher to better pellet

Cardinium cells. The lysis buffer also differed, as well as the extraction technique– phenol chloroform rather than CTAB. Additionally, no DNase step was included in this protocol because it appeared to reduce DNA yield while not significantly increasing the ratio of symbiont to host DNA (data not shown), probably because the DNase digested host and Cardinium DNA equally. These extraction methods differed in outcomes as well; Penz et al. (2012) started with

8000 wasps and the extraction yielded 2 ng of purified DNA, while this protocol with roughly a third of the starting number of wasps yielded about 1 µg of symbiont-enriched DNA (Table 1).

3.2 Variation in incubation time affects symbiont to host ratio

Including the 4°C wasp homogenate incubation step in Buffer A greatly enriched the symbiont

DNA. One hour of incubation led to a doubling of symbiont DNA relative to host DNA, and a

1.75 hour incubation step led to almost a tripling of symbiont DNA relative to host DNA, likely because it gave the wasp host cells more time to lyse. Interestingly, the increase in symbiont:host ratio was most pronounced in samples that were not overly homogenized, such that some thoraces and most legs remained still intact. In other extractions, overly homogenized samples

(few legs intact, no thoraces intact), the one hour incubation step in Buffer A at 4°C did not increase the ratio of symbiont: host DNA. This is likely because homogenizing a sample too much causes host cells to lyse and may disrupt the nucleus before it can be filtered out by the 5

134 µm filter. Incubating the sample in Buffer A for one hour also did not reduce the average fragment size, but the 1.75 hour incubation step did result in a smaller average fragment size

(Table 1).

3.3 Long and short read sequencing

The sequencing of the enriched samples shows a significant improvement in the percentage of

Cardinium reads over the whole wasp sample (Table 2). Alhough the comparison is not perfectly equivalent because the sequencing platforms differ in chemistries and library preparation protocols, the increase in Cardinium reads corresponds well to the qPCR results reported in the same table, suggesting the enrichment protocol underlies the differences.

135 4. CONCLUSION

Minute arthropods and their symbionts are prevalent in nature, and although the variety of their associations has long been known from microscopic studies (e.g. Buchner 1965), genomic studies are now helping to understand the many facets of their interactions with their hosts

(Chaston and Douglas, 2012). At the same time, the potential of declining sequencing costs raises the possibility of greater genomic sampling depth, e.g. phylogenomics and population genomic approaches (e.g. Brown et al. 2014). Although sequencing costs are decreasing, the volume of sequencing necessary to overcome the large amounts of host DNA in typical extraction protocols can quickly make the sequencing costs for such approaches prohibitive. For symbionts of small arthropods and/or those for which rearing is impractical there is also a need to maximize the amount of symbiont DNA in a small starting sample. The enrichment protocol detailed here increases the density of symbiont DNA relative to the host DNA while maintaining

DNA integrity for long read sequencing, and we hope will expand study of the symbionts of the smallest, non-model arthropods.

136 5. TABLES

Table 1: Extractions of E.inaron with varying incubation times. The Cardinium: host ratios were derived with qPCR and calculated using the delta-delta Ct method. The average fragment size was estimated with the BioAnalyzer 2100. The total nanograms were based on DNA quantification using Qubit 3.0.

Sample treatment Cardinium : host Average fragment Total ng of DNA cell ratio size in sample No incubation 34.64 : 1 8595 bp 1753 1 hour incubation 70.22 : 1 8733 bp 1154.5 1.75 hour 110.81 : 1 5958 bp 707.5 incubation

Table 2: Percent of Cardinium reads in enriched and non-enriched samples from different

Encarsia hosts. In each table cell the two figures correspond to the percent of reads identified as

Cardinium in each sample, followed by the ratio of Cardinium gyrB to wasp EF 1 alpha derived from qPCR. The ratios of the original HiSeq non-enriched samples were not calculated, but instead the original extraction protocol was performed on whole extractions of wasps of similar ages and measured with qPCR at a later time.

Cardinium strain(s) Illumina Hiseq 2500, MiSeq, 300 PE: PacBio: enriched 150 PE: sample not enriched sample sample enriched cEina2 and cEina3 (double 0.75%, 1.49 : 1 64.22%, 154 : 1 41.16% 26:1 infection in E. inaron (IT)) 0.14%, 0.11 : 1 N/A 64.42% 9.22 :1 cEhis1 (in E. hispida) 0.17% 1.08 : 1 40.63%, 118 : 1 N/A cEper2 (in E. tabacivora)

137 6. REFERENCES

Braig HR, Zhou W, Dobson SL, O’Neill SL (1998). Cloning and characterization of a gene

encoding the major surface protein of the bacterial endosymbiont Wolbachia pipientis. J

Bacteriol 180: 2373–2378.

Brown AMV V, Huynh LY, Bolender CM, Nelson KG, McCutcheon JP (2014). Population

genomics of a symbiont in the early stages of a pest invasion. Mol Ecol 23: 1516–1530.

Brown AMV V, Wasala SK, Howe DK, Peetz AB, Zasada IA, Denver DR (2016). Genomic

evidence for plant-parasitic nematodes as the earliest Wolbachia hosts. 6: 34955.

Buchner P (1965). Endosymbiosis of animals with plant microorganisms. John Wiley: New

York, NY.

Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, et al. (2009). BLAST

plus: architecture and applications. BMC Bioinformatics 10: 1.

Chaston J, Douglas AE (2012). Making the most of ‘omics’ for symbiosis research. Biol Bull

223: 21–29.

Engelstädter J, Hurst GDD (2009). The ecology and evolution of microbes that manipulate host

reproduction. Annu Rev Ecol Evol Syst 40: 127–149.

Hunter MS, Perlman SJ, Kelly SE (2003). A bacterial symbiont in the Bacteroidetes induces

cytoplasmic incompatibility in the parasitoid wasp Encarsia pergandiella. Proc R Soc Lond

B 270: 2185–90.

Iturbe-Ormaetxe I, Woolfit M, Rancès E, Duplouy A, O’Neill SL (2011). A simple protocol to

obtain highly pure Wolbachia endosymbiont DNA for genome sequencing. J Microbiol

Methods 84: 134–136.

Katz AJ, Young SSY (1975). Selection for high adult body weight in Drosophila populations

138 with different structures. Genetics 81: 163 LP-175.

Koutsovoulos G, Kumar S, Laetsch DR, Stevens L, Daub J, Conlon C, et al. (2016). No evidence

for extensive horizontal gene transfer in the genome of the tardigrade Hypsibius dujardini.

Proc Natl Acad Sci 113: 5053–5058.

Langmead B, Salzberg SL, Langmead (2013). Bowtie2. Nat Methods 9: 357–359.

Lewis SE, Rice A, Hurst GDD, Baylis M (2014). First detection of endosymbiotic bacteria in

biting midges Culicoides pulicaris and Culicoides punctatus, important Palaearctic vectors

of bluetongue virus. Med Vet Entomol 28: 453–456.

Łukasik P, Guo H, van Asch M, Ferrari J, Godfray HCJ (2013). Protection against a fungal

pathogen conferred by the aphid facultative endosymbionts Rickettsia and Spiroplasma is

expressed in multiple host genotypes and species and is not influenced by co-infection with

another symbiont. J Evol Biol 26: 2654–2661.

Montllor CB, Maxmen A, Purcell AH (2002). Facultative bacterial endosymbionts benefit pea

aphids Acyrthosiphon pisum under heat stress. Ecol Entomol 27: 189–195.

Moran NA, McCutcheon JP, Nakabachi A (2008). Genomics and evolution of heritable bacterial

symbionts. Annu Rev Genet 42: 165–90.

Nguyen DT, Morrow JL, Spooner-Hart RN, Riegler M (2017). Independent cytoplasmic

incompatibility induced by Cardinium and Wolbachia maintains endosymbiont coinfections

in haplodiploid thrips populations. Evolution 71: 995–1008.

Oliver KM, Russell JA, Moran NA, Hunter MS (2003). Facultative bacterial symbionts in aphids

confer resistance to parasitic wasps. Proc Natl Acad Sci USA 100: 1803–1807.

Penz T, Schmitz-Esser S, Kelly SE, Cass BN, Müller A, Woyke T, et al. (2012). Comparative

genomics suggests an independent origin of cytoplasmic incompatibility in Cardinium

139 hertigii. PLoS Genet 8: e1003012.

Perlman SJ, Dowdy NJ, Harris LR, Khalid M, Kelly SE, Hunter MS (2014). Factors affecting the

strength of Cardinium-induced cytoplasmic incompatibility in the parasitic wasp Encarsia

pergandiella (Hymenoptera: Aphelinidae). Microb Ecol 67: 671–8.

Provencher LM, Morse GE, Weeks AR, Benjamin B. Normark (2005). Parthenogenesis in the

Aspidiotus nerii complex (Hemiptera:Diaspididae): a single origin of a worldwide,

polyphagous lineage associated with Cardinium bacteria. Ann Entomol Soc Am 98: 629–

635.

Rhoads A, Au KF (2015). PacBio Sequencing and Its Applications. Genomics, Proteomics

Bioinformatics 13: 278–289.

Russell JA, Funaro CF, Giraldo YM, Goldman-Huertas B, Suh D, Kronauer DJC, et al. (2012).

A veritable menagerie of heritable bacteria from ants, butterflies, and beyond: broad

molecular surveys and a systematic review. PLoS One 7: e51027.

Scarborough CL, Ferrari J, Godfray HCJ (2005). Aphid protected from pathogen by

endosymbiont. Science 310: 1781 LP-1781.

Weeks AR, Marec F, Breeuwer JA (2001). A mite species that consists entirely of haploid

females. Science 292: 2479–82.

Xie J, Butler S, Sanchez G, Mateos M (2014). Male killing Spiroplasma protects Drosophila

melanogaster against two parasitoid wasps. Heredity 112: 399–408.

Zchori-Fein E, Perlman SJ (2004). Distribution of the bacterial symbiont Cardinium in

arthropods. Mol Ecol 13: 2009–2016.

Zchori-Fein E, Perlman SJ, Kelly SE, Katzir N, Hunter MS (2004). Characterization of a

’Bacteroidetes’ symbiont in Encarsia wasps (Hymenoptera: Aphelinidae): proposal of

140 ‘Candidatus Cardinium hertigii’. Int J Syst Evol Microbiol 54: 961–968.

141