<<

Ecological and Evolutionary Factors Shaping -Bacterial Symbioses: Insights from & Gut Symbionts

by

Bryan Paul Brown

Environment Duke University

Date:______Approved:

______Jennifer Wernegreen, Supervisor

______Dana Hunt

______John Rawls

______Lawrence David

Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of the Environment in the Graduate School of Duke University

2017

i

v

ABSTRACT

Ecological and Evolutionary Factors Shaping Animal-Bacterial Symbioses: Insights from Insects & Gut Symbionts

by

Bryan Paul Brown

Environment Duke University

Date:______Approved:

______Jennifer Wernegreen, Supervisor

______Dana Hunt

______John Rawls

______Lawrence David

An abstract of a dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Department of the Environment in the Graduate School of Duke University

2017

i

v

Copyright by Bryan Paul Brown 2017

Abstract

Animal bacterial symbioses are pervasive and underlie the success of many groups. Here, I study ecological and evolutionary factors that shape interactions between a host and gut associates. In this dissertation, I interrogate interactions between the carpenter (Camponotus) and its associated gut to ask the following questions: What are the resident microbiota of the Camponotine gastrointestinal tract?

How does persistent gut association affect rates of molecular in gut symbionts? How are transmitted between social hosts? How does gut community composition and structure vary across host development? What evolutionary factors facilitate adaptation to the gut? How do the of gut associates respond to selective pressures associated with persistent gut habitation? I use a combination of next generation sequencing, anaerobic isolate culturing, computational modeling, and comparative genomics to illustrate evolutionary consequences of persistent host association on the genomes of gut associates. In chapter one, I characterize the gut community of C. chromaiodes and describe two novel lineages in the

Acetobacteraceae (AAB). I demonstrate rapid evolutionary rates, deleterious evolution at 16S rRNA, and deep divergence of a monophyletic clade of ant associated AAB. In chapter two, I design a novel molecular tool to prevent amplification of nontarget DNA in 16S based community screens. I then use this tool to characterize the gut microbiota of

iv

C. chromaiodes across several developmental stages and incipient colonies. I argue that highly similar bacterial profiles between a colony queen and offspring are indicative of reliable vertical of gut . In chapter three, I isolate and culture two strains of AAB gut associates from C. chromaiodes, as well as an associate in the

Lactobacillaceae, and perform whole sequencing. I use comparative genomic analyses to delineate patterns of genomic erosion and rampant on AAB gut isolates that lead to genomes with mosaic metabolic pathways.

Taken together, this dissertation establishes a new model system for assessing evolutionary consequences of symbioses with gut bacteria. These results provide novel insights into the repercussions of bacterial adaptation to a host gut tract. They establish a foundation to interrogate questions unique to persistent extracellular gut symbionts.

Finally, they delineate distinct forces shaping the functional capacity of symbiont genomes: gene loss through reductive evolution and gene acquisition via horizontal transfer from diverse community members.

v

Contents

Abstract ...... iv

List of Tables...... x

List of Figures ...... xi

Personal Acknowledgements ...... xiii

Professional Acknowledgements ...... xvii

Introduction to the document ...... 1

1. Deep divergence and rapid evolutionary rates in gut-associated Acetobacteraceae of ...... 7

1.1 Introduction ...... 7

1.2 Results ...... 12

1.2.1 The gaster microbiota of C. chromaiodes is largely dominated by two Acetobacteraceae (AAB) OTUs ...... 12

1.2.2 AAB are localized to the Camponotus gut tract ...... 16

1.2.3 Strain-level variation occurs within AAB OTUs, and screens across host species and geographic regions...... 17

1.2.4 Camponotus-associated AABs join other ant associates in a deep, well- supported clade ...... 19

1.2.5 The ant-associated AAB clade evolves several-times faster than other Acetobacteraceae ...... 22

1.2.6 Camponotus AABs show increased AT content and reduced RNA stability, comparable to known socially transmitted bacteria ...... 23

1.3 Discussion ...... 26

1.3.1 Monophyly of ant-associated AAB...... 27

vi

1.3.2 Rapid evolution and destabilization at 16S rDNA...... 30

1.3.3 Potential transmission routes and caste variation...... 34

1.4 Methods ...... 38

1.4.1 Ant collection ...... 38

1.4.2 Sample preparation for microbiota characterization: surface sterilization, genomic DNA (gDNA) extraction, and Blochmannia depletion ...... 38

1.4.3 16S rDNA amplicon pyrosequencing using Ion PGM ...... 40

1.4.4 Microbial community analysis ...... 42

1.4.5 Within-OTU polymorphism ...... 42

1.4.6 Assessment of surface sterilization techniques ...... 43

1.4.7 Localization of AABs to the C. chromaiodes gut using diagnostic PCR...... 44

1.4.8 Surface sterilization, dissection, and gDNA extraction for PCR screen of AABs ...... 45

1.4.9 Screening of additional Camponotus species and geographic regions ...... 46

1.4.10 Phylogenetic analysis of AAB 16S rDNA sequences ...... 46

1.4.11 Nucleotide substitution rate estimation ...... 49

1.4.12 Ribosomal RNA folding energy estimation ...... 50

2. Social interactions foster reliable bacterial transmission in Camponotine ants ...... 51

2.1 Introduction ...... 51

2.2 Results ...... 58

2.2.1 depletion and sequencing depth of gut associated community ...... 58

2.2.2 Bacterial taxonomic profiles of incipient colonies ...... 58

vii

2.2.3 Inter-community distance is strongly shaped by colony identity and developmental stage ...... 60

2.2.4 Community complexity varies significantly across host development, but not across host colonies...... 64

2.2.5 The microbiota of P1 pupae displays increased consistency across colonies and contains several environmental microbes ...... 67

2.3 Discussion ...... 67

2.3.1 Bacterial communities shared between queens and progeny are consistent with vertical inheritance ...... 68

2.3.2 Contributions from environmental sources cause shifts in community complexity ...... 69

2.3.3 How do developmental stage and colony identity shape bacterial communities? ...... 70

2.3.4 How does bacterial community structure vary across host development? ...... 71

2.4 Methods ...... 74

2.4.1 Ant collection ...... 74

2.4.2 Sample preparation for microbiota characterization: surface sterilization, genomic DNA (gDNA) extraction, and endosymbiont depletion ...... 75

2.4.3 16S rDNA amplicon library construction and sequencing ...... 77

2.4.4 Microbial community analysis ...... 79

3. Genomic erosion and rampant horizontal gene transfer in gut-associated Acetobacteraceae ...... 81

3.1 Introduction ...... 81

3.2 Results ...... 84

3.2.1 Isolate genome characteristics ...... 84

viii

3.2.2 Metabolic pathway reconstruction ...... 88

3.2.3 Comparative bacterial genome analysis ...... 91

3.2.4 Horizontal gene transfer and functional profiles of gut associates ...... 92

3.2.5 Phylogenomic inference ...... 97

3.3 Discussion ...... 98

3.4 Methods ...... 106

3.4.1 Ant collection and sample preparation ...... 106

3.4.2 gDNA extraction and PCR screens ...... 107

3.4.3 Bacterial isolate culturing ...... 108

3.4.4 Bacterial isolate sequencing, assembly, and analysis ...... 109

3.4.5 Phylogenomic inference ...... 110

3.4.6 Determination of horizontally transferred genes ...... 110

4. Discussion and future directions ...... 111

Appendix A: supplemental information for chapter 1 ...... 116

Estimation of empirical error rate for Ion PGM amplicon sequencing...... 116

Methods: ...... 116

Results:...... 117

Appendix B: supplemental information for chapter 2 ...... 127

Appendix C: supplemental information for chapter 3 ...... 135

References ...... 139

Biography ...... 158

ix

List of Tables

Table 1: Abundance (read count) of Bacterial OTUs detected in Camponotus chromaiodes...... 15

Table 2: Comparison of molecular clock models and relative substitution rates of 16S rDNA...... 22

Table 3: Genome characteristics of sequenced strains ...... 86

Table 4: Empirical estimation of error rates for E. coli amplicon library...... 119

Table 5: Sampling structure of collected colonies ...... 119

Table 6: Distribution of AAB across ant samples...... 121

Table 7: 16S rRNA stability and AT content, across bacterial symbionts with distinct transmission modes ...... 124

Table 8: Primers used in V4-V5 amplicon sequencing...... 125

Table 9: Genbank accession numbers of all taxa used in phylogenetic analyses...... 126

Table 10: Sample structure and design ...... 134

Table 11: Core OTUs present across all P1 pupae ...... 134

Table 12: Genome characteristics of taxa used in phylogenomic trees ...... 138

x

List of Figures

Figure 1: Bayesian phylogeny of Acetobacteraceae 16S rDNA...... 21

Figure 2: Maximum likelihood phylogeny of Acetobacteraceae 16S rDNA...... 24

Figure 3: Relationship between AT content and normalized predicted free energy of 16S rRNA ...... 25

Figure 4: Relative abundance of phylogenetic classes distributed across colonies and developmental stages...... 59

Figure 5: OTUs and inter-community distance from a colony queen tracked across the dataset...... 61

Figure 6: Alpha diversity estimates and proportion of shared OTUs between developmental stages and castes...... 63

Figure 7: Constrained analysis of Principle Coordinates (CAP) ordination and inter community distance comparisons ...... 66

Figure 8: Chromosome maps of each strain sequenced...... 87

Figure 9: Abundance profiles of functions annotated by IMG networks...... 89

Figure 10: Venn diagram of shared genes between isolates BBS1 and BBS3 ...... 92

Figure 11: Genomic content of isolates by phylogenetic distribution of genes and COG functional categories...... 93

Figure 12: Horizontally transferred genes by phylogenetic origin and COG functional category...... 94

Figure 13: PFam function across AAB and LAB...... 96

Figure 14: Phylogenomic analysis of AAB and LAB isolates from diverse sources...... 98

Figure 15: Bootstrap consensus tree based on maximum likelihood analysis of Acetobacteraceae 16S rDNA ...... 122

Figure 16: Bayesian phylogeny of Acetobacteraceae 16S rDNA...... 123

xi

Figure 17: Faith’s PD across all colonies sampled...... 127

Figure 18: CAP ordination of Unifrac distance constrained by colony identity...... 128

Figure 19: CAP ordination of JSD constrained by colony identity...... 129

Figure 20: Unconstrained Principle Coordinate Analysis (PCoA) ordination on JSD .... 130

Figure 21: Unconstrained Principle Coordinate Analysis (PCoA) ordination on Unifrac distance...... 131

Figure 22: Supervised OTU heatmap of all samples...... 132

Figure 23: Jaccard distance between queens and colonies ...... 133

Figure 24: Glycolysis in BBS1 and BBS3 ...... 135

Figure 25: The pentose phosphate and Entner Duodoroff pathways in BBS1 and BBS3 136

Figure 26: The TCA cycle in BBS1 and BBS3...... 137

xii

Personal Acknowledgements

As with any great endeavor (I leave the definition of “great” very open to interpretation), there is a cohort of folks working behind the scenes to facilitate its successful completion. The pursuit of a doctoral degree is no different; rather, it may almost be more applicable.

I would be remiss if I did not acknowledge the Nicholas School of the

Environment and all of the wonderful folks therein who have made the administrative and logistical aspects of graduate school downright easy (dare I say pleasant?). Meg

Stephens was a fantastic DGSA and greeted me with a lovely smile and attitude from day one. Joel Meyer has been an incredibly receptive and supportive DGS and his willingness to share his time and resources are remarkable. I must also acknowledge the invaluable help of Nicholas Devos and Graham Alexander Jr, both of whom hustled hard to get critical samples sequenced for my third chapter before an impending grant expiration.

I would also like to extend a special thanks to my various lab mates over the years. Yongliang Fan is an incredibly skilled molecular and field biologist and provided valuable insights and skillsets in both fields. He set a standard that shaped my molecular techniques throughout graduate school. I would also like to thank Laura

Williams, who took a nearly constant barrage of pestering about simple bash scripting and never broke her smile. Finally, I would like to acknowledge the two undergraduates

xiii

that I worked with, Carlton Adams and Eric Song, for being good sports about field and lab work and providing lots of laughs and fond memories. I must also extend a special thanks to Heather Durand for helping me with various microbiological techniques and sharing reagents, broken tables, puppy love from Rococo, and good laughs.

During my time at Duke, I was surrounded by a diverse and fantastic group of folks who made invaluable contributions to my scientific development and my growth as a person. Liad Holtzman has been a constant source of support and laughs and I greatly admire his authenticity. Nicholas Polizzi took me in for the first few weeks of my time in Durham and was a great climbing partner and Left Hand Milk Stout slayer at the

Federal. Alexandros Iliopoulos was a great friend and roommate and has contributed to my personal growth in more ways than I can fit into this section. Jean-Hubert (Jub) and

Anne-Laure Schmitt Olivier always provided much needed comic relief. Additionally, I would like to thank the far-too-many-folks-to-list that ranged from climbing partners to collaborators that always had time for a cup of tea or beer. Those breaks in the grind made a difference that I can’t possibly convey.

I owe an incredible thank you to the members of my committee for providing insightful comments and guidance during the entire process. I thank John Rawls for providing great advice on conceptual approaches to analyzing host-microbe interactions. Dana Hunt is incredibly brilliant and has prodded me to consider aspects governing microbial interactions that I would have otherwise not considered. Rob Dunn,

xiv

first and foremost, is a boundless trove of excitement an enthusiasm and I appreciate his energy and insights incredibly. Rob is also a brilliant entomologist and microbiologist and has facilitated my development in both fields to a remarkable extent. Finally, I owe a massive thank you to Jen Wernegreen, the scale of which I could never convey. Her open-door policy, patience, and genuine sense of caring have meant the world to me during this process. At the inevitable low points that one encounters during graduate school, Jen always managed to provide encouragement, enthusiasm, and a positive outlook. Her mentoring style, brilliance, sense of balance, and genuine care for my development has been invaluable. Nearly the entirety of my scientific development is due to her thoughtful guidance, and for that I cannot express enough gratitude.

Finally, I would like to thank my family, genetic and extended. My parents have always believed in me and encouraged me to follow my passions and dreams, from absconding to New Zealand on a whim to applying to Duke. My mother is a beacon of positivity and always manages to provide a smile and a laugh. My father has been my

#1 fan from day one and has always pushed me to realize my dreams and do it with everything that I have. My stepmother, Christine Brown, took me in as one of her own and was always available to provide a sense of balance and perspective. My stepfather,

Mark Osterhage, has a heart of gold and sense of humor that can lighten any situation.

All of my siblings have been integral to my personal development and, though we don’t get to spend enough time together, seeing them is one of the best parts of being home.

xv

I’d like to give an extra nod to my sister, Rachel, who’s been my confidant and friend since before I can remember. I would also like to extend a special thanks to Alexandria

Hannay for her endless support and for following me up mountains, across boulders, and off bridges.

xvi

Professional Acknowledgements

Chapter 1

I am thankful to Yongliang Fan for assistance with North Carolina collection efforts. I would also like to thank Olivier Fedrigo for his assistance with amplicon sequencing, and an anonymous reviewer for insightful comments. This work was supported by the National Science Foundation Graduate Research Fellowship under

Grant No. NSF 1106401, and by the Nicholas School of the Environment, to BPB. This research was also supported by grants to JJW from NSF (MCB-1103113), NIH

(R01GM062626) and the Nicholas School of the Environment.

Chapter 2

I am grateful to Eric Song for assistance in collection of samples used here. I also acknowledge Heather Durand for her assistance in amplicon library screening and quantitation. I am further grateful to Karen Abramson for her assistance in amplicon library sequencing and correspondence. This work was supported by the National

Science Foundation Graduate Research Fellowship under Grant No. NSF 1106401 to

BPB, and by the Nicholas School of the Environment, to BPB and JJW.

Chapter 3

I am grateful to Heather Durand and Rachael Bloom for their assistance in microbiological culturing methods and accommodating my needs in the anaerobic

xvii

chamber. I am grateful for the contributions of Nicholas Devos and Graham Alexander

Jr for their assistance with genome sequencing and assembly. I also thank Tom Milledge for his relentless assistance with the DSCR. This work was supported by the National

Science Foundation Graduate Research Fellowship under Grant No. NSF 1106401 to

BPB, and by the Nicholas School of the Environment, to BPB and JJW.

xviii

Introduction to the document

Symbiotic interactions have had profound effects on the success of all life on

Earth. Symbioses vary in nature, from parasitic to commensal and mutualistic, and occur across all kingdoms of life. In this dissertation, I focus on various types of symbioses that occur between and bacteria. Animals and bacterial associates engage in a range of relationships that shape the ecological interactions and evolutionary trajectories of both parties [1]. These relationships range from facultative to obligate and span a diverse range of physiological and evolutionary repercussions [2-5]. Obligate symbioses are generally mutualistic in nature, function to benefit both parties, and are typically integral to the development and reproduction of both partners [4]. These mutualisms are typified by binary partnerships between a host and bacterial associate, whereby the symbiotic bacterium is restricted to an endogenous host niche, has adopted an intracellular lifestyle, and has evolved to be dependent on host mechanisms for transmission and other essential functions [4]. In contrast to obligate symbioses, facultative associations vary widely in nature and can include parasitic [6], commensal

[7], and mutualistic [8] interactions, with the distribution and transmission of facultative associates being much more sporadic [4]. Facultative associations may occur in intra- or extracellular settings and, while the evolutionary consequences of such associations are diverse [4], facultative associates are typically self reliant for reproduction and transmission. In this dissertation, I examine the ecological factors and evolutionary

1

pressures driving animal-bacterial symbioses. In chapter 1, I characterize a novel group of gut associated AAB inhabiting ants. In chapter 2, I delineate ecological factors structuring the transmission and development of gut communities in a social host.

In chapter 3, I characterize the genomic consequences of persistent habitation of the gut environment.

Insects engage in a diverse repertoire of relationships with bacterial associates [4,

5, 9-12]. Several insect orders engage in ancient partnerships with obligate bacterial [13]. These symbioses share millions of years of coevolutionary history

[14, 15] and often result in metabolic interdependence [16] between host and symbiont.

Typically, obligate endosymbionts live intracellularly in specialized host cells or organs and are transmitted vertically by specialized host mechanisms [17]. Aside from physiological shifts, the evolutionary consequences of such associations have profound effects on mutualist genomes [18]. Persistent hallmarks of adaptive evolution on the genomes of bacterial mutualists include global patterns of genomic erosion and gene loss [18], AT nucleotide bias [19, 20], and accelerated evolutionary rates at 16S rRNA [9] and protein coding genes [21]. Recently, the study of extracellular gut associates of insects has uncovered similar trends as those seen in intracellular mutualists. Across various hosts, several studies have delineated parallel patterns of genomic reduction

[22], shifts in functional capacity [23], and co-cladogenesis with the host species [10].

2

Ants are a remarkably successful group of social insects that host an expansive range of symbiotic bacteria. Here, I focus on symbiotic interactions between the , Camponotus chromaiodes, and associated bacterial partners. In addition to the their ancient [14] with the intracellular endosymbiont, Blochmannia [13],

Camponotine ants sporadically host the facultative endosymbiont, Wolbachia [24], and persistently associate with specific members of their gut consortium [9, 25-27]. Recently, social interactions have been implicated as mechanisms for the transmission of persistent gut associates [5, 23, 28, 29]. Socially transmitted gut microbiota serve as interesting models for investigating community-level dynamics and selective pressures associated with persistent host association, as members of these communities undergo evolutionary pressures associated with adaptation to an endogenous host niche, as above, but also enjoy ample opportunity to interact and exchange genetic material with myriad other bacteria.

In this dissertation, I use marker gene sequencing, computational modeling, microbiological techniques, and comparative genomics to delineate the following questions: What are the characteristic taxa inhabiting the gut of C. chromaiodes? How do gut associated communities vary between hosts? What are the evolutionary repercussions of persistent host association on gut bacteria? How are gut associated bacterial communities transmitted between interacting hosts? How do these

3

communities respond to temporal variation across host development? What are the genomic consequences of adaptation to the gut tract?

In chapter 1, I characterize the bacterial communities associated with an abundant ant species, C. chromaiodes. While many bacteria showed sporadic distributions, some taxa were abundant and persistent within and across ant colonies.

Specially, two AAB operational taxonomic units (OTUs; referred to as AAB1 and AAB2) were abundant and widespread across host samples. I performed dissection experiments to confirm localization of AAB1 and AAB2 in the C. chromaiodes gut tract. Further, I found that multiple Camponotus species across varied geographical regions possess close relatives of the Acetobacteraceae OTUs detected in C. chromaiodes. Phylogenetic analysis revealed that AAB1 and AAB2 join other ant associates in a monophyletic clade. This clade consists of AAB from three ant tribes, including a third, basal lineage associated with Attine ants. This ant-specific AAB clade exhibits a significant acceleration of substitution rates at the 16S rDNA gene and elevated AT content. Substitutions along

16S rRNA in AAB1 and AAB2 resulted in ~10 % reduction in the predicted rRNA stability.

In chapter 2, I investigate the fidelity of transfer associated with social interactions by surveying thirteen incipient colonies of C, chromaiodes, known to house intracellular mutualists and the persistent extracellular gut associates I outline in chapter

1. I present evidence that extracellular gut associates of C. chromaiodes are likely

4

transmitted vertically from a mother colony to a young queen and, subsequently, to her offspring via social interactions. Using 16S rDNA sequencing and a de novo designed peptide nucleic acid (PNA) clamp to block amplification of intracellular Wolbachia, I deeply characterize bacterial communities of incipient queens and workers, as well as across several larval and pupal developmental stages, and delineate dynamic shifts in community complexity and composition that structure gut microbiota. These results demonstrate that similar compositional signals from queens to progeny are the primary factor shaping gut community profiles, while shifts in community complexity, via acquisition of environmental microbes during host development, drive bacterial community structure to a lesser degree. Cumulatively, these results demonstrate that social interactions likely facilitate reliable vertical transmission of extracellular bacterial associates, which is a significant driver of adaptation between animals and gut microbiota.

In chapter 3, I culture and characterize the genomes of gut associated AAB and

Lactobacillaceae (LAB) in C. chromaiodes. These results delineate significant genomic erosion and genomic divergence in AAB. I elucidate rampant horizontal gene transfer in these associates and analyze the functional and taxonomic profiles of horizontally transferred genes. These data suggest that horizontally acquired genes are predominantly gammaproteobacterial in origin, with approximately 30% of those originating in the Rhizobiales. Functionally, genes associated with energy production

5

and conversion were far more abundant than any other class. Taken together, these results demonstrate how rampant horizontal transfer of genetic material between gut associates and reductive evolution lead to mosaic metabolic pathways and genomes. I perform phylogenomic analyses on distinct isolates of the gut associate, AAB2, and report deep divergence and monophyly echoing trends from 16S rRNA analysis. Finally,

I show significant elevation of lysine exporters in the genomes of both strains and propose that L-lysine supplementation by AAB2 isolates may serve as a potential functional benefit for the host. I also uncover surprisingly high genomic divergence in gut associated LAB of C. chromaiodes, which otherwise does not show characterics of gut association. Cumulatively, these results illustrate dynamic patterns of genetic acquisition and erosion in response to persistent habitation of the gastrointestinal tract.

6

1. Deep divergence and rapid evolutionary rates in gut- associated Acetobacteraceae of ants

1.1 Introduction

Associations between gut bacteria and their hosts have had profound impacts on the evolutionary trajectories of both partners [1]. These interactions have facilitated the success of many animal groups by providing key benefits to the host, such as contributions to host nutrition in termites [30], immune defense in moths [31], kin recognition in Drosophila [32], and toxin production in the antlion [33]. While fitness benefits for bacterial mutualists remain less obvious, adaptation to the gut environment, in some cases of a particular host species, has significant impacts on the evolution of gut associates.

Among bacteria, inhabiting an animal gut can have a range of genomic consequences. These can include general adaptations that favor colonization of the gut environment and long-term signatures of coevolution with a specific host. In murine models, increased bacterial rates have been demonstrated to facilitate colonization of the gut niche by expediting adaptation [34]. Likewise, the genomes of acetic acid bacteria, a family of known insect symbionts [35], encode cytochrome bo3 ubiquinol oxidase, which may facilitate survival across both normoxic and micro-oxic conditions encountered in the gut ecosystem [36-38]. Genomic studies of bee associates have reported syntrophic networks among bacterial mutualists and the presence of many genes that may facilitate gut colonization and cell-cell interactions [23]. These 7

traits, as well as vertical transmission, may mediate host specificity [23]. Furthermore, genetic characteristics of persistent gut associates can include AT-biased nucleotide composition, long-term accelerated molecular evolution, and reduced ; these trajectories in certain gut associates are similar to, albeit less extreme than, patterns observed in many obligate intracellular symbionts of insects [4, 5, 10, 22].

In light of the patterns above, gut-associated bacteria of insects represent intriguing candidates to explore genomic consequences of symbiotic transitions. Insect gut microbiota are less diverse than those found in vertebrate species with adaptive immune systems [4, 30, 39-42]. Within certain insect hosts, both individual bacterial taxa and whole communities have been reported to maintain a high level of host fidelity, resulting in relationships of remarkable stability over evolutionary timescales [10, 22, 23,

39, 43-45]. Stable associations between insect hosts and extracellular gut associates have been described across several host orders and may offer valuable insights into genomic consequences of inhabiting the gut niche [5].

These genomic consequences are profoundly influenced by transmission mode of a given . Broadly, extracellular gut associates of insects fall into three distinct transmission modes, which is a major determinant shaping symbiont genome evolution.

These include environmental acquisition, social transmission, or specialized maternal transmission. Examples of each transmission type, respectively, include environmentally acquired Burkholderia symbionts of bean bugs that localize to cavities

8

along the midgut (environmental acquisition) [11]; hindgut microbiota of transmitted via trophallaxis and coprophagy (social transmission) [46]; midgut associates of stinkbugs transmitted via egg-smearing, symbiont capsules, or nutrient- rich jelly secretions (specialized maternal transmission) [10, 22].

Transmission mode may impact symbiont genome evolution for two main reasons. First, while all gut associates likely have some requirement to survive outside of their host niche [5], the portion of the lifecycle spent outside of hosts may vary with transmission mode. For instance, environmentally acquired symbionts may spend a considerable portion of their lifecycle outside of hosts; by contrast, strictly maternally transmitted microbiota are expected to spend nearly all of their lifecycle associated with hosts and thus become more specialized to the host niche, potentially leading to gene loss. Socially transmitted microbiota may comprise an intermediate point along this spectrum, showing moderate levels of specialization to the host niche, though tempered by interactions with other bacteria or infrequent host switching.

Second, distinct transmission modes are expected to generate varying levels of host-symbiont stability. Environmentally acquired microbiota are expected to show least stability, maternally transmitted microbiota the most, and socially transmitted microbiota an intermediate level. In agreement with this prediction, current studies suggest that host-symbiont phylogenetic congruence, while strongest under strict maternal transmission, is also significant under social transmission, suggesting that

9

sociality may promote vertical transmission [5, 44]. In this sense, socially transmitted microbiota may represent a transition between free-living existence and a stably inherited, host-reliant symbiotic lifestyle.

Recent studies of gut microbiota suggest that distinct transmission modes indeed affect trajectories of symbiotic evolution. Genome evolution under the most extreme symbiotic lifestyle of strict maternal transmission and obligate intracellularity provides a useful point of reference. It is well known that hallmarks of long-term intracellularity include severe genome reduction, accelerated substitution rates, and (often) AT biased nucleotide composition [4]; these traits are likely explained by a combination of relaxed selective constraint, accelerated mutation rates, and strong due to transmission-related bottlenecks in obligate endosymbionts [47]. Recent studies have demonstrated that extracellular, strictly maternally transmitted stinkbug symbionts show strikingly similar patterns of reductive genome evolution, suggesting that transmission mode -- rather than an intracellular existence per se -- is a major contributing factor [10, 22]. Such extreme reductive genome evolution may be constrained in most extracellular gut microbiota (albeit to varying degrees), due to metabolic requirements of free-living survival and occasional host-switching [5]. In agreement with that prediction, recent genomic studies have found that some gut microbiota show modest trends toward AT bias and genome reduction, especially among socially transmitted gut associates [5]. Additional studies of gut microbiota

10

across transmission modes are needed in order to understand how transmission affects bacterial lifecycles and resulting consequences on symbiont genome evolution.

In this study, we survey the bacterial communities associated with an abundant eusocial insect, the red carpenter ant, Camponotus chromaiodes. We characterize a novel group of Acetobacteraceae, or acetic acid bacteria (AAB), that dominate the C. chromaiodes gut community. We further demonstrate that these AAB occur in some additional Camponotus species screened here, and group phylogenetically with AABs detected in prior studies of other ants. The ant-AAB association represents a useful model for assessing the evolutionary consequences of persistent host association on gut microbes. AAB are known to colonize gut-associated niches across numerous insect species [35, 48, 49], putatively aiding in a range of faculties from metabolite digestion to protection from . AAB often thrive in the slightly acidic (pH range of 4.5-6.5), micro-oxic gut environment, where they establish tight associations with the gut epithelium, provision various polysaccharides contributing to biofilm formation, and help to structure the gut community by decreasing pH and excluding pathogens [35].

Through a broader phylogenetic analysis of these Camponotus AAB and other

AAB lineages, we present the first evidence of a novel, monophyletic, and deeply divergent clade of ant-associated gut microbiota positioned in the Acetobacteraceae. We found that the 16S rDNA gene of these ant-associated AAB shows accelerated nucleotide substitution rates, elevated AT content, and decreased predicted stability of the rRNA

11

secondary structure. These features resemble patterns found in coevolved, socially- transmitted gut associates of bumblebees [29]. Such patterns have not been found in other well studied ant associates, such as Opitutales taxa often associated with Cephalotes hosts [45, 50, 51], as we demonstrate here. These results suggest that AAB may form a persistent association with ants, with significant impacts on the evolutionary trajectory of these bacteria.

1.2 Results

1.2.1 The gaster microbiota of C. chromaiodes is largely dominated by two Acetobacteraceae (AAB) OTUs

To characterize the microbiota of Camponotus chromaiodes, sequencing of the V4 and V5 regions of 16S rDNA was performed using the Ion PGM instrument. We estimated the empirical error rate of PGM amplicon sequencing by including E. coli

DH10B in library construction and Ion PGM sequencing. Results supporting a low empirical error rate for length-filtered data are provided in Table 4 and Appendix A.

Amplicon sequencing provided a snapshot of bacterial groups associated with C. chromaiodes. Table 1 shows the number of reads and number of OTUs per sample, for those OTUs comprising > 1% of total reads. Samples included three replicates of 3-5 pooled minor workers, from each of six ant colonies (Table 5). For four of these colonies, we included the colony queen.

12

All pyrosequencing datasets were dominated by the intracellular bacteria

Blochmannia and Wolbachia, both known to associate with Camponotus [52, 53]. In a previous analysis of 16S rDNA amplicons, Blochmannia typically constituted 95-98% of reads, far outnumbering even Wolbachia (unpublished data). In this study, an initial step of digesting amplicons with a Blochmannia-specific restriction enzyme (PacI) reduced the abundance of Blochmannia reads substantially, to 7.5%. We suspect these remaining

Blochmannia sequences were due to incomplete PacI digestion of Blochmannia 16S rDNA.

Since the Wolbachia 16S rDNA sequence does not contain a PacI restriction site, it was not depleted by the digestion and remains the dominant OTU (92% of all reads). Neither of the intracellular symbionts are thought to be members of the extracellular gut microbiota: Wolbachia display weak tropism for the gut but do not inhabit the gut lumen

[54, 55]; Blochmannia are housed in specialized host cells intercalated among the gut epithelium and ovaries [17]..

Because the vast majority of sequence reads matched the endosymbionts

Blochmannia or Wolbachia, our sampling of other bacterial groups had low coverage

(median of 27 non-endosymbiont reads per sample; Table 1). Even so, several striking patterns emerged when analyzing the non-endosymbiont OTUs. At the level of ant colony, and were the only stable phyla, consistently present across one or more samples per colony. Accounting for most of the widespread distribution of Actinobacteria was a Nocardia sp. distributed across 35% of minor

13

workers and all four queen samples; the total representation of Nocardia amounted to

4.6% of non-endosymbiont reads. Within the Proteobacteria, the class

Alphaproteobacteria was dominant in most samples, comprising 75-100% of

Proteobacteria in nearly all samples (Table 1). This trend is largely explained by the high abundance of two distinct OTUs in the family Acetobacteraceae. At least one of these OTUs was detected in 20/22 (91%) of samples. They dominate the non- endosymbiont OTUs detected, comprising 79.8% of that bacterial community (AAB1:

52.8%, AAB2: 27%; Table 1).

14

Table 1: Abundance (read count) of Bacterial OTUs detected in Camponotus chromaiodes. Read counts are based on 16S amplicon sequencing. "-" indicates zero reads detected. Endosymbiont reads (Blochmannia and Wolbachia) have been removed from the read counts shown. Only OTUs comprising > 1% of total reads are shown. Each worker sample represents a pool of 3 - 5 surface-sterilized whole gasters. Colonies are demarcated by a dashed line. The total % contribution of each OTU (right-most column) reflects the total read count of that OTU, among the 1,557 non-Blochmannia, non-Wolbachia reads for all 22 ant samples, after excluding OTUs <1%.

15

1.2.2 AAB are localized to the Camponotus gut tract

Because we chose to use whole gasters (rather than dissected gut tracts) to characterize the gut microbiota via pyrosequencing, we assessed the efficacy of our sterilization protocol (#1) and another standard protocol for removing exogenous bacterial cells or DNA that could adhere to the cuticle (#2). We also assessed the relative bacterial load expected for unsterilized specimens. After extracting gDNA from dissected tergite pools (portions of the gaster cuticle), we screened for the presence of bacterial 16S rDNA via PCR with universal bacterial primers (9F and 1046R) [56].

Neither sterilization approach (#1 or #2; see Methods) yielded detectable amounts of gDNA via absorbance measurement at 260nm, though both samples yielded positive, albeit very weak, PCR amplification Conversely, unsterilized tergite segments yielded detectable levels of gDNA for both assays; detectable gDNA via absorbance at 260nm suggests much higher amounts of bacterial DNA in the unsterilized samples, suggesting that both sterilization techniques were effective in removing bacterial DNA. However, positive amplification of 16S rDNA from sterilized tergite samples suggests that the source of bacterial DNA was due to incomplete removal of bacterial cells or DNA during sterilization, or an artifact acquired during DNA extraction, as extraction kits are not sterile [57].

In order to determine the host tissue in which AAB reside, we performed PCR on various samples with AAB-specific primers. We found that the tergite pools described

16

above, regardless of sterility, did not yield detectable amplification of either AAB1 or

AAB2. This rules out the possibility that AAB1 and AAB2 detected here reflect bacterial

DNA on the cuticle or in the DNA extraction kit (as the kit was used to prepare tergite

DNA). Conversely, AAB1 and AAB2 were successfully amplified from dissected whole gut tracts (fore-, mid-, and hindgut combined) from individual worker ants, collected the same day and surface sterilized following approach #2. Sequences of resulting PCR products were identical to sequences detected through pyrosequencing. Thus, in our description of the C. chromaiodes microbiota below, we are confident that AAB1 and

AAB2 can occur in the host gut. However, the other OTUs detected through amplicon pyrosequencing (Table 1) could conceivably occur in the gut, elsewhere in the gaster, or on the cuticle. Three lines of evidence argue against these OTUs being contaminants of the DNA extraction kit: (i) their presence varied across biological samples (all of which were extracted with the same kit; (ii) nearly all OTUs belong to genera that are not among the bacterial groups that are known kit contaminants [57]; (iii) related OTUs in many samples have been detected in the gut of other ant species (see Discussion).

1.2.3 Strain-level variation occurs within AAB OTUs, and screens across host species and geographic regions.

We analyzed the within each AAB OTU at the V4-V5 regions from data generated by Ion PGM sequencing. Within the AAB2 OTU,, we detected two nucleotide polymorphisms in the V4-V5 of AAB2 Each had an overall frequency of

17

3.95% in the read dataset, far above for the frequency expected for sequencing errors

(average substitution error frequency of 0.34%; Table 4). Due to the relatively short length of the V4-V5 amplicon (~400 bp), we did not include the AAB2 variant in phylogenetic analyses of near full-length sequences. PCR-based screens of additional

Camponotus species from distinct geographic locations were conducted to assess the prevalence of the ant-AAB relationship. Our screens, confirmed via Sanger sequencing, demonstrated that AAB1 was present in C. pennsylvanicus and C. castaneus from the northeastern US (Massachusetts) and C. castaneus from North Carolina (Table 6). AAB2 was found only in C. chromaiodes from colonies that originated in North Carolina (Table

6).

For select samples in this cross-species screen, we amplified and (Sanger) sequenced a longer region of the 16S rDNA gene, using a combination of OTU-specific and general bacterial primers. AAB1 from one isolate of Camponotus castaneus (854.4;

Table 6) showed two polymorphisms in the V2 region, when compared to the AAB1 sequence from C. chromaiodes. One polymorphism reflected a clear nucleotide transition, and the second reflected a mixed base signal, indicating a G/A sequence polymorphism within the C. castaneus individual sampled. At this position, chromatograms from both sequencing orientations showed a greater fluorescent intensity for G, as compared to A

(the consensus base in the C. chromaiodes sequence). This AAB1 variant from C. castaneus is included in the phylogeny presented (Figure 1).

18

1.2.4 Camponotus-associated AABs join other ant associates in a deep, well-supported clade

Phylogenetic analyses of 16S rDNA sequences showed that AAB1 and AAB2 group together, along with other Acetobacteraceae OTUs sampled from ant hosts or ant nests. This group of ant-associated Acetobacteraceae form a deeply divergent, monophyletic clade that is well supported based on Bayesian analysis (1.0 posterior probability; Figure 1) and maximum likelihood analysis (100% bootstrap support, Figure

15 ).

Based on current sequence data available from the Ribosomal Database Project

(RDP), [58] this clade apparently lacks any non-ant associates. That is, among the RDP

16S rRNA database (>3 million sequences), including environmental isolates and plant and animal associates, we did not detect any non-ant associated sequences that were putative congeneric bacteria (>95% similar) with the ant AAB. By comparing near full length 16S rDNA representative sequences to the entire RDP database, we found that

Camponotus gut isolates were the only sequences with an aligned similarity score greater than 92.6% to AAB1, or greater than 94.4% for AAB1. For the Attine AAB lineage, the closest non-ant relative had an aligned similarity score of 92%. To determine if any sequences in RDP would conflict with the monophyly of the ant-associated AAB clade, we chose 60 sequences that showed the highest sequence similarity to AAB1, AAB2, and/or the Attine-associated AAB via SeqMatch, as well as select sequences that we found to be close relatives in the Bayesian and ML analyses, here. We subjected these

19

sequences to alignment and simple distance-based tree reconstruction, within RDP.

Only ant-associated bacteria grouped with each of the three sequences above. The RDP tree supported the grouping of AAB1 and AAB2 together. While the analysis could not resolve the position of Attine AAB, the RDP results do not conflict with the monophyly of the three lineages demonstrated by the more extensive Bayesian and ML analyses in this study.

As expected of a single-gene phylogeny, some relationships in the Bayesian and

ML analyses were not well resolved. For instance, while this ant-associated AAB clade was extremely well supported, the identity of its sister clade was not well resolved.

Albeit with weak support (posterior probability of 0.61), Bayesian analysis suggests the sister group includes Alpha-2.2 symbionts described across various bee taxa [59], as well as other bee associates (Figure 1). ML analysis was unable to resolve this and other relationships (Figure 15).

To ensure that we could accurately assess relative substitution rates (below), we used an alignment of fewer taxa to generate a robust maximum likelihood tree, as a guide for substitution rate analysis (Figure 2). Phylogenetic reconstruction was based on an alignment of 1,333 unambiguous nucleotide positions of 16S rDNA. Most nodes were well supported, and all relationships were identical between maximum likelihood

(Figure 2) and Bayesian (Figure 16) approaches.

20

Figure 1: Bayesian phylogeny of Acetobacteraceae 16S rDNA. Bayesian inference was based on the GTR + Γ + PInv model of nucleotide substitution and analyses run for 10,000,000 generations. Values at nodes reflect posterior probabilities after a 25% burn-in. Taxa are colored based on lifestyle and host group: monophyletic ant-specific AAB clade (green); gut associates of various insects (orange); free-living taxa (black). The host name is listed after the taxon ID. 16S rDNA sequences of AAB1 from C. chromaiodes and C. castaneus (both included in the tree shown) differ by two nucleotides.Sequences of AAB1 from C. chromaiodes and C. pennsylvancus were identical over the region sequenced (E. coli coordinates 828-1046). Roseomonas terrae is the outgroup.

21

Table 2: Comparison of molecular clock models and relative substitution rates of 16S rDNA. All likelihood values were calculated using PAML4. The tree shown in Figure 2 was used to calculate relative rates and the likelihood scores for the global clock model and both local clock models. The null hypothesis of each comparison was a global clock (uniform rates) model. Clade IDs correspond to those listed in Figure 2. All rates are relative to the rest of the tree, without the inclusion of the outgroup taxon, Roseomonas terrae. Abbreviations: DF, degrees of freedom; LR, likelihood ratio.

1.2.5 The ant-associated AAB clade evolves several-times faster than other Acetobacteraceae

We tested several molecular clock scenarios to determine the likelihood of distinct rates across this phylogeny (Table 2). Allowing a distinct rate for the ant- associated AAB clade (Clade B in Figure 2, Table 2) resulted in a significant improvement in likelihood score, when compared the null hypothesis of a global clock

(uniform rate) (p<10-30). The same analysis also estimated a 7.6-fold rate increase in

Clade B, compared to the single rate inferred for all other lineages. This clear result argues for a statistically significant, several-fold rate increase in the ant-associated AAB clade, compared to a single rate calculated for all other lineages.

To explore rate variation within the ant-associated AAB clade, we compared an additional multi-rate model to the null of a global clock. In this case, the local clock model defined three distinct rates: a distinct rate for both of Clades A and B (i.e., all

22

lineages within Clade B that are not part of subclade A), and a third rate for all other lineages (Figure 2, Table 2). This three-rate model was also significantly better than the uniform rate, and supports rate acceleration throughout the ant-associated clade.

Specifically, under this model, Clade A was found to have a 8.9-fold rate increase, and

(the remainder of) Clade B a 6.1-fold increase. Despite increased resolution, this model is only marginally more likely than the simpler, two-rate model. Because within-clade rate heterogeneity can bias inferences of local rate estimation [60], we cannot say whether it is significantly better than the simpler two-rate model.

1.2.6 Camponotus AABs show increased AT content and reduced RNA stability, comparable to known socially transmitted bacteria

Normalized estimates of thermodynamic stability (Figure 3; Table 7) suggested that 16S ribosomal rRNA of both AAB OTUs were less stable than all free-living microbes analyzed. The predicted stability of AAB1 and AAB2 is comparable to extracellular gut associates of other social insects, such as Snodgrassella alvi and

Gilliamella apicola of bees (Figure 3; Table 7). These and other extracellular gut associates had higher stability than the intracellular or strictly maternally transmitted bacteria analyzed here. These values are based on the region of 16S rDNA corresponding to E. coli coordinates 16-1477. Notably, predicted stability of 16S rRNA molecules appears to be negatively correlated with AU content (Figure 3).

23

Figure 2: Maximum likelihood phylogeny of Acetobacteraceae 16S rDNA Alignment of 1,333 unambiguous nucleotide positions was performed with SSU- align. The tree was reconstructed using a maximum likelihood approximation with a GTR + Γ + PInv model of nucleotide substitution. Node support was generated from 1,000 bootstrap resamplings. Taxa are colored based on lifestyle and host group: monophyletic ant-specific AAB clade (green); gut symbionts of various insects (orange); free living taxa (black). Roseomonas terrae is the outgroup. Letters at select nodes correspond to nodes used for relative rate analysis, as listed in Table 2.

24

Figure 3: Relationship between AT content and normalized predicted free energy of 16S rRNA, showing the grouping of sequences by bacterial transmission mode. Free energy estimation was calculated using the Turner 2004 model. Analysis was based on the region corresponding to E. coli coordinates 16-1477. Colors indicate bacterial lifestyle. Among extracellular gut associates, shapes indicate transmission mode (when known). Note that bacterial associates group by transmission mode, rather than lifestyle.

25

1.3 Discussion

Several lines of evidence support the hypothesis that Acetobacteraceae is a frequent gut associate in Camponotus species and perhaps other ant groups. First, 16S rDNA amplicon sequencing of local C. chromaiodes microbiota showed that at least one of two Acetobacteraceae OTUs (AAB1 or AAB2) occurred in most host samples analyzed and were present in both workers and mated queens. Second, although some of our samples were based on whole ant gasters, we confirmed through analysis of dissected samples that AAB1 and AAB2 indeed occur in the gut tract of C. chromaiodes. Third, using AAB-specific PCR primers, we detected AAB1 in additional Camponotus species and regions: C. castaneus from North Carolina, and C. castaneus and C. pennsylvanicus from Massachusetts. Fourth, our phylogenetic analysis documented that AAB1 and

AAB2 belong to a deeply diverging, monophyletic clade that includes associates of diverse ant species (detailed below) and, based on current data, is restricted to ant associates (Figure 1; Clade B in Figure 2). This phylogenetic pattern suggests that the clade may be specialized for association with ants. Fifth, at the 16S rDNA gene, molecular evolution of the ant-associated AAB clade exhibited trends that resemble stably inherited symbionts: significantly accelerated rates of evolution, AT-biased sequence composition, and (perhaps a result of AT bias) destabilization of predicted rRNA stability. Combined, these data suggest the association of this AAB clade with ants may be specialized and quite old.

26

1.3.1 Monophyly of ant-associated AAB.

In more depth, the ant-associated AAB clade contains two lineages that include

OTUs (AAB1 and AAB2) that we amplified from C. chromaiodes and (in the case of

AAB1) from C. castaneus and C. pennsylvanicus samples. The detection of these two lineages in association with ants is not new. Previous studies have identified ant associates that are closely related to AAB1, including a gut associate of C. japonicus [61] and a bacterial associate of Formica exsecta detected in a transcriptome analysis [26].

Previous work has also detected bacteria that are closely related to AAB2, including a dominant taxon in the crop of lab-reared Camponotus fragilis [25], a gut associate of C. japonicus [GenBank: KM974908.1], and another F. exsecta associate [26]. In addition, unpublished sequences from gut tissue of adult Cataglyphis bicolor [GenBank:KF419351] and Cataglyphis fortis [GenBank:KF419347] group robustly (posterior probability =1.0, unpublished data) with AAB2. These two sequences were not included in the final trees presented here, due to their relatively short sequence lengths. In sum, while several studies have documented AABs in association with ants, often verifying their location in the gut tract, to our knowledge this is the first demonstration that various ant associated AABs form a monophyletic group and may represent a specific association with ants. In light of these diverse associations with various ant groups, and the known role of Acetobacteraceae as gut symbionts in other insects [12, 35, 49], we propose that

27

these two AAB lineages may be beneficial gut-associated symbionts of Camponotus, and potentially of the subfamily more generally.

Within the ant-specific AAB clade, a third, basal lineage consists of bacteria associated with Attine ants. These samples include bacteria found within microbial assemblages of refuse dumps of Panamanian attine ants [62], and a related OTU from an unidentified source of Brazilian Atta laevigata

[GenBank: KF248847]. In addition, this group includes an associate of the ant

Megalomyrmex staudingeri, a social parasite of several Attine genera [63] (posterior probability = 1.0, unpublished data). (Sequence was not included in the tree presented, due to relatively short length.) While further sampling is needed, the intriguing position of this third group as a basal lineage to known Formicinae associates above (AAB1, AAB2, and relatives) raises the possibility that an association between this clade and ants is ancient.

Detection of a related AAB in attine refuse dumps is consistent with this bacteria being part of the gut microbiota, as dumps contain the bodies of dead workers [64] and are likely to reflect a mixed composition of microbial communities from various sources.

While the context of the Atta-AAB association is unclear at this time, our combined analyses of AAB1 and AAB2 are consistent with trends observed in persistent gut associates rather than incidentally acquired from the environment, e.g., via forage fed

28

upon by ants. Moreover, AAB1 or AAB2 have not been detected in previous screens of ant gardens or soil chambers [65]. Further arguing against incidental environmental acquisition, we found that among the numerous 16S rRNA sequences available in the

RDP database, including those most similar to the ant-associated AAB, no non-ant associated bacteria disrupt the monophyly of the clade described above. If members of this AAB group exists among environmental isolates, in association with non-ant animals, or in association with plants, they have not yet been discovered to our knowledge.

As expected of a single-gene phylogeny, many relationships in the

Acetobacteraceae tree were difficult to resolve, including the sister lineage to the ant- specific clade. Bayesian inference supports (albeit weakly) that the sister lineage includes the Alpha-2.2 bee associates [12, 59], as well as associates of other (the carpenter bee, Xylocopa californica, and the ant Formica occulta). Certainly, further phylogenetic analyses that include additional genes are needed in order to better resolve this relationship. If evidence for a close relationship between bee and ant associates is verified, it is possible that this bacterial lineage shares pre-adaptive traits fostering colonization of hymenopteran guts [37].

Interestingly, associations between Acetobacteraceae and ants apparently have occurred multiple times independently. That is, the ant-specific AAB clade detailed above does not contain all AAB detected in ants. A comprehensive survey of ant

29

microbiota [66] identified Asaia bogorensis and an unspecified OTU that were associated with Tetraponera and Formica ants, respectively (Figure 1). Whether these associates represent transient interactions or stable interactions is unclear. Phylogenetic independence among bacterial associations is not limited to ants. Consistent interactions between Drosophila and phylogenetically distinct gut bacteria within the

Acetobacteraceae (Acetobacter pomorum and Gluconobacter morbifer) [67] have been implicated in specific interactions with host health and disease states [49]. Interestingly, both of these associates occur on relatively long branches, but show a predicted rRNA stability comparable to that of E. coli, and the rates of molecular evolution at 16S rDNA is similar to that of free living bacteria (Figure 3, Table 7). In the case of these and other taxa on long branches, additional sampling of related OTUs will help to elucidate whether the long branches are due to shifts in patterns of molecular evolution or an artifact of incomplete sampling.

1.3.2 Rapid evolution and destabilization at 16S rDNA.

In light of the above evidence that the ant-specific AAB clade may represent a long-term association with ants, we tested whether patterns of molecular evolution at the 16S rDNA gene resemble those found in stably inherited bacterial symbionts. We found that patterns of sequence evolution in this group resemble known socially or maternally transmitted gut associates, moreso than environmentally acquired associates

30

or free-living bacteria. These patterns include accelerated rates of substitution, AT bias, and reduced stability of 16S rRNA secondary structure, detailed below. The magnitude of rate acceleration surprisingly resembled those detected for some intracellular symbionts.

Significant acceleration in rates of sequence evolution is well established among the most extremely stable symbionts: long-term, maternally transmitted, intracellular mutualists of insects [4]. Numerical estimates of the extent of this acceleration may depend on the gene analyzed and taxa compared, but studies consistently support several-fold rate increases. For example, estimates of the absolute fold-change in substitution rate at 16S rDNA include a ~1.5 to 2.6-fold acceleration in [68], and up to a 15 to 29-fold acceleration in Blochmannia [14]. Likewise, a significant, ~6.5-fold rate acceleration was detected in gut bacteria of stinkbugs, an extracellular associate that shows extremely high fidelity of maternal transmission and patterns of reductive genome evolution that resemble those of long-term intracellular associates [10]. This observed rate acceleration may reflect a combination of processes that differentially affect stably transmitted symbionts, such as relaxed selection, increased mutation rate

(e.g., if DNA repair genes are lost), and reduced efficacy of purifying selection due to a reduction in Ne. These evolutionary mechanisms have been distinguished in some symbiont groups through analyses of several genes [47]; however, they are impossible to distinguish based on 16S rDNA data alone.

31

Our results indicate that the clade containing AAB1 and AAB2 may experience an acceleration of sequence evolution that is comparable to the examples above. At 16S rDNA, the ant-specific clade is evolving at a significantly faster rate than other

Acetobacteraceae included in the analysis. That is, a model allowing a faster rate for this clade represents a significant improvement over the null model of uniform rates across the tree. While the fold-increase in rate depends on the taxa included, we estimate that

16S rDNA gene is experiencing ~7.6-fold acceleration in substitution rates, as compared to free living and other insect-associated AAB included (Figure 2; Table 2).

In addition, AT bias characterizes most long-term intracellular associates, and more recently has been documented in extracellular gut associates with maternal and social transmission [10, 69]. Like rate acceleration and often coupled with it, increased

AT bias may reflect a combination of increased mutational pressure, or a reduction in the strength or efficacy of purifying selection against GC->AT changes. We detected atypically high AU content of 16S rRNA for AAB1 and AAB2 (~47% AU), compared to other host-associated and environmental bacteria (Figure 3; Table 7).

Similar to reports on obligate endosymbionts of insects [70], we found a notable decrease in 16S rRNA stability for AAB1 and AAB2 (Figure 3). Normalized folding free energy of 16S rRNA (spanning E. coli coordinates 14-1445) is ~9-11% lower in both AAB

OTUs than those of free-living AAB. By contrast, across the same region, various species of the obligate endosymbiont, Blochmannia, are ~12-19% less stable than E. coli. Unlike

32

previous studies [70], we did not distinguish whether base substitutions are phylogenetically independent; nor can we rule out the possibility that the reduced stability simply reflects the observed increase in AU content of this molecule.

Interestingly, the stability difference between AAB1 and AAB2 is large (~6% less stable in AAB1), despite a modest shift in AT content (0.1%). This shift in stability across similar AT content suggests factors other than base composition contribute to destabilization. Comparative genomic data could shed light on the mechanisms driving the overall destabilization of AAB1 and AAB2, and the notable difference between them.

The AT content of 16S rDNA from Attine-associated AAB was slightly elevated, but its structural stability was comparable to free-living bacteria or environmentally acquired symbionts.

Although evidence based on one gene is limited, this combined pattern of rate acceleration, increased AT content, and reduced rRNA stability in ant-specific AAB1 and

AAB2 suggests that their molecular evolution may resemble known socially and maternally transmitted symbionts. A similar, though more extreme trajectory typifies symbionts with strict maternal transmission, including most obligate endosymbionts.

These parallels suggest that socially transmitted bacteria may experience some of the evolutionary forces that shape long-term endosymbionts, albeit to a lesser degree. Such forces may include intensified mutation rates via relaxed selection (though constrained by the requirement for survival outside of hosts [5]) and possibly fixation of deleterious

33

changes by genetic drift (though mitigated by host-switching and potential recombination with other members of the microbiota [23]). Further studies will shed light on the extent to which transmission mode contributes to this evolutionary trajectory and the underlying mechanisms involved.

Within each AAB OTU, we detected polymorphisms at 16S rDNA. Variation within a gut associated OTU has been demonstrated in extracellular gut associates of bees [71], and may predict strain-level variation in gene content and ecological role.

Specifically, within AAB2, we detected two polymorphic alleles in the V4-V5 region.

Likewise, within AAB1, we detected polymorphisms across host species, although our limited sample does not allow us to assess host specificity of these variants. Further studies are needed to explore whether this strain-level variation arises from niche specialization or reflects neutral changes, e.g., under an elevated substitution rate.

1.3.3 Potential transmission routes and caste variation.

Although clarifying the transmission routes of AAB1 and AAB2 will require experimental data, the patterns above suggest the possibility of social transmission via nutritional or other interactions among ant nestmates. As a eusocial host, intimate contact within colonies creates opportunities for transfer of gut microbiota among kin

(e.g., between queen and her offspring, and among workers) via trophallaxis. Such nutritional exchanges may transfer gut microbes and promote stability of the gut

34

community [44, 72, 73]. Anal trophyllaxis (i.e. feeding on anal secretions) likely fosters microbiota transmission in the few ant groups that perform this behavior (e.g.,

Cephalotes) [45, 66] and may contribute to the phylogenetic stability of the microbiota in

Cephalotes [45]. In most ants, including the tribe Camponotini trophallaxis is mouth-to- mouth (stomodeal trophyllaxis), involving the regurgitation of crop contents. This social feeding transmits important nutritional resources, and might also transmit key gut microbes.

On a community scale, the microbiota of C. chromaiodes had low taxonomic diversity, and many taxa occurred sporadically. However, some taxa were identified in several samples, and relatives have been documented in additional ant species. The two AABs were, by far, the most abundant within most samples and widespread across samples. In addition, a Nocardia sp. was detected repeatedly, within and across host colonies. Interestingly, two OTUs with 100% and 98% sequence identity to the Nocardia sp. have also been detected in the guts of C. japonicus [74] and the camponotine Polyrhachis robsoni [75], respectively, though the broader prevalence of this relationship is unknown.

Microbial communities from colony queens were similar to those of workers; with one exception (799.Q; Table 1), queens possessed one AAB OTU or the other, but not both. The AAB OTU detected in the colony queen was not necessarily the

35

numerically dominant AAB in workers from the same colony. The Nocardia OTU described above was also consistently recovered across colony queens. The processes generating differences between queen and worker microbiota are uncertain, but may include transfer of gut bacteria among workers from different nests, distinct ecological roles of queens and workers within the colony, and shifts in the gut community as queens age. The colonies samples were large, well-established nests, meaning that resident queens could be several years old.

Exploration of transmission routes and caste variation will benefit from a deeper sampling of the gut community. While our efforts to deplete the abundance of 16S rDNA amplicons from Blochmannia were successful, Wolbachia contributed significantly to our sequencing effort. In the context of gut community surveys, the alimentary tract is relatively large among the Formicidae, and likely amenable to microdissection of luminal contents for targeted characterization of those communities, as has been performed for the characterization of termite luminal microbiota [30]. A combined approach utilizing these and other [76] methods will facilitate deeper exploration of gut communities among insects that also house abundant endosymbionts near the gut.

In sum, insect hosts continue to offer fertile ground to explore evolutionary processes shaping host-microbe interactions, including gut-associated symbioses.

Exploration of diverse host systems will help to uncover fundamental principles shaping these relationships, including the influence of transmission mode on symbiont

36

evolution. While our results point to distinct patterns of sequence evolution in the ant- specific Acetobacteraceae clade, additional studies are needed to clarify the evolutionary mechanisms driving these observed patterns and to explore potential genome-wide impacts. In addition, experimental work is required to test the hypothesis of social transmission and to understand the potential functions and host fitness effects of this bacterial group. The data presented here suggest that Acetobacteraceae symbionts represent abundant and dominant members of the gut community of Camponotus chromaiodes. The two most abundant members, AAB1 and AAB2, belong to a novel and deeply divergent, monophyletic clade of ant-associated bacteria. Relatives of these taxa were recovered from two additional Camponotus species, and are known from prior studies to occur in additional ant groups. Members of this ant-specific AAB clade display patterns of molecular evolution at the 16S rDNA gene that resemble patterns found in socially transmitted gut symbionts of bees and, in a more extreme form, cospeciating, maternally transmitted gut associates of stinkbugs. Specifically, these features include elevated AT content, a ~7.6-fold increase in substitution rates at 16S rDNA, and a notable decrease in the predicted stability of 16S rRNA. Cumulatively, these results are consistent with the hypothesis that this AAB clade may represent a long-term gut associate of ants.

37

1.4 Methods

1.4.1 Ant collection

Ants were collected for two purposes: to characterize the microbiota of local C. chromaiodes colonies, and to screen Camponotus of various species and geographic regions for AABs detected. For the microbiota characterization, we sampled several minor workers and, when available, the mated colony queen from each of six local C. chromaiodes colonies. All specimens were immediately placed in 100% ethanol and transported to Duke University upon collection. Collection information is available in

Table 5. For the targeted AAB screen across Camponotus species and geographic regions, we used ants that were collected from 2008-15. This collection information is available in Table 6. For all samples, vouchers are available upon request.

1.4.2 Sample preparation for microbiota characterization: surface sterilization, genomic DNA (gDNA) extraction, and Blochmannia depletion

Microbiota characterization of local C. chromaiodes was based on surface- sterilized gasters. All samples were surface-sterilized in 100% ethanol and washed 4-6 times in sterile 50mL vials filled with DEPC water. Surface sterility was confirmed by negative PCR on the final wash bath. We call this protocol surface sterilization approach #1.

38

gDNA was extracted from 3-5 pooled, surface-sterilized worker gasters per sample, or the single gaster of the colony queen, using the Qiagen (Hilden, Germany)

DNEasy kit according to a standard protocol (Table 5). We chose to use whole, surface- sterilized gaster samples rather than dissected gut tissue, as dissection may increase contamination. Because surface-sterilized gasters were used, we cannot conclude that all bacterial taxa detected were gut-associated (our later use of dissected gut samples, described below, confirmed that the AAB detected in this characterization resided in the gut).

Blochmannia, the long-term intracellular endosymbiont of Camponotus and related genera, far outnumbers extracellular gut associates. Thus, in order to deplete

Blochmannia 16S rDNA abundance, 500ng of gDNA per sample were treated with NEB

(Ipswich, Massachusetts, USA) PacI enzyme according to the manufacturer’s protocol.

The Blochmannia chromaiodes 16S rDNA gene contains a unique PacI site from coordinates 124-131. The site is rare in other bacteria, occurring in just 0.38%

(sporadically distributed) of sequences in the RDP 16S database. This rarity makes it extremely unlikely that 16S rDNA of other gut-associated taxa would contain this restriction site. To ensure the specificity of our approach, we amplified 16S rDNA with universal Bacterial primers from undigested ant gDNA, digested the resulting PCR product with PacI, and visualized the digested products on an Agilent Bioanalyzer. The products of digestion included two bands corresponding to the expected fragment sizes

39

for Blochmannia. We then gel-purified and sequenced the digested amplicon using an

ABI 3730xl (Life Technologies, Carlsbad, CA, USA). The sequences of these digested fragments showed clean Blochmannia sequences, indicating that 16S rDNA of other bacterial species were not digested by PacI. In order to reduce the contribution of

Blochmannia to 16S rDNA amplicon pools, PacI-digested gDNA was used as template in the initial PCR for library preparation, below.

1.4.3 16S rDNA amplicon pyrosequencing using Ion PGM

Selection of region: We targeted the V4-V5 region of 16S rDNA (E. coli coordinates

515-926) to maximize the strengths of Ion PGM (Life Technologies, Carlsbad, CA, USA) sequencing. First, the V4-V5 regions are flanked by conserved sequences that have been validated as universal PCR primer sites across Bacteria [77, 78] (5’ primer coordinates

515 and 926; Table 8). Second, the length of this region is well suited for the Sequencing

400 kit (410bp empirical modal read length). Third, by sequencing both the V4 and V5 hypervariable regions, we were able to achieve robust taxonomic resolution at the genus level.

Initial PCR reaction: In order to minimize barcode-induced amplification bias during library preparation, we performed a two-step PCR approach [79]. From each

PacI-digested gDNA sample, we first amplified a ~918 bp region of 16S rDNA that spans the PacI restriction site in Blochmannia. PCR reactions were performed in triplicate and

40

contained 5.05uL nuclease-free water, 1uL 10x High Fidelity buffer (Life Technologies),

0.25uL 10mM dNTP mix, 0.5uL 50mM MgCl2, 0.8uL forward primer 9F, 0.8uL reverse primer mix 926R, 0.1uL Platinum Taq DNA Polymerase High Fidelity (Life

Technologies), and 1.5uL of PacI-digested gDNA. Reactions were held at 95°C for 2 min to denature the DNA, with 25 cycles of amplification proceeding at 95°C for 20s, 50°C for 30s, and 72°C for 30s, we added a final extension phase at 72°C for 5 min, then cooled at 4°C. Resulting amplicons were then pooled, cleaned using the Qiagen PCR

Purification kit following the manufacturer’s instructions, and digested with PacI following the same protocol as above (for gDNA), to further deplete any undigested 16S copies from Blochmannia. All samples were cleaned and run on a 1% E-gel (Life

Technologies) for 13 minutes. The target (undigested) bands were excised, cleaned, and used as template for Ion PGM libraries below.

Ion PGM library construction and sequencing: 10-cycle PCRs were set up similarly to above but using barcoded fusion primers to amplify the V4 and V5 hypervariable regions from the 25 cycle amplicons described above. All primers used are detailed in

Table 8. All libraries were quantified using a Qubit 2.0 Fluorometer (Life Technologies), then pooled in equimolar concentrations before being sequenced with an Ion 314 Chip

Kit v2, using the Sequencing 400 Kit on the Ion PGM. Sequencing was performed by the

Genome Sequencing and Analysis Core Resource at Duke University according to the manufacturer’s protocol for Ion Amplicon Sequencing. This approach generated 658,420

41

raw reads, of which 337,364 passed our quality and length filters and were used in downstream analyses.

1.4.4 Microbial community analysis

Raw sequencing reads were processed using QIIME 1.8 [80] and UPARSE [81].

All reads were trimmed of sequencing adaptors, barcodes, and forward and reverse primers. Based on our estimates of empirical error rates from an E. coli DH10B control library, found that length-based thresholds (universal trimming of reads to 360 bp, and discarding any reads <360 bp) followed by run-specific expected error filtering [81]) proved to be the most effective way to reduce sequencing errors (Table 4). Expected error filtering parameters were based upon retaining as many reads as possible while still successfully calling a single OTU for the E. coli control dataset. The UPARSE pipeline [81] was used to group operational taxonomic units (OTUs) at a 97% similarity threshold and to remove chimeric sequences. was assigned to UPARSE- identified OTUs using the RDP Classifier against Greengenes database 13.8 [58, 82].

Relative abundance estimates were generated during data analysis with QIIME.

1.4.5 Within-OTU polymorphism

To calculate the genetic diversity within an OTU, we used an approach similar to that for error calculation (Table 4). As there is no reference sequence for either OTU, we

42

used the representative sequence for that OTU, which was the most abundant sequence.

Using the OTU representative sequence as the reference, we aligned all Ion PGM reads to the reference sequence and performed variant calculations as described above.

Because indel errors rates can vary widely with pyrosequencing data, we were only able to assess polymorphism frequency for nucleotide substitutions. We characterized a polymorphism as genuine only if its frequency deviated from the empirical substitution error rate by an order of magnitude. Thus, our estimates of OTU heterogeneity are likely conservative.

1.4.6 Assessment of surface sterilization techniques

To assess the efficacy of the surface sterilization of ant gasters, we sterilized pooled samples of tergites and analyzed samples by PCR to test exogenous bacterial cells or bacterial DNA that could adhere to the cuticle. From each of five minor workers, two to three tergites were dissected under microscope and pooled. We generated three such samples of 13-15 tergites each. Each pool was processed exactly as the experimental samples, but then treated with one of two sterilization regimes. Under sterilization approach #1 (identical to that used above to prepare samples for Ion Torrent sequencing), tergites were vortexed and incubated for >5 minutes in 100% ethanol, then washed in sterile water until the final wash bath did not yield a positive signal from

PCR with universal bacterial primers (as above). Under approach #2, tergites were

43

dipped into 95% ethanol, soaked for one min in 5% bleach, and then rinsed in sterilized water. We also included a sterilization-free sample for comparison. This sample was simply rinsed with sterile PBS. After treating each sample accordingly, we performed gDNA extractions on the tergites exactly as we did for the experimental samples. We quantified the amount of resulting gDNA using a Nanodrop ND-1000 (Thermo Fisher

Scientific, MA, USA). To screen for the presence of bacterial DNA specifically, we performed PCR on gDNA extractions using universal Bacterial primers. These PCR reactions contained 4.2uL nuclease-free water, 4uL 5PRIME HotMaster mix (Hilden,

Germany), 0.5uL 25mM MgCl2, 0.2uL forward primer 9F, 0.2uL reverse primer 1046R, and 1uL of gDNA extraction product. Reactions were held at 94°C for 2 min to denature the DNA, with 30 cycles of amplification proceeding at 94°C for 20s, 50°C for 30s, and

72°C for 30s, we added a final extension phase at 72°C for 5 min, then cooled at 4°C.

Resulting amplicons were visualized on a 1% agarose gel.

1.4.7 Localization of AABs to the C. chromaiodes gut using diagnostic PCR

We developed a specific PCR screen for the presence of each AAB OTU that was detected through the microbiota characterization above. For this screen, we designed primers specific to each AAB OTU (AAB1 and AAB2). The 5’ primer coordinate for

AAB1 corresponds to E. coli 16S position 828 (5’ TGGATGTCGGAGATTATGTCTTC 3’), which falls within hypervariable region 5. The corresponding 5’ coordinate for the AAB2

44

probe lies at position 628 (5’ GAAACTGCATTCAAGACGTGTAG 3’) and falls within hypervariable region 4. PCR reactions contained 10uL nuclease-free water, 10uL

5PRIME HotMaster mix (Hilden, Germany), 1uL DMSO, 1uL 25mM MgCl2, 0.5uL of the

AAB-specific forward primer, 0.5uL reverse primer 1046R, and 2uL of gDNA extraction product. Reaction conditions were as described above, but with a 54’C annealing temperature for 828F and 58’C for 628F.

We used this PCR screen to verify that AABs of local C. chromaiodes samples are located in the gut tract. After applying surface sterilization approach #2 (below) to whole ants, we dissected the entire gut tract (fore-, mid-, and hindgut tissue combined) of minor workers from multiple local colonies of C. chromaiodes under sterile conditions, extracted gDNA using the manufacturer’s protocol, and screened those samples as well as the tergite samples (described above) for the presence of each AAB

OTU.

1.4.8 Surface sterilization, dissection, and gDNA extraction for PCR screen of AABs

Building on the microbiota characterization above, we screened dissected guts of additional Camponotus samples. Ants were surface sterilized by rinsing in 95% ethanol, soaked for one minute in 5% bleach, and given a final rinse in sterilized water (approach

#2 above). The entire gut system was dissected out under sterile dissection conditions and gDNA extracted from the dissected gut sample.

45

1.4.9 Screening of additional Camponotus species and geographic regions

We used the described AAB primers to screen Camponotus ants of different species and various locations in the eastern and southwestern United States (Table 6).

Samples were of two types: vouchers that were preserved in 70% EtOH for up to seven years, and newly collected samples. For preserved voucher specimen, we used whole gasters that were surface sterilized under approach #2 (ethanol, bleach, and water rinses). For newly collected samples, within the same day of field collection, we surface sterilized whole ants using approach #2 and dissected the entire gut tract under sterile conditions. For both sample types, only one ant was used per sample. gDNA was prepared using the Qiagen (Hilden, Germany) DNEasy kit. PCR was performed as described above for the AAB-specific primers. All samples that yielded positive amplification with the specific primers described above were sequenced in both directions on an ABI 3730 xl DNA Analyzer (Thermo Fisher Scientific, MA, USA) by the

Duke Genomic and Computational Biology Shared Resource. Sequence assembly was performed with Phrap [83]. Called bases were only accepted if they had a Q score > 20

(99.9% correct base call).

1.4.10 Phylogenetic analysis of AAB 16S rDNA sequences

Generation of near full length 16S rDNA sequences: We generated near full length

16S rDNA sequences (spanning E. coli coordinates 9-1507) for each AAB OTU using a

46

combination of AAB-specific primers (above) and flanking universal primers. For AAB1,

16S sequences were generated for Camponotus chromaiodes (ID:793.2) and Camponotus castaneus (ID:854). For AAB2, 16S sequences were generated from Camponotus chromaiodes (799.2). To selectively amplify AAB1, we used forward primer 16S 828F (5’

TGGATGTCGGAGATTATGTCTTC ‘3) and universal reverse primer 16S 1507R (5’

TACCTTGTTACGACTTCACCCCAG ‘3). To amplify AAB2, we used 16S 628F (5’

GAAACTGCATTCAAGACGTGTAG ‘3) and the same reverse primer, 16S 1507R. In order to generate the remainder of the 16S rDNA sequence, we used universal forward primer 16S 9F (5’ GAGTTTGATCCTGGCTCA ‘3) and the reverse complement of the specific primers listed above. All products were sequenced on an ABI 3730xl (Life

Technologies, Carlsbad, CA, USA).

Alignments: Bacterial 16S sequences were chosen from genera within the

Acetobacteraceae that have host-associated species, as annotated by RDP [58]. Sequence alignment was performed using SSU-ALIGN [84], a secondary structure-aware program based on the INFERNAL aligner [85]. Prior to phylogenetic analysis, all sequences were trimmed to a uniform length and regions were masked if the posterior probability of the nucleotide alignment fell below 0.95. Additionally, we masked all unaligned columns and removed any columns where more than 50% of the sequences contained gaps. The final alignment spanned 1,248 positions of 16S rDNA across 53 taxa. This alignment was used to infer the broader tree of relationships (Figure 1). Sequence divergences

47

calculated from this alignment are likely conservative estimates of divergence, due to the trimming parameters described above.

Because relative rate analysis requires a well resolved tree, we generated a second Acetobacteraceae 16S rDNA alignment with fewer (36) taxa. Due to the decreased number of taxa, more alignment positions were unambiguous and thus could be included. This 36-taxa alignment includes 1,333 nucleotides, was generated with the same alignment parameters as described above. This alignment was used to generate the rate analysis guide tree (Figure 2), as detailed below.

Phylogenetic inference: For both alignments above, maximum likelihood and

Bayesian approaches were used for phylogenetic inference. Maximum likelihood based inference employed RAxML [86], using a General Time Reversible (GTR) model with a

Gamma (Γ) distribution of rate heterogeneity [87] and proportion of invariant site estimation (PInv). This model was chosen because jModelTest 2 [88] returned a nearly identical Bayesian information criterion (BIC) score to the top-scoring (Tamura-Nei) model and GTR models generally yield slightly better likelihood scores [89]. Bootstrap values were determined from 1000 replications.

MrBayes [90] was used for Bayesian phylogenetic inference, using a GTR + Γ +

PInv model of nucleotide substitution, 10,000,000 generations, 4 chains, 1,000 generation sampling interval, and a 0.25 burn-in fraction.. Genbank accession numbers for all taxa used in phylogenetic analyses are listed in Table 9.

48

Search of Ribosomal Database Project: Near full length sequences from isolates of

AAB1 and AAB2 (E.coli coordinates 9-1507) were aligned using RDP’s secondary- structure aware INFERNAL aligner. Using the RDP SeqMatch tool, aligned sequences were queried against the entire database of 16S rRNA sequences, regardless of length, strain, or isolation source (3,224,600 sequences). The naïve Bayesian classifier was used for taxonomic assignment and trained on the latest training set (RDP Release 11, Update

4; training set 14). The latest search was performed on 15 March 2016. The RDP

TreeBuilder tool was used to confirm that the most similar non-ant associated bacteria did not break up the monophyly of ant-associated AAB.

1.4.11 Nucleotide substitution rate estimation

We used the baseml package within PAML4 [91] to estimate the nucleotide substitution rate of each AAB OTU at 16S rDNA. We used a GTR + Γ model of nucleotide substitution, as determined by jModelTest 2, based on the Bayesian information criterion [88]. The maximum likelihood tree of the 1,333 bp alignment

(Figure 2) was used as the guide in rate estimation. We chose to use this tree because it included enough taxa (n=36) to robustly analyze rates across major lineages, and was well resolved (bootstrap value > 70) at most nodes.

We assessed the likelihood of various clock models under the GTR substitution model described above. To evaluate each clock scenario, we used PAML4 to calculate

49

the likelihood score for that scenario (two or three distinct substitution rates) and used the likelihood ratio test (LRT) to calculate the p-value of that test. A more complex model (increased number of substitution rates) was accepted if it offered a significant improvement over the null hypothesis (a uniform rate).

1.4.12 Ribosomal RNA folding energy estimation

We predicted the secondary structure and associated free energy estimation for each lineage listed in Table 7 using RNAfold [92]. We analyzed the near full length 16S rRNA sequence (E. coli coordinates 16-1477) of each OTU and constrained the alignment to the secondary structure of the 16S rRNA molecule. Due to expansion/deletion of various regions of 16S rRNA, direct comparisons of absolute free energy calculations were impossible. Therefore, we length-normalized free energy calculations to allow for cross-phyla comparisons. We used the Turner 2004 model [93] to determine RNA folding energies. This algorithm calculates both the partition function and base pairing probability matrix, in addition to the minimum free energy structure. We allowed dangling end energies on both sides of a helix to stabilize. All free energy calculations were assumed at a temperature of 37°C.

50

2. Social interactions foster reliable bacterial transmission in Camponotine ants

2.1 Introduction

Evolutionary transitions in mutualisms are shaped by myriad host and microbial factors. Often, these are mediated by differential drivers that aim to maximize fitness benefits at both the host and bacterial level [94, 95]. On the host side, these can include bacterial niche partitioning [96], resource manipulation [97-99], and control of transmission [43, 100]. Mechanisms of bacterial transmission, in particular, have recently been shown to correlate tightly with evolutionary shifts in bacterial genomes and lifestyles [5, 9], with specialized, high-fidelity modalities acting as conduits for long- term, coevolved mutualists [4, 10, 101]. This is somewhat intuitive, as adaptation to any niche, host-associated or otherwise, is dependent upon persistent interactions over evolutionary timescales [4, 23, 102].

Mechanisms facilitating bacterial transmission can vary in both specificity and capacity across hosts. Insects and associated bacterial mutualists represent the most studied of such systems and utilize a diverse range of mechanisms to transmit a wide variety of associates [4, 5, 43]. These include the relatively well studied cases of ancient mutualisms with long-term intracellular bacteria, typified by binary host-microbe partnerships utilizing high-fidelity, specialized transmission modalities [4]. At the other end of the spectrum, insects also host far more promiscuous relationships with a variety

51

of extracellular gut associates from various environmental sources [103, 104]. Distinct examples include and acquisition of environmental Burkholderia isolates from soil communities by the bean bug, Riptortus pedestris [105], and vertical transmission of coevolved isolates via specialized symbiont capsules [101] or jelly-like secretions [22] in the Pentatomidae. Aside from these extremes, recent studies have suggested that social behavior – intimate interactions between social nest mates – may foster vertical transmission of heterogeneous gut consortia from a mother colony to incipient queens and her progeny [23, 44, 45]. Further sources of bacterial exchange likely include horizontal sharing of bacteria between unrelated hosts, stochastic interactions with environmental microbes, and mixed modalities combining several of the previously mentioned interactions.

Social behavior fosters ample opportunities for reliable bacterial over evolutionary timescales. Due to their intimate nature, social interactions such as trophallaxis, the transfer of food, bacteria, and other excretions via the mouth

(stomodeal) or anus (proctodeal), have been hypothesized as conduits for beneficial extracellular gut associates [5, 43, 45, 106, 107]. This is supported by evidence from social bees and certain ant tribes, which possess simple yet specialized communities [12, 45] in their alimentary tracts thought to be transmitted via social interactions [28]. Indeed, two dominant operational taxonomic units (OTUs) within bee gut communities, Snodgrassella

52

and Gilliamella, are highly specialized to their host species and share a long coevolutionary history [23, 29]. Additionally, microbiota of turtle ants are largely dominated by ant-specific OTUs within the Opitutales, which have likely codiversified with their associated host species [45, 51]. Furthermore, the gut microbiota of camponotine ants have been studied in both lab-raised [25] and field-collected contexts

[9, 61, 66], with select OTUs present across broad geographic scales and settings.

Specifically, two OTUs within the Acetobacteraceae (AAB1 and AAB2) have been demonstrated to display hallmarks of adaptation to the host and potentially transmitted via social interactions.

To assess how sociality affects transmission of gut microbiota in natural settings, we surveyed the bacterial communities of incipient queens and associated workers and brood of the carpenter ant, Camponotus chromaiodes, across thirteen field-collected incipient colonies spanning several developmental stages. We selected C. chromaiodes for myriad reasons. First, camponotine queens are typically monogynous (one queen per colony) and, upon leaving a mother colony, found and nurse the first generation of worker offspring until workers can assume husbandry duties. This mechanism of foundation and establishment of an incipient colony by a monogynous queen provides a singular and reliable conduit for bacterial transmission. Second, several species of

Camponotus maintain persistent relationships with select members of their gut

53

community. These are the previously described OTUs, AAB1 and AAB2, members of a deeply divergent, monophyletic clade of Acetobacteraceae bacteria that display distinct signatures of stable host association [9], including elevated evolutionary rates, AT nucleotide bias, and significant destabilization of the 16S rRNA. Remarkably distinct, in this regard, from environmental isolates, both AAB1 and AAB2 show similar characteristics to coevolved gut associates of social bees [23] and likely represent an interesting midpoint in the evolutionary transition of bacteria from free-living to obligately intracellular lifestyles. Third, colonies of camponotine ants share a ‘social stomach’, or the communal reservoir of nutrients, antimicrobial peptides, and bacteria that are shared among colony members via stomodeal (as opposed to proctodeal) trophallaxis [108, 109]. Cumulatively, these factors implicate, but do not elucidate, the utility of social interactions in fostering vertical transmission of gut associates.

While social interactions have been hypothesized as facilitators of host-microbe coadaptation [23, 28], several confounding factors still abound. Here, we measure the following variables to infer transmission modality of C. chromaiodes gut microbiota, with predictions to follow. These are: (i) the fidelity of bacterial transmission, as measured via the extent of shared OTUs between a colony queen and progeny, (ii) contributions from environmental sources shape the profile of resident communities acquired via social transmission (iii) resilience of these communities are to temporal and physiological

54

variations during host development and, finally, (iv) to what extent communities of adult workers resemble those of founding queens and may persist in light of natural temporal variation.

Structuring of bacterial communities often occurs via two distinct yet intertwining mechanisms: shifts in community composition and shifts in community complexity, along with mixed combinatorial effects. Accordingly, we shape our predictions and inference of transmission modality as follows: In scenarios where vertical transmission of bacterial associates occurs between a colony queen and progeny, we expect that (i) bacterial community dissimilarity between individuals within a host colony is less than dissimilarity between individuals from distinct colonies, (ii) compositional differences drive dissimilarity between colonies, rather than shifts in complexity, and (iii) shifts in bacterial community complexity between host colonies is minor. These predictions are based on the precedent that, in scenarios where vertical transmission of bacterial associates represents the main conduit for bacterial acquisition and exchange, we expect a compositional signal from a founding queen to override random shifts in complexity between colonies, which should be minimal.

If a shared microbiota is vertically inherited from a colony queen, intra-colony distance should be significantly lower than inter-colony distance and community complexity should be relatively static across colonies. A relatively high proportion of

55

shared unique OTUs within a colony should generate a strong, distinct compositional signal that should override minor stochastic shifts in complexity from environmental acquisition. We note that, because camponotine ants maintain persistent associations with select members of their gut community [9, 25, 61], shared OTUs between colonies may persist at a moderate level, though significantly lower than the level of shared

OTUs within a colony. In any social host, as environmental acquisition of microbes accumulates over development, we expect that a shared microbiota should develop due to common acquisition of soil or environmental taxa. In Camponotine ants, this accumulation of shared environmental OTUs should be highest during the P1 pupal phase [110], the developmental stage just prior to the shedding of the meconium and luminal bacterial associates. After shedding of the meconium and reinoculation by the colony queen via trophallaxis, P2 pupae should possess a gut community more similar to that of the colony queen than P1 pupae, and these communities should be readily distinguishable via shifts in community complexity.

In scenarios where horizontal transmission from environmental sources represents the dominant form of bacterial exchange, stochastic shifts in intra-community diversity should override any compositional signal. In such scenarios, we would expect inter- and intra- colony distance to be similar, and complexity across colonies to be dynamic and structured via stochastic acquisition of environmental bacteria. The

56

proportion of OTUs shared within a colony should be approximately equal to those shared between colonies. Furthermore, shifts in community complexity, from taxa acquired over development, should override any colony-specific signals and developmental stage should represent the strongest correlate with inter-community distance.

Our results delineate the capacity of social interactions to facilitate transmission of extracellular gut associates in C. chromaiodes. We characterize the effects of temporal variation and host development on socially transmitted bacterial consortia, and demonstrate how these forces shape the taxonomic profile of associated bacterial communities. We illustrate how shifts in the complexity of these communities varies across host development, from early larval stages through pupal eclosion and development into nanitic workers. We propose a model of vertical transmission of gut bacteria from founding queens to offspring. We delineate how social interactions contribute to the transmission of essential gut bacteria, despite significant contributions from environmental sources and physiological factors. Our results demonstrate that social interactions represent stable conduits for bacterial transmission and must shape predictions of evolutionary consequences on the genomes of bacterial gut associates.

57

2.2 Results

2.2.1 Endosymbiont depletion and sequencing depth of gut associated community

After quality and length filtering, we retained 8,894,175 sequence reads associated with 109 samples (Table 10) from 15 C. chromaiodes colonies (per sample mean: 78453; median: 76681), though these numbers shifted after removing control samples and filtering out endosymbiont taxa (eg. Blochmannia and Wolbachia; per sample mean: 17677; median: 8841). Previously, restriction endonuclease digestion of bacterial community gDNA has been shown to be effective in reducing representation of

Blochmannia in 16S profiles [9], resulting in increased sequencing depth of Wolbachia and gut associates. Our use of PNA clamps was effective in preventing amplification of

Wolbachia 16S rDNA and further increasing sequencing depth of gut associated taxa.

Without the use of Wolbachia-specific PNA clamps, Wolbachia reads represent up to 99% of 16S profiles, though in our dataset, PNA clamping reduced Wolbachia abundance to below 10% in any given sample.

2.2.2 Bacterial taxonomic profiles of incipient colonies

Taxonomic profiles generated from sequencing the V4-5 hypervariable regions of

16S rDNA varied across queens and workers, colonies, and developmental stages, but were consistently represented by members of , Proteobacteria, and

Planctomycetes (Figure 4). Though this is the first study to survey the bacterial

58

communities of camponotine ants across host developmental stages and among incipient colonies, we detected OTUs that have previously been associated with the guts of minor workers of various camponotine ant species [9, 25, 61, 66, 74]. We detected the persistently abundant AAB2 OTU described in Brown and Wernegreen [9] across few samples (n=7), though we did not detect the AAB1 OTU described in said study.

812 821 831 832 833 834 836

1.00

0.75 Class c__ c__[Methylacidiphilae] c__[Pedosphaerae] 0.50 c__[Saprospirae] c__4C0d−2 c__Acidimicrobiia c__Acidobacteria−6 0.25 c__Acidobacteriia c__Actinobacteria c__Alphaproteobacteria c__Bacilli

e 0.00 c__Bacteroidia c n

Q L L P P m

a c__Betaproteobacteria

1 2

1 2

u

i d

n

838 839 845 847 848 849 e

o n

e c__Clostridia

r u

n b c__Cytophagia

A 1.00 c__DA052 c__Gammaproteobacteria c__koll11

0.75 c__Ktedonobacteria c__OP11−3 c__Phycisphaerae c__Planctomycetia c__SC3 0.50 c__SJA−4 c__Solibacteres c__Sphingobacteriia c__Thermoleophilia 0.25

0.00

Q L L P P m Q L L P P m Q L L P P m Q L L P P m Q L L P P m Q L L P P m

1 2 1 2 1 2 1 2 1 2 1 2

1 2 1 2 1 2 1 2 1 2 1 2

u u u u u u

i i i i i i

n n n n n n

e e e e e e

o o o o o o

e e e e e e

r r r r r r

n n n n n n

Figure 4: Relative abundance of phylogenetic classes distributed across colonies and developmental stages. Numbers at the top of taxonomic profiles indicate colony identity. Gaps indicate missing data for that stage/caste in that colony.

59

2.2.3 Inter-community distance is strongly shaped by colony identity and developmental stage

C. chromaiodes colony mates housed gut communities significantly more similar to each other than conspecifics from separate colonies (Figure 5). Within a host colony, the mean Jaccard dissimilarity was 0.62, as compared to 0.77 between colonies (Wilcox

P<0.001; Figure 7C). This indicates that, across the dataset, 38% of OTUs detected within a host colony were shared among all members of that colony, whereas distinct colonies shared just 23% of the OTU dataset. These values are similar, yet represent a slightly larger shared proportion of taxa, to those described in cephalotine ants [45]. Unifrac distance, which incorporates phylogenetic distance across a designated tree, showed similar trends (Within colony: 0.57; Between colony: 0.63; P<0.001; Figure 7D).

60

0.8

Distance 0.6 Jaccard Unifrac

Phylum p__Acidobacteria p__Actinobacteria

0.4 p__Bacteroidetes p__Firmicutes p__Planctomycetes p__Proteobacteria

0.2

0

8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8 8

1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

2 2 2 2 2 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3 4 4 4 4 4 6 6 6 6 8 8 8 8 8 9 9 5 5 5 7 7 7 7 7 8 8 8 8 8 9 9 9 9

......

L m P P Q L L P Q L L P Q L P P Q L L Q L L P P Q L L P Q L L P P Q m Q L P Q L L P P Q L L P P Q L P P Q

2 1 2 1 2 2 1 2 1 2 1 2 1 2 2 1 2 1 2 2

1 2 1 1 1 2 1 2 2 1 2 2 1 2 1 2 1 2

u u u u u u u u u u u u u

i i

n n

e e e e e e e e e e e e e

o o

e e e e e e e e e e e e e

r r n n n n n n n n n n n n n

Figure 5: OTUs and inter-community distance from a colony queen tracked across the dataset. Individual OTUs associated with the queen from colony 812 are plotted against all samples and colored by phylum. Unifrac distance between queen 812 and the indicated sample is drawn with a dashed black line and Jaccard dissimilarity is plotted with a solid black line. Vertical dotted lines demarcate colonies are simply drawn as a visual aid.

Host colony identity also had a strongly significant effect on shaping distances between bacterial communities across multiple distance measures (Jaccard:

PERMANOVA P <0.001, PERMDISP P=0.09; Unifrac: PERMANOVA P <0.001,

PERMDISP P=0.06; Figure 7A, Figure 21). Furthermore, using both phylogenetically

61

informed and traditional distance metrics, within-colony dissimilarity was significantly less than between colony distance, across all colonies (Figure 7A & Figure 7B). Host developmental stage had a significant effect on inter-community distance as well, though to a lesser degree than colony identity (Jaccard: PERMANOVA P <0.001,

PERMDISP P=0.06; Unifrac: PERMANOVA P <0.001, PERMDISP P=0.02).

To determine how strongly these two factors shaped inter-community distance, we performed several Constrained Analyses of Principle Coordinates (CAP) [111] on our dataset. CAP ordination is conceptually similar to Redundancy Analysis (RDA) [112], though it is valid on a variety of distance measures. We performed CAP on various distance metrics to determine the contribution of colony identity and host development in shaping bacterial community structure between samples. CAP on Unifrac distance demonstrated that ~10% of variation in β-diversity across the dataset (Figure 18) was shaped by host colony identity, with the inclusion of host developmental stage increasing the described variation to 16.5% (Figure 7B). A likelihood ratio test (LRT) between these two models determined a significant improvement in explanatory power with the inclusion of both factors (P<0.001, F=5.1). We also performed CAP using Jensen-

Shannon distance (JSD), a non-phylogenetically informed, finite metric useful for comparing community composition [113]. Similarly, host colony identity accounted for

12.9% of variation in inter-community distance, when using JSD (Figure 19). Though the

62

inclusion of host developmental stage only increased the total variation explained to

13.9% (Figure 7A), the effect was significant (LRT P<0.001, F= 5.1053). This is not surprising, as Unifrac distance weights longer branches in a phylogeny more heavily and would accordingly weight the addition of unique taxa acquired during development more heavily than a non-phylogenetic metric. Additionally, we performed unconstrained hierarchical clustering analysis with average mean (UPGMA), which supported colony identity as the driving force shaping β-diversity trends .

A B

Figure 6: Alpha diversity estimates and proportion of shared OTUs between developmental stages and castes. A. Faith’s PD across each caste or developmental stage measured, across all colonies. B. Proportion of shared OTUs between the indicated developmental stage or caste and the associated queen of the same colony.

63

2.2.4 Community complexity varies significantly across host development, but not across host colonies

Our results depict bacterial communities of dynamic nature and complexity across host development. Median alpha diversity indices were both highly similarly low for incipient queen gut communities and L1 larvae (median Faith’s PD: 14.05173, queens; 14.41128, L1; Wilcox P= 0.51). Community complexity increased for L2 larvae, though not significantly from queen gut community diversity (median Faith’s PD: 18.56;

Wilcox P=0.08). P1 pupae consistently displayed the highest community complexity among castes, and this elevation in diversity was significantly higher than queen gut communities (median Faith’s PD: 34.76; Wilcox P<0.001). After shedding of the meconium between the P1 and P2 phase, P2 pupae housed communities of significantly decrease complexity, compared to P1 pupae (median Faith’s PD: 27.12; Wilcox P=0.0053), though this value was still significantly elevated over queen gut communities (Wilcox

P<0.001). Complexity of early stage minor workers was similar to values for queen gut communities and L1 larvae (median Faith’s PD: 14.7, P=0.97) and was not significantly more diverse (Figure 6A). Our diversity estimates for minor workers were consistent with values previously described for camponotine and cephalotine ants [9, 25, 45, 61,

74]. We also calculated the percentage of shared OTUs between each developmental stage and the incipient queen of the associated colony (Figure 6B). In accordance with our alpha diversity measurements, the percentage of OTUs shared between a founding

64

queen and associated castes decreases across host development through the larval stages and P1 pupae. However, the shedding of the meconium prior to the P2 phase did not affect the proportion of OTUs shared with colony queen (P1 median 0.152; P2 median

0.149; Wilcox P=0.99) between P1 and P2 pupae. Nanitic workers, who feed the queen once they eclose from the pupal casing, accordingly share a higher proportion of OTUs that pupal stages and are on par with L1 larvae.

Community complexity across colonies was highly consistent, suggesting that compositional drive inter-community distance. Pairwise comparisons of alpha diversity across all colonies yielded only 3-5 significantly different means across multiples indices and raw OTUs counts. (Figure 17). Significant shifts in mean values were only detected between the least complex colony (836), and the 2-3 most diverse colonies (821, 831, 848;

Figure 17). These results indicate that differences in bacterial community composition, rather than shifts in bacterial community complexity, were responsible for driving the differences detected between host colonies. Conversely, changes in bacterial community complexity are, at least, partially responsible for any differences detected in between bacterial communities between host castes.

65

Figure 7: A. Constrained analysis of Principle Coordinates (CAP) ordination on Unifrac distance, constrained by host colony identity and caste/ developmental stage. B. CAP ordination on Jensen-Shannon distance (JSD), constrained as in subfigure A. C. Jaccard dissimilarity measured between colonies and castes/developmental stages. D. Unifrac distance measured between colonies and castes/developmental stages.

66

2.2.5 The microbiota of P1 pupae displays increased consistency across colonies and contains several environmental microbes

Across all colonies, samples within a caste or developmental stage shared few common OTUs (mean = 3.2). However, across the dataset, the microbiota of P1 pupae displayed an increased level of convergence, compared to other castes or developmental stages. P1 pupae possessed approximately four fold more shared OTUs with each other than any other stage or caste (n=16). Spanning several phyla, these taxa were consistent with OTUs from microbiomes of forest soils and wetlands, suggesting that environmental contributions during ontogeny accumulate largely from soil sources

(Table 11) and reach their apex during P1. Upon shedding of the pupal meconium and reinoculation by the colony queen, shared OTUs among all samples of P2 pupae were reduced and consistent with levels of other stages or castes.

2.3 Discussion

The evolution of sociality has had profound impacts on animal behavior and development [107, 114, 115] and governs all aspects of an animal’s interactions, to include those with bacterial associates [28, 116-118]. Intimate interactions between social colony mates potentially represent a highly reliable conduit for transmission of foodstuffs, immune factors, and bacterial associates over evolutionary timescales [45,

67

119]. Based on the criteria we outlined previously, our results suggest that social interactions foster vertical transmission of bacterial associates of camponotine ants.

2.3.1 Bacterial communities shared between queens and progeny are consistent with vertical inheritance

At higher taxonomic ranks, the profiles of C. chromaiodes gut communities were moderately consistent across colonies (Figure 4), including sporadic representation of the potentially coevolved mutualist, AAB2, described in previous studies [9, 25, 26]. At the individual OTU level, our data suggest that taxa present in a queen gut community are significantly more abundant in associated colony mates than individuals from other colonies (Figure 5 & Figure 23). These data are consistent with trends observed in

Cephalotes ants, which have coevolved with multiple members of their gut consortia [45].

That community complexity was highly static across host colonies indicates that compositional shifts, rather than shifts in complexity, were driving correlations between

β-diversity and host colony. However, we also appreciate that significant differences in bacterial community composition between host colonies may reflect shifts in the bacterial profiles of the colony environment. Thus, we cannot eliminate the physical colony microbiome as a source of results described herein, though we note that such dynamic shifts between colony environments have not been previously reported, nor resemble gut communities [65, 120]. 68

2.3.2 Contributions from environmental sources cause shifts in community complexity

We also observed a significant effect of developmental stage on trends between and within communities (Figure 6 & 7). Our results indicate dynamic shifts in community complexity as C. chromaiodes progresses through development and across castes. The gut communities of colony founding queens displayed consistent complexity and a high proportion of shared OTUs with associated L1 larvae, supporting direct transmission of bacterial communities across groups with minimal environmental contribution. Community complexity continued to increase during development with an apex in diversity occurring during the P1 phase. This correlates with shifts in host biology, as, between the P1 and P2 pupal stage, the insect gut goes through a metamorphosis and the gut epithelium is shed [72], and Blochmannia densities increase.

Following this shift, we report a reduction in gut community complexity and decreased community dissimilarity between queens and associated P2 pupae, relative to the P1 stage. These shifts suggest that reinoculation from the colony queen (or minor workers, if available) is likely driving these changes. After eclosion from the pupal casing, complexity of gut consortia of minor workers resembles that of incipient queen gut communities.

69

Trends in β-diversity were also significantly affected by developmental stage, though less strongly than the colony effect. Clusters in β-diversity ordinations associated with developmental stage were significant, though much more diffuse than those associated with colony identity (Figure 18 & 19). This corresponds with the dynamic trends observed in in α-diversity described above (Figure 6a), as the reduced cluster density is likely due to increased complexity accumulated from environmental sources during development.

2.3.3 How do developmental stage and colony identity shape bacterial communities?

That both host colony identity and developmental stage significantly correlate with trends in bacterial β-diversity belies how distinctly these forces structure bacterial communities. When a vertically inherited compositional signal represents the greatest factor, intra-colony distance should be significantly lower than inter-colony distance and community complexity should be relatively static across colonies. Our results suggest that correlations between inter-community distance and colony identity are due to compositional differences associated with vertical inheritance of bacterial communities from a founding queen. This is evident in the high proportion of shared OTUs within a colony and relatively low intra-community dissimilarity (Figure 5 & Figure 23).

However, the complexity of bacterial communities across host colonies was incredibly

70

static (Figure 17), which strongly supports differentiation at the colony level deriving from compositional differences alone. Conversely, correlations between inter- community distance and host developmental stages are due to shifts in community complexity. We detected significant shifts in community complexity across host development, with the pinnacle of intra-community diversity occurring during the P1 phase. OTUs shared between P1 pupae across colonies were consistent with previously described soil and wetland microbes (Table 11), which suggests environmental acquisition of bacteria over time. These results demonstrate that host development structures inter-community distance by shifting community complexity via accumulation of environmental microbes over time. These data are consistent with vertical transmission of gut associates from queens to progeny and fit with observations in previous studies of ant microbiota [27, 45].

2.3.4 How does bacterial community structure vary across host development?

While the composition of larval and pupal bacterial communities are variably similar to their associated queen’s gut community, the strength of this signal varies over time. The similarity between larval and pupal bacterial communities and their associated queen decreases during development (Figure 6 & 7), and this shift is driven by increases in community complexity, likely due to increased exposure to

71

environmental microbes over time. That most of the variation in our dataset was explained by host colony identity suggests that shifts in community complexity, while significant, are not powerful enough to override the unique compositional differences between colonies. As evident in β-diversity ordinations (Figure 7, 18 and 19), shifts in complexity associated with developmental stage correlates with an increase in distance between related individuals, though in a non-affine manner. Such shifts are consistent with stochastic interactions with environmental associates, which we would expect to drive β-diversity trends in inconsistent directions. Despite these shifts in complexity, bacterial communities of newly eclosed workers resemble that of associated queens and

L1 larvae. This suggests that, at least in early stages of colony founding, bacterial community composition fluctuates with shifts in complexity across development, but retains a high similarity to the founding queen. This is congruent with expectations, as the queen is constantly feeding early workers during development, providing ample opportunities for bacterial reinoculation. Once minor workers commence husbandry duties, their microbiomes appear to be similar in composition and complexity to the colony queen. In concert, these trends demonstrate that shifts in community complexity represent a dynamic force shaping bacterial communities during host development.

Previous studies have established a stark difference in taxonomic profiles between gut associated and environmental (nest material and soil) communities [27, 65,

72

121]. Thus, we would expect phylogenetically aware distance metrics to be more sensitive to the acquisition of environmental microbes than non-phylogenetic sources.

Unifrac distance, which weights longer, deeply diverged branches in a phylogeny more heavily than short branches separating closely related taxa, should magnify the contribution of unique, phylogenetically distinct microbes from environmental sources.

Conversely, non phylogenetic metrics, such as Jaccard dissimilarity or JSD, should attenuate the proportion of variation explained in β-diversity trends by the inclusion of microbes from environmental sources. Our constrained ordinations of Principle

Coordinate Analyses reflect these observations (Figure 7A & 7B). Using the JSD metric, constrained ordination mixed modeling of the effect of host colony identity and developmental stage described 13.9% of the variation (permutational ANOVA P<0.001,

F=4.882), with the addition of developmental stage adding only 1% of described variance to the colony-only model. When employing Unifrac distance, our mixed model describes

16.5% of the variation (permutational ANOVA P<0.001, F=2.916), with developmental stage adding 6.7% of described variance to the colony-only model. Cumulatively, these results support the notion that variations in β-diversity during host development are driven by increases in community complexity via acquisition of bacteria from environmental sources.

73

In concert, our results suggest that social interactions facilitate reliable exchange of bacterial mutualists, and these associations may be persistent over evolutionary timescales. Vertical transmission of extracellular gut associates has been shown to facilitate bacterial coadaptation with hosts across several phyla [5, 22, 45, 101]. Our study represents the first survey of bacterial transmission and community dynamics across ant developmental stages and castes of field-collected colonies. Our data suggest that gut associated bacterial communities are reliably transmitted to offspring via social interactions and that the composition of these communities reflects that of the founding queen and are thus colony specific. These results are in line with previous studies that have suggested social interactions as a conduit for bacterial transmission [5, 9, 23, 28, 43,

119] and are essential for predictions of evolutionary consequences of mutualisms between animals and gut microbiota.

2.4 Methods

2.4.1 Ant collection

Ants were collected across several locations within Duke Forest, Durham, NC between June 9 and June 22 of 2015. This timeframe was chosen to coincide with colloquial nuptial flights of C. chromaiodes. For bacterial community characterization, we collected only incipient colonies of C. chromaiodes. Colonies were considered incipient if

74

the immediate area near an identified queen contained only brood stages and/or dwarfish (nanitic) workers. All samples were collected within a 1km stretch of Duke

Forest to minimize geographic variation, as large scale geographical shifts have been shown to correlate with shifts in gut community profiles in Cephalotes [45], likely due to shifts in bacterial communities of both environmental and nutritional sources. All specimens (Queen and all offspring) were immediately placed in 100% ethanol and transported to Duke University upon collection. For all colonies, vouchers (in the form of whole samples or dissected remains) are available upon request. Where possible, triplicates of each caste or developmental stage were processed and sequenced for each colony. Full sample structure and collection metadata are listed in Table 10.

Developmental stages were assessed and determined in accord with previously published studies [110].

2.4.2 Sample preparation for microbiota characterization: surface sterilization, genomic DNA (gDNA) extraction, and endosymbiont depletion

All samples were surface-sterilized prior to gDNA extraction. Samples were sterilized by immersion in 100% ethanol, followed by a 60 second soak in 5% bleach, and then rinsed in sterile water. After surface sterilization, whole gut tracts were dissected out under microscope for all queen and minor worker samples, and gDNA was

75

extracted from these tissues only. Because tissue dissection of brood is not feasible, gDNA was extracted from whole specimens for larvae and pupae. gDNA extraction was performed using the Qiagen (Hilden, Germany) DNEasy kit according to a standard protocol.

By absolute abundance, gDNA of endosymbiotic bacteria is more abundant than that of extracellular gut associates [9]. In Camponotus, these are Blochmannia, a long-term intracellular endosymbiont of Camponotus and related genera, and Wolbachia, a known bacterial associate of various insect taxa. In order to better characterize the community of extracellular gut associates, we employed two techniques to deplete 16S rDNA abundance of both endosymbionts. To deplete Blochmannia rDNA abundance, we employed enzymatic digestion as described in Brown and Wernegreen [9]. To reduce the abundance of Wolbachia 16S rDNA, we designed and employed a novel peptide nucleic acid (PNA) clamp targeted against the Wolbachia operational taxonomic unit (OTU) identified in C. chromaiodes. PNAs are oligonucleotides with a higher annealing temperature and lower mismatch tolerance than their nucleic acid-based analogs.

Because of these features, PNA clamps have been successfully modified to block specific

OTUs from marker gene studies [76]. We designed a PNA (5’ K-

CACCTGTGTGAAATCCG 3’) to anneal to the C. chromaiodes Wolbachia OTU at E. coli

76

16S position 1030, which blocks the universal reverse primer for the V6 region of 16S (E. coli 3’ coordinate 1046), as described below.

2.4.3 16S rDNA amplicon library construction and sequencing

First PCR: We performed a two step PCR protocol to maximize coverage of the extracellular gut community. From each PacI-digested gDNA sample, we first amplified a ~1037 bp region of 16S rDNA that spans the PacI restriction site in Blochmannia and the annealing site of the Wolbachia PNA. Prior to amplification, the PNA clamp was solubilized at 65°C for 5 minutes. PCR reactions contained 5.52 uL nuclease-free water,

0.975 uL DMSO, 1.63 uL 10 uM forward primer 9F, 1.63 uL 10 uM reverse primer 1046R,

3.25 uL 20 uM PNA, 3.25 uL of PacI-digested gDNA, and 16.25 uL Phusion High-

Fidelity Master Mix with HF Buffer (NEB). Reactions were held at 98°C for 30 seconds to denature the DNA, with 30 cycles of amplification proceeding at 98°C for 10s, 74°C for

10s for PNA annealing, 65°C for 30s, and 72°C for 60s, we added a final extension phase at 72°C for 5 min, then cooled at 4°C. Resulting amplicons were cleaned using the

Agencourt AMPure XP system (Beckman Coulter) following the manufacturer’s instructions.

Second PCR and sequencing: 10-cycle tagging PCRs were designed to amplify the

V4 and V5 hypervariable regions from the amplicons from the initial reaction. Tagging

77

PCR reactions were performed in triplicate and contained 5.5 uL nuclease-free water,

0.75 uL DMSO, 1.25 uL 10 uM barcoded fusion primer mix, 5 uL of PacI-digested gDNA, and 12.5 uL Phusion High-Fidelity Master Mix with HF Buffer (NEB). Reactions were held at 98°C for 30 seconds to denature the DNA, with 10 cycles of amplification proceeding at 98°C for 10s, 55°C for 30s, and 72°C for 60s, we added a final extension phase at 72°C for 5 min, then cooled at 4°C. Tagged amplicons were again cleaned using the Agencourt AMPure XP system (Beckman Coulter) following the manufacturer’s instructions.

All libraries were quantified using a Quant-iT High-Sensitivity dsDNA Assay Kit

(Thermo Fisher) following the manufacturer’s instructions, and then pooled in equimolar concentrations. A 200uL aliquot of pooled amplicon libraries was concentrated with a MinElute PCR Purification kit (Qiagen) and size selected on a 1% agarose gel. The size selected libraries were cleaned using a QIAquick Gel Extraction Kit

(Qiagen) and sequenced with a MiSeq Reagent Kit v3 on an Illumina MiSeq. Sequencing was performed by the Molecular Physiology Institute at Duke University. Of the output from this sequencing run, 17,972,183 raw reads were generated, and ~50% were associated with samples in this study.

78

2.4.4 Microbial community analysis

Raw sequencing reads were processed using UPARSE [81]. All reads were trimmed of sequencing adaptors, barcodes, and forward and reverse primers, then globally trimmed to 367bp. All reads shorter than 367bp were discarded. Expected error filtering parameters were based on the ability to classify a single OTU for a control (E. coli isolate) dataset. The UPARSE pipeline was used to group operational taxonomic units (OTUs) at a 97% similarity threshold and to remove chimeric sequences.

Taxonomy was assigned to UPARSE-identified OTUs using the RDP Classifier [122] against Greengenes database 13.8 [82]. All statistical analyses were performed in R. The phyloseq, vegan, reshape2, randomForest, and ggplot2 packages were used for data manipulation and visualization [111, 123-126].

Phylogenetic inference: To more accurately estimate phylogenetic distance between communities (weighted and unweighted Unifrac), we used Bayesian approaches to infer phylogenetic relationships between OTUs. We employed a General

Time Reversible (GTR) model with a Gamma (Γ) distribution of rate heterogeneity [87] and proportion of invariant site estimation (PInv). MrBayes [90] was used for Bayesian phylogenetic inference with 10 million generations, 4 chains, 1,000 generation sampling

79

interval, and a 0.25 burn-in fraction. This tree was then used in calculations of phylogenetically informed distance estimates (unweighted and weighted Unifrac).

80

3. Genomic erosion and rampant horizontal gene transfer in gut-associated Acetobacteraceae

3.1 Introduction

Evolutionary consequences of symbiotic associations are dynamic and drive genomic evolution in myriad ways. Persistent relationships between animals and microbiota have facilitated a range of adaptations, from niche diversification [2, 22] to dramatic increases in metabolic potential [1], even leading to speciation of one or multiple partners [3, 127]. Gut microbiota represent a unique system for studying the genomic basis of adaptation, due to their diverse range of persistence and intimacy across hosts. Traditionally, proxies for assessing bacterial divergence have relied on inference based on changes to the sequence and structure of 16S rRNA. However, the rate of molecular evolution at 16S rRNA vastly underestimates that of the genome [71] as a whole and, at best, provides a cursory evaluation of strain level diversification.

Thus, elucidation of processes facilitating adaptation and associated consequences require genomic data for comparative analyses.

Comparative analyses of gut associates have revealed a broad spectrum of genomic consequences. At one extreme, genomes of extracellular gut associates of pentatomoid bugs display remarkable erosion, AT-biased nucleotide composition, accelerated rates of molecular evolution, and co-cladogenesis with the host. Conversely, genomes of Lactobacillus reuteri isolates, persistent gut symbionts across several 81

vertebrate families [128], display fewer hallmarks of adaptation, such as strain-level variation in gene content between host populations [129] and monophyletic clades associated with host origin [128]. Likely existing in a middle ground between the above two extremes, strains of extracellular gut associates of social bees possess specialized genomes adapted to their specific host species [23], and confer distinct benefits associated with host lifestyle or nutrition. Indeed, across various hosts, evidence of host specialization, long term adaptation, and/or subspecies divergence are becoming increasingly present in persistent gut microbiota, revealing a spectrum of consequences of adaptation to the gut niche. Recently, studies into the composition and stability of ant gut microbiota have uncovered deeply divergent branches of putatively host-adapted bacterial taxa.

Gut associated communities of ants vary widely across host families, though tend to show a relatively high level of stability within a host species [9, 25, 27, 61].

Camponotine ants enjoy a nearly worldwide range and, despite such a large distribution, persistently associate with select taxa in the Acetobacteraceae (AAB) [9, 25,

26] and Lactobacillaceae (LAB) [66, 75, 130]. Previous studies have shown that gut associated Acetobacteraceae (strains AAB1 and AAB2) [9] display elevated mutation rates, AT bias, significant destabilization of 16S rRNA, and deep monophyly with strains only found in other ants, though insights into genomes of these associates remain

82

undeveloped. Conversely, despite a widespread and persistent association with various host species, Lactobacillus isolates do not appear to display significant differentiation from environmental strains, based on analysis of 16S rRNA. While the nature of the relationship between camponotine ants and bacterial isolates from either family remain uncertain, each group likely represents a distinct positions on the symbiotic spectrum.

Deep divergence from environmental isolates and host-restricted distribution suggest that AAB1 and AAB2 may share a lengthy coevolutionary history with their host, based on comparisons from coevolved gut associates of other social insects. However,

Lactobacillus strains, a genus with a history of host specialization [128], likely represent more promiscuous associates with lesser shared evolutionary history and fewer signatures of specialization. This may be due to a relatively novel nature of the relationship between Camponotine ants and LAB or due to increased fitness outside of the host. Though the mechanisms directing these relationships are unclear, the distinct nature of these associations likely represents a unique position on the symbiotic spectrum. However, there is a paucity of genomic or functional data available for these associates, leaving insights into evolutionary processes structuring these relationships mostly obscured.

Here, I culture and characterize the genomes of multiple strains of AAB and LAB associated with C chromaiodes. I delineate various factors structuring genomic evolution

83

in these gut associates and explore mechanisms by which genomic content and divergence facilitate adaptation to the host gastrointestinal tract. These results illustrate selective forces underlying adaptation and genomic evolution in gut associated bacteria.

3.2 Results

3.2.1 Isolate genome characteristics

From pure monoculture, I sequenced two isolates of the previously described acetic acid bacterial lineage, AAB2, as well as a strain of Lactobacillus that has been documented in association with multiple ant species. Isolate sequencing on the PacBio

RSII generated datasets with mean read counts of 76,659 and mean read N50 of

23,501bp. When assembled, all read datasets formed closed genomes with mean coverage of 443-fold and consensus concordance of 99.9967%. Strain BBS1 possessed on plasmid sequence and strains BBS2 contained multiple plasmid sequences with variable coverage. For all analyses, results are based on chromosomal data only. The AAB2 strain

BBS1 genome is a circular, 2.1 Mb chromosome encoding 2032 CDS sequences, and 342 putatively horizontally transferred genes (Table 3; Figure 8a, 8c). AAB2 strain BBS3 also possessed a single circular chromosome of 1.9Mb and 1798 CDS and 276 putatively horizontally transferred genes. Lactobacillus sp. BBS2 possessed a single 1.5Mb circular

84

chromosome with 1475 CDS sequences and 17 putatively horizontally transferred genes

(Table 3; Figure 8b).

85

Table 3: Genome characteristics of sequenced strains

86

a. b.

c. d. BBS1

87

BBS3

Figure 8: Chromosome maps of each strain sequenced. a-c. Chromosome are colored according to the COG category listed. From outside to the center, rings on each chromosome represent genes on forward strand (color by COG categories), genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, and GC skew. d. whole genome alignment of strains BBS1 (reference) and BBS3. Alignments are colored by locally collinear blocks.

Compared to free-living and other host-associated isolates within the

Acetobacteraceae, both strains of AAB2 displayed drastically distinct metrics. On average, the genomes from both strains were 1.2Mb smaller than other acetic acid bacteria and had a mean GC content that was 15% lower (Table 12). Additionally, both

AAB2 strains possessed an average of 1221 fewer CDS than other Acetobacteraceae, though had a mean percentage of coding bases that was 1.86% higher. By contrast, none of these metrics were distinct between Lactobacillus sp. BBS2 and other members of the

Lactobacillaceae (Table 12).

3.2.2 Metabolic pathway reconstruction

The size and metabolic potential of genomes of both strains of AAB2 are shifted, mainly through gene loss, relative to free living acetic acid bacteria, a trend that is consistent among specialized gut associates of insects [23, 43]. Each strain has shared gene loss in several amino acid biosynthetic pathways as well as multiple central metabolic pathways such as glycolysis (Figure 24), pentose phosphate (Figure 25), and the tricarboxylic acid cycle (TCA; Figure 26). Biosynthetic pathways that have undergone gene loss and are putatively nonfunctional include synthetic pathways for several amino acids and precursors. The synthetic pathway for chorismate, a precursor molecule to the three aromatic amino acids, tyrosine, tryptophan, and phenylalanine, is

88

entirely absent, as are synthetic pathways for each amino acid (Figure 9). Furthermore, both strains appear to have lost the ability to fully or partially synthesize several other amino acids, though appear to have gained the ability to synthesize valine and serine,

2 1

3 2

b.

1

Figure 9: Abundance profiles of functions annotated by IMG networks. a. Abundance profile for acetic acid bacteria. b. Abundance profile for lactic acid bacteria. Fewer categories showed meaningful comparisons and are thus omitted. Numbered black lines are visual aids meant purely to segregate classes of networks, and are numbered as follows. 1: amino acid biosynthesis, 2: carbohydrate metabolism, 3: cofactor biosynthesis.

89

likely from horizontal transfer. With respect to carbohydrate utilization and metabolism, both strains have undergone significant shifts in capacity. Several key genes are absent from glycolysis (Embden-Meyerhof-Parnas), though both have acquired key genes in the Entner-Doudoroff (ED) alternative, facilitating metabolism of dextrose to pyruvate via this alternate mechanism. Distinct from the capabilities of BBS1, strain

BBS3 possesses a complete, standard Entner-Doudoroff pathway (Figure 9). The TCA cycle has been completely broken in both strains (Figure 26), with approximately half of the genes lost, resulting in an inability to convert succinyl CoA into succinate or any of the downstream metabolites. The fate of succinyl CoA is likely tied to glycine metabolism via the unique metabolic capacity of Alphaproteobacteria to synthesize 5- aminolevulinic acid (5ALA) from glycine and succinyl CoA. This reaction, catalyzed by aminolevulinic acid synthase (ALAS) [131], is a key precursor in heme biosynthesis.

Additionally, both strains uniquely possess several genes for thiamine regeneration

(Figure 9).

Whereas both strains of AAB2 exhibited large scale erosion of metabolic pathways, Lactobacillus sp. BBS2 has largely differentiated from other isolates by gene acquisition. Strain BBS2 has acquired chorismate mutase, an essential enzyme in the conversion of chorismate to phenylalanine and tyrosine (via the Shikimate pathway), though its full biosynthetic capacity for these amino acids in unclear.

90

3.2.3 Comparative bacterial genome analysis

Complete annotated genome sequences were downloaded from the JGI IMG database. Whole genome alignment and visualization was performed using the Mauve aligner [132] on nucleotide sequence data. The number of shared genes between BBS1 and BBS3 was 1598 (Figure 10). Whole genome alignment produced 12 locally collinear blocks (LCBs; Figure 8d) revealing several shifts in orientation and location within the genome of each strain. Of the rearrangements of LCBs within the genome, only one caused a reversal in sequence orientation.

91

BBS1 BBS3

Figure 10: Venn diagram of shared genes between isolates BBS1 and BBS3

3.2.4 Horizontal gene transfer and functional profiles of gut associates

Rampant horizontal gene transfer has had a profound effect on the genomes of both AAB strains and, to a much lesser extent, the LAB associate. The total count of horizontally transferred genes within the genomes of AAB2 strains BBS1 and BBS3 was

92

342 and 276, which amounted to 16.2% and 14.73%, respectively (Table 3, Figure11a).

Figure 11: Genomic content of isolates by phylogenetic distribution of genes and COG functional categories. a. Relative abundance of genes horizontally acquired from various phylogenetic sources. b. Relative abundance of COG functional categories across the genome.

Lactobacillus sp. BBS2 housed 17 horizontally transferred genes, which represent

1.09% of the CDS sequences in the genome (Table 3). The functional profile of horizontally transferred genes in BBS1 and BBS3 was diverse and spanned several COG

categories (Figure 12) and multiple phyla (Figure 12). The amount of horizontally transferred genes across COG categories was dynamic and ranged from one to ~30. The

COG functional category with the greatest extent of HGT was “Energy production and

93

conversion” and included acquisition of all 14 subunits of NADH dehydrogenase

a.

Figure 12: Horizontally transferred genes by phylogenetic origin and COG functional category. a. HGT in BBS1. b. HGT in BBS3

94

from Gammaproteobacterial origin (Figure 12). Gammproteobacteria was the taxonomic class with the greatest contribution of horizontally transferred genes, and the

Rhizobiales were the family with greatest contribution. Across both strains, “Coenzyme transport and metabolism” was the COG category with the greatest diversity of phyla.

Comparative Pfam profiles (Figure 13) of various AAB taxa suggested that both strains of AAB2 showed enrichment of Sel1 domains, likely due to the horizontal acquisition

NADH dehydrogenase, as described above. BBS1 and BBS3 also showed enrichment of

LysE type translocators, an amino acid exporter implicated in the extracellular export of

L-lysine [133]. Both of these PFam domains are almost entirely absent in all other AAB taxa, suggesting that their enrichment is unique to isolates BBS1 and BBS3. Furthermore,

AAB2 isolates also displayed a reduction in phage integrase family proteins, a family that cleaves DNA and aids in recombination [134] (Figure 13). Of the horizontally transferred genes in Lactobacillus sp. BBS2, six were annotated with a COG category, which were evenly distributed between metabolite biosynthesis, carbohydrate metabolism, amino acid metabolism, and transcription and post translational modification. Pfam profile comparison across various LAB (Figure 13), suggested that

BBS2 displayed moderate enrichment of helix-turn-helix domains, and modest enrichment of amino acid permeases and Type I restriction enzymes (Figure 13b). These

95

domains correspond with the genes associated with the COG categories described above

(Figure 11b).

a.

Figure 13: PFam function across AAB and LAB. The ten most abundant function categories within the genomes of AAB and LAB isolates as annotated by PFam category. a. Functional profile of AAB. b. Functional profile of LAB.

96

3.2.5 Phylogenomic inference

Phylogenomic inference revealed deep divergence of both AAB strains and surprising, though modest, divergence of the LAB strain (Figure 14). Similar to previous analyses of AAB1 and AAB2 lineages based on 16S rRNA, our estimations placed both

AAB2 strains in a deeply divergent, monophyletic clade with perfect bootstrap support.

Though some shifts in positioning of more distant clades were observed, tree architecture was mostly consistent with 16S rRNA trees. Phylogenomic analysis and placement of strain BBS2 within LAB suggested that L. sanfranciscensis was the closest relative, which is consistent with 16S rRNA trees. However, divergence between strain

BBS2 and L. sanfranciscensis was greater than that between free living and gut associated

(host: Drosophila melanogaster) L. fructivorans relatives (Figure 14).

97

a.

Figure 14: Phylogenomic analysis of AAB and LAB isolates from diverse sources. Ant associates are colored green, other host associated bacterial isolates are colored orange, and environmental isolates are colored black. Node values represent bootstrap support from 1000 replications. Trees are based on codon-aligned nucleotide data from 50 concatenated single copy orthologs. a. Phylogenomic tree of relevant AAB genomes. b. Phylogenomic tree of relevant LAB genomes.

3.3 Discussion

From large-scale genomic erosion [10] to host specialization [23, 128], symbiotic associations have profound effects on the evolution of bacterial associates. Our results shed light on ecological and evolutionary processes shaping genomes of diverse bacterial associates of Camponotine ants. Compared to free living AAB, strains BBS1 and BBS3 possess genomes reduced in both size and biosynthetic capacity. Genome reduction in bacterial associates of insects is well documented and most evident in ancient symbioses with intracellular mutualists [4]. Recently, several studies have

98

demonstrated that, while the magnitude of evolutionary consequences are lessened, extracellular gut associates undergo similar processes, resulting in large scale gene loss and alterations in metabolic capacity [5, 10, 22, 23, 28]. Our findings of mean reductions in genome size by ~1.2 MB and GC content by ~15% are consistent with shifts seen in other extracellular gut associates [10, 22, 23] with long coevolutionary histories [29].

Conversely, LAB strain BBS2 did not display these hallmarks of adaptation, arguing against long term adaptation to the gut.

A hallmark of the gut as an , HGT had a profound impact of the content and functional potential of both AAB strains. While the functionality of horizontally acquired genes in the LAB strain may facilitate gut colonization, the total instances of gene transfer was consistent with environmental isolates. Approximately

15% of the genome of each AAB strain was acquired via horizontal transfer from closely related taxa, as well as across phyla and kingdoms (Figure 11a). COG functional profiles of horizontally transferred genes (Figure 12) suggest that acquisition was non-uniform across functional categories, with genes associated with energy production and conversion having the highest incidence of acquisition (Figures 11b and 12). This may be due to a cascade effect in which shifts in metabolism or substrate availability may require an increased energy investment into salvage pathways or metabolite conversion.

In such cases, an increased repertoire of energy production and conversion pathways

99

would lead to increased fitness of symbionts with such metabolic plasticity. In both

AAB2 strains, the acquisition of the complete NADH-quinone oxidoreductase complex would increase the oxidative capacity of H+ export to create a proton gradient necessary for ATP synthesis via oxidative phosphorylation.

To a much lesser extent than the AAB strains, the Lactobacillus isolate has also gained novel synthetic capacity via HGT. BBS2 has acquired genes associated with amino acid metabolism, defensive systems, and cell-cell interactions. In contrast to AAB isolates, BBS2 has acquired chorismate mutase from Clostridium, which is a key enzyme in the interconversion of chorismate and aromatic amino acids. This increase in biosynthetic capacity likely offers a substantial increase in fitness over strains without such plasticity, especially when scavenging for nutrients not typically usable by closely related taxa. Additionally, BBS2 has acquired a putative adhesion protein and type I restriction enzyme from Bacillus. Bacterial adhesins are key cell surface proteins that facilitate adhesion to other cells. In host associated taxa, bacterial adhesins generally foster colonization of host tissue [135] and are thus a type of factor.

Furthermore, restriction enzymes are defensive proteins that protect against phage DNA insertion and subsequent lysis [136]. In concert, these proteins likely foster colonization of the gut niche by increasing biosynthetic plasticity, facilitating physical adhesion to

100

host tissue, and increasing defensive capabilities against bacteriophage (Figure 9b).

Cumulatively, this fitness benefit may be substantial.

As a whole, central metabolism and oxidative phosphorylation represent highly mosaic processes in both strains of AAB2. Loss of individual enzymes from metabolic cycles has been documented in certain gut associates [10, 23, 29], with mutualists of social bees utilizing an alternate enzyme during the TCA cycle to catalyze succinyl-CoA

-> succinate conversion [137]. However, our results suggest that several key metabolic pathways suffer losses of approximately 50% of key enzymes. Both AAB2 isolates have lost half of the genes required for the standard (Embden-Meyerhof-Parnas) mechanism of glycolysis, as well as approximately half of the TCA cycle, breaking its cyclical structure. The decrease in fitness associated with the loss of these genes is likely partially offset via the acquisition of select genes (though not all) in the ED pathway, resulting in an atypically mosaic approach to the metabolism of glucose to pyruvate (Figures 24 -

26). Specifically, both strains lack genes used in the first five reactions of glucose metabolism, whereby glucose is converted to glyceraldehyde-3-phosphate (Figure 24).

However, via the acquisition of phosphogluconate dehydratase and 2-dehydro-3- deoxyphosphogluconate aldolase, B-D-glucose-6-phosphate can be converted directly into pyruvate in conjunction with standard pentose phosphate pathway enzymes.

Furthermore, enzymes needed to convert D-gluconate-6-phosphate into D-

101

glyceraldehyde-3-phosphate are retained. D-glyceraldehyde-3P is a key byproduct of the standard pentose phosphate pathway, and is the first metabolite able to be processed by the truncated glycolysis cycle retained in BBS1 and BBS3.

Although gene acquisition via HGT seems to have rescued large scale gene loss in the standard Embden-Meyerhof-Parnas pathway, the cyclical nature of the TCA cycle seems to have been completely broken (Figure 26). All enzymes needed to catalyze reactions after the production of succinyl-CoA have been lost. This is especially curious, as the next reaction in the cycle is the conversion of succinyl-CoA to succinate, which generates one molecule of ATP or GTP. While the fate of succinyl-CoA remains unclear, a likely usage is in the 5-aminolevulinic acid synthase (ALAS) mediated reaction with glycine to produce 5-aminolevulinic acid (5ALA), coenzyme A (CoA), and CO2 [131].

This reaction, catalyzed by ALAS, is unique to eukaryotes and Alphaproteobacteria and feeds into porphyrin and heme synthesis.

In addition to these alterations in carbohydrate metabolism, several pathways for the biosynthesis of amino acids and precursor molecules have undergone substantial shifts (Figure 9). Neither AAB strain is predicted to be able to synthesize chorismate, the precursor molecule to aromatic amino acid synthesis, or any of the aromatic amino acids. Compared to other acetic acid bacteria, AAB2 strains have lost genes in biosynthetic or conversion pathways of histidine, leucine, methionine, and arginine,

102

though both have acquired genes for the synthesis or conversion of cysteine and serine.

Additionally, both AAB isolates have acquired genes in pathways for thiamine and derivative synthesis and regeneration, molybdenum cofactor synthesis, and CoA biosynthesis. Furthermore, isolates BBS1 and BBS3 display significant enrichment of Sel1 repeat proteins and LysE type transporters (Figure 13). Sel1 repeat proteins possess diverse roles in cellular processes including serving as adaptor proteins in supermolecular complexes. They have also been implicated in the cellular stress response and in mediating interactions between host and bacterial cells [138]. While it is difficult to pinpoint functional profiles from genetic data alone, both of these interactions would be extremely valuable in gut associates. LysE type translocators are membrane spanning L-lysine exporters that excrete L-lysine into extracellular space

[133], and both AAB strains have retained several pathways associated with lysine synthesis and conversion. The acquisition and maintenance of multiple lysine exporters is suggestive of a physiological role in the relationship between AAB2 isolates and C. chromaiodes: supplementation of L-lysine. This interaction may have a precedent, as the genome of the ancient Camponotus endosymbiont, Blochmannia, [13, 14] has been shown to retain biosynthetic pathways for amino acids utilized by the host [139]. In gut associated bacteria, where reductive evolution is especially quick to deplete

103

functionality of nonessential genes [10, 29], enrichment and maintenance of loci is strongly suggestive of functionality.

Aside from genomic plasticity due to gene gain and loss, persistent associations with hosts also direct selection at the nucleotide level in various ways. We performed phylogenomic analyses of each isolate to estimate divergence from other host associated and free living bacteria (Figure 14). In accord with phylogenetic reconstructions based on 16S rRNA [9], both strains of AAB2 formed a deeply divergent, monophyletic clade that was sister to gut associated Saccharibacter in A. mellifera. Though 16S rRNA divergence generally underestimates genomic divergence [71], our data delineated a surprising, albeit modest, level of divergence between Lactobacillus sp. BBS2 and its closest relative, L. sanfranciscensis. Furthermore, the divergence between these two taxa was much greater than that between environmental and gut associated isolates of L. fructivorans from D. melanogaster. Whereas all ant associated AAB have shown extensive

(7-10%) divergence from environmental isolates at 16S rRNA, the full length 16S rRNA sequence of BBS2 displayed >99% similarity to environmental L. sanfranciscensis, suggesting that it may be an environmental isolate with preadaptive genes facilitating gut colonization. Whether the genomic divergence of BBS2 is due to selective regimes associated with gut adaptation or other factors remains unclear.

104

Collectively, our results delineate ecological and evolutionary factors both priming and resulting from bacterial adaptation to the gut environment. The genomes of both AAB2 strains illustrate consequences of persistent host association. While hallmarks such as genomic erosion and loss of metabolic potential are well documented in reliable gut associates, our data also delineate processes increasing genomic content

(HGT), resulting in strains with mosaic genomes (Figures 12, 24-26). Accelerated horizontal transfer of genomic content has been previously described in gut microbiota

[140]. Our data illustrate patterns of rampant transfer and acquisition of genetic material between closely related taxa and across phyla. Comprising approximately 15% of protein coding sequences in each AAB strain (Figure 11a), horizontally transferred genes have drastically increased the metabolic and biosynthetic potential of each isolate, potentially offsetting the patterns of reductive evolution germane to gut adaptation.

Between these two processes, genomes of gut associated AAB have become rapidly divergent from closely related environmental isolates. While both AAB strains illustrate consequences resulting from adaptation to the gut niche, Lactobacillus isolate BBS2 likely represents ecological factors priming colonization of the gut environment. A seemingly more promiscuous associate [61, 66, 130], BBS2 does not exhibit any distinct properties from environmental isolates using standard 16S based assessments. However, phylogenomic analysis demonstrated a clear distinction from environmental isolates

105

(Figure 14). Furthermore, the increase (via HGT) in biosynthetic capacity as well as in defensive and adhesive mechanisms has likely facilitated persistent colonization of the gut niche. The level of genomic divergence seen in BBS2 may be representative of early stages of symbiotic adaptation, though we are not able to discern the definitive cause of divergence. Together, our results delineate unique positions on the spectrum of symbiotic associations and illustrate processes facilitating colonization of and adaptation to the gut environment by phylogenetically diverse bacterial taxa.

3.4 Methods

3.4.1 Ant collection and sample preparation

Ants colonies were collected from Duke Forest, Durham, NC and Durant Nature

Preserve, Raleigh NC during October and November 2016. Based on the prevalence of

AAB1 and AAB2 among the worker caste [9], only minor and major workers were collected from each colony. All specimens were collected alive and transported to Duke

University immediately. Prior to downstream processing, specimens were anesthetized by placing in 15mL test tubes and immersing in ice. For all colonies, vouchers (in the form of whole samples or dissected remains) are available upon request.

Prior to dissection, all anesthetized ants were surface sterilized. Samples were sterilized by immersion in 100% ethanol, followed by a 60 second soak in a 5% bleach

106

solution, and finally rinsed in sterile water. Upon surface sterilization, the entire alimentary tract (fore-, mid-, and hindgut) was dissected out under microscope.

Dissected tissue samples were pooled in sterile PBS and held on ice until homogenization. Pooled tissue samples were lightly homogenized with a plastic pestle and handheld homogenizer.

3.4.2 gDNA extraction and PCR screens

gDNA was extracted from tissue samples and gram negative bacterial colonies using the Qiagen (Hilden, Germany) DNEasy kit. We used the standard protocol for animal tissue for dissected gut tracts and the recommended gram-negative pretreatment for cell cultures of acetic acid bacteria. For isolates of Lactobacillus, we used the MoBio

(Hilden, Germany) UltraClean Microbial DNA Isolation kit following the standard protocol.

Prior to isolate plating and growth in liquid culture, bacterial colonies were screened using primers specific to 16S rRNA of AAB1 and AAB2, as described previously [9]. Colony PCR was performed using the Phusion High-Fidelity Master Mix with HF Buffer (NEB). PCR programs were held at 98°C for 30 seconds to denature the

DNA, followed by 30 cycles of amplification proceeding at 98°C for 10s, 61°C for 30s,

107

and 72°C for 60s, a final extension phase was held at 72°C for 5 min, then cooled at 4°C.

Colonies with positive amplification were sequenced to confirm identity.

3.4.3 Bacterial isolate culturing

Prior to inoculation, all culture plates and broths were allowed to de-gas for 24 hours at room temperature in an anaerobic chamber with an environment consisting of

5% H, 5% CO2, 90% N. All bacterial cultures were sterilely plated, passaged, and/or inoculated under these conditions, then placed in a GasPak EZ (BD) anaerobic container and transferred to an incubator held at 29’C under the same environment. Homogenized gut tracts suspended in sterile PBS were directly plated onto MRS agar (Hardy

Diagnostics) for the growth and selection of acetic acid bacteria and left to incubate for 2-

5 days, or until colonies appeared. Isolated strains of AAB2 were grown in MRS broth under the same anaerobic conditions and temperature until OD600 > 0.7. Though

Lactobacillus grow well on MRS agar, we isolated the strain described herein by directly plating homogenized gut tracts suspended in sterile PBS onto Blood Agar plates (Hardy

Diagnostics) under the same conditions as above. Growth of Lactobacillus in liquid culture was performed under the same conditions as previously described and in GIFU

Anaerobic broth (Himedia) until OD600 > 0.7. Glycerol stocks of all cultures were diluted to 25% glycerol and stored at -80’C. Liquid cultures were centrifuged at 5000xg for 10

108

minutes and gDNA was extracted from cell pellets as described above. Approximately

15ug of gDNA from each sample was used for full genome sequencing.

3.4.4 Bacterial isolate sequencing, assembly, and analysis

Bacterial isolates were sequenced on a PacBio RSII (Pacific Biosciences), utilizing one SMRT cell per isolate. Large insert library preparation and sequencing was performed by the Genomic and Computational Biology Core Facility at Duke

University. De novo genome assembly was performed using the HGAP assembler [141] and resulted in fully closed genomes for all strains, as well as variable coverage of various plasmid sequences. Full genomic data is available in Table 1.

Closed bacterial genomes were uploaded to the Joint Genome Institute (JGI)

Integrated Microbial Genomes and Microbiomes (IMG) analysis server and database.

Gene calling and annotation were performed using the JGI Microbial Genome

Annotation Pipeline [142]. Whole genome alignment was performed using the progressiveMauve algorithm within Mauve [132]. COG function, Pfam assignment, and

KEGG pathway reconstruction were performed using IMG’s automated toolkit. All statistical analyses were performed in R. The vegan, ape, and ggplot2 packages were used for data manipulation and visualization [111, 126, 143]

109

3.4.5 Phylogenomic inference

Homologs shared among all taxa were determined using IMG’s analysis pipeline. Single copy orthologs were identified using OrthoMCL [144], and amino acid guided nucleotide alignments were performed using MUSCLE [145] within the

TranslatorX framework [146]. Ambiguously aligned regions were removed using

Gblocks [147]. We used RAxML [86] for maximum likelihood based phylogenomic inference, employing a General Time Reversible (GTR) model with a Gamma (Γ) distribution of rate heterogeneity [87] and proportion of invariant site estimation (PInv).

Support for node placement was generated via 1000 bootstrap replications.

3.4.6 Determination of horizontally transferred genes

We used the standard protocol [148] of the IMG comparative analysis system to detect horizontally transferred genes. Translated nucleotide sequences were queried against NCBI’s RefSeq resource [149] and GenBank [150] using default parameters

(Expect threshold: 1e-5) and the best BLAST hit was chosen.

110

4. Discussion and future directions

Taken together, the results of experiments described herein contribute to the elucidation of ecological and evolutionary factors shaping animal-bacterial symbioses.

This dissertation establishes a novel system and framework to test various hypotheses about the causes and consequences of symbioses on bacterial associates. Historically, ant-bacterial symbioses have been largely focused on interactions with obligate endosymbionts [14, 20]. Due to the unique nature of those relationships, they offers vast insights into conflicts in mutualisms [151] and evolutionary outcomes for ancient associations [152]. However, being a binary system, questions about interactions between hosts and heterogeneous communities remained opaque.

The implications of results presented in chapter 1 establish a novel system with a potentially worldwide distribution in which one can test hypotheses about the nature of interactions in the gut. I characterize novel and distinct lineages of AAB that co-occur in various species of ants and form a deeply divergent clade. I establish that these associates are persistent and widespread across ant tribes. Unlike Opitutales symbionts in the Cephalotes gut microbiota [45], I delineate rapid evolutionary rates at 16S rRNA and significant destabilization of its structure [9]. Taken together, these results establish a unique model of host-microbe relationships in the context of extracellular gut associates.

111

The implications of research presented in chapter 2 address mechanisms of bacterial exchange between interacting hosts. Whereas vertebrate gut communities are generally complex and vary with the influx of food [153, 154], the gut microbiota of social insects appears to be vertically inherited from nestmates. Compositional signals inherited from queens or workers are the primary determinant of gut community structure, with acquisition of environmental isolates during development increasing community complexity but not overriding inherited compositional signals. This level of community wide stability is unique to insects, likely due to social transmission of bacterial associates. Social interactions such as trophallaxis, the sharing of nutrients and bacteria via mouth-to-mouth or anus-to-mouth exchange, likely represents a conduit for the reliable transmission of gut associates between colony members. Additionally, I develop a novel tool for interrogating the composition of the gut microbiome in this system. Intracellular symbionts, such as Blochmannia in Camponotus, and Wolbachia across several orders, vastly outnumber extracellular gut associates in absolute copy number of

16S rRNA. Accordingly, 16S based community profiles are incredibly difficult to assess deeply, as the majority of sequencing effort is consumed by endosymbiotic DNA. The development of a Wolbachia-specific PNA clamp is a technical innovation that may help to elucidate gut communities across various hosts infected by related Wolbachia strains.

112

The implications of conclusions discussed in chapter 3 elucidate dynamic forces shaping genomic evolution in gut symbionts. Traditionally, gut communities have been analyzed in the context of community profiles measured by 16S rRNA sequencing [11,

43, 55, 61, 155]. However, 16S rRNA can drastically underestimate genomic divergence, as ribosomal RNA evolves orders of magnitude more slowly than protein coding sequences [71]. Therefore, I perform whole genome sequencing of bacterial isolates of taxa identified in chapter 1 as being abundant and persistent across distinct host species.

I successfully culture multiple strains of the symbiotic bacterium AAB2, as well as an isolate of LAB that has also been documented in various host species. Unlike AAB2, the

LAB isolate does not show the typical characteristics of gut associates, as described previously. Rather, it appears to be a mostly environmental isolate that occasionally inhabits insect guts. Despite the similarity to environmental LAB, the genome of strain

BBS2 shows surprisingly high levels of divergence and has acquired genes that may facilitate colonization of the gut tract. I delineate rampant horizontal gene transfer in the genomes of both AAB2 isolates that serve to rescue large scale genomic erosion, also likely from persistent habitation of the gut. These results characterize non-uniform patterns of gene acquisition via HGT that accumulate mostly in COGs associated with energy production and conversion, leading to a mosaic central metabolism. Finally, I propose a functional link in the symbioses by delineating significant enrichment of L-

113

lysine transporters encoded by chromosomal DNA in both AAB isolates. Nutritional upgrading has been documented in obligate endosymbionts [16], and metabolic diversification by gut microbiota has long been appreciated [156, 157]. However, these data are the first to illustrate how rampant HGT may lead to adaptive evolution of gut bacteria.

Future directions in this system could focus on generating monocultures of

AAB1, or pursue cell sorting for isolation and characterization of this associate. My work demonstrates the cultivability of AAB2 and, based on predictions from the evolutionary theory of gut associates, isolates of AAB1 should be able to survive and reproduce outside of the host as well. Of course, the ability to survive outside of a host does not guarantee cultivability, but further efforts with different media and conditions may yield promising results. If cultivability is achieved, then myriad experiments to delineate larger evolutionary trends and ecological adaptation may become available. Cross host species comparison of isolates of each lineage may reveal parallel evolutionary patterns and provide insights into the evolutionary histories of these unique symbionts. Genome- wide analyses of selection will reveal genomic targets of adaptation to the gut and will foster hypothesis about selective regimes inherent to gut association. Furthermore, symbiont clearing and reinoculation may provide mechanistic insights into the functionality of this relationship and illustrate the host fitness consequences associated

114

with these associates. Comparative transcriptomics of host gut tissue would provide insights into host-mediated mechanisms of communication and interaction with persistent gut associates. Furthermore, symbiont transplantation experiments across host species would elucidate levels of host specialization, if present. Additionally, further efforts could focus on screening the physical colony environment and common sources of nutrition for Camponotine ants, to elucidate or eliminate sources of AAB acquisition.

In concert, the data presented here establish a novel model system for testing hypothesis about symbiotic interactions in the gut and illustrate factors shaping the ecology and evolution of gut associates. These results represent advancements in the fields of microbiology, symbiosis research, and genomic evolution, and may serve as benchmarks for future exploration.

115

Appendix A: supplemental information for chapter 1

Estimation of empirical error rate for Ion PGM amplicon sequencing.

In light of concerns about high error rates for amplicon data generated on the Ion

PGM [158], we determined the empirical error rate for Ion PGM amplicon sequencing by including a sample from a pure culture of the fully sequenced E. coli strain in our library construction and Ion PGM sequencing run.

Methods:

We used a pure culture from a single colony of Escherichia coli K12 substrain

DH10B (TOP10) competent cells (Life Technologies), a line which has been fully sequenced [159], in order to determine sequencing error rates empirically. We prepared gDNA, digested the gDNA and resultant amplicons with PacI, created a barcoded library (as described above), and sequenced the library during the same sequencing run as all other samples. We then aligned reads against the known 16S rRNA sequence for this strain. We assessed the platform’s proclivity for indel and substitution errors and empirically calculated the per-base error rate. Any observed errors in the E. coli sample reflect those introduced during both PCR and Ion PGM sequencing, which was performed exactly as for our experimental (ant) samples.

116

All E. coli reads were stripped of barcodes and forward and reverse primers, and were aligned to the 16S rDNA gene of the K12 substrain DH10B [GenBank:CP000948] using the default parameters for –very-sensitive-local alignments in Bowtie2 [160].

Samtools was used to process all alignment files and create the variant-identified .pileup files [161]. Variants were analyzed using Varscan 2.2.3 [162].

Results:

Based on aligning all resulting E. coli reads >50 bp long (2,676 reads) to the known 16S rDNA sequence of this strain (one operon along V4-V5), we estimated the empirical error frequency at 0.00061 base substitution errors per site, and 0.006 indel errors per site (Table 4). We evaluated the frequency of both error types (base substitutions and indels) across reads. While most errors were unique, some errors were abundant in the dataset, with a maximal read frequency of 0.022 for a given substitution error and a maximal read frequency of 0.624 for a given indel error. This high incidence likely reflects a systematic error profile inherent to Ion Torrent pyrosequencing, as previously suggested [158].

We found that length-based thresholds (universal trimming of reads to 360 bp) followed by run-specific expected error filtering [81] proved to be the most effective way to reduce sequencing errors. In addition to mitigating OTU inflation derived from

117

terminal alignment gaps [81], this strategy truncates the section of pyrosequencing reads with the highest error incidence [163]. For 400bp Ion PGM data, a precipitous increase in error rate characteristically occurs around flow 380 [164]. Accordingly, our approach reduced the total per-base error rate by an order of magnitude as compared to more standard length- and average Q score-based filtering strategies, while retaining up to 4- fold more reads than said strategies (Table 4). Our parameters for error filtering (1.5 expected errors per read) were experiment-specific and selected on the ability to maximize reads and successfully call a single OTU in the E. coli dataset.

118

Table 4: Empirical estimation of error rates for E. coli amplicon library. Statistics are for raw reads, filtered and trimmed as listed. All quality values are listed as Phred quality scores. Variant bases were identified using VarScan. N50, the sequence read length at which > 50% of all bases reside in sequence reads longer than the listed value. N95, the sequence read length at which > 95% of all bases reside in sequence reads longer than the listed value.

Length Stats1 Filtering Criteria Reads surviving filter N50 N95 Avg. length Total base pairs minimum 50bp reads2 2,676 374 184 274.46 733,911 minimum 300bp reads 1,831 375 371 373.41 683,708 minimum 300bp/Q20 reads3 215 375 369 372.1 80,001 minimum 300bp/Q163 724 374 371 371.72 269,122 minimum 400bp reads 1,765 375 372 374.88 661,657 minimum 360bp/ maximum expected error 1.5 894 360 360 360 321,840

Quality Stats % bases > Q20 % bases > Q30 Min Q Max Q Avg. Q minimum 50bp reads 87.04 57.28 5 45 28.19 minimum 300bp reads 89.41 60.31 5 45 28.71 minimum 300bp/Q20 reads 95.93 69.82 7 45 30.17 minimum 300bp/Q16 92.31 65.37 7 45 29.53 minimum 400bp reads 89.75 60.87 6 45 28.79 minimum 360bp/ maximum expected error 1.5 94.71 69.95 7 45 30.08

Error Stats Filtering Criteria Total Substitutions Total Indels Total errors Max indel freq./site Max subs. freq./site minimum 50bp reads 445 4,568 5013 0.6239 0.0219 minimum 300bp reads 405 3,505 3910 0.572 0.0231 minimum 300bp/Q20 reads 42 154 196 0.2698 0.0093 minimum 300bp/Q16 158 872 1030 0.5104 0.0042 minimum 400bp reads 340 3,219 3559 0.5655 0.0054 minimum 360bp/ maximum expected error 1.5 118 58 176 0.4518 0.0054

Error Stats cont. Per-base indel error rate Per-base substitution error rate Per-base combined error rate minimum 50bp reads 0.006 0.001 0.007 minimum 300bp reads 0.005 0.001 0.006 minimum 300bp/Q20 reads 0.002 0.001 0.002 minimum 300bp/Q16 0.003 0.001 0.004 minimum 400bp reads 0.005 0.001 0.005 minimum 360bp/ maximum expected error 1.5 0.000 0.000 0.001

1All values are given for reads after barcode and primer removal 2Retained only those reads 50bp or longer, etc. 3Quality filtering based on maintaining the average listed Q score for 10bp sliding window along read

Table 5: Sampling structure of collected colonies

119

Location Durant Nature Park* Durant Nature Park* Durant Nature Park* Colony Number 2 4 6 Colony ID 789 791 793 GPS coordinate N35°53'31.4 W078°35'09.3 N35°53'31.9 W078°35'10.8 N35°53'31.8 W078°35'09.7 Samples gaster replicate 1 (3 gasters) gaster replicate 1 (5 gasters) gaster replicate 1 (5 gasters) gaster replicate 2 (3 gasters) gaster replicate 2 (5 gasters) gaster replicate 2 (5 gasters) gaster replicate 3 (3 gasters) gaster replicate 3 (5 gasters) gaster replicate 3 (5 gasters) Queen gaster Queen gaster

Location Duke Forest** Duke Forest** Duke Forest** Colony Number 8 9 10 Colony ID 798 799 800 GPS coordinate N36°00'50.7 W078°59'23.5 N36°00'50.7 W078°59'23.3 N/A, within 200m of 799 Samples gaster replicate 1 (5 gasters) gaster replicate 1 (5 gasters) gaster replicate 1 (3 gasters) gaster replicate 2 (5 gasters) gaster replicate 2 (5 gasters) gaster replicate 2 (3 gasters) gaster replicate 3 (5 gasters) gaster replicate 3 (5 gasters) gaster replicate 3 (3 gasters) Queen gaster Queen gaster

* Approximate address: 8305 Camp Durant Rd, Raleigh, NC 27614, United States ** Approximate address: 3900 Kerley Rd, Durham, NC 27705, United States

120

Table 6: Distribution of AAB across ant samples. (a) Result of AAB-specific PCR screen across Camponotus samples from distinct locations. (b) Ant-associated AABs detected in previous studies, with references to the original study.

121

Figure 15: Bootstrap consensus tree based on maximum likelihood analysis of Acetobacteraceae 16S rDNA. Phylogeny is based on the same 1,248 bp alignment that was used for Bayesian analysis shown in Figure 1. The tree was reconstructed using a GTR + Γ model of nucleotide substitution. Node support was generated from 1,000 bootstrap resamplings. Environmental taxa are colored black, taxa associated with various insects are colored in orange, and the monophyletic AAB clade described here is colored green. Roseomonas terrae is the outgroup.

122

Roseomonas terrae Acidiphilium rubrum 1 Acidiphilium cryptum 1 Acidiphilium organovorum Acidocella aluminiidurans 1 1 Acidocella aminolytica Gluconacetobacter liquefaciens 0.89 Gluconacetobacter rhaeticus, Drosophila melanogaster 1 Gluconacetobacter europaeus 0.65 1 Gluconacetobacter oboediens 0.54 Gluconacetobacter intermedius, Drosophila melanogaster Asaia platycodi Asaia siamensis 1 1 Asaia astilbis Asaia bogorensis Gluconobacter cerinus 1 Gluconobacter japonicus 1 Gluconobacter kanchanaburiensis 0.92 1Gluconobacter albidus 1 Gluconobacter oxydans, Saccharicoccus sacchari 0.85 Gluconobacter kondonii Gluconobacter morbifer, Drosophila melanogaster Acetobacter pasteurianus 1 0.99 Acetobacter pomorum, Drosophila melanogaster 0.75 Acetobacter aceti 0.98 Acetobacter indonesiensis 1 Acetobacter cerevisiae 1 0.97 Acetobacter malorum, Drosophila melanogaster uncultured Acetobacteraceae bacterium, Atta colombica 1 uncultured Acetobacteraceae bacterium, Atta laevigata

1 uncultured Acetobacteraceae bacterium, Formica exsecta 1 AAB2 Camponotus chromaiodes 0.98 uncultured Acetobacteraceae bacterium, Formica exsecta 1 uncultured Acetobacteraceae bacterium, Camponotus japonicus 1 uncultured Acetobacteraceae bacterium, Camponotus castaneus 0.6 AAB1 Camponotus chromaiodes

0.3

Figure 16: Bayesian phylogeny of Acetobacteraceae 16S rDNA. Phylogeny is based on the same 1,333 bp alignment that was used for maximum likelihood analysis shown in Figure 2. A Markov chain Monte Carlo approximation was used for Bayesian inference of phylogenetic relationships. A GTR + Γ + PInv model of nucleotide substitution was implemented and the posterior probability of each node was estimated from 10,000,000 generations sampled in intervals of 1,000. Environmental taxa are colored black, taxa associated with various insects are colored in orange, and the monophyletic AAB clade described here is colored green. Roseomonas terrae is the outgroup.

123

Table 7: 16S rRNA stability and AT content, across bacterial symbionts with distinct transmission modes

124

Table 8: Primers used in V4-V5 amplicon sequencing.

125

Table 9: Genbank accession numbers of all taxa used in phylogenetic analyses.

126

Appendix B: supplemental information for chapter 2

30

812 821 831 20 832 833 834 D

P 836 838 839 845 847 848 10 849

0

Figure 17: Faith’s PD across all colonies sampled.

127

Colony 1 812 821 831 832 833 834 836 838

] 0 839 % 7 .

3 845 [

847 2 P

A 848 C 849

SubCaste Queen −1 L1 L2 P1 P2 minor

−2

−1 0 1 2 CAP1 [9.8%]

Figure 18: CAP ordination of Unifrac distance constrained by colony identity.

128

2

Colony 812 821 831 832 1 833 834 836 838

] 839 % 9 .

6 845 [

847 2 P

A 848 C 0 849

SubCaste Queen L1 L2 P1 P2 minor −1

0 1 2 CAP1 [12.9%]

Figure 19: CAP ordination of JSD constrained by colony identity.

129

812 821 831 0.0 832 833 834 836 838 ]

% 839 3 .

6 845 1 [

847 2 . s i 848 x

A −0.2 849

Queen L1 L2 P1 P2 minor −0.4

−0.2 0.0 0.2 0.4 Axis.1 [21.8%]

Figure 20: Unconstrained Principle Coordinate Analysis (PCoA) ordination on JSD

130

0.4

812 821 831 832 833 834 836 0.2 838

] 839 % 7

[ 845

2

. 847 s i x 848 A 849

Queen 0.0 L1 L2 P1 P2 minor

−0.2

−0.2 0.0 0.2 0.4 Axis.1 [27.2%]

Figure 21: Unconstrained Principle Coordinate Analysis (PCoA) ordination on Unifrac distance.

131

Color Key Heatmap example

0 0.05 0.1 0.15 0.2 Value

105 14 6 86 82 83 88 71 57 92 91 69 70 93 43 40 41 39 42 85 89 87 104 5 63 64 65 72 79 77 76 78 75 4 23 21 60 62 58 59 52 46 50 84 11 56 48

19 s 24

66 e 68 l 67 12 p 61 2 3 m

1 a 22 51 74 S 8 55 47 73 20 7 80 81 106 53 10 25 94 96 95 90 15 108 54 44 17 16 107 45 29 97 28 98 103 49 18 26 33 31 100 30 37 38 32 35 36 34 99 27 13 9 6 7 9 1 9 7 6 5 1 1 3 8 0 8 7 0 2 6 6 8 4 6 1 4 5 2 3 8 3 1 5 5 0 2 2 7 9 9 9 0 9 0 3 2 5 1 0 3 1 6 6 2 4 8 3 1 1 3 5 8 9 3 5 4 7 8 5 6 9 0 1 5 9 7 6 9 3 8 8 1 1 9 1 4 6 7 8 8 7 6 9 9 0 6 2 7 1 3 5 1 5 5 9 4 2 2 1 2 0 3 3 3 5 5 9 5 9 5 6 9 8 3 0 1 6 7 6 0 0 2 3 8 3 9 5 5 7 3 1 6 8 1 4 8 8 8 9 9 4 1 3 2 3 8 5 5 4 1 1 6 0 5 3 3 2 8 9 1 2 9 8 1 8 3 6 3 4 8 0 0 6 3 1 8 0 3 8 4 9 4 7 2 2 0 7 2 2 9 4 4 0 5 9 3 9 8 9 3 0 7 9 3 6 4 2 3 7 8 1 9 9 3 5 4 5 4 6 1 7 3 2 9 5 2 2 2 7 1 3 3 7 0 7 6 2 1 0 8 7 9 6 3 9 8 2 0 8 8 9 6 2 1 1 7 1 1 0 3 6 0 0 2 4 4 0 7 9 2 2 9 3 0 3 4 2 2 3 8 5 7 5 2 9 5 7 0 9 5 1 3 2 5 0 1 6 6 1 8 7 0 0 2 2 0 6 6 8 7 9 2 9 8 9 2 1 7 7 0 5 9 3 1 5 6 3 6 5 0 5 5 9 1 0 7 1 8 7 0 4 9 5 7 2 4 2 5 5 6 0 0 3 7 3 3 0 6 6 5 3 1 0 2 0 3 0 1 6 0 2 6 2 8 5 0 4 3 9 7 9 2 7 1 9 7 2 0 2 4 7 6 8 1 9 6 3 7 2 5 2 9 1 5 4 0 8 5 8 4 2 7 7 8 8 7 1 2 6 9 3 6 5 2 8 2 5 5 8 3 2 9 3 5 3 4 2 4 2 0 0 8 6 2 5 1 5 7 5 6 8 8 8 4 3 1 1 5 3 3 5 6 4 4 5 0 4 5 3 0 7 7 0 7 8 8 9 3 5 9 6 0 7 5 5 0 8 8 6 8 8 6 5 8 7 8 3 1 1 8 6 8 6 3 8 7 1 6 0 2 9 5 2 6 1 7 8 1 7 8 7 7 2 0 3 8 6 3 7 7 0 5 8 1 2 7 1 7 9 4 5 3 1 5 0 9 1 9 6 8 4 3 3 3 3 7 2 9 8 3 0 1 5 2 5 8 8 3 4 3 6 9 9 8 8 8 7 5 1 0 1 6 9 4 7 7 6 9 0 4 1 9 1 9 0 5 2 2 9 1 1 0 7 2 0 3 7 6 5 9 1 6 0 4 5 7 4 4 2 9 8 9 3 2 0 6 3 8 5 6 1 1 9 8 2 1 3 4 1 6 7 0 4 3 6 1 0 2 2 0 3 8 0 5 4 9 4 3 6 2 7 6 4 5 4 9 7 6 6 0 8 3 6 7 0 1 4 4 5 2 0 5 2 5 4 9 6 0 2 6 3 5 3 2 5 8 0 6 4 0 4 0 7 3 8 8 4 5 2 3 5 2 2 6 1 6 6 1 3 1 0 4 2 7 5 2 8 0 7 5 5 8 7 4 3 8 4 4 6 6 5 7 8 6 5 1 9 2 1 2 0 7 9 3 3 4 2 3 8 8 0 5 6 0 1 1 1 5 0 6 0 6 1 1 9 2 7 0 3 1 8 2 4 3 0 4 8 2 9 1 3 3 4 0 7 0 8 8 0 4 0 8 7 8 9 4 7 7 6 8 6 3 3 1 7 5 7 4 4 6 2 2 4 6 0 0 7 7 8 3 9 8 2 7 6 3 3 8 7 2 3 1 7 0 7 0 3 8 1 8 1 0 8 2 5 1 1 3 9 6 3 0 6 9 5 5 6 3 4 8 5 1 7 7 6 4 7 6 3 3 6 5 2 8 4 8 5 4 4 2 5 9 8 3 4 5 5 0 6 3 8 2 1 7 7 8 2 7 7 8 1 6 5 2 0 6 4 5 5 8 5 0 9 2 4 5 1 5 5 3 8 5 1 8 2 0 6 6 5 1 2 2 8 3 3 5 0 2 0 6 5 0 3 1 1 2 0 7 6 9 7 0 9 2 0 0 7 U 6 8 3 9 5 8 6 9 5 2 4 4 8 4 2 6 2 6 9 8 1 9 8 9 6 3 6 1 2 2 7 5 9 3 9 5 7 3 2 4 5 7 3 1 9 1 4 5 5 1 8 9 1 1 1 5 2 9 2 5 6 9 3 8 6 1 3 7 3 3 4 3 3 9 1 8 3 2 8 6 2 9 4 8 4 8 3 4 9 1 5 7 8 7 7 3 5 8 1 5 6 9 2 1 3 9 3 U U U U U U U U U U U U U U T U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U U T T T T T T T T T T T T T T O T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T T O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O O

OTUs

Figure 22: Supervised OTU heatmap of all samples. Heatmap construction was supervised by UPGMA hierarchical cluster analysis. All rows are color coded by colony identity.

132

1.00

0.75

812 821 831 832 833

e 834 u l a 836 v 0.50 838 839 845 847 848 849

0.25

81 81 82 82 83 83 83 83 83 83 83 83 83 83 83 83 83 83 84 84 84 84 84 84 84 84 2.q 2.w 1.q 1.w 1.q 1.w 2.q 2.w 3.q 3.w 4.q 4.w 6.q 6.w 8.q 8.w 9.q 9.w 5.q 5.w 7.q 7.w 8.q 8.w 9.q 9.w it it it it it it it it it it it it it hin hin hin hin hin hin hin hin hin hin hin hin hin

Figure 23: Jaccard distance between queens and colonies

133

Table 10: Sample structure and design

Table 11: Core OTUs present across all P1 pupae

134

Appendix C: supplemental information for chapter 3

Figure 24: Glycolysis in BBS1 and BBS3

135

Figure 25: The pentose phosphate and Entner Duodoroff pathways in BBS1 and BBS3

136

Figure 26: The TCA cycle in BBS1 and BBS3.

137

Table 12: Genome characteristics of taxa used in phylogenomic trees

138

References

1. McFall-Ngai M, Hadfield MG, Bosch TC, Carey HV, Domazet-Loso T, Douglas AE, Dubilier N, Eberl G, Fukami T, Gilbert SF et al. Animals in a bacterial world, a new imperative for the life sciences. Proc Natl Acad Sci U S A. 2013; 110:3229- 3236.

2. Ley RE, Hamady M, Lozupone C, Turnbaugh PJ, Ramey RR, Bircher JS, Schlegel ML, Tucker TA, Schrenzel MD, Knight R et al. Evolution of mammals and their gut microbes. Science. 2008; 320:1647-1651.

3. Moeller AH, Caro-Quintero A, Mjungu D, Georgiev AV, Lonsdorf EV, Muller MN, Pusey AE, Peeters M, Hahn BH, Ochman H. Cospeciation of gut microbiota with hominids. Science. 2016; 353:380-382.

4. Moran NA, McCutcheon JP, Nakabachi A. Genomics and evolution of heritable bacterial symbionts. Annu Rev Genet. 2008; 42:165-190.

5. Salem H, Florez L, Gerardo N, Kaltenpoth M. An out-of-body experience: The extracellular dimension for the transmission of mutualistic bacteria in insects. Proc Biol Sci. 2015; 282:20142957.

6. Chen DQ, Montllor CB, Purcell AH. Fitness effects of two facultative endosymbiotic bacteria on the pea , , and the blue alfalfa aphid, a. Kondoi. Entomologia experimentalis et applicata. 2000; 95:315- 323.

7. Eckburg PB, Bik EM, Bernstein CN, Purdom E, Dethlefsen L, Sargent M, Gill SR, Nelson KE, Relman DA. Diversity of the human intestinal microbial flora. Science. 2005; 308:1635-1638.

8. Moran NA, Russell JA, Koga R, Fukatsu T. Evolutionary relationships of three new species of enterobacteriaceae living as symbionts of and other insects. Appl Environ Microbiol. 2005; 71:3302-3310. 139

9. Brown BP, Wernegreen JJ. Deep divergence and rapid evolutionary rates in gut- associated acetobacteraceae of ants. BMC Microbiol. 2016; 16:140.

10. Hosokawa T, Kikuchi Y, Nikoh N, Shimada M, Fukatsu T. Strict host-symbiont cospeciation and reductive genome evolution in insect gut bacteria. PLoS Biol. 2006; 4:e337.

11. Kikuchi Y, Meng XY, Fukatsu T. Gut symbiotic bacteria of the genus burkholderia in the broad-headed bugs riptortus clavatus and leptocorisa chinensis (heteroptera: Alydidae). Appl Environ Microbiol. 2005; 71:4035-4043.

12. Moran NA, Hansen AK, Powell JE, Sabree ZL. Distinctive gut microbiota of honey bees assessed using deep sampling from individual worker bees. PloS one. 2012; 7:e36393.

13. Blochmann F: Ueber das regelmässige vorkommen von bakterienähnlichen gebilden in den geweben und eiern verschiedener insecten; 1888.

14. Degnan PH, Lazarus AB, Brock CD, Wernegreen JJ. Host-symbiont stability and fast evolutionary rates in an ant-bacterium association: Cospeciation of camponotus species and their endosymbionts, candidatus blochmannia. Syst Biol. 2004; 53:95-110.

15. Moran NA, Munson MA, Baumann P, Ishikawa H. A molecular clock in endosymbiotic bacteria is calibrated using the insect hosts. Proceedings of the Royal Society of London B: Biological Sciences. 1993; 253:167-171.

16. Feldhaar H, Straka J, Krischke M, Berthold K, Stoll S, Mueller MJ, Gross R. Nutritional upgrading for omnivorous carpenter ants by the endosymbiont blochmannia. BMC Biol. 2007; 5:48.

17. Schroder D, Deppisch H, Obermayer M, Krohne G, Stackebrandt E, Holldobler B, Goebel W, Gross R. Intracellular endosymbiotic bacteria of camponotus species

140

(carpenter ants): Systematics, evolution and ultrastructural characterization. Mol Microbiol. 1996; 21:479-489.

18. McCutcheon JP, Moran NA. Extreme genome reduction in symbiotic bacteria. Nat Rev Microbiol. 2012; 10:13-26.

19. Shigenobu S, Watanabe H, Hattori M, Sakaki Y, Ishikawa H. Genome sequence of the endocellular bacterial symbiont of aphids buchnera sp. Aps. Nature. 2000; 407:81-86.

20. Wernegreen JJ, Lazarus AB, Degnan PH. Small genome of candidatus blochmannia, the bacterial endosymbiont of camponotus, implies irreversible specialization to an intracellular lifestyle. Microbiology. 2002; 148:2551-2556.

21. Gruwell ME, Morse GE, Normark BB. Phylogenetic congruence of armored scale insects (hemiptera: Diaspididae) and their primary endosymbionts from the phylum . Mol Phylogenet Evol. 2007; 44:267-280.

22. Kaiwa N, Hosokawa T, Nikoh N, Tanahashi M, Moriyama M, Meng XY, Maeda T, Yamaguchi K, Shigenobu S, Ito M et al. Symbiont-supplemented maternal investment underpinning host's ecological adaptation. Curr Biol. 2014; 24:2465- 2470.

23. Kwong WK, Engel P, Koch H, Moran NA. Genomics and host specialization of and bumble bee gut symbionts. Proc Natl Acad Sci U S A. 2014; 111:11509-11514.

24. Wenseleers T, Ito F, Van Borm S, Huybrechts R, Volckaert F, Billen J. Widespread occurrence of the micro-organism wolbachia in ants. Proc Biol Sci. 1998; 265:1447- 1452.

141

25. He H, Wei C, Wheeler DE. The gut bacterial communities associated with lab- raised and field-collected ants of camponotus fragilis (formicidae: Formicinae). Current microbiology. 2014:1-11.

26. Johansson H, Dhaygude K, Lindstrom S, Helantera H, Sundstrom L, Trontti K. A metatranscriptomic approach to the identification of microbiota associated with the ant formica exsecta. PLoS One. 2013; 8:e79777.

27. Lanan MC, Rodrigues PA, Agellon A, Jansma P, Wheeler DE. A bacterial filter protects and structures the gut microbiome of an insect. ISME J. 2016; 10:1866- 1876.

28. Kwong WK, Medina LA, Koch H, Sing KW, Soh EJY, Ascher JS, Jaffe R, Moran NA. Dynamic microbiome evolution in social bees. Sci Adv. 2017; 3:e1600513.

29. Kwong WK, Moran NA. Evolution of host specialization in gut microbes: The bee gut as a model. Gut Microbes. 2015; 6:214-220.

30. Warnecke F, Luginbuhl P, Ivanova N, Ghassemian M, Richardson TH, Stege JT, Cayouette M, McHardy AC, Djordjevic G, Aboushadi N et al. Metagenomic and functional analysis of hindgut microbiota of a wood-feeding higher termite. Nature. 2007; 450:560-565.

31. Takatsuka J, Kunimi Y. Intestinal bacteria affect growth of bacillus thuringiensis in larvae of the oriental tea tortrix, homona magnanima diakonoff (lepidoptera: Tortricidae). J Invertebr Pathol. 2000; 76:222-226.

32. Lize A, McKay R, Lewis Z. Kin recognition in drosophila: The importance of ecology and gut microbiota. ISME J. 2014; 8:469-477.

33. Yoshida N, Oeda K, Watanabe E, Mikami T, Fukita Y, Nishimura K, Komai K, Matsuda K. Protein function. Chaperonin turned insect toxin. Nature. 2001; 411:44.

142

34. Giraud A, Matic I, Tenaillon O, Clara A, Radman M, Fons M, Taddei F. Costs and benefits of high mutation rates: Adaptive evolution of bacteria in the mouse gut. Science. 2001; 291:2606-2608.

35. Crotti E, Rizzi A, Chouaia B, Ricci I, Favia G, Alma A, Sacchi L, Bourtzis K, Mandrioli M, Cherif A et al. Acetic acid bacteria, newly emerging symbionts of insects. Appl Environ Microbiol. 2010; 76:6963-6970.

36. Ceja-Navarro JA, Nguyen NH, Karaoz U, Gross SR, Herman DJ, Andersen GL, Bruns TD, Pett-Ridge J, Blackwell M, Brodie EL. Compartmentalized microbial composition, oxygen gradients and nitrogen fixation in the gut of odontotaenius disjunctus. ISME J. 2014; 8:6-18.

37. Chouaia B, Gaiarsa S, Crotti E, Comandatore F, Degli Esposti M, Ricci I, Alma A, Favia G, Bandi C, Daffonchio D. Acetic acid bacteria genomes reveal functional traits for adaptation to life in insect guts. Genome Biol Evol. 2014; 6:912-920.

38. Sudakaran S, Salem H, Kost C, Kaltenpoth M. Geographical and ecological stability of the symbiotic mid-gut microbiota in european firebugs, pyrrhocoris apterus (hemiptera, pyrrhocoridae). Mol Ecol. 2012; 21:6134-6151.

39. Engel P, Martinson VG, Moran NA. Functional diversity within the simple gut microbiota of the honey bee. Proc Natl Acad Sci U S A. 2012; 109:11002-11007.

40. Feldhaar H. Bacterial symbionts as mediators of ecologically important traits of insect hosts. Ecological Entomology. 2011; 36:533-543.

41. Jing X, Wong AC, Chaston JM, Colvin J, McKenzie CL, Douglas AE. The bacterial communities in plant phloem-sap-feeding insects. Molecular Ecology. 2014; 23:1433-1444.

42. McFall-Ngai M. Adaptive immunity: Care for the community. Nature. 2007; 445:153.

143

43. Engel P, Moran NA. The gut microbiota of insects - diversity in structure and function. FEMS Microbiol Rev. 2013; 37:699-735.

44. Koch H, Abrol DP, Li J, Schmid-Hempel P. Diversity and evolutionary patterns of bacterial gut associates of corbiculate bees. Molecular ecology. 2013; 22:2028- 2044.

45. Sanders JG, Powell S, Kronauer DJ, Vasconcelos HL, Frederickson ME, Pierce NE. Stability and phylogenetic correlation in gut microbiota: Lessons from ants and apes. Mol Ecol. 2014; 23:1268-1283.

46. Martinson VG, Moy J, Moran NA. Establishment of characteristic gut bacteria during development of the honeybee worker. Appl Environ Microbiol. 2012; 78:2830-2840.

47. Moran NA. Accelerated evolution and muller's rachet in endosymbiotic bacteria. Proc Natl Acad Sci U S A. 1996; 93:2873-2878.

48. Ashbolt NJ, Inkerman PA. Acetic acid bacterial biota of the pink sugar cane mealybug, saccharococcus sacchari, and its environs. Appl Environ Microbiol. 1990; 56:707-712.

49. Roh SW, Nam YD, Chang HW, Kim KH, Kim MS, Ryu JH, Kim SH, Lee WJ, Bae JW. Phylogenetic characterization of two novel commensal bacteria involved with innate immune homeostasis in drosophila melanogaster. Appl Environ Microbiol. 2008; 74:6171-6177.

50. Anderson KE, Russell JA, Moreau CS, Kautz S, Sullam KE, Hu Y, Basinger U, Mott BM, Buck N, Wheeler DE. Highly similar microbial communities are shared among related and trophically similar ant species. Molecular Ecology. 2012; 21:2282-2296.

144

51. Kautz S, Rubin BE, Russell JA, Moreau CS. Surveying the microbiome of ants: Comparing 454 pyrosequencing with traditional methods to uncover bacterial diversity. Appl Environ Microbiol. 2013; 79:525-534.

52. Buchner P: Endosymbiosis of animals with plant microorganisms: Interscience Publishers; 1965.

53. Jeyaprakash A, Hoy MA. Long pcr improves wolbachia DNA amplification: Wsp sequences found in 76% of sixty-three species. Insect Mol Biol. 2000; 9:393-405.

54. Frydman HM, Li JM, Robson DN, Wieschaus E. Somatic stem cell niche tropism in wolbachia. Nature. 2006; 441:509-512.

55. Wong AC, Chaston JM, Douglas AE. The inconstant gut microbiota of drosophila species revealed by 16s rrna gene analysis. ISME J. 2013; 7:1922-1932.

56. Turner S, PRYER KM, MIAO VP, PALMER JD. Investigating deep phylogenetic relationships among and plastids by small subunit rrna sequence analysis1. Journal of Eukaryotic Microbiology. 1999; 46:327-338.

57. Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J, Loman NJ, Walker AW. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol. 2014; 12:87.

58. Cole JR, Wang Q, Fish JA, Chai B, McGarrell DM, Sun Y, Brown CT, Porras- Alfaro A, Kuske CR, Tiedje JM. Ribosomal database project: Data and tools for high throughput rrna analysis. Nucleic Acids Res. 2014; 42:D633-642.

59. Corby-Harris V, Snyder LA, Schwan MR, Maes P, McFrederick QS, Anderson KE. Origin and effect of alpha 2.2 acetobacteraceae in honey bee larvae and description of parasaccharibacter apium gen. Nov., sp. Nov. Appl Environ Microbiol. 2014; 80:7460-7472.

145

60. Lanfear R. The local-clock permutation test: A simple test to compare rates of molecular evolution on phylogenetic trees. Evolution. 2011; 65:606-611.

61. Li X, Nan X, Wei C, He H. The gut bacteria associated with camponotus japonicus mayr with culture-dependent and dgge methods. Curr Microbiol. 2012; 65:610-616.

62. Aylward FO, Burnum KE, Scott JJ, Suen G, Tringe SG, Adams SM, Barry KW, Nicora CD, Piehowski PD, Purvine SO et al. Metagenomic and metaproteomic insights into bacterial communities in leaf-cutter ant fungus gardens. ISME J. 2012; 6:1688-1701.

63. Liberti J, Sapountzis P, Hansen LH, Sorensen SJ, Adams RM, Boomsma JJ. Bacterial symbiont sharing in megalomyrmex social parasites and their fungus- growing ant hosts. Mol Ecol. 2015; 24:3151-3169.

64. Hart AG, Ratnieks FL. Waste management in the leaf-cutting ant atta colombica. Behavioral Ecology. 2002; 13:224-231.

65. Ishak HD, Miller JL, Sen R, Dowd SE, Meyer E, Mueller UG. Microbiomes of ant castes implicate new microbial roles in the fungus-growing ant trachymyrmex septentrionalis. Sci Rep. 2011; 1:204.

66. Russell JA, Moreau CS, Goldman-Huertas B, Fujiwara M, Lohman DJ, Pierce NE. Bacterial gut symbionts are tightly linked with the evolution of herbivory in ants. Proceedings of the National Academy of Sciences of the United States of America. 2009; 106:21236-21241.

67. Shin SC, Kim S-H, You H, Kim B, Kim AC, Lee K-A, Yoon J-H, Ryu J-H, Lee W-J. Drosophila microbiome modulates host developmental and metabolic homeostasis via insulin signaling. Science. 2011; 334:670-674.

146

68. Clark MA, Moran NA, Baumann P. Sequence evolution in bacterial endosymbionts having extreme base compositions. Molecular Biology and Evolution. 1999; 16:1586-1598.

69. Kwong WK, Moran NA. Cultivation and characterization of the gut symbionts of honey bees and bumble bees: Description of snodgrassella alvi gen. Nov., sp. Nov., a member of the family neisseriaceae of the betaproteobacteria, and gilliamella apicola gen. Nov., sp. Nov., a member of orbaceae fam. Nov., orbales ord. Nov., a sister taxon to the order 'enterobacteriales' of the . Int J Syst Evol Microbiol. 2013; 63:2008-2018.

70. Lambert JD, Moran NA. Deleterious destabilize ribosomal rna in endosymbiotic bacteria. Proceedings of the National Academy of Sciences. 1998; 95:4458-4462.

71. Engel P, Stepanauskas R, Moran NA. Hidden diversity in honey bee gut symbionts detected by single-cell genomics. PLoS Genet. 2014; 10:e1004596.

72. Hakim RS, Baldwin K, Smagghe G. Regulation of midgut growth, development, and metamorphosis. Annu Rev Entomol. 2010; 55:593-608.

73. Kohler T, Dietrich C, Scheffrahn RH, Brune A. High-resolution analysis of gut environment and bacterial microbiota reveals functional compartmentation of the gut in wood-feeding higher termites (nasutitermes spp.). Appl Environ Microbiol. 2012; 78:4691-4701.

74. He H, Chen Y, Zhang Y, Wei C. Bacteria associated with gut lumen of camponotus japonicus mayr. Environ Entomol. 2011; 40:1405-1409.

75. Russell JA, Funaro CF, Giraldo YM, Goldman-Huertas B, Suh D, Kronauer DJ, Moreau CS, Pierce NE. A veritable menagerie of heritable bacteria from ants, butterflies, and beyond: Broad molecular surveys and a systematic review. PLoS One. 2012; 7:e51027.

147

76. Lundberg DS, Yourstone S, Mieczkowski P, Jones CD, Dangl JL. Practical innovations for high-throughput amplicon sequencing. Nat Methods. 2013; 10:999-1002.

77. Baker GC, Smith JJ, Cowan DA. Review and re-analysis of -specific 16s primers. J Microbiol Methods. 2003; 55:541-555.

78. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Lozupone CA, Turnbaugh PJ, Fierer N, Knight R. Global patterns of 16s rrna diversity at a depth of millions of sequences per sample. Proc Natl Acad Sci U S A. 2011; 108 Suppl 1:4516-4522.

79. Berry D, Mahfoudh KB, Wagner M, Loy A. Barcoded primers used in multiplex amplicon pyrosequencing bias amplification. Applied and Environmental Microbiology. 2011; 77:7846-7849.

80. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, Fierer N, Pena AG, Goodrich JK, Gordon JI et al. Qiime allows analysis of high- throughput community sequencing data. Nat Methods. 2010; 7:335-336.

81. Edgar RC. Uparse: Highly accurate otu sequences from microbial amplicon reads. Nat Methods. 2013; 10:996-998.

82. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, Keller K, Huber T, Dalevi D, Hu P, Andersen GL. Greengenes, a chimera-checked 16s rrna gene database and workbench compatible with arb. Appl Environ Microbiol. 2006; 72:5069-5072.

83. de la Bastide M, McCombie WR. Assembling genomic DNA sequences with phrap. Curr Protoc Bioinformatics. 2007; Chapter 11:Unit11 14.

84. Nawrocki EP, Kolbe DL, Eddy SR. Infernal 1.0: Inference of rna alignments. Bioinformatics. 2009; 25:1335-1337.

148

85. Nawrocki EP, Eddy SR. Query-dependent banding (qdb) for faster rna similarity searches. PLoS Comput Biol. 2007; 3:e56.

86. Stamatakis A. Raxml version 8: A tool for phylogenetic analysis and post- analysis of large phylogenies. Bioinformatics. 2014; 30:1312-1313.

87. Yang Z. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol Evol. 1996; 11:367-372.

88. Darriba D, Taboada GL, Doallo R, Posada D. Jmodeltest 2: More models, new heuristics and parallel computing. Nature methods. 2012; 9:772-772.

89. Posada D, Crandall KA. Selecting the best-fit model of nucleotide substitution. Syst Biol. 2001; 50:580-601.

90. Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Hohna S, Larget B, Liu L, Suchard MA, Huelsenbeck JP. Mrbayes 3.2: Efficient bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012; 61:539-542.

91. Yang Z. Paml 4: Phylogenetic analysis by maximum likelihood. Molecular biology and evolution. 2007; 24:1586-1591.

92. Lorenz R, Bernhart SH, Honer Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, Hofacker IL. Viennarna package 2.0. Algorithms Mol Biol. 2011; 6:26.

93. Mathews DH, Disney MD, Childs JL, Schroeder SJ, Zuker M, Turner DH. Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of rna secondary structure. Proc Natl Acad Sci U S A. 2004; 101:7287-7292.

94. Bronstein JL. The exploitation of mutualisms. Ecology Letters. 2001; 4:277-287.

149

95. Herre E, Knowlton N, Mueller U, Rehner S. The evolution of mutualisms: Exploring the paths between conflict and cooperation. Trends in Ecology & Evolution. 1999; 14:49-53.

96. Boucias DG, Garcia-Maruniak A, Cherry R, Lu H, Maruniak JE, Lietze VU. Detection and characterization of bacterial symbionts in the heteropteran, blissus insularis. FEMS Microbiol Ecol. 2012; 82:629-641.

97. Nairz M, Schroll A, Sonnweber T, Weiss G. The struggle for iron - a metal at the host- interface. Cell Microbiol. 2010; 12:1691-1702.

98. Polage CR. Good and bad bacteria fight for iron in the gut. Science Translational Medicine. 2013; 5:199ec140-199ec140.

99. Bäumler AJ, Sperandio V. Interactions between the microbiota and in the gut. Nature. 2016; 535:85-93.

100. Bright M, Bulgheresi S. A complex journey: Transmission of microbial symbionts. Nat Rev Microbiol. 2010; 8:218-230.

101. Fukatsu T, Hosokawa T. Capsule-transmitted gut symbiotic bacterium of the japanese common plataspid stinkbug, megacopta punctatissima. Appl Environ Microbiol. 2002; 68:389-396.

102. Koch H, Schmid-Hempel P. Bacterial communities in central european bumblebees: Low diversity and high specificity. Microb Ecol. 2011; 62:121-133.

103. Caspi-Fluger A, Inbar M, Mozes-Daube N, Katzir N, Portnoy V, Belausov E, Hunter MS, Zchori-Fein E. Horizontal transmission of the insect symbiont is plant-mediated. Proc Biol Sci. 2012; 279:1791-1796.

150

104. Kikuchi Y, Hosokawa T, Fukatsu T. An ancient but promiscuous host-symbiont association between burkholderia gut symbionts and their heteropteran hosts. ISME J. 2011; 5:446-460.

105. Kikuchi Y, Hosokawa T, Fukatsu T. Insect-microbe mutualism without vertical transmission: A stinkbug acquires a beneficial gut symbiont from the environment every generation. Appl Environ Microbiol. 2007; 73:4308-4316.

106. Wilson EO. Chemical communication in the social insects. Science. 1965; 149:1064-1071.

107. Wilson EO. Social insects. Science. 1971; 172:406.

108. Greenwald E, Segre E, Feinerman O. Ant trophallactic networks: Simultaneous measurement of interaction patterns and food dissemination. Sci Rep. 2015; 5:12496.

109. Hamilton C, Lejeune BT, Rosengaus RB. Trophallaxis and prophylaxis: Social immunity in the carpenter ant camponotus pennsylvanicus. Biol Lett. 2011; 7:89- 92.

110. Stoll S, Feldhaar H, Fraunholz MJ, Gross R. dynamics during development of a holometabolous insect, the carpenter ant camponotus floridanus. BMC Microbiol. 2010; 10:308.

111. Oksanen J, Kindt R, Legendre P, O’Hara B, Stevens MHH, Oksanen MJ, Suggests M. The vegan package. Community ecology package. 2007; 10:631-637.

112. Legendre P, Anderson MJ. Distance-based redundancy analysis: Testing multispecies responses in multifactorial ecological experiments. Ecological monographs. 1999; 69:1-24.

151

113. Lin J. Divergence measures based on the shannon entropy. IEEE Transactions on Information theory. 1991; 37:145-151.

114. Hölldobler B: The ants: Harvard University Press; 1990.

115. Strassmann JE, Queller DC. The social organism: Congresses, parties, and committees. Evolution. 2010; 64:605-616.

116. Koch H, Schmid-Hempel P. Socially transmitted gut microbiota protect bumble bees against an intestinal parasite. Proc Natl Acad Sci U S A. 2011; 108:19288- 19292.

117. Moeller AH, Foerster S, Wilson ML, Pusey AE, Hahn BH, Ochman H. Social behavior shapes the chimpanzee pan-microbiome. Sci Adv. 2016; 2:e1500997.

118. Tung J, Barreiro LB, Burns MB, Grenier JC, Lynch J, Grieneisen LE, Altmann J, Alberts SC, Blekhman R, Archie EA. Social networks predict gut microbiome composition in wild baboons. Elife. 2015; 4.

119. Kwong WK, Moran NA. Gut microbial communities of social bees. Nat Rev Microbiol. 2016; 14:374-384.

120. Lucas J, Bill B, Stevenson B, Kaspari M. The microbiome of the ant-built home: The microbial communities of a tropical arboreal ant and its nest. Ecosphere. 2017; 8.

121. Aylward FO, Suen G, Biedermann PH, Adams AS, Scott JJ, Malfatti SA, Glavina del Rio T, Tringe SG, Poulsen M, Raffa KF et al. Convergent bacterial in the fungal agricultural systems of insects. MBio. 2014; 5:e02077.

122. Wang Q, Garrity GM, Tiedje JM, Cole JR. Naive bayesian classifier for rapid assignment of rrna sequences into the new bacterial taxonomy. Appl Environ Microbiol. 2007; 73:5261-5267. 152

123. Liaw A, Wiener M. Classification and regression by randomforest. R news. 2002; 2:18-22.

124. McMurdie PJ, Holmes S. Phyloseq: An r package for reproducible interactive analysis and graphics of microbiome census data. PLoS One. 2013; 8:e61217.

125. Wickham H. Reshape2: Flexibly reshape data: A reboot of the reshape package. R package version. 2012; 1.

126. Wickham H: Ggplot2: Elegant graphics for data analysis: Springer; 2016.

127. Brucker RM, Bordenstein SR. The hologenomic basis of speciation: Gut bacteria cause hybrid lethality in the genus nasonia. Science. 2013; 341:667-669.

128. Oh PL, Benson AK, Peterson DA, Patil PB, Moriyama EN, Roos S, Walter J. Diversification of the gut symbiont lactobacillus reuteri as a result of host-driven evolution. ISME J. 2010; 4:377-387.

129. Frese SA, Benson AK, Tannock GW, Loach DM, Kim J, Zhang M, Oh PL, Heng NC, Patil PB, Juge N et al. The evolution of host specialization in the vertebrate gut symbiont lactobacillus reuteri. PLoS Genet. 2011; 7:e1001314.

130. Ramalho MO, Bueno OC, Moreau CS. Microbial composition of spiny ants (hymenoptera: Formicidae: Polyrhachis) across their geographic range. BMC Evol Biol. 2017; 17:96.

131. Hunter GA, Ferreira GC. Molecular enzymology of 5-aminolevulinate synthase, the gatekeeper of heme biosynthesis. Biochim Biophys Acta. 2011; 1814:1467- 1473.

132. Darling AE, Mau B, Perna NT. Progressivemauve: Multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010; 5:e11147.

153

133. Vrljic M, Sahm H, Eggeling L. A new type of transporter with a new type of cellular function: L-lysine export from corynebacterium glutamicum. Mol Microbiol. 1996; 22:815-826.

134. Kwon HJ, Tirumalai R, Landy A, Ellenberger T. Flexibility in DNA recombination: Structure of the lambda integrase catalytic core. Science. 1997; 276:126-131.

135. Coutte L, Alonso S, Reveneau N, Willery E, Quatannens B, Locht C, Jacob- Dubuisson F. Role of adhesin release for mucosal colonization by a bacterial pathogen. J Exp Med. 2003; 197:735-742.

136. Seed KD. Battling phages: How bacteria defend against viral attack. PLoS pathogens. 2015; 11:e1004847.

137. Kwong WK, Zheng H, Moran NA. Convergent evolution of a modified, acetate- driven tca cycle in bacteria. Nat Microbiol. 2017; 2:17067.

138. Mittl PR, Schneider-Brachert W. Sel1-like repeat proteins in signal transduction. Cell Signal. 2007; 19:20-31.

139. Gil R, Silva FJ, Zientz E, Delmotte F, Gonzalez-Candelas F, Latorre A, Rausell C, Kamerbeek J, Gadau J, Holldobler B et al. The genome sequence of blochmannia floridanus: Comparative analysis of reduced genomes. Proc Natl Acad Sci U S A. 2003; 100:9388-9393.

140. Smillie CS, Smith MB, Friedman J, Cordero OX, David LA, Alm EJ. Ecology drives a global network of gene exchange connecting the . Nature. 2011; 480:241-244.

141. Chin CS, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE et al. Nonhybrid, finished microbial

154

genome assemblies from long-read smrt sequencing data. Nat Methods. 2013; 10:563-569.

142. Huntemann M, Ivanova NN, Mavromatis K, Tripp HJ, Paez-Espino D, Palaniappan K, Szeto E, Pillay M, Chen IM, Pati A et al. The standard operating procedure of the doe-jgi microbial genome annotation pipeline (mgap v.4). Stand Genomic Sci. 2015; 10:86.

143. Paradis E, Claude J, Strimmer K. Ape: Analyses of phylogenetics and evolution in r language. Bioinformatics. 2004; 20:289-290.

144. Li L, Stoeckert CJ, Jr., Roos DS. Orthomcl: Identification of ortholog groups for eukaryotic genomes. Genome Res. 2003; 13:2178-2189.

145. Edgar RC. Muscle: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004; 32:1792-1797.

146. Abascal F, Zardoya R, Telford MJ. Translatorx: Multiple alignment of nucleotide sequences guided by amino acid translations. Nucleic Acids Res. 2010; 38:W7-13.

147. Talavera G, Castresana J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol. 2007; 56:564-577.

148. Markowitz VM, Chen IM, Palaniappan K, Chu K, Szeto E, Pillay M, Ratner A, Huang J, Woyke T, Huntemann M et al. Img 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res. 2014; 42:D560-567.

149. Pruitt KD, Tatusova T, Brown GR, Maglott DR. Ncbi reference sequences (refseq): Current status, new features and genome annotation policy. Nucleic Acids Res. 2012; 40:D130-135.

155

150. Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. Genbank. Nucleic Acids Res. 2013; 41:D36-42.

151. Wernegreen JJ. Mutualism meltdown in insects: Bacteria constrain thermal adaptation. Curr Opin Microbiol. 2012; 15:255-262.

152. Williams LE, Wernegreen JJ. Genome evolution in an ancient bacteria-ant symbiosis: Parallel gene loss among blochmannia spanning the origin of the ant tribe camponotini. PeerJ. 2015; 3:e881.

153. Yatsunenko T, Rey FE, Manary MJ, Trehan I, Dominguez-Bello MG, Contreras M, Magris M, Hidalgo G, Baldassano RN, Anokhin AP et al. Human gut microbiome viewed across age and geography. Nature. 2012; 486:222-227.

154. Backhed F, Roswall J, Peng Y, Feng Q, Jia H, Kovatcheva-Datchary P, Li Y, Xia Y, Xie H, Zhong H et al. Dynamics and stabilization of the human gut microbiome during the first year of life. Cell Host Microbe. 2015; 17:852.

155. Boucias DG, Cai Y, Sun Y, Lietze VU, Sen R, Raychoudhury R, Scharf ME. The hindgut lumen prokaryotic microbiota of the termite reticulitermes flavipes and its responses to dietary lignocellulose composition. Mol Ecol. 2013; 22:1836-1853.

156. Turnbaugh PJ, Bäckhed F, Fulton L, Gordon JI. Diet-induced obesity is linked to marked but reversible alterations in the mouse distal gut microbiome. Cell host & microbe. 2008; 3:213-223.

157. Turnbaugh PJ, Ridaura VK, Faith JJ, Rey FE, Knight R, Gordon JI. The effect of diet on the human gut microbiome: A metagenomic analysis in humanized gnotobiotic mice. Sci Transl Med. 2009; 1:6ra14.

158. Bragg LM, Stone G, Butler MK, Hugenholtz P, Tyson GW. Shining a light on dark sequencing: Characterising errors in ion torrent pgm data. PLoS Comput Biol. 2013; 9:e1003031.

156

159. Durfee T, Nelson R, Baldwin S, Plunkett G, 3rd, Burland V, Mau B, Petrosino JF, Qin X, Muzny DM, Ayele M et al. The complete genome sequence of escherichia coli dh10b: Insights into the biology of a laboratory workhorse. J Bacteriol. 2008; 190:2597-2606.

160. Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012; 9:357-359.

161. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S. The sequence alignment/map format and samtools. Bioinformatics. 2009; 25:2078-2079.

162. Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK. Varscan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012; 22:568- 576.

163. Huse SM, Huber JA, Morrison HG, Sogin ML, Welch DM. Accuracy and quality of massively parallel DNA pyrosequencing. Genome Biol. 2007; 8:R143.

164. Golan D, Medvedev P. Using state machines to model the ion torrent sequencing process and to improve read error rates. Bioinformatics. 2013; 29:i344-i351.

157

Biography

Bryan Brown was born in Akron, Ohio in 1989. He obtained a BS in Biochemistry and Biology from the University of Akron (UA) in 2011. While at UA, Bryan worked in the laboratories of Stephen Weeks and John Senko, both of which had profound and positive impacts on his professional development. Bryan obtained a Graduate Research

Fellowship and Graduate Research Opportunities Worldwide Fellowship from the

National Science Foundation. He also earned several research grants and fellowships from Duke University. Upon completion of the Doctoral degree, Bryan will become a

Postdoctoral Associate at the University of Washington and Seattle Children’s Hospital in the lab of Heather Jaspan.

158