<<

Host speciation and microbiomes: ecological and evolutionary factors shaping gut microbial communities in Darwin's finches

The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters

Citation Loo, Wesley Tsekuang. 2018. Host speciation and microbiomes: ecological and evolutionary factors shaping gut microbial communities in Darwin's finches. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:41128170

Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA

Host speciation and microbiomes: ecological and evolutionary factors shaping gut microbial communities in Darwin’s finches

A dissertation presented

by

Wesley Tsekuang Loo

to

The Department of Organismic and Evolutionary Biology

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of

Biology

Harvard University

Cambridge, Massachusetts

May 2018

© 2018 Wesley Tsekuang Loo All rights reserved.

Dissertation Advisor: Professor Colleen M. Cavanaugh Wesley Tsekuang Loo

Host speciation and microbiomes: ecological and evolutionary factors

shaping gut microbial communities in Darwin’s finches

Abstract

The bacterial communities associated with host organisms, or microbiomes, are an integral aspect of host health, development, and . Characterization of microbiomes in wild are essential to our understanding of co-evolutionary histories among lineages. However, it is difficult to disentangle the numerous variables that can affect the composition of this bacterial community. Therefore, I investigated the factors that potentially shape gut microbiomes in one of the best studied avian lineages in the world, Darwin’s finches in the Galápagos Archipelago. The distribution of species across the islands allowed for independent sampling in ecologically distinct . Given their well-characterized evolutionary history, Darwin’s finch species provided an ideal study system for testing the effect of co-phylogeny on the gut microbiome in the context of ecological factors.

First, I comprehensively sampled nine Darwin’s finch species on Santa Cruz Island, spanning the extremes of the ecological habitats. The microbial diversity was characterized using

16S rRNA sequencing. In addition to microbiome data, host phylogeny, stable isotope analysis, and first foraging observations in the field were used to place the microbiome characterization in their evolutionary and ecological context. Applying a variety of methods to examine the

iii

relationship between the gut microbiome and these factors, I detected a clear effect of and correlation of the microbiome with both host phylogeny and dietary preferences, with foraging data uniquely explaining portions of the variation seen in the gut microbiome.

Next, the recent introgression from the ( pauper) into the small tree finch (C. parvulus) on Floreana Island was used as a case study for the effect of species identity and genetic background on the gut microbiome. Due to the morphological similarity between small and hybrid tree finches, I genotyped and morphologically characterized all tree finches in combination with analyzing their microbiomes. Though overall patterns in beta diversity were similar across genetic clusters, differential abundance analysis with both 16S and metagenomic data demonstrated that the gut microbiomes of hybrid tree finches are more similar to those of their paternal species, the small tree finch.

Finally, I interrogated the remaining ground finch species from Floreana Island as an independent sample for signatures of co-diversification and performed a meta-analysis with the overlapping species between the two islands. Darwin’s finch gut microbiome samples were consistently different between the highland and lowland habitats on each island. Additionally, they were significantly different between the highland habitats of the two islands, but not the lowland habitats, signifying a clear environmental effect independent of host species, likely as a result of more divergent flora present in the highlands than the lowlands.

Altogether, this dissertation provides three separate but interconnected lines of evidence that are consistent with a model of microbiome assembly in which environmental filtering via diet and habitat are primary determinants of the bacterial taxa present with secondary influence from the evolutionary history between hosts. These studies demonstrate the necessity of comprehensive metadata for the correct interpretation of patterns observed in host-associated

iv

bacterial communities and provide foundational knowledge for future work in understanding the complex dynamics of microbiome assembly.

v

Table of Contents

Abstract iii

Table of Contents vi

Acknowledgements x

List of Figures xii

List of Tables xiv

Chapter 1: Introduction 1

Chapter 2: Diet, habitat, and host phylogeny shape the gut microbiome

of Darwin's finches 13

Abstract 14

Introduction 15

Materials and Methods 22

Results 32

Discussion 48

Acknowledgements 57

Chapter 3: Hybrid Darwin's finches share the gut microbiomes

of their paternal species 58

Abstract 59

Introduction 60

Materials and Methods 64

Results 78

Discussion 94

Acknowledgements 103

vi

Chapter 4: Habitat difference shapes the gut microbiome of Darwin's finches

on Floreana Island 104

Abstract 105

Introduction 106

Materials and Methods 110

Results 121

Discussion 135

Acknowledgements 144

Chapter 5: Conclusion 145

Bibliography 155

Supplementary Material 171

Chapter 2 Supplementary Material 171

Chapter 3 Supplementary Material 187

Chapter 4 Supplementary Material 204

vii

To my family – biological and chosen

viii

One might really fancy that from an original paucity of in this archipelago one species had been taken and modified for different ends.

—Charles Darwin, The Voyage of the Beagle

This is truly the ‘age of bacteria’ – as it was in the beginning, is now and ever shall be.

—Stephen Jay Gould, Scientific American

ix

Acknowledgements

This dissertation would not have been possible without the help and support of many people. First, thank you to my academic mentors: my advisor and committee members. Colleen

Cavanaugh gave me consistent guidance from planting the initial idea of studying Darwin’s finch microbiomes, to editing grant proposals and seeing this project to completion. Pete Girguis provided unwavering support and an optimistic outlook throughout my graduate career. Eric Alm was my methodological mastermind, always encouraging me to see my data in a new light. Scott

Edwards looked at my work from a “’s eye view” and never let me lose sight of the larger significance of this project.

Second, I’m grateful to my incredible collaborators on Team Pinzon. Sonia Kleindorfer was unendingly supportive of this project from my first email asking about a potential collaboration in the Galápagos. Her expertise in the field was invaluable to the project, and I appreciate her patience as I figured out my field protocol. Rachael Dudaniec was an excellent early career mentor and provided ongoing encouragement throughout this process. Jefferson

Garcia Loor was an unparalleled field assistant and friend.

Third, the wider academic community was an excellent intellectual home. The past and present members of the Cavanaugh lab have always been ready with sage advice, both in the lab and in life (Kristina Fontanez, Shelbi Russell, Oleg Dmytrenko, Li Liao, Jingchun Li, Joey Pakes

Nelson, Fatma Gomaa, David Fronk, and Dan Utter). The graduate administrators at the

Department of Organismic and Evolutionary Biology (Chris Preheim, Alex Hernandez-Siegel,

Lydia Carmasino) made this journey as smooth as possible.

x

Fourth, graduate school is not only an academic endeavor but a personal one. Thank you to all my friends in the Harvard-Radcliffe Collegium Musicum for brightening my days with song, and to the Harvard Ballroom Dance Team for being the craziest, most entertaining student group. I’m grateful to my OEB cohort for the support (and commiseration), especially Allison

Shultz, Seth Donoughe, and Brent Hawkins. To my board game geeks (Tamsin Jones, Aaron

DeLoughery, Brielle Bryan, Zach Epstein), thank you for battling the Ancient Ones with me.

Sebastian Akle was an excellent companion for many late nights of writing at CGIS. My friends from Princeton (Diana Chien, Kevin Jeng, Gabe Rodriguez) have given me humility at my highest and encouragement at my lowest, especially Andrew Sue-Ako. To Katie Boronow, you evolved from cohort companion to forever friend. Thank you for the good stories over delicious dinner dates. To Jason Smith and Ivo Baca, the countless evenings I spent with you were every bit as formative to who I have become as the hours spent in lab. Thank you for all the time we spent dancing, laughing, and living.

Finally, this dissertation is the culmination of many, many years of love and support from my family. Thank you to my parents for valuing my education from preschool to what I calculate is now grade 23 and for encouraging me to pursue my passion. To my brother Clinton, thank you for being a role model of compassion. My extended family has been a stellar support system at every holiday and family trip. And to my chosen family, Theo Leenman, thank you for your enduring patience and boundless love. You have truly been a partner in every sense, seeing me through every trial, tribulation and now, one notable triumph.

xi

List of Figures

Figure 1.1. Phylogeny of Darwin’s finches based on microsatellite DNA sequences and their distribution across the habitats and islands sampled (adapted from Grant and Grant 2002). 9

Figure 2.1 Relative abundance of bacterial phyla in the gut microbiota of Darwin’s finch species. 20

Figure 2.2 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities. 35

Figure 2.3. !13C and !15N stable isotope measurements for Darwin’s finch species. 38

Figure 2.4. Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome samples overlaid with foraging data. 42

Figure 2.5. Beta diversity through time applied on the gut microbiome data. 44

Figure 2.6. Variation partitioning of Darwin’s finch gut microbiome samples by finch phylogeny, stable isotope values, and foraging data. 46

Figure 3.1. Membership coefficient in STF cluster compared to morphological summary variables. 81

Figure 3.2 Relative abundance of bacterial taxa in the gut microbiota of Darwin’s finch species. 84

Figure 3.3 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities in Floreana tree finches. 85

Figure 3.4. Differential abundance of ribosomal sequence variants (RSVs) in pairwise comparisons between Darwin’s tree finch genetic clusters. 87

Figure 3.5. Coverage of Lactobacillus species from metagenomic sequencing scaled per sample. 89

Figure 3.6. Differential abundance of KEGG pathways in pairwise comparisons between Darwin’s tree finch genetic clusters. 91

Figure 3.7. !13C and !15N stable isotope measurements for Darwin’s tree finch genetic populations. 93

Figure 4.1 Relative abundance of bacterial phyla in the gut microbiota of Darwin’s finch species on Floreana Island. 123

xii

Figure 4.2 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities. 126

Figure 4.3 !13C and !15N stable isotope measurements for Darwin’s finch species. 129

Figure 4.4. Variation partitioning results for comparing Darwin’s finch phylogeny, stable isotope values, and foraging data to the weighted UniFrac distances of gut microbiome samples from Floreana. 132

Figure 4.5. Principal coordinate analysis visualization of weighted UniFrac distances between microbiome samples from Santa Cruz and Floreana Islands. 134

xiii

List of Tables

Table 2.1. Darwin’s finch species sampled for this study, including habitats and specimen number, organized by habitat 17

Table 2.2. Procrustes Analysis of Co-Phylogeny. 37

Table 2.3. Stable isotope (! 13C and ! 15N) ratios by Darwin’s finch species 39

Table 2.4. First foraging observations across all Darwin’s finch species and both habitats 40

Table 2.5 Top ribosomal sequence variants (RSVs) for classifying gut microbiome samples by habitat of origin using the random algorithm 48

Table 3.1. First foraging observations across Darwin’s tree finch species on Floreana Island 93

Table 4.1. Darwin’s finch species and sample number from highland and lowland habitats on Floreana Island, Galápagos Archipelago. 112

Table 4.2. First foraging observations across all Darwin’s finch species on Floreana Island 130

xiv

Chapter 1

Introduction

Bacteria are the most abundant and diverse organisms on earth, with an estimated one trillion species based on extrapolation from current surveys (Locey and Lennon 2016). Whole genome analyses confirm that the Bacterial domain of life encompass a larger proportion of the tree of life than either Archaea or Eukarya (Hug et al. 2016). Two breakthroughs have catalyzed the characterization of this unseen majority across many environments: one conceptual and one technological (Whitman et al. 1998). First, the determination of ribosomal RNA as a molecular clock with which to build the phylogenetic relationships among diverse organisms provided the foundation for direct sequencing to build our understanding of the tree of life, going so far as to introduce a previously unknown domain of life (Woese 1987; Woese et al. 1990). Second, technological advancement has quickly changed the scale of research in microbial ecology, going from cloning projects with hundreds of DNA sequences to sequencing millions of reads from a single sample using next generation sequencing (Shokralla et al. 2012). The past decade has seen tremendous progress in profiling the bacterial diversity present in diverse environments, not least the complex communities associated with other organisms.

Microbiome research and progress

Host-associated microbial communities, or microbiomes, are now known to be a key aspect of organismal biology. Some even refer to holobionts, or organisms plus all associated microbes, as the unit of selection for considering evolutionary principles (Rosenberg and Zilber-

Rosenberg 2013). It is clear that the microbiomes play an important role in host health,

development, and evolution. Examples include the treatment of infectious diseases in humans using healthy gut microbiomes (Jorup-Rönström et al. 2012), the development of antibody responses in the gut as seen in mice (Planer et al. 2016), and hybrid incompatibility of microbiomes between wasps in the Nasonia (Brucker and Bordenstein 2013). Moreover, host-associated microbial communities are distinct from free-living ones, further emphasizing the untapped diversity present in this unique environment (Ley et al. 2008b).

The vertebrate gut contains some of the densest collections of bacteria, with up to a trillion bacterial cells per gram of colonic content (O'Hara and Shanahan 2006). Not only is the gut home to a vast number of cells, but these communities perform vital functions, including vitamin synthesis (LeBlanc et al. 2013) and metabolism of otherwise indigestible carbohydrates

(Hehemann et al. 2010). It is not surprising then that the gut microbiome has been the subject of intense study in relation to obesity and nutrition, with obesity correlating to lower bacterial diversity and enrichment in the bacterial phylum Bacteroidetes (Turnbaugh et al. 2008).

Research into the human microbiome has progressed from characterization of the healthy microbiome (Human Microbiome Project Consortium 2012) to interrogating the microbiome for causal relationships with disease states such as diabetes (Kostic et al. 2015) and inflammatory bowel disease (Halfvarson et al. 2017). In contrast to the well-studied human microbiome, research into the microbiomes of non-model organisms is significantly less common, and especially lacking for one the most diverse tetrapod lineages: birds.

Birds are a successful group of vertebrates with unique life history traits, including hatching from a sterile and subsequent feeding via regurgitation, thus serving as an excellent group to study both ecological and evolutionary aspects of host-microbiome associations. While making up the majority of tetrapod diversity, birds are relatively understudied in microbiome

2

research. There are fewer studies on avian microbiota compared to studies on mammals, with a majority focused on domesticated species, such as poultry, or potential pathogens in the fecal material of birds in urban areas (Danzeisen et al. 2011; Kohl 2012). However, recent surveys of the avian microbiome have found that taxonomic categories and the phylogenetic relationships of the host species had the highest correlation with microbiome composition when compared with life history traits and ecological variables (Hird et al. 2015; Kropáčková et al. 2017).

Ecological and evolutionary factors that affect microbiome composition

Beyond determining the bacterial diversity present in microbiome samples, the goal of microbiome research is to attain some understanding of the factors that shape the microbiome.

Studies have delved into the effects of diet, geography, and evolutionary history, to name but a few of the possible variables.

Diet is perhaps the most obvious variable to have an influence on the gut microbial community. Not only are food items sources of bacterial inoculation themselves, the composition of the food items can sustain particular clades of bacterial taxa. For example, in humans, the gut microbiome can rapidly shift in diversity based on changes between plant- or animal-based diets

(David et al. 2014). These shifts in diet can be cyclical, as shown in a population of hunter gatherers in Tanzania, with the gut microbiome composition changing with the diet of the season

(Smits et al. 2017). However, the effect of high-protein, high-fat, low-fiber ‘Western diet’ is not universal across all primates species, indicating that while diet may play a large role, it does not fully explain the diversity observed (Amato et al. 2015). Instead, other factors must also play a role in shaping the microbial community.

In considering environmental effects, physical location is another intuitive explanation for differences observed in the microbiome. This was tested in a study of the brood parasitic

3

brown-headed cowbird, Molothrus ater (Hird et al. 2015). Cowbird are laid in the nests of heterospecific host species and the cowbird hatchlings are subsequently raised by heterospecific foster adults. By sampling the gut microbiome from cowbird and host species populations in

California and Louisiana, Hird et al. (2015) investigated whether the gut microbiome reflected the species identity of the individual or the host species that cared for it. Ultimately the sampling locality had the strongest association with the microbiome composition across all the samples characterized, corroborating a strong environmental effect. Another study across the Americas, albeit in mammalian species, demonstrated a correlation between geographic distance and microbiome distances even after taking the phylogenetic relationships into account (Moeller et al. 2017). There, the geographic distance provided additional explanatory power compared to the phylogenetic distances alone. Together, these studies show the potential of geography and environment to affect the microbiome.

Tracing the impact of coevolution on gut microbiomes is a difficult task, not least because of the confounding variables described above. Additionally, the concept of coevolution is broad, making it difficult to develop specific hypotheses. The most intuitive case of co- evolution is co-phylogeny, or the congruence between the topologies between the phylogenetic tree of the host and the microbiome. Most of the methods developed for detecting co-phylogeny come from research into host-parasite interactions and test for the significance of the association by comparing the global fit of the two tree topologies as observed and with randomized association, such as ParaFit (Legendre et al. 2002) and Hommola et al. Cospeciation Test

(Hommola et al. 2009). A more recent method, Procrustes Analysis of Co-phylogeny (PACo), uses principal coordinate analysis on the distance matrices of the host and parasite and then uses

Procrustes analysis to find the correlation between these projections (Balbuena et al. 2013). The

4

application of PACo to a survey of temperate bird species found correlation between the 51 species and their gut microbiomes, but requires further investigation into the cause of the correlation.

Correlation between the host phylogeny and their microbiomes can have different explanations. In the case of vertical transmission of microbes, the bacteria could co-diversify with the host species. That is, the bacterial lineages would evolve in tandem with the separation between host species. Alternatively, the correlation could be the product of environmental filters, such as diet, which are phylogenetically correlated. Distinguishing between these alternative models is an important step in understanding the role of the host’s evolutionary history in shaping the microbiome. A test for co-diversification, beta diversity clustering sensitivity analysis, was developed by Sanders et al. that leverages different levels of sequence similarity to differentiate between co-phylogeny signal caused by recent or ancient bacterial lineages (Sanders et al. 2014).

This technique helped reconcile two different results from studying coevolution in humans and closely related great ape species. Ochman et al. first confirmed the complete congruence between the phylogeny of the seven great ape species and their gut microbiomes

(Ochman et al. 2010). A following study showed that within chimpanzee communities, diet played a much larger role than the genetic distance between individuals (Degnan et al. 2012).

With the beta diversity clustering sensitivity analysis, the signal grouping microbiomes to host species was lost at wider clustering thresholds, implying that these bacterial lineages were the product of recent bacterial evolution and consistent with a model where species-specific bacteria are acquired horizontally alongside a small proportion of vertically transmitted taxa. The next iteration of this method, termed Beta Diversity Through Time (BDTT), was applied to a dataset

5

of 33 mammalian species and was able to show that diet and host phylogeny acted on the microbiome on different timescales (Groussin et al. 2017).

Disentangling the varying effects of diet, environment, and host phylogeny require data from well-characterized species. More importantly, to reflect the diversity of the microbiome as accurately as possible, samples should be collected from wild animals. Captive animals maintain distinct microbiomes in comparison to their wild counterparts and may therefore convolute how different factors shape the microbiome (Xenoulis et al. 2010). The importance of wild microbiomes to our understanding of the complex relationship between host species and microbiomes are reviewed in (Amato 2013; Hird 2017).

Adaptive radiations provide the opportunity to investigate the influence of evolutionary and ecological diversification within a group of closely related species. Typically, species in adaptive radiations evolve to occupy different ecological niches, with varying diets and behaviors among sympatric species (Schluter 2000). Microbial surveys in multiple animal kingdoms have shown varying results with respect to coevolution in the lineage. In mammals, a study on different families of bats found congruence between the host phylogeny and the microbiome (Phillips et al. 2012). In the African cichlid fish radiation sampled across two lakes, the host phylogeny had a clear signal in analyzing the microbial community (Baldo et al. 2017).

Conversely, a study in different ecotypes of the Trinidadian guppy did not find any signal of parallel evolution between the two populations (Sullam et al. 2015). Finally, a study of the different ecotypes within the adaptive radiation of Anolis lizards in Florida and Puerto Rico found that the microbiome composition was partly explained by phylogeny, despite high intraspecific variation (Ren et al. 2016). Further characterization of how these multiple factors

6

shape gut microbiomes is needed in avian lineages. Therefore, one of the best studied adaptive radiations in birds was chosen for investigation.

Darwin’s finches

Darwin’s finches in the Galápagos archipelago are iconic in evolutionary biology.

Though Darwin collected some specimens during his voyage on the H.M.S. Beagle, their first scientific description was written by the ornithologist John Gould (Gould 1837). Their popularization as “Darwin’s Finches” came after the first comprehensive overview of their ecology and evolution by David Lack in 1947 (Lack 1947). In total, fourteen species are spread across the islands in the archipelago and Cocos Island approximately 780 km north. Extensive research into the phylogeny of these species has illuminated their evolutionary history (Freeland et al. 1999; Petren et al. 1999; Sato et al. 1999). The most closely related species is the black- faced grassquit (Tiaris bicolor) in the family and the diversification of the lineage is estimated to have taken place in the past 1 million years, as corroborated with whole genome resequencing (Lamichhaney et al. 2015).

The finches have long been classified into two groups: the ground finches and the tree finches. The ground finch species include the small (Geospiza fuliginosa), medium (G. fortis), large (G. magnirostris), cactus (G. scandens), large cactus (G. conirostris), and sharp-beaked (G. difficilis) finches which primarily feed on seeds and plant material though this is dependent on season and habitat. The tree finch species include the small (Camarhynchus parvulus), medium

(C. pauper), large (C. psittacula), mangrove (C. heliobates), and woodpecker (C. pallidus) finches which primarily feed on insects but are also opportunistic. Aside from these two species groups are the vegetarian (Platyspiza crassirostris), grey warbler (C. fusca), green warbler (C. olivacea), and Cocos finch (Pinaroloxias inorata). The distribution of the finch species across

7

the archipelago allows for independent sampling of the same species from different environments and habitats.

This dissertation focuses on two of the larger islands in the archipelago: Santa Cruz and

Floreana. Santa Cruz is the second largest island in the Galápagos archipelago and is located near the center of the island group (area: 986 km2; 0°37'S, 90°21'W) while Floreana Island is a smaller island located on the southern end of the archipelago (total area: 173 km2, 1°28’ S,

90°48’ W). Both islands contain study sites that range in ecological features from arid lowlands to moist highlands (Galligan et al. 2012). Lowland habitats range from the coast to 120 m in elevation, receive less than 250 mm of rainfall in a typical year, and are characterized by Opuntia cacti species, brush, and deciduous trees. In contrast, highland habitats range from 600 m to 800 m above sea level, usually receive more than 700 mm of rainfall per year, and are a forest dominated by the non-deciduous Scalesia pedunculata. The distribution of finch species across islands and habitats is shown in Figure 1.1. Four species occur on both islands, allowing for the additional comparison of habitats between islands while controlling for the host species.

8

Figure 1.1. Phylogeny of Darwin’s finches based on microsatellite DNA length variation and their distribution across the habitats and islands sampled in this dissertation (adapted from Grant and Grant 2002).

In addition to the well-defined phylogeny for these species, the populations on both islands are exceptionally well characterized ecologically. Foraging observations can be used to

9

quantify the dietary patterns of avian species and by utilizing only the first observation can avoid statistical bias (Morrison 1984). On Santa Cruz, extensive foraging data has been collected for all the tree finch species present across spatial and seasonal variables, noting increased consumption of insects from the small to large to woodpecker to warbler finch (Tebbich et al. 2004).

Similarly, foraging observations of the small ground finch across habitats demonstrated adaptive divergence in contiguous populations, which was not driven by genetic isolation given high gene flow across the island (Kleindorfer and Chapman 2006; Galligan et al. 2012).

An alternative method to characterize dietary differences is the application of stable isotope analysis. !13C and !15N stable isotope ratios can provide additional layers of data to interpret the patterns of diversity found in gut microbiomes. Stable isotope values can be used to infer the trophic levels of organisms with the context of potential food sources (Post 2002) and have been used extensively as a non-invasive method of establishing food webs in other avian species (Kelly 2000). For example, !13C and !15N stable isotope ratios demonstrated trophic partitioning among bird species in a tropical rain forest (Herrera et al. 2003).

The island of Floreana additionally provides a unique case study for the effect of the genetic background on the gut microbiome. Since the early 2000s, an asymmetrical introgression from the critically endangered, endemic medium tree finch into the common small tree finch driven by female mate choice has produced a hybrid and small tree finch swarm (Peters et al.

2017). The hybrid finches are morphologically indistinguishable from small tree finches and require microsatellite genotyping for assignment to the genetic population. Because finches learn songs from their fathers, the hybrid females pair exclusively with small and hybrid tree finch males, whose song is also indistinguishable (Peters and Kleindorfer 2017). These three

10

populations of tree finches also have different foraging behaviors, segregating by foraging height

(Peters and Kleindorfer 2015).

In this dissertation, I focus on Darwin’s finch species on Santa Cruz and Floreana to address the following questions: 1) What is the diversity of bacterial taxa present in Darwin’s finch gut microbiomes? 2) How well do diet, habitat, and host phylogeny explain the variation observed in the gut microbiome? 3) Do hybrid tree finches harbor a unique bacterial profile in their gut microbiomes in comparison to the parental species? 4) Is there an island effect on the composition of the gut microbiome?

In Chapter 2, I comprehensively sampled nine Darwin’s finch species on Santa Cruz

Island, spanning the extremes of the ecological habitats. The microbial diversity was characterized using 16S rRNA sequencing. In addition to microbiome data, host phylogeny, stable isotope analysis, and first foraging observations in the field were used to place the microbiome characterization in their evolutionary and ecological context. Applying a variety of methods to examine the relationship between the gut microbiome and these factors, I detected a clear effect of habitat and correlation of the microbiome with both host phylogeny and dietary preferences, with foraging data uniquely explaining portions of the variation seen in the gut microbiome.

In Chapter 3, the recent introgression from the medium tree finch (Camarhynchus pauper) into the small tree finch (C. parvulus) on Floreana Island was used as a case study for the effect of species identity and genetic background on the gut microbiome. Due to the morphological similarity between small and hybrid tree finches, I genotyped and morphologically characterized all tree finches in combination with analyzing their microbiomes.

Though overall patterns in beta diversity were similar across genetic clusters, differential

11

abundance analysis with both 16S and metagenomic data demonstrated that the gut microbiomes of hybrid tree finches are more similar to those of their paternal species, the small tree finch.

In Chapter 4, I interrogated the remaining ground finch species from Floreana Island as an independent sample for signatures of co-diversification and performed a meta-analysis with the overlapping species between the two islands. Darwin’s finch gut microbiome samples were consistently different between the highland and lowland habitats on each island. Additionally, they were significantly different between the highland habitats of the two islands, but not the lowland habitats, signifying a clear environmental effect independent of host species, likely as a result of more divergent flora present in the highlands than the lowlands.

Chapter 5 provides a summary of the findings and implications of these results in the broader context of host-microbiome associations and the conservation of Darwin’s finches.

Altogether, this dissertation provides three separate but interconnected lines of evidence that are consistent with a model of microbiome assembly in which environmental filtering via diet and habitat are primary determinants of the bacterial taxa present with secondary influence from the evolutionary history between hosts. These studies demonstrate the necessity of comprehensive metadata for the correct interpretation of patterns observed in host-associated bacterial communities and provide foundational knowledge for future work in understanding the complex dynamics of microbiome assembly.

12

Chapter 2

Title: Diet, habitat, and host phylogeny shape the gut microbiome of Darwin’s finches

Authors: Wesley T. Loo1, Jefferson G. Loor2, Rachael Y. Dudaniec3, Sonia Kleindorfer4,

Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA,

USA; 2Manuel Cajías E14 122 y Toribio Hidalgo, Quito, Ecuador; 3Department of Biological

Sciences, Macquarie University, Sydney, NSW, Australia; 4School of Biological Sciences,

Flinders University, Adelaide, SA, Australia

Abstract

Darwin’s finches are an iconic example of an adaptive radiation, with well-characterized evolutionary history, dietary preferences, and biogeography. The gut microbiome is known to impact the health and development of hosts, but detecting the impact of evolutionary history on the composition of this microbial community is difficult to disentangle from other effects, such as diet and environment. Adaptive radiations offer the unique opportunity to test for the impact of different ecological niches among closely related species. We investigated the effect of host phylogeny on the gut microbiome using fecal samples from all nine species of Darwin’s finches on Santa Cruz in both lowland and highland habitats. To estimate dietary differences between species, blood samples were analyzed for !13C and !15N stable isotope values in addition to foraging data collected at each sampling site. Gut microbiome communities were characterized with the V4 region of the bacterial 16S rRNA gene. Across all samples, the bacterial phyla

Firmicutes, Actinobacteria, and Proteobacteria comprised the majority of sequences. While species did not cluster strongly by weighted UniFrac distances, Procrustes Analysis of Co-

Phylogeny (PACo) revealed moderate congruence between the finch phylogeny and the microbiome. Stable isotope values and foraging data also correlated with the microbiome communities, corroborating the interaction between host phylogeny and environmental factors in shaping the gut microbiome. With beta diversity through time analysis and variation partitioning, foraging data uniquely explained some of the variation in the microbiome. We present the first characterization of the gut microbiome in Darwin’s finches in conjunction with an analysis comparing the impact of host phylogeny and diet

14

Introduction

All animals evolved in a microbial world and have therefore formed relationships with their microbiomes, i.e., associated microbial communities (McFall-Ngai et al. 2013). In vertebrates, the gut microbiome plays many roles in host health and function, from effects on nutrition (Kohl et al. 2016) and metabolism (Li et al. 2008), to wider ranging effects on host behavior (Ezenwa et al. 2012) and immune system function (Hooper et al. 2012). The composition of the gut microbiome is thus an important biological attribute of all metazoans.

The evolutionary relationship between hosts and their microbiomes has been described by the term phylosymbiosis, which specifies that the composition of the microbiome is related to the phylogeny of the host without assuming that the microbial community is stable or vertically inherited (Brooks et al. 2016). While diet has been shown to strongly influence microbiome composition in a variety of taxa (David et al. 2014; Smits et al. 2017; Ben Yosef et al. 2017), host phylogeny also affects the bacterial taxa present in the gut (Groussin et al. 2017).

The microbiomes of wild animals offer a unique window into the relationship between microbiomes and their hosts, as they better represent the bacterial communities that have been associated with host species over evolutionary timescales given the impact of the environment on microbiome composition (Amato 2013; Hird 2017). Recent investigations into the relationship between animals and their microbiomes have shown evidence of co-divergence between host species and microbial communities in taxa as diverse as ant species to hominids (Sanders et al.

2014),(Moeller et al. 2016). However, animals (e.g., ) raised in captivity develop microbiomes distinct from their wild counterparts (Xenoulis et al. 2010), reducing their usefulness in distinguishing patterns of co-divergence between hosts and microbiomes though strong phylogenetic signals may still be detected (Ley et al. 2008a).

15

Birds are a successful group of vertebrates with unique life history traits, including hatching from a sterile egg and subsequent feeding via regurgitation, thus serving as an excellent group to study both ecological and evolutionary aspects of host-microbiome associations. While making up the majority of tetrapod diversity, birds are relatively understudied in microbiome research. There are fewer studies on avian microbiota compared to studies on mammals, with a majority focused on domesticated species, such as poultry, or potential pathogens in the fecal material of birds in urban areas (Danzeisen et al. 2011; Kohl 2012). However, recent studies of the avian microbiome have found that taxonomic categories and the phylogenetic relationships of the host species had the highest correlation with microbiome composition when compared with life history traits and ecological variables (Hird et al. 2015; Kropáčková et al. 2017).

Adaptive radiations provide the opportunity to investigate the influence of evolutionary and ecological diversification within a group of closely related species. Typically, species in adaptive radiations evolve to occupy different ecological niches, with varying diets and behaviors in sympatric species(Schluter 2000). A recent study of the different ecotypes within the adaptive radiation of Anolis lizards in Florida and Puerto Rico found that the microbiome composition was partly explained by phylogeny, despite high intraspecific variation (Ren et al.

2016).

Darwin's finches in the Galápagos archipelago offer a natural experiment for investigating the effects of host phylogeny, diet, habitat, and life history on the bacterial composition of the microbiome. The fourteen currently recognized species diverged on the order of the last 1.5 million years across the islands and now occupy a variety of different habitats and with varying diets. Their evolutionary history has been extensively studied using microsatellites, mitochondrial DNA, and whole-genome resequencing (Petren et al. 2005; Lamichhaney et al.

16

2015). Morphologically, the species are split between the ground finches and tree finches, which typically reside in the lowland and highland habitats, respectively, though not exclusively. These morphological differences correlate with dietary niches including granivorous, frugivorous, and insectivorous (Grant 1999). Within the archipelago, all species are sympatric with at least one other species, which affords the ability to compare the microbiomes of individuals captured in the same environment.

Santa Cruz is the second largest island in the Galápagos archipelago and is located near the center of the island group (area: 986 km2; 0°37'S, 90°21'W). Study sites range in ecological features from arid lowlands to moist highlands (Galligan et al. 2012). This study focused on the lowland and highland habitats to maximize the difference in expected diets of the birds sampled.

Lowland habitats range from the coast to 120 m in elevation, receive less than 250 mm of rainfall in a typical year, and are characterized by Opuntia cacti species, brush, and deciduous trees. In contrast, highland habitats range from 600 m to 800 m above sea level, usually receive more than

700 mm of rainfall per year, and are a forest dominated by the non-deciduous Scalesia pedunculata.

Nine Darwin finch species are extant on Santa Cruz Island across the highlands and lowlands (Table 2.1) and are typically separated into two groups: the ground finches in the genus

Geospiza (small, medium, large, and cactus) and the tree finches in the genus Camarhynchus

(small, large, and woodpecker) while the vegetarian finch and warbler finch are outside of these groupings with no close sister species (Figure 2.1A). Ground finches predominantly feed on seeds with the exception of the cactus finch, which like the vegetarian finch, feeds on flowers and leaf material. Tree finches are largely arboreal feeders and diet varies from omnivorous in small tree finches with increasing insectivory from large tree finch to to

17

warbler finch (Grant 1999; Tebbich et al. 2004). The populations of these species are well studied on Santa Cruz, including the genetics of the small ground finch (Kleindorfer and

Chapman 2006; Galligan et al. 2012; Chaves et al. 2016) and medium ground finch (de León et al. 2010), and foraging patterns of the four tree finch species (Tebbich et al. 2004).

18

Table 2.1. Darwin’s finch species sampled for this study, including habitats and specimen number, organized by habitat

Common Name Abb. Scientific Name Highland Lowland Total per Samples Samples species Small Ground finch SGF Geospiza fuliginosa 8 5 13 Medium Ground finch MGF Geospiza fortis 1 6 7 Large Ground finch LGF Geospiza magnirostris 8 8 Cactus finch CF Geospiza scandens 6 6 Small Tree finch STF Camarhynchus parvulus 9 1 10 Large Tree finch LTF Camarhynchus psittacula 3 3 Woodpecker finch WPF Camarhynchus pallidus 5 5 Vegetarian finch VF Platyspiza crassirostris 4 4 Warbler finch WF Certhidea olivacea 7 7 Total 33 30 63 19

Figure 2.1 Relative abundance of bacterial phyla in the gut microbiota of Darwin’s finch species.

A) Phylogeny of Darwin’s finch species on Santa Cruz island based on whole-genome resequencing (Lamichhaney et al. 2015) and the mean relative abundance of bacterial phyla across all gut microbiome samples of each species. Species abbreviations are given in Table 2.1.

B) Relative abundance of the bacterial phyla in individual microbiome samples grouped according to species and habitat. Any bacterial taxa with mean relative abundance within species below 1% was omitted from both plots.

20

21

Here we investigate the diversity of the gut microbiomes of nine sympatric species of

Darwin’s finches across both highland and lowland habitats on Santa Cruz Island. Given the well-studied evolutionary relationship between these species, we examine the signal of co- diversification between host species and their gut microbiomes. The distribution of the small ground finch across both habitats provides an opportunity to determine the influence of the environment on the composition of the gut microbial community. Using stable isotope measurements and foraging observations, we quantify the dietary differences between species and interrogate the impact of host phylogeny and diet on the observed variation between microbiomes. The study contributes to the understanding of co-diversification between bird hosts and microbiomes, specifically discerning the effects of host phylogeny, habitat, and diet in one of the best studied adaptive radiations known to biology.

Materials and Methods

Ethics statement

All samples were collected with permission from the Parque Nacional Galápagos and

Ministerio del Ambiente, Ecuador (Research permit No. PC-23-16). All collection protocols were approved by the Institutional Animal Care and Use Committee in the Faculty of Arts and

Sciences at Harvard University (Protocol 15-08-249).

Study sites and species

Fieldwork was conducted in February 2016 on Santa Cruz Island, Galápagos

Archipelago, Ecuador. Sampling sites were located in both highland (0°37’S, 90°23’W) and lowland (0°40’S, 90°13’W) habitats. All nine extant species on Santa Cruz were sampled and the

22

number of samples per species per habitat are detailed in Table 2.1. Three species were only sampled in the lowlands: the cactus finch (Geospiza scandens; n=6), the large ground finch

(Geospiza magnirostris; n=8), and the vegetarian finch (Platyspiza crassirostris; n=4) and three species were only sampled in the highlands: the large tree finch (Camarhynchus psittacula; n=3), the woodpecker finch (Camarhynchus pallidus; n=5), and the warbler finch (Certhidea olivacea; n=7). Two species had a single sample in the second habitat - the medium ground finch

(Geospiza fortis; n=6L/1H) and the small tree finch (Camarhynchus parvulus; n=1L/9H) - while the small ground finch (Geospiza fuliginosa) was the only species with multiple samples in both habitats (n=5L/8H).

Sample Collection

Finches were caught using mist nets and tagged with an aluminum ring imprinted with a unique identifier. Eight morphological traits and mass were measured as previously described

(Kleindorfer et al. 2014a). Individuals were classified into species using the morphological measurements and established protocols (Lack 1947; Grant 1999; Kleindorfer et al. 2014a).

Blood samples (10 µl) were collected stable isotope analyses. Samples for stable isotope analysis were dried on small pieces (roughly 0.5 x 0.5 cm2) of quartz fiber filter paper (Schleicher and

Shuell, Dassel, DE) and stored in microcentrifuge tubes with a silica gel bead as desiccant at room temperature.

After morphological measurements and blood sample collection, fecal samples were collected by placing each finch into a 7" x 7" x 7" cage lined with UV-sterilized parchment paper. Cages were covered with fabric and finches were monitored until defecation for a maximum of 30 min before release. Feces were immediately transferred from the parchment paper with bleach cleaned spatulas into pre-weighed microcentrifuge tubes containing 1 ml of

23

DNA/RNA Shield (Zymo Research, Irvine, CA) and mixed by shaking the tubes by hand before storage at -20°C within 4 hours of collection to prolong the longevity of the DNA stabilization buffer. Fecal samples were shipped at room temperature and stored at -80°C in the lab until further analysis.

Foraging observations

To quantify the diet patterns across species in both habitats, first foraging observations were collected at both highland and lowland sampling sites (Kleindorfer and Chapman 2006;

Christensen and Kleindorfer 2009). At each site, a single walk through of one hour was conducted with no overlaps or doubling back to avoid observing the same individuals. During the walkthrough, individual finches were observed until the first food item was ingested. The food item consumed was recorded as one of five categories: insect, seed, flower, leaf, or fruit.

Due to the tame nature of Darwin’s finches, the majority of observations were made within 8 m of the focal individual.

Fecal DNA Extraction and 16S rRNA gene sequencing

DNA was extracted from feces in the laboratory using the ZR Fecal Miniprep kit (Zymo

Research, Irvine, CA) following manufacturer's instructions with the following changes. To minimize loss of biological material, BashingBeads (Zymo Research, Irvine, CA) were added directly to the collection tubes with the fecal sample in DNA/RNA Shield, which acted as the lysis buffer. Samples were homogenized in a FastPrep FP120 (Qbiogene, Carlsbad, CA) for six rounds of 45 s at speed 6.5 m/s. Between each round, tubes were cooled on ice for 3 min. All liquid transfer steps were performed in a laminar flow hood to minimize environmental contamination.

24

The V4 region of the 16S rRNA gene was amplified using NEBNext Q5 HotStart HiFi

MasterMix 2x (New England Biolabs, Ipswich, MA) and previously designed dual-index barcoded universal primers (Kozich et al. 2013). Briefly, samples were amplified in triplicate 25

µl PCR reactions for 20 cycle and purified with Aline PCRClean DX (Aline Biosciences,

Woburn, MA) before being pooled in equimolar concentrations and sequenced with the Illumina

MiSeq platform (Illumina, USA). See Supplementary Material for details.

Contamination controls

Given the low DNA content of bird feces (Vo and Jedlicka 2014), we were concerned about the influence of environmental microbial contamination in analyzing the sequences (Salter et al. 2014). To understand the sources of contamination, controls were included at each step of sample processing: DNA extraction and amplification. To evaluate contaminants from the DNA extraction kits, for each kit we included a mock community of bacterial cells using 75 µl of

ZymoBIOMICS Microbial Community Standard (Zymo Research, Irvine, CA) and a no sample extraction control with only DNA/RNA Shield. To assess contaminants from PCR amplification reagents, for each 96-well plate of PCR amplification, a mock community of bacterial DNA was amplified in triplicate with 2 µl of a 1:10 dilution of ZymoBIOMICS Microbial Community

DNA Standard (Zymo Research, Irvine, CA) and a triplicate no template control reaction.

Greater than 99.75% of all reads from ZymoBIOMICS Microbial Community standards and

DNA standards mapped to the expected genera. None of the extraction or no template controls produced quantifiable PCR product and were excluded from further analysis.

25

Sequence processing

Sequences were demultiplexed according to the dual-index barcode by the Harvard

Biopolymers Facility (Boston, MA) and all the following sequence processing steps were performed in R version 3.4.0 (R Core Team 2014). The fastq files for each sample were converted into Ribosomal Sequence Variants (RSVs) using DADA2 with parameters as described in (Callahan et al. 2016). RSVs were taxonomically classified with the RDP v14 training set (Cole et al. 2009) and chimeras were removed as implemented in DADA2. After initial processing a total of 1,865,835 reads and 6,225 RSVs were identified across all samples.

Sequence filtering

The following steps were taken to produce the final dataset for analysis. To remove likely environmental DNA contaminant sequences, the frequency based decontam algorithm (Davis et al. 2017) was applied to the dataset, which removed 26,484 reads (1.42 %) and 122 RSVs

(1.96%). To reduce the influence of RSVs present in only a few samples, a 5% prevalence filter was applied, which removed 57,335 (3.12%) and 2,570 RSVs (42.11%). After taxonomic assignment, any sequences not classified as Bacteria were removed, subtracting 2,518 reads

(0.14%) and 23 RSVs (0.65%). Finally, sequences classified as Chloroplasts were removed, subtracting 107,875 reads (6.06%) and 36 RSVs (1.03%). The final dataset included 1,671,623 reads and 3474 RSVs.

Rarefying reads

To ensure sample library sizes were not driving the patterns observed in the data, the following categorical variables were checked for significant differences in mean library size

26

using the Kruskal-Wallis rank sum test and library size distribution using Levene’s test as implemented in the R package car (Fox and Weisberg 2011): species, habitat, sex, and PCR plate. None of the variables were significantly different in mean library size or library size distribution (Table S1). Therefore, for increased statistical power in detecting differences between microbiome samples, all following analyses were performed with non-rarified microbiome data (McMurdie and Holmes 2014).

Alpha diversity analyses

To calculate the relative abundance of bacterial phyla present in the gut microbiome of each Darwin’s finch species, reads were transformed to proportions by sample and then averaged across all microbiome samples per finch species. For species richness estimates, observed RSVs and Chao1 estimates were calculated in the R package phyloseq (McMurdie and Holmes 2013) while phylogenetic diversity was calculated with the R package picante (Kembel et al. 2010).

The Chao1 estimate uses information on the frequency of rare species to estimate the total number of species in the assemblage, including undetected species (Chao 1984). Phylogenetic diversity sums the total branch length of the resulting bacterial phylogeny (Faith 1992). To check for phylogenetic signal of the alpha diversity, Pagel’s lambda was calculated using a

Markov-chain Monte Carlo generalized linear mixed-effects model as implemented in the R package MCMCglmm (Hadfield 2010). The phylogenetic signal was calculated as a random effect, incorporating both variation between species as well as within species between the multiple measurements. The model was run with 5,000,000 iterations, a 10,000 step burn in, and

500 step thinning intervals.

27

Beta diversity analyses

To visualize differences between microbiome samples, double principal coordinate analysis (DPCoA) was applied to the log-transformed RSV table as implemented in the R package phyloseq (McMurdie and Holmes 2013). DPCoA is a dissimilarity metric which incorporates both quantitative and phylogenetic information about the microbiome samples

(Pavoine et al. 2004). To assess the differences in community composition of the gut microbiomes between samples, weighted UniFrac distances (Lozupone et al. 2010) were calculated between all samples. All abundance data were log transformed prior to distance calculations as an approximate variance stabilization method. To check for the homogeneity of the multivariate dispersions of the distance metrics, the betadisper function was used as implemented in the R package vegan (Oksanen et al. 2017). To test the significance of categorical variables, permutational analysis of variance (PERMANOVA) was used as implemented in the R package vegan function adonis (Oksanen et al. 2017).

Comparative metadata analysis

Procrustean approach to co-phylogeny (PACo)

To assess congruence between the phylogenetic diversification of Darwin’s finches and their gut microbiomes, the Procrustean approach to co-phylogeny (PACo) (Balbuena et al. 2013) was applied to the data. PACo was designed to detect the similarity of evolutionary patterns in host-parasite associations. Here, the microbiome samples are treated as the ‘parasites’ to compare with the host species genetic distances. Darwin’s finch species’ genetic distances for the

PACo analysis were based on whole genome resequencing encompassing more than 44 million variable sites with representative individuals chosen from Santa Cruz when available

28

(Lamichhaney, personal communication,(Lamichhaney et al. 2015). Microbiome distances were calculated using the weighted UniFrac metric (Lozupone et al. 2010), to produce a quantitative distance comparison that incorporated the phylogeny of the microbial community. PACo analysis was run as implemented in the R package paco (Hutchinson et al. 2017), with 10,000 permutations to test the significance of the signal. Using the symmetric calculation, the correlation coefficient r was calculated as r = (1-ss).

As three samples lacked stable isotope data and one sample lacked foraging data, a total of four samples were excluded from PACo analysis with the finch phylogeny.

Stable isotope analysis

To assess differences in diet between the finches sampled, stable isotope analysis was performed using blood samples dried on quartz fiber filter paper. These were packaged in 5 x 9 mm tin capsules for analysis (041077, Costech Analytical Technologies, Inc, Valencia, CA).

!13C and !15N values were measured on a Thermo Scientific Delta V paired with a Costech 4010 elemental analyzer and a high-temperature conversion elemental analyzer at the Center for Stable

Isotopes at the University of New Mexico (Albuquerque, NM). A known protein standard was run at multiple concentrations as a run to run control. !13C values were adjusted by the mean difference between the measured values for the protein and the known value (-1.64 per mil).

!15N values for samples below 1000 mV were error corrected using a linear regression on the protein standard (R2 = 0.77).

Two small tree finches and one large ground finch did not have stable isotope samples collected and were excluded from all analyses using stable isotope ratios.

PACo analysis was also used to calculate the correlation between stable isotope values and microbiome community composition. Euclidean distance matrices were calculated between

29

the stable isotope signatures of each sample and between the mean values of stable isotopes in each species and each habitat combination. The stable isotope distance matrices were used in

PACo analyses of the the microbiome weighted UniFrac beta diversity metric.

Foraging data analysis

To summarize the foraging data, the food items seed, flower, leaf, and fruit were combined into the category plant. The proportion of plant and insect food items therefore sum to

1. These observations provide knowledge of the broad diet patterns for each finch species in both habitats. Since no observations were made of small tree finches in the lowlands, the single small tree finch sample from the lowlands was excluded from any analysis using foraging data.

For PACo analyses and analysis of beta diversity through time (see below), Euclidean distances were calculated between the proportion of observations across the five food items. For variation partitioning, principal component analysis was applied to the proportion data using the function rda as implemented in the R package vegan (Oksanen et al. 2017).

Beta diversity through time (BDTT) analysis

To further disentangle the contribution of host phylogeny and diet to the microbiome composition, the beta diversity through time (BDTT) metric (Groussin et al. 2017) was applied to the dataset. Only species and habitat combinations with multiple samples were included in the analysis, which removed the medium ground finch sample from the highlands. Only the small ground finch had multiple samples in both habitats and therefore was split into highland and lowland microbiome profiles. The mean relative abundance of RSVs was calculated across all samples of each species and habitat combination. BDTT was calculated as described in

(Groussin et al. 2017). Briefly, the RSV sequences were aligned using the SINA algorithm

30

(Pruesse et al. 2012) and all sites with more than 95% gaps were removed. The bacterial phylogenetic tree was calculated with FastTree (Price et al. 2010) using the GTR model and default CAT approximation to model rate heterogeneity across sites and with the constraint that all bacterial phyla and all classes within Proteobacteria were monophyletic. The root of the tree was placed between Actinobacteria and the rest of the taxa. PATHd8 (Britton et al. 2007) was used to time-calibrate the tree with a maximum age of 3.8 Gya for the root as an estimate of the earliest signs of life (Dodd et al. 2017). The time calibrated bacterial phylogeny was sliced in units of 10 Mya and Spearman correlations of Bray-Curtis dissimilarity between species were calculated against three distance matrices using Mantel tests: host phylogenetic distance,

Euclidean distance between the mean stable isotope values, and Euclidean distance between foraging data across the five food items.

Variation partitioning

To compare the amount of variation explained by host genetic distance, stable isotope values, and diet distance as calculated from first foraging observations, variation partitioning by redundancy analysis (Legendre 2008) was used as implemented with the varpart function in the

R package vegan (Oksanen et al. 2017). The microbiome distance matrix was used as a response variable with three explanatory tables: the first two principal coordinate axes of the host genetic distance, the !13C and !15N stable isotope values, and the first two principal component axes of the first foraging observations. Significance of the distance based redundancy analysis was assessed using the anova.cca function implemented in vegan.

31

Random forest analysis

To determine whether the gut microbiome communities could differentiate between categorical variables of interest, a random forest classifier (Breiman 2001) was applied to the

RSV table as implemented in the R package randomForest (Liaw et al. 2002), using leave-one- out cross validation. RSVs with the top importance for classification were determined by calculating the mean decrease in accuracy across all models. Accuracy was defined by the number of samples correctly classified based on the category of interest, which was calculated along with confusion matrices in the R package Caret (Kuhn et al. 2017).

Results

All nine species of Darwin’s finches on Santa Cruz Island were sampled over February

2016, with a total of 63 fecal samples successfully sequenced (Table 2.1). Three species were sampled in the lowland habitat exclusively: the cactus finch (G. scandens), the large ground finch (G. magnirostris), and the vegetarian finch (P. crassirostris), while three species were sampled in the highland exclusively: the large tree finch (C. psittacula), the woodpecker finch

(C. pallidus), and the warbler finch (C. olivaceae). Three species were sampled in both highland and lowland habitats: the small ground finch (G. fuliginosa), the medium ground finch (G. fortis), and the small tree finch (C. parvulus). The small ground finch was the only species with multiple samples from both habitats.

Using next-generation sequencing, a total of 1,671,623 sequences were generated across all samples (mean = 26,367; range = 2,196 to 63,974) across a total of 3,474 ribosomal sequence variants (RSVs) (mean = 685, range = 105 to 1,483). Sequence numbers were not significantly

32

different across variables of interest and all following analyses are based on the non-rarified data for increased statistical power (see Methods – Rarefying reads).

Alpha diversity analyses

A total of nineteen bacterial phyla were detected in the gut microbiome in Darwin’s finches. Using DADA2, all sequences were assigned to ribosomal sequence variants (RSVs) and taxonomically classified using the RDP v14 database (Cole et al. 2009). Across all samples, the bacterial phyla Firmicutes, Actinobacteria, and Proteobacteria composed the majority of the sequences, representing 35%, 31%, and 28% of RSVs respectively. Bacterial taxa unclassified at the phylum level made up 4% of the sequences while all other bacterial phyla detected were represented by less than 1% of sequences across all samples (Table S2). Comparing across host species, the cactus finch and small ground finch had the highest proportion of Firmicutes (82% and 53% respectively), while the large ground finch had the highest proportion of Actinobacteria

(56%) (Figure 2.1A; Table S3). There was significant variation between individual finches within a species and marked differences between intraspecific individuals across habitats (Figure

2.1B).

At lower bacterial taxonomic levels, a few genera comprise a significant portion of total sequences (Figure S1; Table S4). Notably, the bacterial genus Lactobacillus in the phylum

Firmicutes is the most abundant genus across all samples with a mean relative abundance of

26%. Calculation of relative abundance by species demonstrates that it is the most abundant bacterial genus in five Darwin finch species: small ground finch, medium ground finch, cactus finch, small tree finch, and large tree finch (Table S5). In the cactus finch, Lactobacillus comprise the majority of sequences at 77%. The most abundant bacterial genera in other finch species were Diplorickettsia, Rhodospirillum, and Acinetobaacter from the phylum

33

Proteobacteria in the warbler finch (14%), woodpecker finch (16%), and vegetarian finch (17%), respectively while Kocuria from the phylum Actinobacteria was the most abundant genus in the large ground finch (23%).

To estimate the total diversity present in the gut microbiome, three metrics were calculated across all samples (mean ± SE): observed RSVs (679.7 ± 41.9), Chao1 (879.5 ± 47.4), and phylogenetic diversity (60.7 ± 2.5). Estimates between species were not significantly different across all alpha diversity measures (Table S6 and Table S7). To determine the contribution of shared evolutionary history on the microbial alpha diversity, Pagel’s " was calculated for all three metrics using a generalized linear mixed model where host phylogeny and intra-specific variation were both random effects. In all cases, values of Pagel’s " were relatively small with large 95% confidence intervals: observed RSVs (0.10; 0 to 0.69), Chao1 (0.10; 0 to

0.68), and phylogenetic diversity (0.14, 0 to 0.74).

Beta diversity analyses

To visualize differences in the beta diversity of microbiomes across samples, double principal coordinate analysis (DPCoA), a phylogenetic ordination method that allows both the samples and the RSVs to be plotted in the coordinate space, was applied to the log-transformed

RSV table. A single warbler finch was an outlier based on DPCoA and was excluded from further anlaysis (Figure S2). DPCoA demonstrated a demarcation between samples from the highland or lowland habitats, but not clear clustering by species (Figure 2.2A). Indeed, even within a single species, the small ground finch, there was a separation between those collected in the highland or lowland habitats. Plotting the RSVs in the same ordination space, the bacterial phyla Actinobacteria and Proteobacteria appeared to drive the location of the highland and lowland samples, respectively (Figure 2.2B).

34

Figure 2.2 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities.

A) Gut microbiome samples are plotted on the first two principal coordinate axes, with point color and point shape indicating host species and habitat, respectively. A clear demarcation between highland and lowland samples can be seen along the first axis, with lowland samples and highland samples mostly to the right and left, respectively.

B) Individual ribosomal sequence variants (RSVs) are plotted in the same ordination space as the gut microbiome samples. Bacterial phyla with at least 1% relative abundance across samples are color-coded; all other RSVs are gray. The demarcation between highland and lowland samples is recapitulated by the RSVs, with Proteobacteria and Actinobacteria corresponding to highland

(left) and lowland (right) samples, respectively.

35

To examine if differences in community composition of Darwin’s finches gut microbiome varied with categorical variables of interest, permutational analysis of variance

(PERMANOVA) was used as implemented in the vegan function adonis (Oksanen et al. 2017).

Weighted UniFrac distances were used as the response variable with the following categorical variables: habitat, species, sex, and PCR plate. Since not all species occur in both habitats, a nested PERMANOVA structure was used, where permutations amongst species were restricted by habitat. Both habitat and species showed significant differences in the weighted UniFrac distances amongst categories (R2 = 0.15 and 0.21, respectively; p = 0.033 for both), while sex and PCR plate did not (Table S8). To determine which species were most dissimilar, pairwise post hoc anova analysis was calculated with a Bonferroni correction for multiple hypothesis testing. Three pairs of species were significantly different: warbler finch vs large ground finch

(R2 = 0.28; adjusted p = 0.036), vegetarian finch vs small tree finch (R2 = 0.31; adjusted p =

0.036), and small tree finch vs large ground finch (R2 = 0.22; adjusted p = 0.036). All other pairwise comparisons did not show significant differences in weighted UniFrac distances (Table

S9). Significant differences as detected by PERMANOVA may be driven by differences in the dispersion, or multivariate variance, of the samples. To check whether dispersion was driving the difference between habitats and species, the PERMDISP2 metric (Anderson 2006) was calculated with the betadisper function in the R package vegan (Oksanen et al. 2017).

Microbiome samples were significantly different in their dispersion depending on habitat (F value = 7.37, p = 0.01), but not depending on species (F value = 1.78, p = 0.10) (Figure S3).

Testing co-phylogeny of Darwin's finches and their microbiomes

To test congruence between the phylogeny of Darwin’s finches and their gut microbiomes, the correlation between the weighted UniFrac distances of the microbiome

36

samples and the genetic distances of the host species was tested using the Procrustean approach to cophylogeny (PACo) (Balbuena et al., 2013). PACo analysis showed significant correlation between the host phylogeny and the microbiome (R2=0.18, p < 0.001) (Table 2.2). Visualizing the Procrustes transformation, most microbiome samples (circles) are closest to their respective hosts (arrowheads), demonstrating the congruence of the two distance matrices (Figure S4A).

PACo analysis was also applied to the stable isotope and foraging data (see below). Foraging data had a similar procrustean correlation coefficient (R2=0.18, p < 0.001) while stable isotope values were lower (R2=0.06, p =0.013)

Table 2.2. Procrustes Analysis of Co-Phylogeny.

Host phylogeny Stable isotope* Diet** R2 0.178 0.064 0.175 p-value <0.0001 0.013 <0.0001

* Euclidean distances were calculated between d13C and d15N values for all microbiome samples

** Euclidean distances were calculated based on proportion of food items in first foraging observations

Stable isotope (!13C and !15N) analysis

To differentiate between potential dietary differences between and within species, stable isotope ratios (! 13C and ! 15N) were determined using blood samples collected from each finch used for microbiome analyses. The carbon isotope values for the tree finches (mean ! 13C = -25 to -26.7‰) were lighter than those for the ground finches (mean ! 13C = -21.9 to -24.5‰) with the cactus finch having the heaviest signal (mean ! 13C = -16.2‰) (Figure 2.3A; Table 2.3). The

!15N measurements showed less of a pattern between the ground and tree finches; however, the cactus finch exhibited the most enriched stable isotope value (!15N = 11.4‰). The mean values

37

of all other finch species ranged from !15N 8.1‰ to 10.5‰. Stable isotope values showed a distinction by habitat of origin. Lowland samples generally had lighter carbon isotope values and heavier nitrogen isotope values, even within species such as the small ground finch. (Figure

2.3B). The single samples for medium ground finch in the highlands and small tree finch in the lowlands also follow this pattern, clustering more closely by habitat than by species.

Figure 2.3. !13C and !15N stable isotope measurements for Darwin’s finch species.

Point color and point shape indicate host species and habitat, respectively. A) Individual !13C and !15N values for each finch with gut microbiome samples. B) Mean !13C and !15N values for each species and habitat with standard deviation. The small ground finch (SGF) is the only species with multiple samples per habitat and has distinct !13C and !15N values dependent on the habitat of origin.

38

Table 2.3. Stable isotope (! 13C and ! 15N) ratios by Darwin’s finch species

Species Habitat ! 13C mean ! 13C SD ! 15N mean ! 15N SD (‰) (‰) (‰) (‰) H -20.9 5.3 6.9 1.5 SGF L -23.6 0.7 10.1 1.8 H -24.6 -* 8.7 -* MGF L -23.0 2.0 10.1 1.1 LGF L -23.3 1.0 9.3 0.7 CF L -16.2 1.6 11.4 1.2 H -25.2 1.4 7.7 1.0 STF L -23.7 -* 12.9 -* LTF H -25.7 0.2 8.9 0.5 WPF H -25.6 0.8 9.2 0.8 VF L -24.5 0.2 10.5 1.0 WF H -26.7 0.7 9.3 0.3

* Single sample for this species/habitat so standard deviation was not calculated

First foraging observations

To quantify the diet patterns across species in both highland and lowland habitats, first foraging observations were collected at the same sampling sites. A total of 201 observations were collected across all species. Based on these observations, lowland finch diets are characterized by a high percentage of seeds, flowers, and leaves while highland finch diets are characterized by insects, with two species, the warbler finch and the woodpecker finch, being exclusively insectivorous (Table 2.4; Figure S5). Within a species, the diet can shift between habitat as shown by the medium ground finch which includes more insects in its diet in the highlands, while flowers and leaf material were only observed to be consumed in the lowlands.

For the small ground finch, its diet was similar across both habitats, excepting an exchange of flowers over fruit in the lowland and highland, respectively.

39

Table 2.4. First foraging observations across all Darwin’s finch species and both habitats

Counts (n) Proportion of counts (%) Summary (%) Total Sp. Hab (n) Flower Fruit Leaf Seed Insect Flower Fruit Leaf Seed Insect Plant* Insect L 12 2 - - 6 4 17 - - 50 33 67 33 SGF H 25 - 3 - 15 7 - 12 - 60 28 72 28 L 20 5 - 2 12 1 25 - 10 60 5 95 5 MGF H 12 - - - 6 6 - - - 50 50 50 50 LGF L 11 - - 1 9 1 - - 9 82 9 91 9 CF L 6 3 - 3 - - 50 - 50 - - 100 0 STF H 41 - - - 4 37 - - - 10 90 10 90 40

LTF H 15 - 3 - - 12 - 20 - - 80 20 80 WPF H 19 - - - - 19 - - - - 100 0 100 VF L 6 2 - 4 - - 33 - 67 - - 100 0 WF H 34 - - - - 34 - - - - 100 0 100

* The category ‘Plant’ is the sum of all plant derived food items (flower, fruit, leaf, and seed).

To summarize diet differences, the proportions for food items seed, flower, fruit, and leaf were combined into the food category plant to compare with the proportion of insect consumption. Using the broad diet categories of plant vs insect, the cactus finch and vegetarian finch were only observed consuming plant material, while the woodpecker finch and warbler finch were only observed consuming insects. The small ground finch, small tree finch, and medium ground finch fell along the spectrum between herbivore and insectivore. By overlaying the broad diet categorization on the DPCoA plot, it is clear that habitat and diet are interconnected, with highland species consuming more insects and lowland species consuming more plant material (Figure 2.4).

41

Figure 2.4. Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome samples overlaid with foraging data.

Darwin’s finch gut microbiome samples are plotted with DPCoA ordination and facetted by finch species. The color of each point corresponds to the proportion of the diet coming from plants vs insects as estimated by foraging observations. Points are colored gray if no foraging observations were collected of that species and habitat combination.

42

Beta diversity through time (BDTT) analysis

To test whether the correlations observed with PACo are dependent on more recent or more ancient bacterial lineages, the recently developed beta diversity through time (BDTT) analysis (Groussin et al. 2017) was applied to the microbiome data. BDTT analysis samples the bacterial phylogeny at given intervals, providing a correlation profile between the bacterial taxa and corresponding metadata at different phylogenetic resolutions. Profiles that show high correlation coefficients further back in evolutionary time indicate that more ancient bacterial lineages are driving the observed correlation. In contrast, profiles with high correlation coefficients at more recent dates but low correlation further back in time signify that recent bacterial diversification is responsible for the similarity between the microbiome and metadata.

Taking the mean relative abundance across samples for each species with the small ground finch split by habitat, an ultrametric bacterial phylogeny was calculated as previously described. The bacterial phylogeny was subdivided into slices every 10 million years and a new RSV table calculated by collapsing the RSVs into taxonomic units at that age. Beta diversity distances were then calculated using the Bray-Curtis dissimilarity index and a Mantel test run against each of the explanatory distance matrices: the finch phylogeny, stable isotope values, and foraging data.

The correlation coefficient for each slice was then plotted against time.

The finch phylogeny and foraging data had the highest correlation coefficients with the beta diversity distance, though at slightly different time scales. For all three explanatory variables, the correlation coefficient remained below 0.15 until 2000 mya, indicating a lack of sufficient bacterial diversity prior to this time point (Figure S6). The finch phylogeny and foraging data both increased past R2 = 0.20 around 750 mya (finch phylogeny: 690 mya, R2 =

0.23, p = 0.003; foraging data: 770 mya, R2 = 0.22, p = 0.007) and remain between R2 = 0.20 and

43

R2 = 0.25 until present, with the maximums occurring at 30 mya (R2 = 0.26, p = 0.001) and 350 mya (R2 = 0.25, p = 0.002), respectively (Figure 2.5). The maintenance of the correlation well beyond the divergence time of Darwin finch species (~ 1 mya) indicates that the correlation seen at present is due to ancient bacterial lineages instead of recent bacterial evolution in conjunction with host speciation. Conversely, the correlation coefficient between beta diversity distance and stable isotope values was less than 0.10 across all time points (peak at 1920 mya; R2 = 0.10; p=0.18).

Beta Diversity Through Time

0.25

0.20 Correlated Data

2 Diet−Foraging

R 0.15 Diet−Stable Isotope Finch Phylogeny

0.10

0.05

0 500 1000 1500 2000 Millions of years

Figure 2.5. Beta diversity through time applied on the gut microbiome data.

Lines show pairwise Sorensen dissimilarities of Darwin’s finch gut microbiome samples determined by time slices every 10 Mya correlated to pairwise dietary distances calculated with first foraging observations (red), pairwise dietary distances using !13C and !15N stable isotope measurements (green), and pairwise host phylogenetic distances (blue).

44

Variation partitioning

Variation partitioning was used to compare the amount of variance in the microbiome explained by the finch phylogeny, stable isotope values, and foraging data. This method takes a response variable, in this case the weighted UniFrac distances between gut microbiome samples, and calculates how multiple explanatory tables correlate with the variation seen in the response variable. Total explained variance with all three explanatory tables had an adjusted R2 = 0.16

(Figure 2.6). Foraging data had the highest correlation with the microbiome samples (adjusted R2

= 0.175, p = 0.001) while the finch phylogeny (adjusted R2 = 0.066, p = 0.010) and stable isotope values (adjusted R2 = 0.072, p = 0.009) showed a weaker correlation and were comparable in their explained variance. The total explained variance is lower than the sum of the individual explanatory tables due to the overlapping nature of the partitions. After controlling for variation explained by the overlap between explanatory tables, only foraging data uniquely explained variation in the microbiome (adjusted R2 = 0.053, p = 0.014).

45

Figure 2.6. Variation partitioning of Darwin’s finch gut microbiome samples by finch phylogeny, stable isotope values, and foraging data.

Results of variation partitioning using weighted UniFrac distances between gut microbiome samples against Darwin’s finch phylogeny (first two principal coordinate axes), stable isotope values (!13C and !15N values), and foraging data (first two principal component axes) visualized with a Venn diagram. Adjusted R2 values for each component are plotted inside the circles. All testable components include the p-value calculated using distance based redundancy analysis.

Adjusted R2 values in [a], [b], and [c] are the amount of variation explained uniquely by the corresponding explanatory table. Parts [d], [e], and [f] are amounts of variation that can be explained by either table in the overlap and part [g] is shared by all three tables. Foraging data is the only table with a positive adjusted R2 value after controlling for overlapping variance.

46

Random forest classification of gut microbiome communities

To determine how well the gut microbiome community can distinguish between habitats and host species, a random forest algorithm was applied to the samples, which creates a classification model using the log transformed abundance of each RSV across samples (Breiman

2001; Statnikov et al. 2013). The algorithm was applied to the dataset twice, classifying samples by habitat or host species using leave-one-out cross validation, where one sample is used as the test data and the remaining samples are used the training data. Classification of the gut microbiome samples by habitat was significantly higher than the no information rate, or the percentage of the largest class, with only two highland samples misclassified as coming from lowland finches (accuracy = 0.96, no information rate = 0.55, p<0.0001). Conversely, classifying samples by host species was unsuccessful, with the accuracy of the model equal to the no information rate (accuracy = 0.23, p=0.55) (Figure S7).

Random forest classification allows RSVs to be ranked by their importance in the classification, signaling bacterial taxa unique to a category. The top 10 RSVs for habitat classification ranked by the mean decrease in accuracy across all 57 models are shown in Table

2.5. In concordance with the DPCoA plot, these RSVs are all in the bacterial phyla

Actinobacteria or Proteobacteria. Plotting the abundance of the most important RSV (phylum

Actinobacteria, genus Tetrasphaera) shows its presence in most lowland samples and absence in most highland samples (Figure S8).

47

Table 2.5 Top ribosomal sequence variants (RSVs) for classifying gut microbiome samples by habitat of origin using the random forest algorithm

Phylum Family Genus meanMDA* Actinobacteria Intrasporangiaceae Tetrasphaera 0.009 Actinobacteria Mycobacteriaceae Mycobacterium 0.008 Actinobacteria Nocardiaceae Williamsia 0.008 Actinobacteria Micromonosporaceae Unclassified 0.008 Actinobacteria Patulibacteraceae Patulibacter 0.007 Proteobacteria Methylobacteriaceae Methylobacterium 0.006 Actinobacteria Mycobacteriaceae Mycobacterium 0.005 Actinobacteria Mycobacteriaceae Mycobacterium 0.005 Actinobacteria Mycobacteriaceae Mycobacterium 0.005 Actinobacteria Nocardioidaceae Nocardioides 0.005

* Mean decrease in accuracy averaged across all 62 random forest classifiers

Discussion

We present the first characterization of gut microbiome composition in Darwin’s finches, with comprehensive sampling of 63 individuals across nine extant finch species on Santa Cruz

Island. This iconic adaptive radiation presents a natural experiment to investigate the factors that influence the gut microbial community given their well understood evolutionary relationships, dietary habits, and biogeography. The gut microbiome composition across all samples was dominated by the bacterial phyla Firmicutes, Actinobacteria, and Proteobacteria. Visualization of the beta diversity of the gut microbiome using double principal coordinate analysis (Figure 2.4) showed that samples were more clearly separable by habitat than by species identity alone. Using

PACo and BDTT, both the finch phylogeny and foraging data correlated with beta diversity

48

distances between samples; however, only foraging data uniquely explained variation in the microbiome composition as determined by variation partitioning. Together, these findings support a stronger influence of diet and habitat than host phylogeny on the gut microbiome in this recent adaptive radiation.

Composition and diversity of Darwin finch gut microbiomes

The overall gut microbiome composition across Darwin’s finch species is dominated by the bacterial phyla Firmicutes (35%), Actinobacteria (31%), and Proteobacteria (28%). The broad taxonomic characterization is consistent with previous surveys of avian gut microbiome diversity (Hird et al. 2015; Kropáčková et al. 2017), though Darwin’s finches are relatively enriched for the bacterial phylum Actinobacteria and relatively depleted of Tenericutes and

Bacteroidetes in comparison with other bird orders. At lower taxonomic levels, the most abundant bacterial genera are all soil and plant associated taxa: Lactobacillus within Firmicutes,

Acinetobacter within Proteobacteria, and Kocuria within Actinobacteria. The high abundance of these genera in Darwin finch species may reflect the large percentage of plant material consumed by the finch species. Among the predominantly insectivorous finch species, the warbler finch is characterized by a bacterial genus not associated with soil and plants. The most abundant bacterial genus in warbler finches, Diplorickettsia, includes bacterial species characterized as obligate intracellular bacteria associated with arthropods (Mediannikov et al. 2010) and was previously noted as a common bacterial taxon in the gut microbiome of temperate bird species

(Kropáčková et al. 2017). The abundance of arthropod associated bacterial taxa may similarly be reflective of the increased proportion of insects in the diet of the warbler finch.

The three measures of alpha diversity calculated (observed RSVs, Chao1, and phylogenetic diversity) were not significantly different between species and showed almost no

49

phylogenetic signal, with large confidence intervals on Pagel’s " for all three metrics, similar to previous findings in old world passerine species (Kropáčková et al. 2017).

Habitat and diet effects

The distribution of Darwin’s finch species across both arid lowland and humid highland habitats provides an opportunity to disentangle the effects of phylogeny and locality on host microbiome composition. Though the two clades of closely related species, the ground finches and tree finches, have higher population densities in the lowland and highland habitat, respectively, the vegetarian finch and warbler finch allow for the comparison of phylogenetically distant host species sampled in the same habitat. Visualization of the diversity of the gut microbiome using double principal coordinate analysis (Figure 2.4) showed that samples were more clearly separable by habitat than by species alone. This was likely due in part to the high variation between samples within each species (Figure 2.1), similar to previous surveys of avian gut microbiomes (Kropáčková et al. 2017). While almost all species have been documented in both habitats (Dvorak et al. 2012), the small ground finch was the only species with high enough density to have multiple samples in each habitat. Small ground finch microbiomes display the same differentiation by habitat within a single species. The small ground finch populations on

Santa Cruz have previously been shown to be a single, panmictic population, reducing the likelihood that the observed differences between habitats are due to fine-scale genetic differences

(Galligan et al. 2012). Since most species were sampled in a single habitat, a nested

PERMANOVA was used to test whether habitat and species assignments were significantly different when comparing the weighted UniFrac distances. Both habitat and species were significantly different; however, all three pairwise post hoc species comparisons which were significant after correction were between species in different habitats. That is, none of the finch

50

species with significantly different microbiome composition are from the same habitat. Further corroborating the correlation of habitat with microbiome diversity, the random forest model had a high accuracy in using the abundance of RSVs across samples to classify the samples by habitat, but was not able to classify samples according to species.

In Darwin’s finches, habitat and diet are interconnected but not interchangeable. We used foraging observations to characterize the broad dietary patterns in each species and habitat.

While the foraging data obtained in this study show a strong correlation between insect consumption in the highlands and plant material in the lowlands, the finch species are opportunistic in their diets. Previous characterization of foraging behavior in the tree finches on

Santa Cruz island showed no exclusive insectivorous species, though the proportion of the diet from insects as quantified by foraging observations increases from the small tree to large tree to woodpecker to warbler finch (Tebbich et al. 2004). Notably, this diet pattern held across habitats for three of the species, excepting the warbler finch, which was not found in the lowlands.

Similarly, the foraging patterns observed for the omnivorous small ground finch in both habitats is consistent with previous surveys, with the diet consisting of a high proportion of seeds across both habitats but an increase in flower consumption in the lowlands (Kleindorfer and Chapman

2006).

In our study, stable carbon and nitrogen isotope values provided context for the diet of each individual and represents the first time that differences in the diet of Darwin’s finches has been characterized using this method. The measurements were used as proxies for distances between individuals in their dietary intake. !13C and !15N ratios can differentiate between the relative contribution of C-3 or C-4/CAM food sources in the animal’s diet and trophic levels in the ecosystem, respectively (Herrera et al. 2003), as well as also elucidate trophic partitioning

51

between sympatric species (Rakotondranary et al. 2011). Similar to the visualization of the microbiome distances with DPCoA, stable isotope values showed a demarcation by habitat, with highland samples generally more depleted in 13C and more enriched in 15N (Figure 2.3). The cactus finch had the lightest !13C values at -16.2‰ which are consistent with its feeding on

Opuntia cacti species that fix carbon via the Crassulacean acid metabolism (CAM) pathway

(Winter and Smith 2012). There are a few small ground finches with similar !13C signatures collected in the highlands where Opuntia cacti are not present; however, these samples have a lower !15N signature than the cactus finches (mean = 6.9‰). Higher !15N values in the lowlands is unlikely to reflect trophic partitioning given high proportion of plant material ingested as determined by foraging data and the difficulties in establishing trophic relationships in herbivores (Stapp et al. 1999). Without comprehensive sampling of potential food sources in both habitats, it is difficult to establish the exact diets of the finch species with stable isotope values. These values were instead used as alternative explanatory variables for testing co- diversification between Darwin finch species and their microbiomes.

Co-phylogeny between Darwin finch species and their microbiomes

Co-diversification between host species and their gut microbiome communities has been investigated across the animal kingdom and at multiple scales (Sanders et al. 2014; Moeller et al.

2016; Brooks et al. 2016). Using Procrustes analysis of co-phylogeny (PACo) (Balbuena et al.

2013), we found a significant correlation between the host phylogeny and weighted UniFrac microbiome distances with the procrustean correlation coefficient R2 = 0.18, about half the correlation coefficient previously described across 51 passerine species, where R2 = 0.35

(Kropáčková et al. 2017). This analysis indicates some congruence between the phylogeny of the finch species and their microbiomes despite high variation between samples from individuals of

52

the same species. The recent divergence of Darwin’s finch species ca. 1 million years ago

(Lamichhaney et al. 2015) may explain the weak phylogenetic congruence. While there have been documented examples of strong host selection in the gut microbiome of other species, such as wasps (Brucker and Bordenstein 2013), the moderate correlation observed in Darwin’s finches indicate other factors may explain more of the variation observed. For comparison with the correlation observed with the host phylogeny, PACo was also applied to foraging data and stable isotope values, revealing a similar level of correlation between the gut microbiome and foraging data (R2 = 0.18) while stable isotopes had a much smaller coefficient (R2 = 0.06).

To better understand the phylogenetic scale of the bacterial taxa driving the congruence observed with PACo with both foraging data and host phylogeny, the beta diversity through time analysis (BDTT) was applied (Groussin et al. 2017). BDTT provides additional information about the phylogenetic scale of the observed correlation between the microbiome and explanatory variables by slicing the bacterial phylogeny at regular intervals and collapsing bacterial RSVs into larger taxonomic units which are then correlated with the metadata.

Correlation profiles that maintain significant coefficient values at older evolutionary time points demonstrate that ancient bacterial lineages grouped at larger taxonomic levels drive the correlation observed. In contrast, profiles where the correlation drops off more immediately show that recent bacterial evolution is the cause of the congruence between the microbiome distances and the explanatory variable.

In Darwin’s finches, host phylogeny and foraging data have similar correlation profiles with correlation coefficients between 0.20 and 0.25 as far back as 750 mya while stable isotope

R2 values never rise above 0.10 (Figure 2.5). This suggests the correlation is driven by differences between microbial linages that diverged much further back in time than the

53

emergence of the Darwin finch adaptive radiation. That is, the congruence between the host phylogeny and microbiome distances is not diminished when recent bacterial evolution is not included. The signal of co-diversification relies on recent bacterial evolution to drive the congruence observed (Sanders et al. 2014). The pattern found here implies an environmental filter, such as diet, that correlates with the host phylogeny is responsible for the observed co- phylogeny, not co-diversification between the gut microbiome and the host species.

This similarity in correlation profile between the host phylogeny and foraging data contrasts with the pattern seen in the initial description of BDTT applied to a dataset of 33 mammalian species(Groussin et al. 2017). There the correlation with host phylogeny peaked around 100 mya, near the estimated time of divergence for the mammalian host species and then dropped off significantly while the correlation with diet peaked around 500 mya but was insignificant at present. Though it is possible the divergence between Darwin finch species is simply too recent for BDTT to disentangle, the similarity in correlation profile between host phylogeny and foraging data are consistent with dietary preferences driving the differences observed between microbiome samples. Variation partitioning further demonstrates the unique contribution of foraging data to explain the differences between microbiome samples. Foraging data was the only variable to uniquely explain any of the variation in the microbiome when analyzed with variation partitioning (Figure 2.6). The variation explained by both host phylogeny and stable isotope values was shared and could not be attributed solely to either of these variables. The opportunistic nature of the dietary patterns observed in Darwin finch species suggests that the microbiome community is likely plastic in comparison to the tight association of herbivore or carnivore specific bacterial communities seen in mammals.

54

Congruent with the wide dietary patterns observed, Darwin’s finch species do not harbor species-specific bacterial taxa. The random forest algorithm uses the differential abundance of bacterial taxa across samples to classify the samples and was able to effectively identify mammalian host species using strain level resolution of the microbiome (Eren et al. 2015). Here, the random forest classifier was unable to differentiate samples by the RSV distribution across host species, corroborating the lack of clear species clusters in the DPCoA visualization (Figure

2.2). Conversely, the highland and lowland habitats have bacterial taxa whose abundances are clearly demarcated, primarily within the bacterial phyla Proteobacteria and Actinobacteria, respectively (Figure S8). The lack of species-specific bacterial taxa in Darwin’s finches supports a model of microbiome assembly more dependent on locality than host species identification., which is consistent with previous findings in the brood-parasitic cowbird where geography explained the most variation in the gut microbiome (Hird et al. 2014).

Dietary shifts across seasons are known to impact the gut microbiome composition, such as in humans (Smits et al. 2017) and foraging patterns in Darwin’s finches change significantly between the wet and dry seasons (Grant 1999; Tebbich et al. 2004). Food scarcity in the dry seasons drove the most impressive examples of natural selection, as documented on the island of

Daphne Major (Grant 2002). It is possible that the samples presented here, collected in the wet season, demonstrate the breadth of microbial diversity in the Darwin finch species but do not reflect the restricted dietary niches that facilitated the speciation events. Cataloging the temporal variation in the gut microbiome of Darwin’s finches could provide further insight into the pattern of co-phylogeny observed here.

Interestingly, a study on the gut microbiome of the invasive parasitic fly Philornis downsi noted host species specific patterns in the associated bacterial taxa (Ben Yosef et al. 2017). P.

55

downsi larvae feed on the blood and tissue of Darwin’s finch nestlings and induce high mortality rates across all species (see review in Kleindorfer and Dudaniec 2016). The gut microbiome of the larvae sampled from warbler finch nests clustered tightly while larvae from the nests of small ground finch and small and medium tree finch overlapped in ordination space. Since all larvae were isolated from nests in the highland habitat, the differences seen in the microbiome are most likely attributed to dietary differences in the host finch species. More notably, larvae from small ground finch nests sampled on Floreana Island were not significantly different from those on

Santa Cruz and suggests that the ecological zones across islands are similar. The separation of the gut microbiome in P. downsi larvae collected from warbler finch nests contrasts with the lack of a species specific cluster in the gut microbiome of the Darwin finch species sampled here.

Though nestlings were not sampled in this study, a previous study in black legged demonstrate that nestlings harbor distinct and more variable gut microbiomes than adults (van

Dongen et al. 2013). P. downsi larvae feeding on the nestlings therefore may encounter different bacterial communities from those represented in Darwin finch adults.

Conclusion

The combination of a well-studied host phylogeny, individual stable isotope values, and foraging data specific to the field season enabled us to effectively characterize the variables that play a role in the composition of the gut microbiome in Darwin’s finches. While host phylogeny and foraging data both correlate with beta diversity in the microbiome samples, stable isotope values did not. Notably, habitat was a strong predictor of the microbiome composition, even within a single species, the small ground finch. This study demonstrates the need for comprehensive metadata to correctly interpret the many dimensions of microbiome data. Without foraging data, the co-phylogenetic correlation detected between the host phylogeny and the

56

microbiome could have been interpreted as signal for co-diversification when it was instead driven by diet. Thus, multiple methods are needed to better comprehend the complex evolutionary relationship between animals and their microbiomes.

Acknowledgements

We are grateful to the Charles Darwin Foundation and Galápagos National Park for allowing us to work in the Galápagos Islands, with special thanks to Solanda Rea, Marta

Romoleroux, and the rest of the staff at the Charles Darwin Research Station for facilitating logistics. We thank Arno Cimadom and Sabine Tebbich for help with fieldwork. Noreen Tuross provided invaluable feedback on stable isotope sample preparation and analysis. Sangeet

Lamichhaney generously provided the genetic distance matrix calculated from whole genome resequencing of Darwin’s finches. This work was supported by a National Science Foundation

Graduate Research Fellowship, a Graduate Research Opportunities Worldwide grant in collaboration with Flinders University, fieldwork funding from Macquarie University, and the

Dean’s Competitive Fund at Harvard University.

57

Chapter 3

Title: Hybrid Darwin’s finches share the gut microbiomes of their paternal species

Authors: Wesley T. Loo1, Rachael Y. Dudaniec2, Sonia Kleindorfer3, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA,

USA; 2Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia;

3School of Biological Sciences, Flinders University, Adelaide, SA, Australia

Abstract

Animal gut microbiomes play an important role in the health and development of the host organism. Hybridization events are a rare opportunity to investigate the influence of the genetic background of parental species on phenotypes of interest, such as the microbiome. Previous work has demonstrated reduced fitness in hybrid individuals modulated by the microbiome.

Darwin’s tree finch species on Floreana Island in the Galápagos Archipelago are currently undergoing a hybridization event driven by asymmetrical introgression from females of the endemic, critically endangered medium tree finch (Camarhynchus pauper) mating with males of the common small tree finch (Camarhynchus parvulus), producing a hybrid genetic cluster.

Recent characterization of Darwin’s finch species on Santa Cruz indicated the effects of habitat, diet, and host phylogeny on the gut microbiome communities. However, these did not include higher resolution genotyping of individuals within a species. Here, we examined the relationship between the genetic assignment of individuals and their microbiomes. Using amplicon sequencing of the 16S rRNA and whole genome shotgun metagenome sequencing, we characterized the gut microbiome in the both parental tree finch species and the hybrid tree finch genetic cluster. Though gut microbiome distances did not correlate with host genetic distances, tests of differential abundance displayed enrichment of the bacterial genus Lactobacillus in the medium tree finch (maternal species) relative to the hybrid and small tree finch clusters (paternal species). This pattern coincided with observed increases of the consumption of Scalesia seeds in the maternal species. Metagenomic characterization of the gut microbiome also displayed a similar pattern with greater similarity between the hybrid tree finch and the small tree finch.

Together, these findings are congruent with a diet-driven model of gut microbiome composition

59

in Darwin’s finch species. They also raise questions about how hybrid offspring learn to forage and from whom, given that hybrid foraging behavior and gut microbiome were more similar to that of the paternal species.

Introduction

The microbial communities associated with host organisms, or microbiomes, are an essential aspect of host health and development (McFall-Ngai et al. 2013). Their importance can be expressed in the hologenome concept, where the host organism is considered along with all the associated microbiome communities (Theis et al. 2016). Amongst the various sites of microbiomes, the intestinal tracts of vertebrates represent some of the densest collections of bacteria and have a unique consortia of bacteria in comparison to free-living communities (Ley et al. 2008b). Understanding the effect of different factors on the composition of these communities has implications for both applied life sciences such as medical interventions to treat infectious diseases (Jorup-Rönström et al. 2012) as well as foundational biological concepts, such as the process of evolution and speciation (Brucker and Bordenstein 2013). Specifically, tracing co-evolution between host species and microbiomes can give insight into the ways in which these bacterial communities shape the evolutionary trajectories of the hosts. However, finding evidence for these complex interactions requires the characterization of wild animal microbiomes, which better represent the broader environmental context that the host species encountered as it evolved (Amato 2013; Hird 2017).

Hybridization events offer natural experiments to investigate species-specific effects on the microbiome. At its most extreme, the microbiome may reinforce biological species boundaries, as shown in two species of Nasonia wasps, where the microbiome conferred genetic

60

incompatibility between sister species (Brucker and Bordenstein 2013). In a less extreme example, Wang et al. demonstrated that hybrid individuals had reduced fitness in a hybrid zone between two subspecies of house mice (Wang et al. 2015). Not only did the microbial community structure shift between the two parental species, but immunological assays showed increased intestinal pathology which correlated with these changes. The microbiome can thus play an important role in mediating the fitness of hybrid individuals and maintaining species boundaries. Despite the wide biodiversity in avian species and recent microbiome surveys (Hird et al. 2015; Kropáčková et al. 2017), the effects of hybridization on the microbiome has not been interrogated in any avian species to date.

Darwin’s finch species in the Galápagos archipelago are a quintessential example of adaptive radiation, spanning fourteen species across the many islands. Intensive study over decades has made these species an excellent case study for the effects of natural selection on divergent populations of finches and the process of speciation (Lack 1947; Grant 1999; Grant and Grant 2011). As an example, food availability strongly influences the evolutionary trajectory of these species, driving divergence between populations and species and changing the fitness landscape, sometimes as the consequence of climatic fluctuations (Grant and Grant 1996).

Multiple hybridization events over the years have demonstrated the short timescales in which evolution can take place in Darwin's finches (Grant et al. 2004). Most recently, a single immigrant Geospiza conirostris male to Daphne Major was chosen by a resident G. fortis female, and together they spawned a morphologically and genetically distinct ‘Big Bird’ population that is thriving despite heavy inbreeding (Lamichhaney et al. 2018). Therefore, Darwin’s finch species are clear candidates to elucidate the relationship between gut microbiomes and their hosts.

61

A previous study conducted on Santa Cruz Island demonstrated an influence of host diet, habitat and phylogeny on the microbiome (Loo et al. 2018a). Habitat was a strong predictor of the microbial community, even within the small ground finch (G. fuliginosa), which was distributed across both the arid lowland and humid highland habitats. Analysis of foraging patterns across all nine extant finch species showed a correlation between the host phylogeny and foraging patterns, both of which explained some of the variation amongst microbiomes. Overall, the host species identity did not seem to drive divergence between microbiomes. However, the

Santa Cruz study relied on morphological identification of finch species identity and did not incorporate genetic information on the finches sampled. Here, we focus on a hybridizing population of Camarhynchus tree finches on Floreana Island (Kleindorfer et al. 2014a; Peters et al. 2017) to further interrogate the effect of host species' genetic similarity on microbiome composition.

The tree finch species on Floreana island are currently undergoing an introgression from the endemic, critically endangered medium tree finch (C. pauper) into the common small tree finch (C. parvulus), creating a population of hybrid tree finches with all three genetic populations living in sympatry (Kleindorfer et al. 2014a). The documentation of the hybridization between the medium and small tree finch was accompanied by the genetic confirmation of the local extinction of the large tree finch (C. psittacula) on Floreana, estimated to have taken place in the early 2000s (Kleindorfer et al. 2014a). Recent analysis on a decade of morphological and genetic data on the tree finch individuals showed that female mate choice drives this asymmetrical introgression from the medium tree finch into the small tree finch population, with female medium tree finches not pairing assortatively with their conspecifics (Peters et al. 2017). In

2005, hybrid tree finches could be identified as intermediate in body size compared with the

62

small and medium tree finch (Kleindorfer et al. 2014a). But by 2014, hybrid tree finches were morphologically indistinguishable from the small tree finches, leading to their designation as a hybrid swarm. Concurrent with the emergence of the hybrid swarm, previous work demonstrated a lack of hybrid specific song types (Peters and Kleindorfer 2017).

Characterization of the foraging patterns amongst these populations also showed differences in both food items consumed and foraging strategies. Based on foraging observations in 2005 and 2006, the three morphological clusters identified as small, medium, and large tree finch showed all finches to be primarily insectivorous but with increased proportions of flower and fruit consumption during the wetter year (Christensen and Kleindorfer 2009). Specifically, the medium tree finch (confusingly referred to as the large tree finch as the original study was done prior to genetic confirmation) was more specialized during the driest year (2005) and had broader foraging breadth during the moderately wet year (2006) consuming both invertebrates and plant material. Across high rainfall years in 2010-2013, using color-banded and genetically assigned birds, medium tree finch consumed insects and foraged more on foliage compared to small tree finch and hybrid birds (Peters and Kleindorfer 2015). The well-characterized genetic populations provide fundamental ecological context to understand differences in the microbiome between species and within the hybrid population.

In this study, we characterized the microbiome in Darwin’s tree finch species on Floreana

Island. Using the phenomenon of hybridization between the small and medium tree finches, we investigated whether the hybrid individuals show novel or unique microbiomes compared to the parental species. Secondarily, the detailed genotypes of each finch allows for testing correlation between host genetic distances and microbiome distances. The increased resolution of genetic data for all host finches were compared to the stable isotope and foraging observations to

63

evaluate which variable best explains the microbiome diversity. Finally, we used metagenomic sequencing for higher resolution identification of bacterial species in each finch, along with functional characterization of the microbiome in Darwin’s finch species.

Materials and Methods

Ethics statement

All samples were collected with permission from the Parque Nacional Galápagos and Ministerio del Ambiente, Ecuador (Research permit No. PC-23-16). All collection protocols were approved by the Institutional Animal Care and Use Committee in the Faculty of Arts and Sciences at

Harvard University (Protocol 15-08-249). Samples for the medium tree finch (C. pauper) were imported under U.S. Fish and Wildlife Service permit number MA05827C-0.

Study sites and species

Fieldwork was conducted in February 2016 on Floreana Island, Galápagos Archipelago,

Ecuador. The sampling site was located in the highland habitat at the base of the Cerro Pajas volcano (1°17’S, 90°27’W). Due to morphological similarities amongst tree finch individuals, all finches were assigned to genetic populations after microsatellite genotyping (see below). After genetic population assignment, there were 4 samples from the medium tree finch (C. pauper), 14 samples from the small tree finch (C. parvulus), and 11 samples from the admixed population.

Sample Collection

Finches were caught using mist nets and tagged with an aluminum ring imprinted with a unique identifier prior to all sample collections. Eight morphological traits and mass were

64

measured as previously described and are detailed in Supplementary Methods (see

Morphological measurements; (Kleindorfer et al. 2014a).

Blood samples were collected for genetic and stable isotope analyses. Samples for genetic analysis were preserved on Whatman FTA Paper (GE Healthcare Life Sciences,

Pittsburgh, PA) and stored at room temperature. Samples for stable isotope analysis were dried on small pieces (roughly 0.5 x 0.5 cm2) of quartz fiber filter paper (Schleicher and Shuell,

Dassel, DE) and stored in microcentrifuge tubes with a silica gel bead as desiccant at room temperature.

After morphological measurements and blood sample collection, fecal samples were collected by placing each finch into a 7" x 7" x 7" cage lined with UV-sterilized parchment paper. Cages were covered with fabric and finches were monitored until defecation for a maximum of 30 min before release. Feces were immediately transferred from parchment paper with bleach cleaned spatulas into pre-weighed microcentrifuge tubes containing 1 ml of

DNA/RNA Shield (Zymo Research, Irvine, CA) and mixed by shaking the tubes by hand before storage at -20°C within 4 h of collection to prolong the longevity of the DNA stabilization buffer. Fecal samples were shipped at room temperature and stored at -80°C in the lab until further analysis.

To quantify the diet patterns across species, first foraging observations were collected at the sampling site. A single walk through of one hour was conducted with no overlaps or doubling back to avoid observing the same individuals. During the walkthrough, individual finches were observed until the first food item was ingested. The food item consumed was recorded as one of five categories: insect, seed, flower, leaf, or fruit. Due to the tame nature of

Darwin’s finches, the majority of observations were made within 8 m of the focal individual.

65

Microsatellite genotyping

Blood DNA extraction

To characterize the genetic populations of tree finches on Floreana, DNA was extracted from blood samples using a protocol modified from Smith and Burgoyne (Smith and Burgoyne

2004). Specifically, two 3 mm discs of blood-soaked FTA paper were cut using sterile biopsy punches (Integra Miltex, York, PA). Discs were processed with one wash of 200 µL FTA lysis buffer (100 mM Tris pH 8.0, 0.1% SDS) for 30 min, two washes of 200 µL DNAzol

(ThermoFisher Scientific, Waltham, MA) for 10 min, two washes of 200 µL molecular water for

5 min, and 200 µL 95% ethanol for 10 min, discarding the solutions after each step. All steps were performed at room temperature on an orbital shaker at 150 rpm (Lab-Line Instruments,

Melrose Park, IL). DNA was eluted from the discs by incubating in 50 µL molecular grade water at 90⁰C for 10 min and storage at 4⁰C overnight prior to use in PCR amplification reactions.

Nine autosomal microsatellite loci were amplified and analyzed for fragment sizes (Gf01,

Gf03-Gf07, Gf11-Gf13). Thirty-two tree finches were genotyped, including 29 tree finches with microbiome samples and 3 previously genotyped individuals. PCR primer sequences were redesigned by Galligan et al. (2012) to enable multiplex genotyping based on the sequences first isolated and designed by Petren (Galligan:2012ko; Petren 1998); Table S10). The forward primer in each pair was labeled with one of four 5’ fluorescent tags: 6-FAM, NED, PET or VIC

(Applied Biosystems, Beverly, MA). PCR amplification reactions were performed separately for each locus in 15 µL volumes containing 1x PCR Gold Buffer (Applied Biosystems, Beverly,

MA), 4 mM MgCl2, 1 mM dNTP, 0.025 U AmpliTaq Gold DNA Polymerase (Applied

Biosystems, Beverly, MA), 0.3 µM of each primer and 10-20 ng of DNA template. PCR conditions were as follows: initial denaturation for 9 min at 94⁰C followed by 40 cycles of 45 sec

66

at 94⁰C, 45 sec at 54⁰C, 1 min at 72⁰C, and a final extension for 30 min at 72⁰C. PCR products were pooled and fragment sizes analyzed using multiplex capillary electrophoresis (ABI 3730xl

DNA Analyzer) at the Massachusetts General Hospital Center for Computational and Integrative

Biology (Cambridge, MA). The R package Fragman was used to size the PCR products for each locus (Covarrubias-Pazaran et al. 2016).

To improve the power of genetic cluster assignment (see below), microsatellite loci genotyped in this study were combined data from 357 Floreana tree finches that were previously genotyped at the same loci (Peters et al. 2017). Five individuals characterized in this study also had microsatellite data from the previous study. Since the absolute size of microsatellite PCR products can vary between studies, systematic shifts in allele sizes at each locus were corrected by comparing the allele calls in these five individuals. With this calibration, allele calls for the 32 individuals from this study were adjusted to match the previous study. A total of 384 genotypes at all nine microsatellite loci were used for genetic population assignment.

Genetic assignment of Floreana tree finches

To assign tree finch individuals into genetic populations, the program STRUCTURE

(Pritchard et al. 2000; Hubisz et al. 2009) was used with the LOCPRIOR setting to incorporate putative population information based on morphology as previously described (Peters et al.

2017). A threshold of 8.2 mm was chosen to classify individuals to phenotype: those with beak length-naris ≥ 8.2mm and < 8.2mm were classified as C. pauper and C. parvulus, respectively.

This threshold matches that used in Peters (2017) as the 32 individuals genotyped for this study also displayed a bimodal distribution in beak length-naris (Figure S9).

To determine the number of genetic clusters which best fit the data, STRUCTURE with

LOCPRIOR was run for K=1-4 with 10 replicates at each K value. Each replicate was run with a

67

burn in of 100,000 iterations and post burn-in chain length of 500,000. Optimal K was determined using the delta K method (Evanno et al. 2005) as implemented in STRUCTURE

HARVESTER (Earl and vonHoldt 2011) and K=2 was determined to provide the best fit to the microsatellite genotypes (Table S11; Figure S10).

Assessing accuracy of tree finch membership coefficient thresholds

To calculate the optimal threshold for classifying tree finch individuals based on the membership coefficient (qi), simulations were run with three thresholds: 0.75, 0.80 and 0.85.

First, STRUCTURE was run using K=2 for 20 replicates, each with a burn-in of 100,000 iterations and post burn-in chain length of 100,000. Label mismatch was corrected and mean qi between runs was calculated using CLUMPP (Jakobsson and Rosenberg 2007). For each threshold value, ten simulated datasets were generated. Individuals above the chosen threshold were used as parental populations for simulations. To avoid pseudo replication, nine times as many genotypes as represented in the parental populations were generated using HybridLab 1.0

(Nielsen et al. 2006). Simulated parental individuals were then combined with the original parental individuals and then split into ten equal datasets. For each of these datasets, hybrid tree finch individuals were simulated using HybridLab 1.0 and combined to create a simulation dataset with the same number of individuals as in the original data. Each simulated dataset was run with the same STRUCTURE parameters as the original dataset, described above.

To assess the performance of classification at each threshold, efficiency, accuracy, and overall performance were calculated as defined by Vaha (Vähä and Primmer 2006). Efficiency refers to the proportion of finches that were correctly identified to their group while accuracy refers to the proportion of an identified group that truly belongs there. Shifts in threshold value usually improve one to the detriment of the other. An additional metric, overall performance, is

68

the product of these two measures. The inclusive threshold of 0.75 had the highest overall performance, with accuracy and efficiency above 75% across all three genetic clusters. In comparison, the thresholds at 0.80 and 0.85 had marginal increases in accuracy but much lower efficiency rates (Table S12). Therefore, an inclusive threshold of 0.75 was used to classify the 29 tree finches with microbiome samples into one of three genetic clusters: 1) C. parvulus (qi ≥

0.75), 2) C. pauper (qi ≤ 0.25), and 3) admixed population (0.25 < qi < 0.75). Membership coefficients for the combined dataset and for the 29 individuals with microbiome samples is shown in Figure S11.

To determine the direction of introgression, private allele frequency and allele richness were calculated with the R package PopGenReport (Adamack and Gruber 2014). These calculations were performed on the combined dataset (n=384) after splitting the individuals with a qi threshold of 0.5. Heterozygosity of these groups were calculated with the R package adegenet (Jombart 2008)

Morphological analysis of genetic clusters

To examine morphological differences between the identified genetic clusters, the following morphological measurements were used for further analysis in the 29 individuals with microbiome samples: beak-head (beak tip to back of head), beak-naris (beak tip to anterior end of the naris), beak- (tip of beak to feather line), beak depth (at the base of the beak), beak width (at the base of the beak), tarsus length and wing length. Morphological variables were tested for normality using the Shapiro-Wilks method (Shapiro and Wilk 1965). Two variables followed a normal distribution and were tested with ANOVA and Tukey post hoc pairwise comparisons: beak depth and tarsus length. The remaining variables were tested with the

69

Kruskal-Wallis and post hoc Dunn’s tests as implemented in the R package dunn.test using the

Bonferroni correction for multiple hypothesis comparison (Dinno 2017).

To facilitate calculation of the relationship between genetic clusters and morphological variables, principal component analysis was used to reduce the dimensions of the data into two summary variables: PCA_beak (beak-head, beak-naris, beak-feather, beak depth, and beak width

) and PCA_body (tarsus length and wing length). PCA_beak and PCA_body explained 85% and

91% of the variation across the component variables, respectively. Correlation between these summary variables and the membership coefficient to the C. parvulus genetic cluster was calculated using the R function cor.test (R Core Team 2017).

Molecular analyses of microbiome samples

DNA was extracted from feces in the laboratory using the ZR Fecal Miniprep kit (Zymo

Research, Irvine, CA) following manufacturer's instructions with the following changes. To minimize loss of biological material, BashingBeads were added directly to the collection tubes with the fecal sample in DNA/RNA Shield, which acted as the lysis buffer. Samples were homogenized in a FastPrep FP120 (Qbiogene, Carlsbad, CA) for six rounds of 45 s at speed 6.5 m/s. Between each round, tubes were cooled on ice for 3 min. All liquid transfer steps were performed in a laminar flow hood to minimize environmental contamination.

16S rRNA gene sequencing

The V4 region of the 16S rRNA gene was amplified using NEBNext Q5 HotStart HiFi

MasterMix 2x (New England Biolabs, Ipswich, MA) and previously designed dual-index barcoded universal primers (Kozich et al. 2013). For each fecal DNA sample, triplicate 25 µl

PCR reactions were performed containing 12.5 µl master mix, 9.5 µl molecular grade water, 0.5

70

µl of 10 M stock for each primer, and 2 µl of DNA template. PCR conditions consisted of initial denaturation at 94°C for 5 min followed by 20 cycles of 98°C for 20 s, 55°C for 15 s, 72°C for

40 s, and a final extension at 72°C for 5 min.

All PCR products were purified using 0.66X Aline PCRClean DX (Aline Biosciences,

Woburn, MA) to size select for the ~450 bp PCR product. Purified PCR products were visualized and quantified using High Sensitivity D1000 ScreenTape on an Agilent 2200

TapeStation (Agilent, Santa Clara, CA) and pooled in equimolar concentrations for sequencing on a single MiSeq run (Illumina, USA) using v2 chemistry and 2 x 250-bp paired-end reads at the Harvard Biopolymers Facility (Boston, MA).

Contamination controls

Given the low DNA content of bird feces (Vo and Jedlicka 2014), we were concerned about the influence of environmental microbial contamination in analyzing the sequences (Salter et al. 2014). To understand the sources of contamination, controls were included at the DNA extraction and amplification steps of the sequencing preparation. To evaluate contaminants from the DNA extraction kits, for each kit we included a mock community extraction with 75 µl of bacterial cells from ZymoBIOMICS Microbial Community Standard (Zymo Research, Irvine,

CA) and a no sample extraction control with only DNA/RNA Shield. To assess contaminants from PCR amplification reagents, for each 96-well plate of PCR amplification, a triplicate mock community amplification with 2 µl of a 1:10 dilution of ZymoBIOMICS Microbial Community

DNA Standard (Zymo Research, Irvine, CA) and a triplicate no template control reaction.

Greater than 99.75% of all reads from ZymoBIOMICS Microbial Community standards and

DNA standards mapped to the expected genera. None of the extraction or no template controls produced quantifiable PCR product and were excluded from further analysis.

71

Sequence processing

Sequences were demultiplexed according to the dual-index barcode by the Harvard

Biopolymers Facility (Boston, MA) and all the following sequence processing steps were performed in R version 3.4.0 (R Core Team 2014). The fastq files for each sample were converted into Ribosomal Sequence Variants (RSVs) using DADA2 with parameters as described in (Callahan et al. 2016). RSVs were taxonomically classified with the RDP v14 training set (Cole et al. 2009) and chimeras were removed as implemented in DADA2. After initial processing a total of 976,026 reads and 4,513 RSVs were identified across all 29 finch fecal samples.

Sequence filtering

The following steps were taken to produce the final dataset for analysis. As it is impossible to prevent all environmental contamination in PCR amplification, the frequency based decontam algorithm (Davis et al. 2017) was applied to the dataset to identify reads from likely contaminants based on the concentration of the PCR products. This removed 13,837 reads

(1.42%) and 75 RSVs (1.66%). To reduce the influence of RSVs present in only a few samples, a

5% prevalence filter was applied, which removed 53,977 (5.61%) reads and 1,422 RSVs

(32.04%). After taxonomic assignment, any sequences not classified as Bacteria were removed, subtracting 824 reads (0.09%) and 20 RSVs (0.66%). Finally, sequences classified as

Chloroplasts were removed, subtracting 12,691 reads (1.40%) and 29 RSVs (0.97%). The final dataset included 894,697 reads and 2,967 RSVs.

72

Rarefying reads

To ensure sample library sizes were not driving the patterns observed in the data, the following categorical variables were checked for significant differences in mean library size using the Kruskal-Wallis rank sum test and library size distribution using Levene’s test as implemented in the R package car (Fox and Weisberg 2011): species, sex, and PCR plate. None of the variables were significantly different in mean library size or library size distribution after

Bonferroni correction for multiple comparisons (Table S13). Therefore, for increased statistical power in detecting differences between microbiome samples, all following analyses were performed with non-rarified microbiome data (McMurdie and Holmes 2014).

16S rRNA sequence analysis of Darwin’s finch microbiomes

Alpha diversity analyses

To calculate the relative abundance of bacterial phyla present in the gut microbiome of each Darwin’s finch species, reads were transformed to proportions by sample and then averaged across all microbiome samples per finch species.

Beta diversity analyses

To visualize differences between microbiome samples, double principal coordinate analysis (DPCoA) was applied to the log-transformed RSV table as implemented in the R package phyloseq (McMurdie and Holmes 2013). DPCoA is a dissimilarity metric which incorporates both quantitative and phylogenetic information about the microbiome samples

(Pavoine et al. 2004). To assess the differences in community composition of the gut microbiomes between samples, weighted UniFrac distances (Lozupone et al. 2010) were

73

calculated between all samples. All abundance data were log transformed prior to distance calculations as an approximate variance stabilization method. To check for the homogeneity of the multivariate dispersions of the distance metrics, the betadisper function was used as implemented in the R package vegan (Oksanen et al. 2017). To test the significance of categorical variables, the adonis function was applied as implemented in the R package vegan

(Oksanen et al. 2017).

Differential abundance testing of bacterial taxa between tree finches with DESeq2

To test whether any bacterial taxa differed significantly in abundance between the tree finch genetic clusters, the R package DESeq2 was applied to the microbiome data (Love et al.

2014). DESeq2 is a method for detecting differential abundance originally developed for

RNASeq experiments that uses shrinkage estimation for dispersions and fold changes to deal with the small replicate numbers, varying numbers of reads and presence of outliers and has been shown to be effective when applied to non-rarified microbiome count data (McMurdie and

Holmes 2014). Multiple hypothesis testing was corrected using the Benjamini-Hochberg method, as implemented in DESeq2 (Benjamin and Hochberg 1995).

Metagenomic library preparation and sequencing

The microbiome was further characterized by shotgun metagenomic sequencing of the fecal DNA. Prior to preparation for metagenomic sequencing, extracted DNA samples were concentrated using a 3x Aline PCRClean DX cleanup (Aline Biosciences, Woburn, MA) and quantified using the Qubit High Sensitivity Assay (ThermoFisher Scientific, Waltham, MA).

Sequencing libraries were prepared using the KAPA HyperPlus kit and barcoded with KAPA

Dual-Indexed Illumina adapters according to manufacturer’s instructions with the following

74

parameters (Kapa Biosystems, Wilmington, MA). All samples were enzymatically fragmented for 8 min at 37°C. The adapter concentration and PCR cycles were dependent on the input DNA amount (Table S14). Prepared libraries were quantified on the Agilent TapeStation 4400 using

D1000 Screen Tapes (Agilent, Santa Clara, CA) and pooled in equimolar concentrations. The final library was size selected for molecules between 450 and 750 bp using the Pippin Prep (Sage

Science, Beverly, MA) prior to sequencing. Metagenomic libraries were sequenced with two runs of High output paired end 2 x 150 bp on the Illumina NextSeq 500 (Illumina, USA) at the

Bauer Core Facility at Harvard University (Cambridge, MA).

Metagenomic sequence analysis

Sequences were demultiplexed by the Bauer Core Facility (Cambridge, MA) and quality filtered using the Minoche parameters as implemented in the Illumina-utils library (Minoche et al. 2011; Eren et al. 2013). A mean ± s.e. of 30 ± 0.5 million reads were generated per sample and 92.8% ± 0.3% passed quality filtering. Metagenomic reads were aligned to the ProGenomes database (Mende et al. 2017), consisting of over 5000 representative bacterial genomes using bwa mem (Li and Durbin 2009). Reads were separately aligned to the reference genome for

Darwin’s finches (Zhang et al. 2014), Geospiza fortis, using the same algorithm. Coverage and percentage per genome covered were calculated using metaSNV and single nucleotide variants were called for genomes with at least 5x coverage and 40% horizontal coverage in a sample

(Costea et al. 2017). Principal coordinate analysis was calculated using the R package ape

(Paradis et al. 2004) and plotted with ggplot2 (Wickham 2016).

Metagenomic samples were functionally characterized by analyzing only the reads that mapped to the ProGenomes database using two approaches. First, FMAP was applied to test for differentially abundant KEGG pathways present between the genetic clusters (Kim et al. 2016).

75

FMAP uses the Kruskal-Wallis rank-sum test to identify differentially abundant genes in pairwise comparisons between genetic clusters and then performs enrichment analysis using

Fisher’s exact test. Second, mi-faser was applied for more exact functional assignment (Zhu et al. 2018). Non-metric multidimensional scaling was performed on the resulting count table using the function metaMDS as implemented in the R package vegan (Oksanen et al. 2017). The number of hits to each Enzyme Commission (EC) number were normalized by sample and the mean relative abundance calculated across all samples. The top 5 percent of EC numbers were mapped to KEGG pathways using KEGG Mapper (Kanehisa et al. 2016).

Comparative metadata: stable isotopes and foraging data

To assess the differences in diet between the genetic populations of tree finches, stable isotope analyses were performed using blood samples dried on quartz fiber filter paper. These were packaged in 5 x 9 mm tin capsules for analysis (041077, Costech Analytical Technologies,

Inc, Valencia, CA). !13C and !15N values were measured on a Thermo Scientific Delta V paired with a Costech 4010 elemental analyzer and a high-temperature conversion elemental analyzer at the Center for Stable Isotopes at the University of New Mexico (Albuquerque, NM). A known protein standard was run at multiple concentrations as a run-to-run control. !13C values were adjusted by the mean difference between the measured values for the protein standard and the known value (-1.18‰). !15N values for samples below 1000 mV were error corrected using a linear regression on the protein standard (R2 = 0.39).

For summarizing foraging data, the food items seed, flower, leaf, and fruit were combined into the category plant. The proportion of plant and insect food items therefore sum to

1. These observations provide knowledge of the broad diet patterns for each finch species in both habitats.

76

Testing co-diversification of the microbiome and the finch phylogeny

To assess congruence between the genetic distances among the tree finch genetic clusters and their gut microbiomes, the Procrustean approach to co-phylogeny (PACo) (Balbuena et al.

2013) was applied to the data. PACo was designed to detect the similarity of evolutionary patterns in host-parasite associations. Here, the microbiome samples are treated as the ‘parasites’ to compare with the three metadata variables: host genetic distance, stable isotope values, and foraging patterns. Genetic distances between tree finch individuals was calculated from the nine microsatellite loci using the dissimilarity metric developed by Kosman and Leonard as implemented in the function gd.kosman in the R package PopGenReport (Kosman and Leonard

2005; Adamack and Gruber 2014). Euclidean distances were calculated for stable isotopes and foraging data. Microbiome distances were calculated using the weighted UniFrac metric

(Lozupone et al. 2010), to produce a quantitative distance comparison that incorporated the phylogeny of the microbial community. PACo analysis was run as implemented in the R package paco (Hutchinson et al. 2017), with 10,000 permutations to test the significance of the signal.

Using the symmetric calculation, the correlation coefficient r was calculated as r = (1-ss).

To compare the amount of variation explained by host genetic distance, stable isotope values, and diet distance as calculated from first foraging observations, variation partitioning by redundancy analysis (Legendre 2008) was used as implemented with the varpart function in the

R package vegan (Oksanen et al. 2017). The microbiome distance matrix was used as a response variable with three explanatory tables: the first two principal coordinate axes of the host genetic distance, the !13C and !15N stable isotope values, and the first two principal component axes of the first foraging observations. Significance of the distance based redundancy analysis was assessed using the anova.cca function implemented in vegan.

77

Results

A total of 29 Darwin’s tree finch individuals in the highlands of Floreana Island were sampled for microbiome analyses. Due to the morphological similarity between the small tree finch (C. parvulus) and hybrids, all finches were genotyped to assign individuals to their correct genetic population.

Microsatellite genotyping

Based on previous work classifying Floreana tree finches to genetic clusters (Peters et al.

2017), individuals were first putatively assigned to either the small or medium tree finch phenotype by morphology. A threshold of 8.2 mm was applied to beak-naris length since the finches caught for microbiome samples also followed a bimodal distribution (≥ 8.2 mm, C. pauper; < 8.2 mm, C. parvulus; Figure S9).

Nine previously designed microsatellite loci were used to genotype tree finches (Gf01,

Gf03-Gf07, Gf11-Gf13) (Petren 1998; Galligan et al. 2012). To facilitate the assignment of tree finches sampled in this study, microsatellite data from an additional 357 finches collected between 2004 and 2014 were included in the following per locus statistics. Three loci, Gf05-

Gf07, were in Hardy-Weinberg equilibrium in both putative populations, five loci were not in

HWE in both populations (Gf01, Gf03, Gf04, Gf11, Gf13) and Gf12 was not in HWE in the C. parvulus population. In tests of linkage disequilibrium, ten of the pairwise tests were shown to have linkage between loci (Table S15). Due to the small number of loci and close genetic relationship between populations, loci were not tested for neutrality (Flanagan and Jones 2017).

However, as these loci have successfully been used to classify individuals into genetic clusters previously, all loci were included for genetic assignment. Allele number ranged from 5 to 19

78

(mean 11.2, SE ± 1.8) and expected heterozygosity ranged from 0.11 to 0.89 (mean 0.56, SE ±

0.09) per loci.

Assigning tree finches to genetic clusters

To determine the number of genetic clusters that best fit the microsatellite genotype data, the program STRUCTURE was run with the LOCPRIOR parameter set on the putative populations above. K=2 had the best fit to the data shown using the delta K method (Evanno et al. 2005) and additional replicates were run with K=2 to calculate the mean membership coefficient (qi) for each individual. Using simulations, an inclusive threshold of 0.75 was chosen to assign individuals to genetic clusters, as it showed the best overall performance compared to thresholds of 0.80 and 0.85 (Table S12). Individuals were assigned as follows: qi ≥ 0.75, small tree finch cluster; qi ≤ 0.25, medium tree finch cluster; 0.75 > qi > 0.25, hybrid tree finch cluster

(Figure S11).

Private allele frequency and heterozygosity were calculated on populations split on a threshold of 0.5 to confirm the direction of introgression. Individuals predominantly assigned as small tree finches (qi > 0.5) had more private alleles (33, 34% of total alleles), higher heterozygosity (0.56), and higher allele richness (8.80) compared to those predominantly assigned as medium tree finches (4 private alleles (6%); heterozygosity 0.51; AR 7.05).

Morphological differences between genetic clusters

To associate the genetic clusters with the phenotypes of the individuals with microbiome samples (n=29), the following morphological traits for beak and body size were analyzed: beak- head, beak-naris, beak-feather, beak depth, beak width, tarsus length, and wing length. All morphological traits showed a significant difference between the genetic clusters except for beak

79

depth (Table S16). Post hoc pairwise comparisons demonstrated a consistent difference between the small and medium tree finch clusters across these traits as well as difference between the hybrid and medium clusters for beak-feather and tarsus length.

To calculate the correlation between beak and body size with the membership coefficient, the morphological variables were reduced using principal component analysis: PCA_beak and

PCA_body. PCA_beak and PCA_body explained 85% and 91% of the variation in their component morphological variables, respectively. PCA_body was negatively correlated with the membership coefficient to the small tree finch cluster (-0.56, p=0.002) while PCA_beak showed a weak visual correlation but was not significant (-0.28, p=0.141) (Figure 3.1). Therefore, as expected, tree finches with largest body size and beak size were assigned to the medium tree finch genetic cluster and those with smallest body size and beak size were assigned to the small tree finch genetic cluster.

80

Figure 3.1. Membership coefficient in STF cluster compared to morphological summary variables.

The color and shape of each point correspond to the genetic cluster assignment with a 0.75 inclusive threshold and sex, respectively. A) PCA_Body includes tarsus length and wing length.

There is a negative correlation between the membership coefficient (qi) and the body size of the individual. B) PCA_Beak includes beak-head, beak-feather, beak-naris, beak depth, and beak width. Two outliers compress the data, but there remains a slight negative correlation between qi and the beak size of the individual.

81

16S characterization of Darwin’s tree finch gut microbiomes on Floreana Island

Using next-generation sequencing, a total of 894,697 sequences were generated across all finch microbiome samples (mean = 30,852; range = 13,328 to 49,605) across a total of 2,967 ribosomal sequence variants (RSVs) (mean = 590, range = 109 to 1,265). The sizes and distribution of each sequencing library were not significantly different across genetic cluster, sex, or PCR plate and all following analyses are based on the non-rarified data to increase statistical power (Table S13; see Methods – Rarefying reads).

Darwin’s finch microbiome alpha diversity analyses

A total of nineteen bacterial phyla were detected in the gut microbiome in Darwin’s finches. Using DADA2, all sequences were assigned to ribosomal sequence variants (RSVs) and taxonomically classified using the RDP v14 database (Cole et al. 2009). Across all samples, the bacterial phyla Firmicutes, Actinobacteria, and Proteobacteria composed the majority of the sequences, representing 47%, 28%, and 23% of RSVs respectively. All other bacterial phyla detected were represented by less than 1% of sequences across all samples (Table S17). The medium tree finch samples had a much higher mean relative abundance of Firmicutes at 89% with all other bacterial phyla below 10%, whereas the small and hybrid tree finch samples were more evenly spread across the top three bacterial phyla (25-42% for each phylum) and the variation amongst individuals was relatively high (Figure 3.2A; Table S18).

At lower taxonomic levels, the bacterial genus Lactobacillus was clearly dominant in all four medium tree finch samples as well as a few samples in the other genetic populations (Figure

3.2B). The high level of inter-individual variation is also apparent when visualizing the bacterial genera with greater than 10% relative abundance in individual samples. Other bacterial genera at

82

high mean relative abundance in each genetic cluster include Kocuria, Acinetobacter,

Methylobacterium, and Enteroccocus, each with at least 5% relative abundance in one of the three populations (Table S19).

83

Figure 3.2 Relative abundance of bacterial taxa in the gut microbiota of Darwin’s finch species.

A) Relative abundance of bacterial phyla over 1% in each finch are separated by the genetic cluster assignment. Finches are ordered in decreasing order by the membership coefficient to the small tree finch genetic population (Figure S11). B) Relative abundance of the bacterial genera above 10% in each finch.

84

Beta diversity analyses

To visualize differences in bacterial community composition between microbiome samples, double principal coordinate analysis (DPCoA) was applied to the data. DPCoA is a multivariate ordination method that takes the phylogeny of the bacteria into account when calculating pairwise distances. Plotting samples by genetic population assignment did not reveal strong clustering amongst the small tree finch, hybrid tree finch, and medium tree finch (Figure

3.3). To test whether the differences in beta diversity were statistically significant, the weighted

UniFrac distances were tested using PERMANOVA with the categorical variable genetic population assignment, which did not demonstrate significant differences between samples (R2 =

0.11, p = 0.09).

● 0.20

0.15 Species ● ● ● 0.10 ● ● STF ● ● ● ● ● HTF 0.05 ● ● MTF ●●

Axis2 [23.2%] ● ● ● ● ● ● 0.00 ● ● ● ● ● ●● ● ● ● −0.05 ● −0.10 −0.05 0.00 0.05 0.10 Axis1 [43.4%]

Figure 3.3 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities in Floreana tree finches.

The three genetic clusters of Floreana tree finches do not cluster separately using DPCoA ordination.

85

Differential abundance among the three tree finch genetic clusters was tested using

DESeq2, a method of variance stabilization and coverage estimation that can be applied to non- rarified microbiome count data prior to statistical tests of significance. DESeq2 revealed a total of ten RSVs which were differentially abundant between the three genetic clusters. Six of the

RSVs were present in both comparisons with the medium tree finch while the comparison between the hybric cluster and the medium tree finch included three additional RSVs. Only one

RSV was different between the small tree finch and the hybrid cluster (Figure 3.4; Table S20).

The medium tree finch was significantly enriched in four RSVs classified to the genus

Lactobacillus compared to the small tree finch and the hybrid cluster but depleted in two RSVs classified to the order Clostridiales and the family Enterobacteriaceae. In the comparison with the hybrid cluster, the medium tree finch was additionally enriched in two other RSVs classified to the order Lactobacillales and to the genus Corynebacterium but was depleted in an RSV classified to the genus Methylobacterium. The sole RSV that was differentially abundant between the hybrid cluster and the small tree finch was enriched in the small tree finch and classified to the genus Williamsia, in the phylum Actinobacteria.

86

Figure 3.4. Differential abundance of ribosomal sequence variants (RSVs) in pairwise comparisons between Darwin’s tree finch genetic clusters.

The log2 fold change is plotted for the ten RSVs that were significantly different in their abundance between the compared genetic clusters (adjusted p-value < 0.05). For each comparison, the first genetic population was the numerator and the second the denominator (e.g., a positive value on the y-axis corresponds to enrichment in the first genetic population). RSVs are labeled by their classification at the lowest taxonomic rank available: most are classified to the genus level, except for those denoted with * and +, which are classified at the bacterial Order and Family, respectively.

87

Metagenomic characterization

To further characterize the microbiome in Darwin’s tree finches, metagenomic libraries were prepared and sequenced for each individual. About 30% of the reads from each sample mapped to the ProGenomes database with coverage across the 5306 bacterial reference genomes ranging from 0 to 272. Total coverage of the reference genomes in the database across finch samples ranged from 44 to 589.

To investigate sample-specific strains within these bacterial species, single nucleotide variants (SNVs) were called within each sample. Fourteen bacterial species had enough coverage in at least four samples to calculate distances based on the bacterial alleles present (Table S21) but only four included an individual from each tree finch genetic population: Rhodococcus fascians, Microbacterium testaceum, Curtobacterium sp. UNCCL17, and Bradyrhizobium sp.

DFCI-1. None of the principal coordinate plots among these four bacterial species showed clustering by the genetic population (Figure S12).

To compare the bacterial taxa that were flagged as differentially abundant from the 16S data to the metagenomic data, Lactobacillus species with at least 1x coverage were scaled to total genome coverage per sample and plotted. The metagenomic sequences recapitulate the enrichment found by the 16S data comparison amongst the genetic populations, with a total of 7 species of Lactobacillus found in the metagenomes (Figure 3.5). The relative coverage of

Lactobacillus is much higher across all of the medium tree finch samples in comparison to the other two genetic populations, similar to the pattern observed in the 16S data.

88

Figure 3.5. Coverage of Lactobacillus species from metagenomic sequencing scaled per sample.

Metagenomic reads were aligned to the ProGenomes database composed of over 5000 representative bacterial genomes. Coverage was scaled by the total coverage in the database for each sample.

89

Metagenomic sequences were also characterized for functional differences in the microbiomes across the three genetic clusters by classifying reads that mapped to bacterial genomes to a KEGG Filtered UniProt database and testing for differential abundance. The medium tree finch cluster had differentially abundant genes in four KEGG pathways when compared to both the small tree finch and hybrid tree finch clusters: ABC transporters, two- component system, cationic antimicrobial peptide resistance, and biofilm formation in

Pseudomonas aeruginosa (Figure 3.6). The comparison between the medium tree finch and small tree finch additionally had differentially abundant genes in the pathway for biofilm formation in Escherichia coli. Three pathways were differentially abundant between the small tree finch and hybrid tree finch: terpenoid backbone biosynthesis, mRNA surveillance, and autophagy.

90

Enriched Depleted

ABC transporters

Two−component system

Cationic antimicrobial peptide (CAMP) resistance

Biofilm formation − Pseudomonas aeruginosa

Comparison MTF−STF Biofilm formation − Escherichia coli MTF−HTF STF−HTF

Riboflavin metabolism

Terpenoid backbone biosynthesis

mRNA surveillance pathway

Autophagy − other

0 10 20 30 0 10 20 30

Figure 3.6. Differential abundance of KEGG pathways in pairwise comparisons between

Darwin’s tree finch genetic clusters.

The number of genes that are significantly different in their abundance between compared genetic clusters (p-value < 0.05) are plotted and labeled with the corresponding KEGG pathway.

91

To further characterize the broader functions found in the metagenomic sequences, metagenomic were classified to their Enzyme Commission numbers using mi-faser.

Visualization was performed using non-metric multidimensional scaling, which showed no clear clustering based on the counts across the EC numbers (Figure S13). EC numbers in the top 5% of coverage across all samples were mapped to the corresponding KEGG pathways. The top KEGG pathways almost exclusively deal with the metabolism of carbon, nucleotides, and amino acids

(Figure S14).

Stable isotope values and foraging data

To estimate the dietary differences between individuals sampled for the microbiome,

!13C and !15N stable isotope ratios were analyzed. In general, the three genetic clusters had similar ranges for both from !13C and !15N values, ranging from -22.8‰ to -28‰ and 5.0‰ to

10.0‰, respectively (Figure 3.7; Table S22). Neither !13C nor !15N values were significantly different across the three genetic clusters (ANOVA F-value = 0.91 and 0.44; p-value = 0.42 and

0.65, respectively). To assess the microbiome samples and stable isotope values in the context of food items, foraging observations were made in both habitats (Table 3.1). By classifying the observations for each species, it is possible to get a sense of the broad dietary patterns. Since species classification was made visually, observations for the small tree finch the hybrid cluster were combined. The tree finch species (STF/HTF and MTF) primarily consumed insects, though the medium tree finch included 11% seed (MTF were observed consuming the seed head of

Scalesia pedunculata).

92

Figure 3.7. !13C and !15N stable isotope measurements for Darwin’s tree finch genetic populations.

Point color and point shape indicate host species and habitat, respectively. A) Individual !13C and !15N values for each finch with gut microbiome samples. B) Mean !13C and !15N values for each species and habitat with standard deviation. Neither !13C and !15N are significantly different amongst the genetic populations.

Table 3.1. First foraging observations across Darwin’s tree finch species on Floreana Island

Species Hab Total Counts (% of counts) Summary % obs. Flower Fruit Leaf Seed Insect Plant* Insect

MTF H 19 2 (11) 17 (89) 11 89 STF/HTF H 23 23 (100) 100

* The category ‘Plant’ is the sum of all plant derived food items (flower, fruit, leaf, and seed).

93

Correlation between host genetic distance and microbiome distance

To evaluate the correlation between the gut microbiome and the metadata variables,

Procrustes Analysis of Co-phylogeny (PACo) was applied to of host genetic distance, stable isotope values, and foraging data separately. The host genetic distance was based on the nine microsatellite loci used to assign each tree finch to the genetic population while Euclidean distances were used for both stable isotope and foraging data. For host genetic distance, though the procrustean correlation coefficient was moderate, permutation tests showed the value to not be significantly different from random associations (R2=0.32, p=0.17). Stable isotope values were also not significant (R2 = 0.09, p=0.17) and the foraging data did not provide enough distance between samples since small and hybrid tree finches were not distinguishable when performing foraging observations.

To better separate any variation explained by the overlap between metadata variables, variation partitioning was applied to the host genetic distance and stable isotope values together with the microbiome distance matrix as the response variable. The model including both metadata variables was not significant in explaining any of the variation in the microbiome samples (p=0.24).

Discussion

This study characterized the gut microbiomes in three closely related genetic clusters of

Darwin’s tree finches: the small tree finch (C. parvulus), the medium tree finch (C. pauper), and the genetic cluster of hybrid tree finches. Hybridization events offer natural experiments in how the genetic background of individuals may affect the microbial communities associated with the finches. In addition to the categorical assignment of finches into genetic clusters, the

94

microsatellite data also allowed for comparison of genetic distance amongst these closely related individuals to interrogate possible correlation between the gut microbiome distance and host genetic distance. The 29 finches sampled for this study are representative of the three genetic clusters characterized in previous studies. Using both 16S rRNA and metagenomic sequencing data, we show that the medium tree finch (maternal species for the observed hybridization) has significant enrichment in bacterial species in the genus Lactobacillus compared to the small tree finch (paternal species for the observed hybridization) and hybrid tree finch populations.

However, the overall bacterial communities are not significantly different amongst the genetic populations nor do they correlate with individual genetic distances, pointing to factors aside from host genetics as major influences in determining the composition of the gut microbiome.

Tree finches sampled are representative of the three genetic populations

The tree finches sampled for this study are representative of previously analyzed populations. Similar to analysis on 368 finches sampled over a decade (2004-2014), the genetic cluster assignment reflects an asymmetrical introgression from the rare medium tree finch into the common small tree finch (Kleindorfer et al. 2014a; Peters et al. 2017). The small tree finch cluster had a higher proportion of private alleles, higher heterozygosity and higher mean allele richness than the medium tree cluster, which is consistent with backcrossing of hybrid individuals with the small tree finch population. The number of finches assigned to each of the three genetic clusters are also commensurate with previous results, showing a smaller proportion of individuals caught assigned as medium tree finches compared to both the small and hybrid clusters (Figure S11).

The small and hybrid tree finch clusters are morphologically more similar to each other than either is to the medium tree finch cluster. In the 29 finches with gut microbiome samples,

95

the medium tree finch was significantly different from the small tree finch across six of the seven morphological variables tested, but only significantly different from the hybrid tree finches in two of the seven (Table S16). These results are slightly different than previous studies, where the small and hybrid tree finches were indistinguishable across all morphological traits. Two possibilities explain this discrepancy. First, more recent hybrids should display intermediate morphology (Steeves et al. 2010). However, the nine microsatellite loci do not provide enough resolution to assign individuals to specific hybrid generations and it is thus difficult to assess the accuracy of this as the cause of significant differences in morphology. Second, the number of individuals used in this study is an order of magnitude fewer than previous characterizations, which likely reduced the statistical power for differentiating the morphological traits, since previous studies with larger sample sizes found the small and hybrid tree finch populations to be morphologically indistinguishable (Peters et al. 2017). Additionally, song characteristics do not differ between the small and hybrid tree finch populations (Peters and Kleindorfer 2017).

Therefore, the small and hybrid tree finch populations are likely a hybrid swarm and the number of individuals sampled was simply too low to recapitulate this pattern. The difference in correlations with beak and body size can also be attributed to small sample size. The significant negative correlation between the membership coefficient to the small tree finch cluster and body size is consistent with previous studies, though the negative correlation with beak size was a non- significant trend (Kleindorfer et al. 2014a; Peters et al. 2017).

Darwin’s tree finch gut microbiomes

Broadly, the bacterial taxa detected with 16S sequencing in Darwin’s tree finch species are consistent with previous characterization on Santa Cruz island (Loo et al. 2018a), with ribosomal sequence variants (RSVs) mapped to the bacterial phyla Firmicutes, Actinobacteria,

96

and Proteobacteria dominating the bacterial communities. The microbiomes of small tree finches on Santa Cruz had similar, relatively even distributions across these three phyla but those of the medium tree finch on Floreana, with 88% Firmicutes was very different. At lower taxonomic levels, the genus Lactobacillus was the highest bacterial genus in all three genetic clusters, which was also the case for the small and large tree finch species on Santa Cruz.

Pairwise comparisons between tree finch genetic populations showed more differences between the medium tree finch and the other two genetic populations than between the small and hybrid populations, specifically with the enrichment of four RSVs classified to the genus

Lactobacillus (Figure 3.4). Two other RSVs were additionally depleted in the medium tree finch compared to the two other genetic populations, classified to the phylum Firmicutes, order

Clostridiales and phylum Proteobacteria, family Enterobacteriaceae, respectively. In contrast, only one RSV was significantly different in abundance between the small and hybrid tree finch populations, which was classified in the phylum Actinobacteria, genus Williamsia. The similarity of abundance profiles for the small and hybrid tree finch populations matches the model of the hybrid swarm from the introgression of the medium tree finch into the small tree finch.

Metagenomic sequencing allows for more specific characterization of the patterns observed in the 16S data. Four bacterial species had enough coverage to call sample specific alleles with at least one finch from each genetic population; however, none of these showed clustering based on the genetic population (Figure S12). Though the sample specific strain calling was not particularly illuminating, the metagenomic reads allowed for detection of bacterial species based on the reference genomes available. Looking at the scaled coverage across the bacterial genomes classified to the genus Lactobacillus, the medium tree finch shows the same enrichment relative to the other two genetic populations (Figure 3.5). A total of seven

97

bacterial Lactobacillus species were identified with at least 1x coverage across all samples using the metagenomic analysis. To properly interpret these signals of differential abundance, it is necessary to look at the corresponding metadata collected in this field season.

Metagenomic sequences can also be used to assess the functional capacity of the sequenced microbial community. Similar to the results of differential abundance analysis on 16S data, the hybrid tree finch cluster was more similar to its paternal species, the small tree finch, than its maternal species, the medium tree finch. The top enzymatic pathways represented in the finch microbiomes mapped to various metabolic capabilities, including carbon, fatty acid, nucleotide, and amino acid metabolism. Even in diverse host-associated microbial habitats where there are very different bacterial species present, such as the human gut vs the human skin microbiomes, the broader functional pathways remain relatively stable (Human Microbiome

Project Consortium 2012). The broad functional characterization of Darwin’s finch microbiomes here is consistent with previous host associated microbiomes.

Comparative metadata for interpreting microbiome differences

!13C and !15N stable isotope values can serve to identify carbon sources and trophic levels, respectively. The values reported here represent the first stable isotope characterization of

Darwin’s finch species on Floreana Island. While comprehensive sampling of potential food sources, the !13C and !15N stable isotope values cannot definitively connect the individual tree finches to prey items, they can be used as proxies for broader differences in diet amongst the three genetic populations. Using ANOVA, there were no significant differences detected in either !13C or !15N values (Figure 3.7).

Using the genetic data in combination with the stable isotopes, we looked for signal from the host genetic distances in multiple ways: correlation with Procrustes Analysis of Co-

98

phylogeny (PACo) and variation partitioning. PACo analysis did not reveal any constraint on the microbiome composition when using the genetic distances calculated from the microsatellite data. For comparison, the stable isotope values also did not significantly correlate with the microbiome data. Foraging observations for each genetic cluster were too similar given the inability to separate the small and hybrid tree finch populations when documenting food items and was therefore not used in PACo analysis. The lack of correlation with individual host genetic distance is congruent with a model of microbiome composition primarily driven by diet rather than host species or genetic similarity. Previous work on Santa Cruz did not show a strong effect of finch species identity on the microbiome between much more diverged finch species(Loo et al. 2018a), so the lack of signal here is unsurprising. Variation partitioning similarly showed no significant amount of variation explained by either the host genetic distances or the stable isotope values.

Foraging data were not specific to the individuals sampled and observations could not be separated between the morphologically similar small and hybrid tree finches. Therefore, the foraging data was by definition the same for the small and hybrid populations, showing 100% insect consumption, while medium tree finches were observed to consume seeds around 11% of the time (Table 3.1). While these observations characterize the small and hybrid tree finches as exclusively insectivorous, Darwin’s finch species are known to be extremely opportunistic in their diets and previous foraging observations in 2005 (~30 mm rain Jan-Mar) and 2006 (~150 mm rain Jan – Mar) show a significant increase in the consumption of fruits and flowers under conditions of moderately higher rainfall, with plant consumption (fruit, flower, leaf, seed) ranging from 35 to 77% under wetter conditions (Christensen and Kleindorfer 2009). Notably, the medium tree finch increased its dietary breadth during wetter conditions – which we found

99

again with our data. The classification of 2016 as a moderately wet year with ~144 mm rain Jan

– Mar was associated with greater dietary breadth in medium tree finch (invertebrates, plants) than small or hybrid tree finch (invertebrates) (Table 3.1). In addition to their diet, the three genetic populations of Darwin’s tree finches have also been documented to change foraging techniques and heights, possibly serving as expansion of their ecological niches (Peters and

Kleindorfer 2015). Some Darwin’s finches escape the trap imposed by seed size and hardness, for example, and exploit novel foraging techniques using innovative behavior that results in novel foods (Tebbich et al. 2002; 2010). One phase we know little about is how Darwin’s finch fledglings learn to forage, which is a crucial step in the ontogeny of foraging behavior (Marchetti and Price 1989; Tebbich et al. 2001). In tropical and southern hemisphere species, parental care of fledglings may extend to three months (Russell 2000). If Darwin’s finch fledglings remain with their father, perhaps they learn to forage according to the paternal diet and vice versa if they remain with their mother. In the case of the Camarhynchus hybridization where the maternal species is likely to be C. pauper and the paternal species is likely to be C. parvulus, the findings show dietary differences in C. pauper and differentiated gut microbiome in C. pauper compared with both C. parvulus and hybrid birds. Based on this finding, we predict that fathers remain with fledglings for extended post-fledging care in Darwin’s tree finches, which remains to be tested.

Placing the microbiome pattern into the context of the foraging data, the primary difference observed between the medium tree finch and the small/hybrid swarm was an increase in seed consumption. This suggests the possibility of seed consumption as a predictor of

Lactobacillus abundance in the gut microbiome. Lactobacillus has previously been identified in other avian surveys across both old and new world passerine species (Kropáčková et al. 2017);

100

(Hird et al. 2015) as well as the microbiomes of pet birds (Garcia-Mazcorro et al. 2016). It is also a prominent member in the gut microbiome of the parasitic fly Philornis downsi (Ben Yosef et al. 2017), which is the cause of high mortality amongst all Darwin’s finch species in the archipelago (Kleindorfer and Dudaniec 2016). Ben-Yosef et al. (2017) found Lactobacillus to specifically associate with the adult females which feed on decaying fruit. Given the strong environmental and diet driven composition of the microbiome in Darwin’s finches (Loo et al.

2018a), it seems most likely that Lactobacillus is present in the fruit and flower of plants and therefore better represented in the medium tree finch population that was observed to consume plant material (Scalesia flower heads) in 2016. The microbiome pattern from Santa Cruz is also congruent with this hypothesis, with the cactus finch having the highest relative abundance of

Lactobacillus at 78% and observed to consume flower and leaf material. To better understand the dietary underpinnings of the microbiome diversity patterns, future work should aim to connect individual microbiome samples with foraging observations, perhaps using color banding to identify the individuals observed.

The results presented here do not show significant differences in the overall gut microbiome community of the hybrid tree finches. Instead, differential abundance in a few RSVs was observed between the hybrid swarm of small and hybrid finches compared against the medium tree finch, with the microbiomes of the hybrid finches most similar to the paternal species, the small tree finch. In Darwin’s tree finches, the hybrid status does not confer any unique combinations of bacterial taxa, unlike the novel phenotypes found in a hybrid zone between subspecies of house mice (Wang et al. 2015). Apart from a mammalian host, several important factors differ between that study and the current study. For example, co-diversification of the microbiome and species in mammals has a much stronger signal than in birds and can be

101

disentangled from the effects of diet when studying the broader mammalian phylogeny (Groussin et al. 2017). Additionally, the introgression from the medium tree finch into the small tree finch on Floreana is occurring in a single, sympatric habitat, whereas the hybrid zone in house mice represents a small portion of the total range of each individual subspecies.

Conclusion

The study of tree finches on Floreana allowed for investigation into the effect of hybridization on the gut microbiome of two closely related Darwin’s finch species and the resulting hybrid cluster. Overall, the gut microbiome of the hybrid tree finches are more similar to their parental species, the small tree finch, based on differential abundance analyses. The lack of differences in the broader microbiome makeup between the genetic clusters and lack of correlation between individual genetic distances and microbiome distances implies that other factors aside from host genetic background contribute to the composition of the microbiome.

Within the individuals studied, the stable isotopes and foraging data did not provide the necessary resolution to accurately assess their contribution. However, the broader patterns of increased seed consumption match with the identified enrichment in Lactobacillus species in the medium tree finch. By interrogating the microbiome within these closely related, sympatric species, this study further corroborates the dietary impact on the microbiome in Darwin’s finches.

102

Acknowledgements

We are grateful to the Charles Darwin Foundation and Galápagos National Park for allowing us to work in the Galápagos Islands, with special thanks to Solanda Rea, Marta Romoleroux, and the rest of the staff at the Charles Darwin Research Station for facilitating logistics. We especially our field assistant, Jefferson Garcia Loor, for his dedication to this project. We thank

Katharina J Peters for help with microsatellite analysis and Arno Cimadom for rainfall data. This work was supported by a National Science Foundation Graduate Research Fellowship, a

Graduate Research Opportunities Worldwide grant in collaboration with Flinders University and additional funding from Macquarie University as well as the Dean’s Competitive Fund from

Harvard University.

103

Chapter 4

Title: Habitat difference shapes the gut microbiome of Darwin’s finches on Floreana Island

Authors: Wesley T. Loo1, Rachael Y. Dudaniec2, Sonia Kleindorfer3, Colleen M. Cavanaugh1

1Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA,

USA; 2Department of Biological Sciences, Macquarie University, Sydney, NSW, Australia;

3School of Biological Sciences, Flinders University, Adelaide, SA, Australia

Abstract

Darwin’s finch species in the Galápagos Archipelago are an iconic adaptive radiation that offer a natural experiment to test for the various factors that influence gut microbiome composition. While previous work has shown a significant impact of diet on the bacterial community in Darwin’s finches, the samples were collected from a single island (Santa Cruz) and from a single age class (adult). The island of Floreana has the longest history of human settlement within the archipelago and offers an opportunity to compare island and habitat effects on Darwin’s finch microbiomes. In this study, we compare gut microbiomes in Darwin’s finch species on Floreana Island to test for effects of (1) host phylogeny, (2) age class (nestlings, adults), (3) habitat (lowlands, highlands), and (4) island (Floreana, Santa Cruz). We used 16S rRNA Illumina sequencing of fecal samples to assess the gut microbiome composition of

Darwin's finches, complemented by analyses of stable isotope values and foraging data to provide ecological context to the patterns observed. Overall bacterial composition of the gut microbiome demonstrated co-phylogeny with Floreana hosts, differed across age class (adult vs nestling), recapitulated the effect of habitat and diet, and showed differences across islands. The finch phylogeny uniquely explained more variation in the microbiome than did foraging data.

Nestlings harbored a greater proportion of unclassified bacterial taxa but did not significantly differ in beta diversity from adults. Finally, there were interaction effects for island × habitat, whereby the same Darwin’s finch species sampled on two islands differed in microbiome for highland samples (highland finches also had different diets across islands) but not lowland samples (lowland finches across islands had comparable diet). Together, these results corroborate the influence of phylogeny, age, diet, and sampling location on microbiome

105

composition and emphasize the necessity for comprehensive sampling given the multiple factors that influence gut microbiome in Darwin’s finches.

Introduction

The microbial communities associated with animals, or microbiomes, are now understood to play significant roles in the biology of the host organism. This is perhaps unsurprising given that all metazoans evolved in a microbial world and microorganisms were an essential aspect of the environment in which each species lived (McFall-Ngai et al. 2013). Most microbiome research has focused on its role for human health and disease states (Human Microbiome Project

Consortium 2012); (Hall et al. 2017) as well as studying potential co-evolution between gut microbiomes and host species in mammals (Ley et al. 2008a; Muegge et al. 2011; Groussin et al.

2017). In contrast, considerably less attention has been paid to the composition of the microbiome in non-mammalian species.

Birds are a diverse vertebrate clade with unique life history traits compared to mammals, and are widely useful as global indicators of ecosystem health (Caro and O'Doherty 1999). From the perspective of the microbiome, recent surveys in both old world and new world passerine species have shown the composition of their gut microbiomes to be distinct from mammals in broad taxonomic characterization. Passerine species contain relatively fewer bacteria from the phylum Bacteroidetes and more Proteobacteria, which is opposite to the relative abundance of these bacterial phyla found in mammalian species (Hird et al. 2015; Kropáčková et al. 2017).

Correlations between the diversity of bacteria in the microbiome and phylogenetic relationship of the host species have been established; in one case, this signal was found to be stronger than ecological life history traits amongst 51 bird species sampled in the Czech Republic

106

(Kropáčková et al. 2017). However, much remains to be learned about the effects of different factors, such as sampling location, diet and host age, on avian microbiome composition.

Changes in the microbiome over the course of development have been documented with great detail in humans, such as tracking individual strains of bacteria between mother and infant

(Nayfach et al. 2016). However, relatively few studies have focused on the ontogeny of the gut microbiome in avian species. One study using cloning libraries in a member of the family, the black legged (Rissa tridactyla), found a clear separation between nestling and adult cloacal microbiomes with minimal overlap between the operational taxonomic units (OTUs)

(van Dongen et al. 2013). Nestlings also appeared to harbor a greater diversity of bacterial taxa based on differences in the rarefaction curves. Another study in chinstrap (Pygoscelis antarctica) similarly showed clear clustering by age, but in this case adults had greater inferred richness of bacteria compared to chicks (Barbosa et al. 2016). In two songbird species, the local nest environment was experimentally shown to influence microbiome. Using a cross-fostering approach, great tit (Parus major) and blue tit (P. caeruleus) nestlings were swapped between nests; heterospecific nestlings reared in the same nest had closer microbiome communities than conspecific nestlings reared in different nests, pointing to a strong effect of the environment on the assemblage of the microbiome (Lucas and Heeb 2005).

In considering environmental effects, geographical location is a key variable that may influence the microbiome. In birds, the effect of location was studied in a brood parasitic system using the brown-headed cowbird (Molothrus ater) (Hird et al. 2015). Cowbird eggs are laid in the nests of heterospecific host species and the cowbird hatchlings are subsequently reared by heterospecific foster adults. By sampling the gut microbiome from cowbird and host species populations in California and Louisiana, Hird et al. investigated whether an individual’s gut

107

microbiome would reflect its species identity (e.g. same in cowbirds from California or

Louisiana) or that of its foster host species (e.g. different in cowbird reared in host species from

California versus Louisiana). The sampling locality had the strongest association with the microbiome composition across all the samples characterized, corroborating a strong environmental effect on the gut microbiome. Another study across the Americas in mammalian species demonstrated a correlation between geographic distance and microbiome distances even after taking the phylogenetic relationships into account (Moeller et al. 2017). The geographic distance provided additional explanatory power compared to the sole interpretation based on phylogenetic distance. Notably, environmental drivers of microbial community composition are increasingly being found with the emergence of novel genomic and spatial analyses (Dudaniec and Tesson 2016).

Darwin’s finches provide an opportunity to compare the effect of locality and phylogeny on gut microbiome in a replicated natural laboratory in the Galápagos Archipelago. With an extensive body of research into the ecology and evolution of these species, each island within the

Galápagos archipelago serves as an independent environment to characterize the microbiome in this extremely well-studied adaptive radiation (Grant 1999). Previous work on the island of Santa

Cruz documented effects of habitat, foraging behavior, and finch phylogeny on the Darwin’s finch bacterial community (Loo et al. 2018a). Sampling from another island will allow us to test if the patterns hold across islands and test for island-specific effects by comparing species present on both islands.

Floreana Island is located in the south of the archipelago (total area: 173 km2, 1°28’ S,

90°48’ W). Similar to the larger islands in the Galápagos, Floreana Island is characterized by multiple ecological zones ranging from the arid lowlands to the humid highlands. It harbors five

108

extant species of Darwin’s finches: the small ground finch (Geospiza fuliginosa), the medium ground finch (G. fortis), the cactus finch (G. scandens), the small tree finch (Camarhynchus parvulus), and the endemic, critically endangered medium tree finch (Camarhynchus pauper)

(Dvorak et al. 2017). Within the Galápagos system, Floreana Island has the longest history of human settlement and, likely for this reason, the highest record of local extinction of land birds within the archipelago. The avian species extinction list includes large ground finch (G. magnirostris) and sharp-beaked ground finch (G. difficilis) by ~1870 (Steadman 1986), the

Floreana mockingbird (Mimus trifasciatus) by 1895 (Curry 1986), the warbler finch (Certhidea fusca) by 2004 (Grant et al. 2005), the large tree finch (C. psittacula) genetically confirmed absent by 2010 (Kleindorfer et al. 2014a), and most recently the vegetarian finch (Platyspiza crassirostris), which was not detected in surveys in 2015 (Dvorak et al. 2017). Currently, the impact of the invasive parasitic fly, Philornis downsi, which was first discovered in Darwin’s finch nests in 1997 (Fessl and Tebbich 2002), is considered especially detrimental to the persistence of critically endangered species (reviewed in (Kleindorfer and Dudaniec 2016).

Floreana Island harbors the only population of the critically endangered medium tree finch

(O’Connor et al. 2009), which is currently hybridizing with the small tree finch (Kleindorfer et al. 2014a; Peters et al. 2017). Since 2004, Kleindorfer’s group has studied the Floreana Island

Darwin’s finch group with insights into nesting behavior (O'Connor et al. 2010; O’Connor et al.

2010; Kleindorfer et al. 2014b; O’Connor et al. 2014), song (Christensen and Kleindorfer 2007;

Peters and Kleindorfer 2017), foraging behavior (Peters and Kleindorfer 2015), and genetic admixture (Peters et al. 2017). The hybridization event between the small tree finch and medium tree finch is of particular interest and specific microbiome differences between the genetic clusters is the subject of a separate study (Loo et al. 2018b).

109

Here we characterize the gut microbiomes of five species of Darwin’s finch found on

Floreana Island across both highland and lowland habitats with the aim to answer four questions:

(1) What is the association between Darwin’s finch phylogeny and microbiome community? (2)

Does age class affect microbiome in small ground finches sampled within the lowlands? (3) Do habitat and diet affect finch microbiome? and (4) How do microbiome patterns differ between

Floreana Island and Santa Cruz Island? Since the species inhabit similar ecological niches on both Floreana Island and Santa Cruz Island, we expect to observe similar patterns in the effects of host phylogeny, habitat, and diet. With opportunistic sampling of nestlings, we also examine differences between life stages within the small ground finch. Given that four of our focal

Darwin’s finch species occur on both of the islands we examine (Santa Cruz, Floreana), we are able to compare conspecific host microbiomes across islands. Controlling for the effect of species allows us to interrogate whether the island of origin affects the microbiome, in combination with comparisons of foraging patterns and stable isotope analysis as proxies for dietary differences between species and locations. This study leverages the natural, replicated distribution of isolated populations of conspecific Darwin’s finch species to illuminate factors that contribute to gut microbial community structure in birds.

Materials and Methods

Ethics statement

All samples were collected with permission from the Parque Nacional Galápagos and

Ministerio del Ambiente, Ecuador (Research permit No. PC-23-16). All collection protocols were approved by the Institutional Animal Care and Use Committee in the Faculty of Arts and

Sciences at Harvard University (Protocol 15-08-249). Samples for the medium tree finch

110

(Camarhynchus pauper) were imported under U.S. Fish and Wildlife Service permit number

MA05827C-0.

Study sites and species

Fieldwork was conducted during February 2016 on Floreana Island, Galápagos

Archipelago, Ecuador. Sampling sites were located in both highland (1°17’S, 90°27’W) and lowland (1°16’S, 90°29’W) habitats. Five Darwin’s finch species on Floreana Island were sampled and the number of samples per species and per habitat are detailed in Table 4.1.

Samples from two of the ground finch species were primarily collected in the lowlands (L) but also included at least one sample in the highlands (H): the medium ground finch (Geospiza fortis) (8L, 2H) and the cactus finch (Geospiza scandens) (6L, 1H). The small ground finch included multiple samples from both habitats (15L, 13H), of which three lowland samples were from a set of nestling siblings. All tree finch samples were collected exclusively in the highlands and were previously assigned to genetic population after microsatellite genotyping(Loo et al.

2018b). After genetic population assignment, there were 4 samples from the medium tree finch

(Camarhynchus pauper), 14 samples from the small tree finch (Camarhynchus parvulus), and 11 samples from the admixed population.

111

Table 4.1. Darwin’s finch species and sample number from highland and lowland habitats on Floreana Island, Galápagos Archipelago.

Common Name Abb. Scientific Name Highland Lowland Total per Samples Samples species Small Ground finch SGF Geospiza fuliginosa 13 15* 28 Medium Ground finch MGF Geospiza fortis 2 8 10 Cactus finch CF Geospiza scandens 1 6 7 Small Tree finch STF** Camarhynchus parvulus 14 14 Hybrid Tree finch HTF** N/A 11 11 Medium Tree finch MTF** Camarhynchus pauper 4 4 Total 44 30 74

* 3 samples were from small ground finch nestlings

** All tree finches were previously genotyped using microsatellite loci and assigned to genetic clusters (Loo et al. 2018b)

Sample Collection

Finches were caught using mist nets and tagged with an aluminum ring imprinted with a unique identifier prior to all sample collections. Eight morphological traits and mass were measured as previously described and are detailed in Supplementary Methods (see

Morphological measurements; (Kleindorfer et al. 2014a). Ground finch individuals were classified into species using the morphological measurements and established protocols (Lack

1947; Grant 1999; Kleindorfer et al. 2014a), while the tree finches required microsatellite genotyping for accurate classification (Kleindorfer et al. 2014a; Peters et al. 2017; Loo et al.

2018a).

Blood samples were collected for genetic and stable isotope analyses. Samples for genetic analysis were preserved on Whatman FTA Paper (GE Healthcare Life Sciences,

Pittsburgh, PA) and stored at room temperature. Samples for stable isotope analysis were dried

112

on small pieces (roughly 0.5 x 0.5 cm2) of quartz fiber filter paper (Schleicher and Shuell,

Dassel, DE) and stored in microcentrifuge tubes with a silica gel bead as desiccant at room temperature.

After morphological measurements and blood sample collection, fecal samples were collected by placing each finch into a 7" x 7" x 7" cage lined with UV-sterilized parchment paper. Cages were covered with fabric and finches were monitored until defecation for a maximum of 30 min before release. Feces were immediately transferred from parchment paper with bleach cleaned spatulas into pre-weighed microcentrifuge tubes containing 1 ml of

DNA/RNA Shield (Zymo Research, Irvine, CA) and mixed by shaking the tubes by hand before storage at -20°C within 4 h of collection to prolong the longevity of the DNA stabilization buffer. Fecal samples were shipped at room temperature and stored at -80°C in the lab until further analysis.

Small ground finch nestling fecal samples were collected directly off the bird bag on which the individuals were placed for weighing. To minimize contamination from the bird bag, only the top layer of the fecal sample was taken, taking care to avoid collecting material in contact with the bird bag.

Foraging observations

To quantify the diet patterns across species in both habitats, first foraging observations were collected at both highland and lowland sampling sites (Kleindorfer and Chapman 2006). At each site, a single walk through of one hour was conducted with no overlaps or doubling back to avoid observing the same individuals. During the walkthrough, individual finches were observed until the first food item was ingested. The food item consumed was recorded as one of five

113

categories: insect, seed, flower, leaf, or fruit. Due to the tame nature of Darwin’s finches, the majority of observations were made within 8 m of the focal individual.

Fecal DNA Extraction and 16S rRNA gene sequencing

DNA was extracted from feces in the laboratory using the ZR Fecal Miniprep kit (Zymo

Research, Irvine, CA) following manufacturer's instructions with the following changes. To minimize loss of biological material, BashingBeads were added directly to the collection tubes with the fecal sample in DNA/RNA Shield, which acted as the lysis buffer. Samples were homogenized in a FastPrep FP120 (Qbiogene, Carlsbad, CA) for six rounds of 45 s at speed 6.5 m/s. Between each round, tubes were cooled on ice for 3 min. All liquid transfer steps were performed in a laminar flow hood to minimize environmental contamination.

The V4 region of the 16S rRNA gene was amplified using NEBNext Q5 HotStart HiFi

MasterMix 2x (New England Biolabs, Ipswich, MA) and previously designed dual-index barcoded universal primers (Kozich et al. 2013). For each fecal DNA sample, triplicate 25 µl

PCR reactions were performed containing 12.5 µl master mix, 9.5 µl molecular grade water, 0.5

µl of 10 M stock for each primer, and 2 µl of DNA template. PCR conditions consisted of initial denaturation at 94°C for 5 min followed by 20 cycles of 98°C for 20 s, 55°C for 15 s, 72°C for

40 s, and a final extension at 72°C for 5 min.

All PCR products were purified using 0.66X Aline PCRClean DX (Aline Biosciences,

Woburn, MA) to size select for the ~450 bp PCR product. Purified PCR products were visualized and quantified using High Sensitivity D1000 ScreenTape on an Agilent 2200

TapeStation (Agilent, Santa Clara, CA) and pooled in equimolar concentrations for sequencing on a single MiSeq run (Illumina, USA) using v2 chemistry and 2 x 250-bp paired-end reads at the Harvard Biopolymers Facility (Boston, MA).

114

Contamination controls

Given the low DNA content of bird feces (Vo and Jedlicka 2014), we were concerned about the influence of environmental microbial contamination in analyzing the sequences (Salter et al. 2014). To understand the sources of contamination, controls were included at the DNA extraction and amplification steps of the sequencing preparation. To evaluate contaminants from the DNA extraction kits, for each kit we included a mock community extraction with 75 µl of bacterial cells from ZymoBIOMICS Microbial Community Standard (Zymo Research, Irvine,

CA) and a no sample extraction control with only DNA/RNA Shield. To assess contaminants from PCR amplification reagents, for each 96-well plate of PCR amplification, a triplicate mock community amplification with 2 µl of a 1:10 dilution of ZymoBIOMICS Microbial Community

DNA Standard (Zymo Research, Irvine, CA) and a triplicate no template control reaction.

Greater than 99.75% of all reads from ZymoBIOMICS Microbial Community standards and

DNA standards mapped to the expected genera. None of the extraction or no template controls produced quantifiable PCR product and were excluded from further analysis.

Sequence processing

Sequences were demultiplexed according to the dual-index barcode by the Harvard

Biopolymers Facility (Boston, MA) and all the following sequence processing steps were performed in R version 3.4.0 (R Core Team 2014). The fastq files for each sample were converted into Ribosomal Sequence Variants (RSVs) using DADA2 with parameters as described in (Callahan et al. 2016). RSVs were taxonomically classified with the RDP v14 training set (Cole et al. 2009)and chimeras were removed as implemented in DADA2. After

115

initial processing a total of 3,709,205 reads and 6,015 RSVs were identified across all 74 finch fecal samples.

Sequence filtering

The following steps were taken to produce the final dataset for analysis. As it is impossible to prevent all environmental contamination in PCR amplification, the frequency based decontam algorithm (Davis et al. 2017) was applied to the dataset to identify reads from likely contaminants based on the concentration of the PCR products. This removed 22,612 reads

(0.61 %) and 100 RSVs (1.66%). To reduce the influence of RSVs present in only a few samples, a 5% prevalence filter was applied, which removed 96,837 (2.63%) reads and 2,379

RSVs (40.22%). After taxonomic assignment, any sequences not classified as Bacteria were removed, subtracting 2,334 reads (0.07%) and 24 RSVs (0.68%). Finally, sequences classified as

Chloroplasts were removed, subtracting 115,724 reads (3.23%) and 36 RSVs (1.03%). The final dataset included 3,471,698 reads and 3,476 RSVs.

Rarefying reads

To ensure sample library sizes were not driving the patterns observed in the data, the following categorical variables were checked for significant differences in mean library size

(using the Kruskal-Wallis rank sum test) and library size distribution (using Levene’s test as implemented in the R package car (Fox and Weisberg 2011)): species, habitat, and sex. None of the variables were significantly different in mean library size or library size distribution after

Bonferroni correction for multiple comparisons (Table S23). Therefore, for increased statistical power in detecting differences between microbiome samples, all following analyses were performed with non-rarified microbiome data (McMurdie and Holmes 2014).

116

16S rRNA sequence analysis of Darwin’s finch microbiomes

Alpha diversity analyses

To calculate the relative abundance of bacterial phyla present in the gut microbiome of each Darwin’s finch species, reads were transformed to proportions by sample and then averaged across all microbiome samples per finch species.

Beta diversity analyses

To visualize differences between microbiome samples, double principal coordinate analysis (DPCoA) was applied to the log-transformed RSV table as implemented in the R package phyloseq (McMurdie and Holmes 2013). DPCoA is a dissimilarity metric which incorporates both quantitative and phylogenetic information about the microbiome samples

(Pavoine et al. 2004). To assess the differences in community composition of the gut microbiomes between samples, weighted UniFrac distances (Lozupone et al. 2010) were calculated between all samples. All abundance data were log transformed prior to distance calculations as an approximate variance stabilization method. To check for the homogeneity of the multivariate dispersions of the distance metrics, the betadisper function was used as implemented in the R package vegan (Oksanen et al. 2017). To test the significance of categorical variables, permanova was used as implemented with the function adonis in the R package vegan (Oksanen et al. 2017).

117

Comparative metadata and analyses

Stable isotope analysis

To assess the differences in diet between the finches sampled, stable isotope analyses were performed using blood samples dried on quartz fiber filter paper. These were packaged in 5 x 9 mm tin capsules for analysis (041077, Costech Analytical Technologies, Inc, Valencia, CA).

!13C and !15N values were measured on a Thermo Scientific Delta V paired with a Costech 4010 elemental analyzer and a high-temperature conversion elemental analyzer at the Center for Stable

Isotopes at the University of New Mexico (Albuquerque, NM). A known protein standard was run at multiple concentrations as a run-to-run control. !13C values were adjusted by the mean difference between the measured values for the protein standard and the known value (-1.18‰).

!15N values for samples below 1000 mV were error corrected using a linear regression on the protein standard (R2 = 0.39).

Foraging data

For summarizing foraging data, the food items seed, flower, leaf, and fruit were combined into the category plant. The proportion of plant and insect food items therefore sum to

1. These observations provide knowledge of the broad diet patterns for each finch species in both habitats. Since few observations were made of cactus finches and medium ground finches in the highlands (zero and one, respectively), samples from these species in the highlands were excluded from any analysis that used foraging data.

118

Testing co-diversification of the microbiome and the finch phylogeny

To assess congruence between the phylogenetic diversification of Darwin’s finches and their gut microbiomes, the Procrustean approach to co-phylogeny (PACo) (Balbuena et al. 2013) was applied to the data. PACo was designed to detect the similarity of evolutionary patterns in host-parasite associations. Here, the microbiome samples are treated as the ‘parasites’ to compare with the host species genetic distances. Darwin’s finch species’ genetic distances for the

PACo analysis were based on whole genome resequencing encompassing more than 44 million variable sites with representative individuals chosen from Santa Cruz when available

(Lamichhaney, personal communication,(Lamichhaney et al. 2015). Microbiome distances were calculated using the weighted UniFrac metric (Lozupone et al. 2010), to produce a quantitative distance comparison that incorporated the phylogeny of the microbial community. PACo analysis was run as implemented in the R package paco (Hutchinson et al. 2017), with 10,000 permutations to test the significance of the signal. Using the symmetric calculation, the correlation coefficient r was calculated as r = (1-ss). Because three samples lacked stable isotope data and one sample lacked foraging data, a total of four samples were excluded from PACo analysis with the finch phylogeny.

Variation partitioning

To compare the amount of variation explained by host genetic distance, stable isotope values, and diet distance as calculated from first foraging observations, variation partitioning by redundancy analysis (Legendre 2008) was used as implemented with the varpart function in the

R package vegan (Oksanen et al. 2017). The microbiome distance matrix was used as a response variable with three explanatory tables: the first two principal coordinate axes of the host genetic

119

distance, the !13C and !15N stable isotope values, and the first two principal component axes of the first foraging observations. Significance of the distance based redundancy analysis was assessed using the anova.cca function implemented in vegan.

Beta diversity through time (BDTT) analysis

To further disentangle the contribution of host phylogeny and diet to the microbiome composition, the beta diversity through time (BDTT) metric (Groussin et al. 2017) was applied to the dataset as described previously (Loo et al. 2018a).

Random forest analysis

To determine whether the gut microbiome communities could differentiate between categorical variables of interest, a random forest classifier (Breiman 2001) was applied to the

RSV table as implemented in the R package randomForest (Liaw et al. 2002), using leave-one- out cross validation. RSVs with the top importance for classification were determined by calculating the mean decrease in accuracy across all models. Accuracy was defined by the number of samples correctly classified based on the category of interest, which was calculated along with confusion matrices in the R package Caret (Kuhn et al. 2017).

Inter-island comparison with Santa Cruz data

To investigate whether Darwin’s finch microbiome communities are affected by the island of origin, the samples sequenced in this study were compared with the microbiome samples from the island of Santa Cruz for the four species present on both islands: the small ground finch, medium ground finch, cactus finch, and small tree finch (Loo et al. 2018a). To more accurately characterize the microbiomes across islands, only species and habitat

120

combinations with at least three samples were included and a summary table of the combined dataset can be found in Table S24. Tests of difference in beta diversity by island were performed using PERMANOVA as described above (Beta diversity analyses).

To evaluate whether foraging patterns differed between islands, Euclidean distances were calculated between the weighted average foraging pattern for each habitat on both islands. The weighted average foraging pattern was calculated by weighting each food category by the number of samples of each species included in the microbiome comparison. The broad categorization ‘plant’ was calculated by combining seed, leaf, fruit, and flower.

Results

Using next-generation sequencing, a total of 3,471,698 sequences were generated across all finch microbiome samples (mean = 31,272; range = 1,792 to 54,585; one outlier with

1,094,988) across a total of 3,476 ribosomal sequence variants (RSVs) (mean = 597, range = 43 to 1,396). Sequence numbers were not significantly different across variables of interest and all following analyses are based on the non-rarified data to increase statistical power (Table S23; see

Methods – Rarefying reads).

Darwin’s finch microbiome alpha diversity analyses

A total of nineteen bacterial phyla were detected in the gut microbiome in Darwin’s finches though only four were present at a relative abundance of >1% across all samples. Using

DADA2, all sequences were assigned to ribosomal sequence variants (RSVs) and taxonomically classified using the RDP v14 database (Cole et al. 2009). Across all samples, the bacterial phyla

Firmicutes, Actinobacteria, and Proteobacteria composed the majority of the sequences,

121

representing 51%, 27%, and 19% of RSVs respectively. Bacterial taxa unclassified at the phylum level made up 3% of the sequences while all other bacterial phyla detected were represented by less than 1% of sequences across all samples (Table S25). Comparing across host species, the medium ground finch and the cactus finch had the highest proportion of Firmicutes (89% and

69% respectively), while the medium ground finch had the highest proportion of Actinobacteria

(36%) (Figure 4.1A;Table S26). Proteobacteria were most abundant in the small tree finch and the hybrid tree finch genetic cluster at 25 and 27%, respectively.

At lower taxonomic levels, the genus Lactobacillus in the phylum Firmicutes was the most abundant bacterial genus across all samples (44%) and within each species (Table S27;

Table S28). For the medium tree finch, Lactobacillus dominated the microbiome, with at least

82% of reads within each individual sample classified to this genus. It was also the most abundant bacterial genus in the other Darwin’s finch species on Floreana Island, ranging from a mean relative abundance of 30-48% in the hybrid tree finch and cactus finch, respectively. The bacterial genus Kocuria in phylum Actinobacteria was the second most abundant bacterial genus in the medium ground finch, small tree finch, and medium tree finch at 10%, 12% and 4%, respectively, and was the third most abundant bacterial genus across all samples at 5% mean relative abundance (Table S29). The remaining three finch species had different bacterial genera as the second most abundant genus: Acinetobacter at 5% in the small ground finch, Enterococcus at 13% in the cactus finch, and Helicobacter at 7% in the hybrid tree finch (Figure S15).

122

Figure 4.1 Relative abundance of bacterial phyla in the gut microbiota of Darwin’s finch species on Floreana Island.

A) Phylogeny of Darwin’s finch species on Floreana Island based on whole-genome resequencing (Lamichhaney et al. 2015) and the mean relative abundance of bacterial phyla across all gut microbiome samples of each species. Species abbreviations are given in Table 4.1.

B) Relative abundance of the bacterial phyla in individual microbiome samples grouped according to species and habitat. The three nestlings are bracketed within the small ground finch lowland. Any bacterial phylum with mean relative abundance within a given finch species below

1% was omitted from both plots.

123

124

The small ground finch provided an opportunity to compare the effect of habitat and age on the composition of the microbiome with multiple samples from the highlands and lowlands and samples from three nestlings. Within samples from the small ground finch, highland samples were characterized by a higher proportion of Firmicutes (70% vs 40%), which were primarily assigned to the genus Lactobacillus (65% vs 38% of all reads) (Table S30). In contrast, lowland samples from adult small ground finches had a higher proportion of Actinobacteria (40% vs

14%), which were assigned to many more genera at lower relative abundances (all < 4%; Table

S27). In comparison with adults from the lowland habitat, nestling samples were significantly enriched in bacterial taxa that were unclassified at the phylum level (42% vs 1% in adults; Table

S31; Figure 2.1B).

Beta diversity analyses

To visualize differences in bacterial community composition between microbiome samples, double principal coordinate analysis (DPCoA) was applied to the data. DPCoA is a multivariate ordination method that takes the phylogeny of the bacteria into account when calculating pairwise distances. Plotting samples by habitat and species revealed a separation between highland and lowland samples but no clear clustering by species (Figure 4.2A). The small ground finch was the only species with multiple samples in both habitats and also showed this separation (Figure S16). By plotting the RSVs in the same ordination space, the bacterial phyla Proteobacteria and Actinobacteria were revealed as enriched in the highland and lowland samples, respectively (Figure 4.2B). DPCoA did not show visual differences between either the small ground finch nestling vs adult samples (Figure S17).

125

Figure 4.2 Double principal coordinate analysis (DPCoA) of Darwin’s finch gut microbiome communities.

A) Gut microbiome samples from Floreana are plotted on the first two principal coordinate axes, with point color and point shape indicating host species and habitat, respectively. Overall a separation between highland and lowland samples can be seen along the first axis, with highland samples and lowland samples mostly to the left and right, respectively.

B) Individual ribosomal sequence variants (RSVs) are plotted in the same ordination space as the gut microbiome samples. Bacterial phyla with at least 1% relative abundance across samples are color-coded; all other RSVs are gray. The demarcation between highland and lowland samples is recapitulated by the RSVs, with Proteobacteria and Actinobacteria corresponding to highland

(left) and lowland (right) samples, respectively.

126

127

To test whether the differences in beta diversity were statistically significant, the weighted UniFrac distances were tested using PERMANOVA with the categorical variables habitat, species, and sex. Habitat and species were tested in a two-way PERMANOVA to account for the presence of the small ground finch in both habitats. Age was tested within small ground finch samples collected in the lowland. Only habitat showed a significant difference between microbiome communities (R2 = 0.10, p = 0.001;Table S32).

Stable isotope values and foraging data

To estimate the dietary differences between individuals sampled for the microbiome,

!13C and !15N stable isotope ratios were analyzed. In general, the samples separated more by habitat than by species with highland samples generally lighter in both 13C and 15N (Figure

4.3;Table S33). The tree finch samples in the highland ranged from !13C and !15N values of -

22.8‰ to -28‰ and 5.0‰ to 10.0‰, respectively. In contrast, the lowland samples across the ground finch species (SGF, MGF, CF) had a wider range in both elements, with !13C from -

26.8‰ to -16.3‰ and !15N from 5.7‰ to 14.9‰. Surprisingly, small ground finch samples in the highland had the heaviest !13C values around 15‰.

128

Figure 4.3 !13C and !15N stable isotope measurements for Darwin’s finch species.

Point color and shape indicate host species and habitat, respectively. A) Individual !13C and

!15N values for each finch with gut microbiome samples. B) Mean !13C and !15N values for each species and habitat with standard deviation. The ground finch species (SGF, MGF, and CF) all had at least one sample in both habitats and have distinct !13C and !15N values dependent on the habitat of origin.

129

To assess the microbiome samples and stable isotope values in the context of food items, foraging observations were made in both habitats (Table 4.2). By classifying the observations for each species, it is possible to get a sense of the broad dietary patterns. Since species classification was made visually, observations for the small tree finch the hybrid cluster were combined. The tree finch species (STF/HTF and MTF) primarily consumed insects, though the medium tree finch also consumed 11% seed. The cactus finch exclusively foraged on plant material though this was spread between flower, leaf, and seed items. The medium and small ground finches consumed both plant material and insects with an increasing proportion of the latter in the highlands.

Table 4.2. First foraging observations across all Darwin’s finch species on Floreana Island

Species Hab. Total Counts (% of counts) Summary % (n) Flower Fruit Leaf Seed Insect Plant* Insect L 60 22 (36) 4 (7) 24 (40) 10 (17) 83 17 SGF H 49 1 (2) 26 (53) 22 (45) 55 45 L 30 1 (3) 11 (37) 12 (40) 6 (20) 80 20 MGF H 1 1 (100) 100 CF L 20 12 (60) 7 (35) 1 (5) 100 STF/HTF H 23 23 (100) 100 MTF H 19 2 (11) 17 (89) 11 89

* The category ‘Plant’ is the sum of all plant derived food items (flower, fruit, leaf, and seed).

130

Testing co-diversification with Floreana species

PACo analysis showed significant correlation between the host phylogeny and the microbiome (R2=0.11, p = 0.002). PACo analysis was also applied to the stable isotope and foraging data. Foraging data had a similar procrustean correlation coefficient (R2=0.12, p <

0.001) while the correlation with stable isotope values was not significant (R2=0.03, p =0.25)

Variation partitioning was used to compare the amount of variance in the microbiome explained by the finch phylogeny, stable isotope values, and foraging data. Total explained variance with all three explanatory tables had an adjusted R2 = 0.15 (Figure 4.4). The finch phylogeny and foraging data had comparable correlation with the microbiome samples (adjusted

R2 = 0.06, p = 0.002 for both) stable isotope values were not significant (adjusted R2 = -0.001, p

= 0.14). After controlling for variation explained by the overlap between explanatory tables, the finch phylogeny uniquely explained more variation in the microbiome than foraging data

(adjusted R2 = 0.071 v 0.044, p = 0.012 v 0.029, respectively).

To further parse the correlation between gut microbiome and explanatory variables, the beta diversity through time (BDTT) was applied to the Floreana samples. There was no significant correlation between gut microbiome distance and explanatory variables at any time points (data not shown).

131

Figure 4.4. Variation partitioning results for comparing Darwin’s finch phylogeny, stable isotope values, and foraging data to the weighted UniFrac distances of gut microbiome samples from Floreana.

Results of variation partitioning using weighted UniFrac distances between gut microbiome samples against Darwin’s finch phylogeny (first two principal coordinate axes), stable isotope values (!13C and !15N values), and foraging data (first two principal component axes) visualized with a Venn diagram. Adjusted R2 values for each component are plotted inside the circles. All testable components include the p-value calculated using distance based redundancy analysis.

Adjusted R2 values in [a], [b], and [c] are the amount of variation explained uniquely by the corresponding explanatory table. Parts [d], [e], and [f] are amounts of variation that can be explained by either table in the overlap and part [g] is shared by all three tables. Foraging data is the only table with a positive adjusted R2 value after controlling for overlapping variance.

132

Inter-island comparison with Santa Cruz

To evaluate whether island of origin affects the gut microbiome composition of Darwin’s finches, data from this study were analyzed in conjunction with samples from Santa Cruz Island for the species that occur on both islands: small ground finch (Floreana: 25, Santa Cruz: 13), medium ground finch (F: 8, S: 6), cactus finch (F: 6, S: 6), and small tree finch (F: 14, S: 9), totaling 87 fecal samples (Loo et al. 2018a; Table S24). Though visualization with DPCoA did not reveal strong clustering by island (Figure 4.5), when tested with a two-way PERMANOVA, both habitat and the interaction term island × habitat were significantly different in weighted

UniFrac distances (p=0.001 and 0.048, respectively) while species nested within habitat or island were not (p=0.202 and 0.064, respectively; Table S34). To further determine which combinations of island × habitat were contributing to the difference, pairwise tests were performed in each category. After correcting for multiple hypothesis testing, three of the four pairwise tests were significant – highland on Floreana versus highland on Santa Cruz (R2 = 0.07, p = 0.008), lowland versus highland on Santa Cruz (R2 = 0.19, p = 0.001), and lowland versus highland on Floreana

(R2 = 0.13, p = 0.001) (Table S35). The lowland comparison between islands was not significantly different in weighted UniFrac distances (R2 = 0.02, p=0.431).

133

Highland Lowland

● 0.2 ● ● ● ● Habitat ● ● Highland 0.1 ● Lowland ● ● ● ●● ●●●●● ● ● ● ●● ●● ● ●●●● ● ● Island ● ● 0.0 ●● ●●● ● Santa Cruz Axis.2 [24.3%] Axis.2 ● ● ● ● ● ● ● Floreana ● ● −0.1

0.1 0.1 − 0.0 0.1 0.2 − 0.0 0.1 0.2 Axis.1 [31.4%]

Figure 4.5. Principal coordinate analysis visualization of weighted UniFrac distances between microbiome samples from Santa Cruz and Floreana Islands.

The four species found on both islands (SGF, MGF, CF, and STF) are plotted with shape and color representing the habitat and island of origin, respectively.

134

To check whether foraging patterns may explain the difference in microbiome samples between the highland habitats, pairwise Euclidean distances were calculated between the weighted average foraging pattern in each island and habitat combination. Euclidean distances between habitats within each island (0.71 and 0.63 for Floreana and Santa Cruz, respectively), were higher than the differences between islands within habitats (0.16 and 0.05 for highland and lowland, respectively; Table S36). This pattern held even after collapsing the four plant based food items (seed, leaf, flower, fruit) into a single ‘plant’ category. Comparing the proportion of observed food items consumed in the highlands between islands, the small ground finch on

Floreana had a higher proportion of insect consumption compared to Santa Cruz (45% v 18%, respectively). The small tree finch was only observed to eat insects on Floreana while on Santa

Cruz it consumed a small proportion of seeds as well (10%).

Discussion

Comparison of Darwin’s finch microbiomes between Floreana Island and Santa Cruz

Island offers the opportunity to test factors that affect microbiome composition in a group of birds with well-documented rapid adaptive radiation into novel niches within and across islands.

We build on previous microbiome research into nine extant Darwin’s finch species on Santa

Cruz Island and, using analysis of the 16S rRNA sequences from an additional 74 finches on

Floreana Island, compare the microbiome in five extant species on Floreana Island to study the effects of: (1) phylogeny, (2) age class (nestlings, adults), (3) habitat (lowlands, highlands), and

(4) island (Floreana, Santa Cruz). Overall, the composition and diversity observed in the samples from Floreana Island were congruent with findings from other avian studies. Interrogating host phylogeny, Procrustes Analysis of Co-phylogeny revealed a significant correlation between the

135

host genetic distance and microbiome community. However, compared with Darwin’s finch microbiome research from Santa Cruz Island, Beta Diversity Through Time analysis did not recapitulate the correlation between phylogeny and microbiome on Floreana Island. Nestlings had more unclassified bacterial phyla than adults, but did not appear to harbor phylogenetically distinct bacterial taxa. Within Floreana samples, habitat and diet showed significant effects on the microbiome. Finally, a comparison between the microbiome of species that occur on both

Santa Cruz and Floreana Islands showed an interaction effect between island × habitat: the microbiome differed for species sampled in the highlands of each island, but did not differ for species sampled in the lowlands. This pattern was congruent with observations on foraging behavior and stable isotope analysis. Highland foraging behavior differed across the two islands, whereas lowland foraging behavior was comparable. Similarly, stable isotope values had larger differences between the highlands of each island compared to the lowlands.

Composition and diversity of Darwin finch microbiomes on Floreana

The gut microbiomes of Darwin’s finch species from Floreana Island are broadly similar to previously characterized avian microbiomes. The bacterial phyla Firmicutes, Actinobacteria, and Proteobacteria comprise the majority of sequences across five Darwin’s finch species, with no other phyla rising above 1% in mean relative abundance. These three phyla are well represented in surveys of broader bird species and were also dominant in the nine finch species from Santa Cruz; however, on Santa Cruz Island, the phyla Bacteroidetes, Chloroflexi, and

Tenericutes also rose above this threshold (Hird et al. 2015; Kropáčková et al. 2017; Loo et al.

2018a). At lower taxonomic levels, the genus Lactobacillus dominates the microbiome in the five species sampled, ranging from 30% to 89% in the hybrid tree finch and medium tree finch, respectively. The prevalence of Lactobacillus was previously noted in the gut microbiomes of

136

four Darwin’s finch species that occur on both Floreana Island and Santa Cruz Island, namely small ground finch, medium ground finch, cactus finch, and small tree finch (Loo et al. 2018a).

Nestlings harbor unclassified bacteria at higher abundances

The gut microbiome samples from small ground finch nestlings offered an opportunity to investigate age related differences in Darwin’s finches. The comparison of the nestlings to small ground finch adults within the lowlands revealed a much higher proportion of unclassified bacterial taxa at the phylum level, controlling for host species and habitat effects (Figure 4.1).

However, in the lowlands, DPCoA revealed no difference in nestling and adult small ground finches (Figure S17). DPCoA calculates distances between microbial communities that incorporate the phylogenetic structure of the bacterial taxa. Therefore, the lack of significant differences in beta diversity between adults and nestlings implies that while the enriched bacterial taxa are unclassified, they are not phylogenetically distinct. Though small in sample number, the lack of clear clustering by age is different than other bird species in which nestling to adult comparisons have been made, such as the black legged kittiwake and the chinstrap , both of which showed a distinct microbial community in the nestlings (van Dongen et al. 2013; Barbosa et al. 2016).

Nestling microbiomes have ecological consequence due to the invasive parasite Philornis downsi, which threatens all species of Darwin’s finches on both Santa Cruz and Floreana Islands.

Philornis downsi larvae feed on the blood and tissue of nestlings and result in average mortality around 55% (Kleindorfer and Dudaniec 2016). Previous investigation into the gut microbiome of

P. downsi larvae demonstrated a marked difference between the parasitic larvae and the adult fly

(Ben Yosef et al. 2017). The number of P. downsi larvae isolated from the nest sampled for this study was consistent with published research. In a comparison with classified bacterial taxa in

137

the gut microbiome of the parasitic larvae, which had a majority of sequences from the phylum

Proteobacteria, the nestlings showed a distinct community with a majority of reads coming from unclassified bacterial taxa. Further investigation of the relationship between the microbiome of nestlings and P. downsi is needed to confirm whether the differences found here are generalizable across all Darwin’s finch species and levels of parasitization.

Habitat effects in the context of foraging and stable isotope analysis

In beta diversity tests of habitat, species, and age, only habitat showed a significant difference in the microbiome community. On Floreana Island, only the small ground finch was present at high enough densities for multiple samples in both the humid highlands and arid lowlands. The ground finches (medium ground finch and cactus finch) were primarily sampled in the lowlands and the tree finches (small, medium, and hybrid tree finch) were sampled in the highlands. Though the correlation between the close phylogenetic relationships of species sampled in each habitat leaves open the possibility that host phylogeny explains the difference, visualization of the samples with DPCoA shows a shift in the microbiome community even within the small ground finch (Figure S2). Additionally, similar analysis run on the communities on Santa Cruz Island showed the same habitat effect after including two phylogenetically more distant species, the vegetarian finch and the warbler finch (Loo et al. 2018a). Therefore, it is likely that the habitat effect is not due solely to the phylogenetic relationship between Darwin’s finch species.

Foraging data provide context to the observed differences in gut microbiomes between habitats. The observations analyzed in this study across five food categories indicate differences in foraging behavior across species and habitats. Given the short time frame for foraging observations in this study, it is possible that Darwin’s finches were foraging opportunistically in

138

relation to easily accessible food items during the early wet season. In the lowlands, both the cactus finch and medium ground finch were only observed foraging on plant material. The small ground finch was observed in both the lowlands and highlands, and had markedly more insect consumption in the highlands. This finding of greater invertebrate diet in the highlands was also found in a study into small ground finch foraging behavior on Santa Cruz Island (Kleindorfer and Chapman 2006). Both tree finch species observed in the highlands largely overlapped in foraging behavior and diet; they mostly consumed insects. Notably, medium tree finches consumed more seeds than small tree finch in this study. Previous research has documented different foraging behaviors between the tree finches on Floreana Island, with more chipping, prying and biting by foraging medium tree finch compared with more gleaning and probing from leaves by foraging small tree finch (Peters and Kleindorfer 2015). In conclusion, the dietary observations from this study align with those from previous studies and point to species and habitat differences in foraging behavior. The dietary observations were done on a random sample of birds in the field that were not subsequently sampled for microbiome characterization, and thus while they cannot align directly with the microbiome characterization, they are an important consideration in interpreting the analyses of the microbiome.

!13C and !15N stable isotope ratios are representative of the diet of the individual sampled and have been used to characterize differences in diets and trophic partitioning (Herrera et al. 2003; Rakotondranary et al. 2011). The measured stable isotope values from Floreana

Island were partitioned by habitat of origin, with lowland samples generally heavier in both !13C and !15N. The exception to this pattern was the small ground finch in the highlands, which had eight samples with !13C values around -15‰. These are likely driven by the consumption of seeds from a grass species within the genus Paspalum (SK, personal observations), which are

139

known to use the C4 carbon fixation pathway and therefore have a !13C ratio between -12 and -

15‰ (Cernusak et al. 2013; Caemmerer et al. 2014). Speaking to the broader dietary habits of this species, the small ground finch also had largest range in !13C measurements, with other highland individuals at -24‰. The general utility of !15N values is in determining trophic levels between organisms, with an average enrichment of ~3‰ per trophic level (Kelly 2000). Within the primarily insectivorous tree finches, the !15N range from 5-10‰ suggests the possibility of the consumption of insects at different trophic levels. The wider range of lowland samples from

6-15‰ is less obvious given the observed primary consumption of plant material. The interpretation of these values is difficult without comprehensive sampling of potential food sources.

Both foraging data and stable isotopes were used as alternative explanatory variables in testing for correlation between the gut microbiome and the finch phylogeny. Two methods of correlating the microbiome samples and the metadata, Procrustes Analysis of Co-phylogeny and variation partitioning, found significant correlation for both finch phylogeny and foraging data, but not for stable isotope values. However, the Beta Diversity Through Time analysis did not show correlation for any of the three metadata variables with the gut microbiome, in contrast with previous characterization of Darwin’s finch microbiomes on Santa Cruz that found significant correlation using all three methods (Loo et al. 2018a). There are a couple possible reasons for the discrepancy between these methods and previous results. First, the sample size for the tests are different – BDTT requires a single microbiome per species, which collapses the samples into five data points whereas both PACo and variation partitioning are run on individual samples. Second, Santa Cruz tested nine species compared to five on Floreana, with the added difference in the phylogenetic structure of the represented finch species. Santa Cruz included two

140

phylogenetically distinct species, the vegetarian finch and the warbler finch, which may provide a larger portion of the phylogenetic signal. The results of these tests on Floreana are therefore consistent with the results from Santa Cruz, taking into account the change in sample size for the

BDTT calculation.

Inter-island comparison shows a convergence in lowland but not highland microbiome samples

The characterization of Darwin’s finch gut microbiomes from Floreana not only provide an independent case study of previous patterns, but also provide a unique opportunity to investigate the effects of habitat and diet alone by controlling for the host species. Four species of Darwin’s finches were sampled on both islands: the small ground finch, medium ground finch, cactus finch, and small tree finch. Additionally, the small ground finch was sampled in all four habitat and island combinations. Testing for an island effect using pairwise comparison of habitat and island combinations showed significant differences between habitats on each island and between the highlands of each island, but not the lowlands. Differences in overall foraging patterns between islands and habitats were calculated using a weighted average across the individuals represented with microbiome samples. These differences were highest between highland and lowland habitats on each island and between the highlands of Santa Cruz and

Floreana but much smaller between the lowlands, which is consistent with the differences seen in the microbiome, suggesting that foraging is a primary factor in microbiome composition.

In addition to different foraging behavior of the finch species per habitat and island, there are also expected differences in the distribution and abundance of flora and fauna across islands

(McMullen 1999). To our knowledge, there is no species list for the highland and lowland plants on Floreana Island in particular. The impact of human history has also been different on the

141

inhabited islands, and likely resulted in different agricultural practices, introduced crops, and environmental weeds that have yet to be formally described. For example, though both Santa

Cruz and Floreana highlands are dominated by Scalesia pedunculata, it has been noted that the

Scalesia on Floreana are unlikely representative of undisturbed vegetation, in part due to the island having the longest history of human habitation within the archipelago (Eliasson 1984).

The divergence in finch gut microbiome communities we observe across islands highlights the role of dispersal limitation in determining microbial community structure within wild host species (reviewed in (Dudaniec and Tesson 2016). Previous work has shown that geographic distance increased the distance between gut microbiomes in mammalian species, tested in 17 species across the Americas (Moeller et al. 2017). However, the distances tested in that study were orders of magnitude larger than the geographic separation between Santa Cruz and Floreana island, with pairs of species up to 7,000 miles apart compared to the roughly 40 miles between islands. The similarity of microbiomes in lowland finches implies that the bacteria observed are not limited to single islands in their distribution. Therefore, the differences seen between highland samples across islands must be attributed to factors other than geographic distance alone. This pattern is also consistent with other work that found sampling locality to be the most detectable signal in a and its host species (Hird et al. 2014). There, the cowbird species did not show a species specific signal but clustered by geographic location.

Given the detectable difference between highland, but not lowland, habitats on both islands in

Darwin’s finches, it is possible that avian microbiomes reflect the ecological environment in which they reside. Taken together, the inter-island comparison shows the importance of comprehensive sampling of multiple individuals in all habitat/islands of comparison due to the significant effects of both these variables on microbiome composition. Notably, our results

142

suggest that analyses of host species effects on the microbiome of avian species should take geographic location into account, given the contrasting habitats which many conspecific birds occupy.

Conclusion

Our study further resolves the factors that affect microbiome composition in Darwin’s finches within and across islands of the Galápagos, revealing potential drivers of host-microbial co-evolutionary patterns in this iconic adaptive radiation. Findings from the five species characterized on Floreana island recapitulate many of the broad patterns observed from Santa

Cruz and provide an independent sampling event to interrogate the interplay between island and habitat. The difference in gut bacterial community observed between the highlands of Santa Cruz and Floreana demonstrate a clear environmental effect independent of host species and show that foraging habits play a critical role in determining the composition of the gut microbiome. Given the importance of dietary niches in the diversification of Darwin’s finches, our results emphasize the importance of the microbiome in the ecology and evolution of species within this adaptive radiation.

143

Acknowledgements

We are grateful to the Charles Darwin Foundation and Galápagos National Park for allowing us to work in the Galápagos Islands, with special thanks to Solanda Rea, Marta

Romoleroux, and the rest of the staff at the Charles Darwin Research Station for facilitating logistics. We especially thank our field assistant, Jefferson Garcia Loor, for his dedication to this project. This work was supported by a National Science Foundation Graduate Research

Fellowship, a Graduate Research Opportunities Worldwide grant in collaboration with Flinders

University, fieldwork funding support from Macquarie University, and the Dean’s Competitive

Fund from Harvard University.

144

Chapter 5

Conclusion

The ecological and evolutionary factors that shape gut microbiomes are fundamental to our understanding of the bacterial diversity within these important microbial communities.

Darwin’s finches provided an excellent study system with which to disentangle the complex relationship between microbiome composition and the many variables that influence it. This dissertation provides three separate but interconnected lines of evidence that are consistent with a model of microbiome assembly in which environmental filtering via diet and habitat are primary determinants of the bacterial taxa present with secondary influence from the evolutionary history between hosts.

First, the gut microbiomes of the nine Darwin’s finch species present on Santa Cruz island correlated with both host phylogeny and diet on similar timescales. The comparable patterns imply that co-diversification of the bacterial taxa is not the primary cause of the correlation between the host phylogeny and the microbiome distances. Instead, it is likely that diet is correlated with host phylogeny and therefore drives the signal seen from the host phylogenetic distances. Habitat differences were recognizable even within a single species, indicating strong differentiation based on the environment. Second, host genetic background did not correlate with microbiome distances, as shown by the hybrid introgression from the medium tree finch (Camarhynchus pauper) into the small tree finch (C. parvulus). However, the gut microbiome of the hybrid tree finch cluster was more similar to its paternal species, the small tree finch shown by the differentially abundant taxa and functional annotations. Finally, Floreana island offered an independent replicate for sampling across Darwin’s finch species and

recapitulated the correlation observed on Santa Cruz. Looking solely at the overlapping species on both islands controls for any potential species effects and found significant differences between the highlands of the two islands, but not the lowlands.

Determining the effects of these evolutionary and ecological variables on the gut microbiome in

Darwin’s finches was only possible due to the well-characterized nature of these populations and the natural experiment of the adaptive radiation of finches across multiple islands and habitats.

Here, I consider the consequence of these studies for our understanding of the evolution of

Darwin’s finches and possible effects of habitat, diet, and temporal differences. I end with thoughts on the trajectory of Darwin’s finches in the face of human habitation and other challenges to their survival and the potential for microbiome research to better inform conservation strategies.

Diversification of Darwin’s finches and the role of the microbiome

The mechanisms of diversification in the adaptive radiation of Darwin’s finches has long been of research interest, first synthesized in modern times with Lack’s treatise (Lack 1947). The varying roles of allopatric and sympatric speciation, along with hybridization and immigration, have been continually discussed as new evidence on the evolutionary history of these species is gathered using updated genetic methods. Mitochondrial DNA sequences (Sato et al. 2001) and microsatellite genotyping (Petren et al. 1999) have elucidated the phylogenetic relationship among all species and at times revealed surprising conflicts with the morphological species assignments.

For example, warbler finches on different islands are morphologically and ecologically similar, but are considered two separate species based on DNA sequencing. More recently, a phylogeny based on whole genome resequencing acoss all species in the archipelago

146

demonstrated that two other species, the sharp beaked ground finch (G. difficilis) and the large cactus finch (G. conirostris) are in fact three and two phylogenetically distinct groups, respectively (Lamichhaney et al. 2015). Additionally, the genetic distances between species groups shows signs of hybridization between sister taxa as a generator of morphological variation. A case study on Daphne Major shows that a single immigrant can spawn a genetically and morphologically distinct species (Lamichhaney et al. 2018). Thus, the diversification of

Darwin’s finches is a complex interplay between sympatric adaptive divergence, allopatric speciation, and hybridization.

Given the impact of diet on the composition of the microbiome, the correlation seen between the finch phylogeny and microbiome distances makes sense as a result of, rather than the cause of, divergent foraging behaviors. In this model, as the finch populations undergo waves of selection and divergent lineages do better, the diets become increasingly divergent and the microbiome follows. Two additional lines of evidence point to strong environmental filtering in

Darwin’s finch microbiomes. First, the habitat effect observed within the small ground finch on both islands shows the lack of a species-specific microbiome. Second, the hybrid introgression from the medium tree finch into the small tree finch and subsequent formation of the hybrid swarm did not demonstrate any novel microbiome phenotypes. These results are consistent with microbiome assembly in these species as primarily a stochastic process dependent on food resources.

Importantly, this contrasts with the pattern observed in mammalian species. Groussin et al. demonstrated a clear signal of co-diversification between mammals and their gut microbiomes that was separable from the influence of ancient dietary shifts (Groussin et al.

2017). One major difference between the mammalian study and this study is the timescale and

147

host diversity represented. Here, Darwin’s finches have diversified on the order of 1 million years compared to the 33 million years of evolutionary history among the mammals studied.

While previous large scale surveys of avian microbiome diversity have found that phylogeny and are correlated with microbiome distance (Hird et al. 2015; Kropáčková et al. 2017), neither has tested for the specific signal of co-diversification using comparable methods. A broader study across a wider range of the avian phylogeny could answer whether the lack of co- diversification observed in Darwin’s finches is generalizable to the wider avian kingdom. Given the significant differences in life history traits, notably in reproduction and initial food sources, it is likely that birds possess a different model of gut microbiome acquisition and assembly compared to mammals.

Environmental effects on microbiome composition

Habitat had the most obvious effect on Darwin’s finch microbiomes; the pattern was especially clear within the well-distributed small ground finch. On both Santa Cruz and Floreana, the populations of small ground finches diverged in microbiome composition based on the habitat of origin. These results are consistent with the model of microbiome assembly proposed above, in which the bacteria in and on the food items consumed are the primary source of constituents in the gut microbiome. Dietary shifts between habitats was apparent in both !13C and !15N stable isotope values and foraging data. Though the stable isotope values did not correlate with microbiome distances with the methods applied, qualitatively the separation of highland and lowland stable isotope samples reinforces the ecological differences between habitats. Similarly, the foraging data showed shifts in in the dietary patterns of the small ground finch. Previous research has demonstrated adaptive divergence within the population of small ground finches on Santa Cruz along the ecological cline, with highland birds displaying larger

148

and more pointed compared to the lowland birds (Kleindorfer and Chapman 2006;

Sulloway and Kleindorfer 2013). The morphological differences seen in the highland and lowland populations is not attributable to genetic drift given evidence for strong gene flow across the entire island (Galligan et al. 2012). Thus, microbiome differences observed between habitats is largely attributable to the characteristics of the ecological niche rather than geographic distance.

Comparing Darwin’s finch microbiomes between Santa Cruz and Floreana corroborated the ecological distinction noted between habitats. First, as an independent case study on a separate island, Floreana recapitulated the same pattern of habitat differences seen on Santa

Cruz. Second, the similarity in lowland Darwin’s finch microbiome samples confirms that ecological similarity drives the composition of the microbiome, not simple geographic isolation.

The differences seen in the highland samples are a less obvious in their implications, but interpretation is aided by the !13C stable isotope values.

The highlands on both Santa Cruz and Floreana contains forests made up of the endemic

Scalesia pedunculata but the sampling sites differed in their floral makeup. On Santa Cruz, the understory was thick and ground cover consisted of a diverse array of species whereas on

Floreana, the sampling site was along a cleared trail covered primarily in grass species including some in the genus Paspalum (S Kleindorfer, personal observation). The !13C stable isotope values from small ground finches in the Floreana highlands confirm the greater presence and consumption of C4 grass species, which have a narrow isotopic signature range of 12-14‰

(reviewed in Caemmerer et al. 2014). Though stable isotope measurements have not been widely used in the Galápagos, the studies here demonstrate their usefulness in clarifying the ecology of

149

sympatric Darwin’s finch species. Future work should aim to collect samples of potential food sources to improve resolution of inferred diets in coordination with foraging observations.

Longitudinal studies can further elucidate the role of ecological variables

How the microbiome varies over time is an important question and requires more detailed sampling strategies. Understanding the time scale of changes in diet to changes in microbiome composition would further inform our model of microbiome assembly. Though longitudinal studies are informative, they are difficult to implement in wild populations. For example, only a handful of individuals were banded and recaptured in the 2016 field season over 10-15 sampling days per site. Meanwhile, microbiome samples in captivity defeat the purpose of characterizing the natural variation and timing of shifts in diet. Alternatively, individually identified foraging patterns may provide better context for relating particular food items to corresponding bacterial taxa. Previous studies have used colored bands to monitor foraging patterns which could be matched to individuals (Peters and Kleindorfer 2015). These data, in combination with stable isotope samples from the observed food sources being consumed, could provide a better basis for elucidating specific effects of different food items on the microbiome.

Another aspect to be investigated is the annual weather pattern in the Galápagos archipelago. On average, the months of January to April or May constitute the warm, wet season, with June to December as the cool, dry season. All samples examined in this dissertation were collected in the wet season of 2016 but it is during the dry season that food resources become scarce and the selection patterns known to drive divergence between populations and species become most apparent (Grant 1999). For example, based on data from 2003-2007, drought conditions shifted sympatric ground finch species to their ‘private’ food resources and away from the shared resources which were no longer available (De León et al. 2014). Future research

150

should aim to characterize Darwin’s finch microbiomes in the dry season, where foraging patterns should be the most divergent. Based on the studies here, the expectation would be a stronger species-specific signal caused by the shift to a species specific diet during food scarcity.

This pattern would corroborate a model of changes to the microbiome following shifts in the diets of the host species.

Finally, long term differences in climate are likely to have an impact on microbiome composition. Rainfall is the main determinant of food sources and availability and therefore represents a major selective variable, as demonstrated on Daphne Major with drought years causing high mortality (Boag and Grant 1981; Price et al. 1984; Grant 2002). The year-to-year rainfall is highly variable and unpredictable. Notably, the El Nino Southern Oscillation (ENSO) occurs irregularly every 2 to 11 years with an average of 7 years between events (Grant 1999).

These years can bring extremely high rainfall and prolong the breeding season; for example, an

ENSO even in 1983 allowed the first set of offspring to fledge and begin breeding within the same wet season (Gibbs et al. 1984). The unpredictability of year-to-year rainfall totals contributes to the unpredictable nature of evolution. Numerous studies on both Santa Cruz and

Floreana have demonstrated varying shifts in diet dependent on the rainfall of the given year

(Tebbich et al. 2004; Kleindorfer and Chapman 2006; Christensen and Kleindorfer 2009; Peters and Kleindorfer 2015). Therefore, to properly understand the role of the microbiome in the diversification of Darwin’s finches, longitudinal sampling of the birds over multiple years is required.

Though non-invasive fecal sampling of birds is not trivial, continual study of the microbiome in Darwin’s finches holds great potential for deepening our understanding of this complex process. As demonstrated by the longitudinal field study on Daphne Major spearheaded

151

by Peter and Rosemary Grant, long term studies reveal aspects of the evolutionary process that are impossible to document otherwise. Research interest into Darwin’s finches shows no signs of slowing down and adding microbiome collection efforts as a standard protocol would be well worth the effort.

Future prospects for Darwin’s finches

Though the Galápagos archipelago remains one of the best preserved sites of natural biodiversity, the human impact on the islands is significant. Both Santa Cruz and Floreana have permanent human settlements which are growing each year. These impact the ecosystem in multiple ways. First, the direct changes to the environment are best shown by the agricultural zones present on both islands, which require clearing of the native flora and subsequent shifts in native fauna. On Santa Cruz, the Scalesia forest occupies only 1% of its original distribution

(Mauchamp and Atkinson 2010). Second, the indirect effects of invasive plants and animals are equally devastating. Invasive plant species on the islands have been steadily increasing over the decades (Mauchamp 1997) and the blackberry Rubeus niveus on Santa Cruz is now undergoing active management, with documented effects on the foraging patterns of small tree and warbler finch populations in response to the application of herbicides (Filek et al. 2018). Additionally, the introduction of rodents and other small mammals has significantly affected populations of

Darwin’s finches along with the distribution of native flora, such as Opuntia cacti. These invasive mammals are also undergoing active management campaigns (Hanson and Campbell

2013). These larger ecological shifts will likely impact food sources and therefore microbiome composition in Darwin’s finch species as well.

The newest threat to the survival of Darwin’s finches is an invasive fly, Philornis downsi, whose larvae parasitize nestlings and cause up to 90% mortality (reviewed in (Kleindorfer and

152

Dudaniec 2016)). This parasite is the primary cause of mortality in the critically endangered medium tree finch on Floreana (O’Connor et al. 2009) and is now well distributed across the archipelago. Given the parasite’s importance in the conservation of Darwin’s finch species, the gut microbiome of the parasitic larvae and adult flies was recently characterized using a similar molecular biomarker approach. Significant differences were found between the adults and larvae and larvae microbiomes clustered by host species (Ben Yosef et al. 2017). Larvae parasitizing the warbler finch had the most tightly clustered microbiomes while those parasitizing small ground finches were more distributed. This clustering pattern is similar to the observations in the preceding chapters with regard to the microbiome of adult Darwin’s finch individuals. Small ground finches have the broadest diet and therefore are more diverse in microbiome makeup.

The opportunistic sampling and characterization of small ground finch nestlings on

Floreana in chapter 4 provide preliminary results for future investigations into the relationship between the microbiomes of the host and parasite. Nestlings had a significantly higher proportion of bacterial taxa unclassified at the phylum level, but their microbiomes were not significantly different by the weighted UniFrac metric, which incorporates the phylogenetic structure of the bacteria. Future efforts should quantify the level of parasitization in each nest as as well as comprehensively sample the corresponding larvae and nestlings. New methods allow for the tracking of exact bacterial strains, which could be used to detect a possible direct inoculation of bacteria from the nestlings to the parasitic larvae or vice versa (Costea et al. 2017). Better understanding of the relationship between the parasite and host can inform conservation practices, such as the development of chemical attractants to trap the flies prior to egg laying in nests.

153

Summary

Darwin’s finches remain one of the best characterized adaptive radiations in the world, with most of its original biodiversity retained even in the face of permanent human settlement.

Microbiomes are an integral part of organismal biology and can influence health, immune system function, and host evolution. This dissertation has provided a characterization of this important aspect for a large fraction of Darwin’s finch species across two different islands. Leveraging the distribution of finches across multiple islands and habitats, these three studies give insight into how the ecological and evolutionary factors can affect microbiome composition. Altogether, this research demonstrates the necessity of comprehensive metadata for the correct interpretation of patterns observed in bacterial communities and provides foundational knowledge for future work to understand the complex dynamics of microbiome assembly.

154

Bibliography

Adamack AT, Gruber B. 2014. PopGenReport: simplifying basic population genetic analyses in R. Ecology 5: 384–387.

Amato KR. 2013. Co-evolution in context: the importance of studying gut microbiomes in wild animals. Microbiome Science and Medicine.

Amato KR, Yeoman CJ, Cerda G, Schmitt C, Cramer JD, Miller MEB, Gomez A, Turner T, Wilson BA, Stumpf RM, Nelson KE, White BA, Knight R, Leigh SR. 2015. Variable responses of human and non-human primate gut microbiomes to a Western diet. Microbiome 3: 53.

Anderson MJ. 2006. Distance-Based Tests for Homogeneity of Multivariate Dispersions. Biometrics 62: 245–253.

Balbuena JA, Míguez-Lozano R, Blasco-Costa I. 2013. PACo: A Novel Procrustes Application to Cophylogenetic Analysis. PLoS ONE 8: e61048.

Baldo L, Pretus JL, Riera JL, Musilova Z, Nyom ARB, Salzburger W. 2017. Convergence of gut microbiotas in the adaptive radiations of African cichlid fishes. The ISME Journal 11: 1975– 1987.

Barbosa A, Balagué V, Valera F, Martínez A, Benzal J, Motas M, Diaz JI, Mira A, Pedrós-Alió C. 2016. Age-Related Differences in the Gastrointestinal Microbiota of Chinstrap Penguins (Pygoscelis antarctica). PLoS ONE 11: e0153215.

Ben Yosef M, Zaada DSY, Dudaniec RY, Pasternak Z, Jurkevitch E, Smith RJ, Causton CE, Lincango MP, Tobe SS, Mitchell JG, Kleindorfer S, Yuval B. 2017. Host-specific associations affect the microbiome of Philornis downsi, an introduced parasite to the Galápagos Islands. Molecular Ecology 26: 4644–4656.

Benjamin Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B Methodological 57: 289–300.

Boag PT, Grant PR. 1981. Intense Natural Selection in a Population of Darwin's Finches (Geospizinae) in the Galápagos on JSTOR. Science 214: 82–85.

Breiman L. 2001. Random forests. Journal of 45: 5–32.

155

Britton T, Anderson CL, Jacquet D, Lundqvist S, Bremer K, Anderson F. 2007. Estimating Divergence Times in Large Phylogenetic Trees. Systematic biology 56: 741–752.

Brooks AW, Kohl KD, Brucker RM, van Opstal EJ, Bordenstein SR. 2016. Phylosymbiosis: Relationships and Functional Effects of Microbial Communities across Host Evolutionary History. PLoS Biology 14: e2000225.

Brucker RM, Bordenstein SR. 2013. The Hologenomic Basis of Speciation: Gut Bacteria Cause Hybrid Lethality in the Genus Nasonia. Science 341: 667–669.

Caemmerer von S, Ghannoum O, Pengelly JJL, Cousins AB. 2014. Carbon isotope discrimination as a tool to explore C4 photosynthesis. Journal of Experimental Botany 65: 3459–3470.

Callahan BJ, Sankaran K, Fukuyama JA, McMurdie PJ, Holmes SP. 2016. Bioconductor Workflow for Microbiome Data Analysis: from raw reads to community analyses. F1000Research 5: 1492.

Caro TM, O'Doherty G. 1999. On the Use of Surrogate Species in Conservation Biology. Conservation Biology 13: 805–814.

Cernusak LA, Ubierna N, Winter K, Holtum JAM, Marshall JD, Farquhar GD. 2013. Environmental and physiological determinants of carbon isotope discrimination in terrestrial plants. New Phytologist 200: 950–965.

Chao A. 1984. Nonparametric estimation of the number of classes in a population. Scandinavian Journal of Statistics 11: 265–270.

Chaves JA, Cooper EA, Hendry AP, Podos J, De León LF, Raeymaekers JAM, MacMillan WO, Uy JAC. 2016. Genomic variation at the tips of the adaptive radiation of Darwin's finches. Molecular Ecology 25: 5282–5295.

Christensen R, Kleindorfer S. 2007. Assortative pairing and divergent evolution in Darwin's Small Tree Finch, Camarhynchus parvulus. Journal of Ornithology.

Christensen R, Kleindorfer S. 2009. Jack-of-all-trades or master of one? Variation in foraging specialisation across years in Darwin’s Tree Finches (Camarhynchus spp.). Journal of Ornithology 150: 383–391.

Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM. 2009. The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic acids research 37: D141–D145.

156

Costea PI, Munch R, Coelho LP, Paoli L, Sunagawa S, Bork P. 2017. metaSNV: A tool for metagenomic strain level analysis. PLoS ONE 12: e0182392.

Covarrubias-Pazaran G, Diaz-Garcia L, Schlautman B, Salazar W, Zalapa J. 2016. Fragman : an R package for fragment analysis. BMC genetics 17: 62.

Curry R. 1986. Whatever happened to the Floreana mockingbird? Noticias de Galápagos 43: 13– 15.

Danzeisen JL, Kim HB, Isaacson RE, Tu ZJ, Johnson TJ. 2011. Modulations of the Chicken Cecal Microbiome and Metagenome in Response to Anticoccidial and Growth Promoter Treatment. PLoS ONE 6: e27949.

David LA, Maurice CF, Carmody RN, Gootenberg DB, Button JE, Wolfe BE, Ling AV, Devlin AS, Varma Y, Fischbach MA, Biddinger SB, Dutton RJ, Turnbaugh PJ. 2014. Diet rapidly and reproducibly alters the human gut microbiome. 505: 559–563.

Davis NM, Proctor D, Holmes SP, Relman D, Callahan BJ. 2017. Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data. bioRxiv. de León LF, Bermingham E, Podos J, Hendry AP. 2010. Divergence with gene flow as facilitated by ecological differences: within-island variation in Darwin's finches. Philosophical transactions of the Royal Society of London. Series B, Biological sciences 365: 1041–1052.

De León LF, Podos J, Gardezi T, Herrel A, Hendry AP. 2014. Darwin's finches and their diet niches: the sympatric coexistence of imperfect generalists. Journal of evolutionary biology 27: 1093–1104.

Degnan PH, Pusey AE, Lonsdorf EV, Goodall J, Wroblewski EE, Wilson ML, Rudicell RS, Hahn BH, Ochman H. 2012. Factors associated with the diversification of the gut microbial communities within chimpanzees from Gombe National Park. Proceedings of the National Academy of Sciences 109: 13034–13039.

Dinno A. 2017. Dunn's Test of Multiple Comparisons Using Rank Sums. R package version 1.3.5.

Dodd MS, Papineau D, Grenne T, Slack JF, Rittner M, Pirajno F, O’Neil J, Little CTS. 2017. Evidence for early life in Earth’s oldest hydrothermal vent precipitates 543: 60–64.

Dudaniec RY, Tesson SVM. 2016. Applying landscape genetics to the microbial world. Molecular Ecology 25: 3266–3275.

157

Dvorak M, Fessl B, Nemeth E, Kleindorfer S, Tebbich S. 2012. Distribution and abundance of Darwin’s finches and other land birds on Santa Cruz Island, Galápagos: evidence for declining populations. Oryx 46: 78–86.

Dvorak M, Nemeth E, Wendelin B, Herrera P, Mosquera D, Anchundia D, Sevilla C, Tebbich S, Fessl B. 2017. Conservation status of landbirds on Floreana: the smallest inhabited Galápagos Island. Journal of Field Ornithology 88: 132–145.

Earl DA, vonHoldt B. 2011. STRUCTURE HARVESTER: a website and program for visualizing STRUCTURE output and implementing the Evanno method. Conservation Genetics Resources 4: 359–361.

Eliasson U. 1984. Native climax forests. In: Perry R (ed), Key Environments Galápagos. pp. 101–114.

Eren AM, Sogin ML, Morrison HG, Vineis JH, Fisher JC, Newton RJ, McLellan SL. 2015. A single genus in the gut microbiome reflects host preference and specificity. The ISME Journal 9: 90–100.

Eren AM, Vineis JH, Morrison HG, Sogin ML. 2013. A Filtering Method to Generate High Quality Short Reads Using Illumina Paired-End Technology. PLoS ONE 8: e66643.

Evanno G, Regnaut S, Goudet J. 2005. Detecting the number of clusters of individuals using the software structure: a simulation study. Molecular Ecology 14: 2611–2620.

Ezenwa VO, Gerardo NM, Inouye DW, Medina M, Xavier JB. 2012. Animal Behavior and the Microbiome. Science 338: 198–199.

Faith DP. 1992. Conservation evaluation and phylogenetic diversity. Biological Conservation 61: 1–10.

Fessl B, Tebbich S. 2002. Philornis downsi–a recently discovered parasite on the Galápagos archipelago–a threat for Darwin's finches? Ibis 144: 445–451.

Filek N, Cimadom A, Schulze CH, Jäger H, Tebbich S. 2018. The impact of invasive plant management on the foraging ecology of the Warbler Finch (Certhidea olivacea) and the Small Tree Finch (Camarhynchus parvulus) on Galápagos. Journal of Ornithology 159: 129–140.

Flanagan SP, Jones AG. 2017. Constraints on the FST–Heterozygosity Outlier Approach. Journal of Heredity 108: 561–573.

Fox J, Weisberg S. 2011. An R Companion to Applied Regression. SAGE.

158

Freeland JR, Boag PT, 1999. 1999. Phylogenetics of Darwin's finches: paraphyly in the tree- finches, and two divergent lineages in the warbler finch. The Auk 116: 577–588.

Galligan TH, Donnellan SC, Sulloway FJ, Fitch AJ, Bertozzi T, Kleindorfer S. 2012. Panmixia supports divergence with gene flow in Darwin's small ground finch, Geospiza fuliginosa, on Santa Cruz, Galápagos Islands. Molecular Ecology 21: 2106–2115.

Garcia-Mazcorro JF, Castillo-Carranza SA, Guard B, Gomez-Vazquez JP, Dowd SE, Brigthsmith DJ. 2016. Comprehensive Molecular Characterization of Bacterial Communities in Feces of Pet Birds Using 16S Marker Sequencing. Microbial ecology 73: 224–235.

Gibbs HL, Grant PR, Weiland J. 1984. Breeding of Darwin's finches at an unusually early age in an El Niño year. The Auk 101: 872–874.

Gould J. 1837. Description of new species of finches collected by Darwin in the Galápagos. Proceedings of the Zoological Society of London 5: 4–7.

Grant BR, Grant PR. 1996. High survival of Darwin's finch hybrids: effects of beak morphology and diets. Ecology 77: 500–509.

Grant PR. 1999. Ecology and Evolution of Darwin's Finches. Princeton, NJ: Princeton University Press.

Grant PR. 2002. Unpredictable Evolution in a 30-Year Study of Darwin's Finches. Science 296: 707–711.

Grant PR, Grant BR. 2002. Adaptive radiation of Darwin's finches: Recent data help explain how this famous group of Galápagos birds evolved, although gaps in our understanding remain. American Scientist.

Grant PR, Grant BR. 2011. How and Why Species Multiply. Princeton University Press.

Grant PR, Grant BR, Markert JA, Keller LF, Petren K. 2004. Convergent evolution of Darwin's finches caused by introgressive hybridization and selection. Evolution 58: 1588–1599.

Grant PR, Grant BR, Petren K, Keller LF. 2005. Extinction behind our backs: the possible fate of one of the Darwin’s finch species on Isla Floreana, Galápagos. Biological Conservation 122: 499–503.

Groussin M, Mazel F, Sanders JG, Smillie CS, Lavergne S, Thuiller W, Alm EJ. 2017. Unraveling the processes shaping mammalian gut microbiomes over evolutionary time. Nature Communications 8: ncomms14319.

159

Hadfield J. 2010. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. Journal of Statistical Software 33: 1–22.

Halfvarson J, Brislawn CJ, Lamendella R, Vazquez-Baeza Y, Walters WA, Bramer LM, D'Amato M, Bonfiglio F, McDonald D, Gonzalez A, McClure EE, Dunklebarger MF, Knight R, Jansson JK. 2017. Dynamics of the human gut microbiome in inflammatory bowel disease. Nature Microbiology 2: 17004.

Hall AB, Tolonen AC, Xavier RJ. 2017. Human genetic variation and the gut microbiome in disease. Nature Reviews Genetics 18: 690–699.

Hanson C, Campbell K. 2013. Floreana Island Ecological Restoration: Rodent and Cat Eradication Feasibility Analysis. Santa Cruz, CA: Island Conservation.

Hehemann J-H, Correc G, Barbeyron T, Helbert W, Czjzek M, Michel G. 2010. Transfer of carbohydrate-active enzymes from marine bacteria to Japanese gut microbiota 464: 908–912.

Herrera LG, Hobson KA, Rodríguez M, Hernandez P. 2003. Trophic partitioning in tropical rain forest birds: insights from stable isotope analysis. Oecologia 136: 439–444.

Hird SM. 2017. Evolutionary Biology Needs Wild Microbiomes. Frontiers in microbiology 8: 689.

Hird SM, Carstens BC, Cardiff SW, Dittmann DL, Brumfield RT. 2014. Sampling locality is more detectable than taxonomy or ecology in the gut microbiota of the brood-parasitic Brown-headed Cowbird (Molothrus ater). PeerJ 2: e321.

Hird SM, Sánchez C, Carstens BC, Brumfield RT. 2015. Comparative Gut Microbiota of 59 Neotropical Bird Species. Frontiers in microbiology 6: 1403.

Hommola K, Smith JE, Qiu Y, Gilks WR. 2009. A Permutation Test of Host–Parasite Cospeciation. Molecular biology and evolution 26: 1457–1468.

Hooper LV, Littman DR, Macpherson AJ. 2012. Interactions between the microbiota and the immune system. Science 336: 1268–1273.

Hubisz MJ, Falush D, Stephens M, Pritchard JK. 2009. Inferring weak population structure with the assistance of sample group information. Molecular ecology resources 9: 1322–1332.

Hug LA, Baker BJ, Anantharaman K, Brown CT, Probst AJ, Castelle CJ, Butterfield CN, Hernsdorf AW, Amano Y, Ise K, Suzuki Y, Dudek N, Relman DA, Finstad KM, Amundson R, Thomas BC, Banfield JF. 2016. A new view of the tree of life. Nature Microbiology 1: 16048.

160

Human Microbiome Project Consortium. 2012. Structure, function and diversity of the healthy human microbiome. 486: 207–214.

Hutchinson MC, Cagua EF, Balbuena JA, Stouffer DB, Poisot T. 2017. paco: implementing Procrustean Approach to Cophylogeny in R. Methods in Ecology and Evolution 69: 82.

Jakobsson M, Rosenberg NA. 2007. CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure. Bioinformatics (Oxford, England) 23: 1801–1806.

Jombart T. 2008. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics (Oxford, England) 24: 1403–1405.

Jorup-Rönström C, Håkanson A, Sandell S, Edvinsson O, Midtvedt T, Persson A-K, Norin E. 2012. Fecal transplant against relapsing Clostridium difficile-associated diarrhea in 32 patients. Scandinavian journal of gastroenterology 47: 548–552.

Kanehisa M, Sato Y, Kawashima M, Furumichi M, Tanabe M. 2016. KEGG as a reference resource for gene and protein annotation. Nucleic acids research 44: D457–D462.

Kelly JF. 2000. Stable isotopes of carbon and nitrogen in the study of avian and mammalian trophic ecology. Canadian Journal of Zoology 78: 1–27.

Kembel SW, Cowan PD, Helmus MR, Cornwell WK, Morlon H, Ackerly DD, Blomberg SP, Webb CO. 2010. Picante: R tools for integrating phylogenies and ecology. Bioinformatics (Oxford, England) 26: 1463–1464.

Kim J, Kim MS, Koh AY, Xie Y, Zhan X. 2016. FMAP: functional mapping and analysis pipeline for metagenomics and metatranscriptomics studies. BMC bioinformatics 17.

Kleindorfer S, Chapman TW. 2006. Adaptive divergence in contiguous populations of Darwin's small ground finch (Geospiza fuliginosa). Evolutionary Ecology Research.

Kleindorfer S, Dudaniec RY. 2016. Host-parasite ecology, behavior and genetics: a review of the introduced fly parasite Philornis downsi and its Darwin’s finch hosts. BMC Zoology 1: 1.

Kleindorfer S, O’Connor JA, Dudaniec RY, Myers SA, Robertson J, Sulloway FJ. 2014a. Species collapse via hybridization in Darwin's tree finches. The American Naturalist 183: 325–341.

Kleindorfer S, Peters KJ, Custance G, Dudaniec RY. 2014b. Changes in Philornis infestation behavior threaten Darwin's finch survival. Curr Zool: 1–9.

161

Kohl KD. 2012. Diversity and function of the avian gut microbiota. Journal of Comparative Physiology B 182: 591–602.

Kohl KD, Stengel A, Dearing MD. 2016. Inoculation of tannin-degrading bacteria into novel hosts increases performance on tannin-rich diets. Environmental Microbiology 18: 1720– 1729.

Kosman E, Leonard KJ. 2005. Similarity coefficients for molecular markers in studies of genetic relationships between individuals for haploid, diploid, and polyploid species. Molecular Ecology 14: 415–424.

Kostic AD, Gevers D, Siljander H, Vatanen T, Hyötyläinen T, Hämäläinen A-M, Peet A, Tillmann V, Pöhö P, Mattila I, Lähdesmäki H, Franzosa EA, Vaarala O, de Goffau M, Harmsen H, Ilonen J, Virtanen SM, Clish CB, Orešič M, Huttenhower C, Knip M, Xavier RJ. 2015. The Dynamics of the Human Infant Gut Microbiome in Development and in Progression toward Type 1 Diabetes. Cell Host and Microbe 17: 260–273.

Kozich JJ, Westcott SL, Baxter NT, Highlander SK, Schloss PD. 2013. Development of a dual- index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Applied and Environmental Microbiology 79: 5112– 5120.

Kropáčková L, Těšický M, Albrecht T, Kubovčiak J, Čížková D, Tomášek O, Martin JF, Bobek L, Králová T, Procházka P, Kreisinger J. 2017. Co-diversification of gastrointestinal microbiota and phylogeny in is not explained by ecological divergence. Molecular Ecology.

Kuhn M, Wing J, Weston S, Williams A, Keefer C, Engelhardt A, Cooper T, Mayer Z, Kenkel B, R Core Team, Benesty M, Lescarbeau R, Ziem A, Scrucca L, Tang Y, Candan C, Hunt T. 2017. Caret: Classification and Regression Training.

Lack D. 1947. Darwin's Finches. Cambridge Univ Press.

Lamichhaney S, Berglund J, Almén MS, Maqbool K, Grabherr M, Martinez-Barrio A, Promerová M, Rubin C-J, Wang C, Zamani N, Grant BR, Grant PR, Webster MT, Andersson L. 2015. Evolution of Darwin's finches and their beaks revealed by genome sequencing 518: 371–375.

Lamichhaney S, Han F, Webster MT, Andersson L, Grant BR, Grant PR. 2018. Rapid hybrid speciation in Darwin’s finches. Science 359: 224–228.

162

LeBlanc JG, Milani C, de Giori GS, Sesma F, van Sinderen D, Ventura M. 2013. Bacteria as vitamin suppliers to their host: a gut microbiota perspective. Current Opinion in Biotechnology 24: 160–168.

Legendre P. 2008. Studying beta diversity: ecological variation partitioning by multiple regression and canonical analysis. Journal of Plant Ecology 1: 3–8.

Legendre P, Desdevises Y, Bazin E. 2002. A Statistical Test for Host–Parasite Coevolution. Systematic biology 51: 217–234.

Ley RE, Hamady M, Lozupone C, Turnbaugh PJ, Ramey RR, Bircher JS, Schlegel ML, Tucker TA, Schrenzel MD, Knight R, Gordon JI. 2008a. Evolution of mammals and their gut microbes. Science 320: 1647–1651.

Ley RE, Lozupone CA, Hamady M, Knight R, Gordon JI. 2008b. Worlds within worlds: evolution of the vertebrate gut microbiota. Nature Reviews Microbiology 6: 776–788.

Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics (Oxford, England) 25: 1754–1760.

Li M, Wang B, Zhang M, Rantalainen M, Wang S, Zhou H, Zhang Y, Shen J, Pang X, Zhang M, Wei H, Chen Y, Lu H, Zuo J, Su M, Qiu Y, Jia W, Xiao C, Smith LM, Yang S, Holmes E, Tang H, Zhao G, Nicholson JK, Li L, Zhao L. 2008. Symbiotic gut microbes modulate human metabolic phenotypes. Proceedings of the National Academy of Sciences of the United States of America 105: 2117–2122.

Liaw A, Wiener MR, 2002. 2002. Classification and regression by randomForest. R News 2: 18– 22.

Locey KJ, Lennon JT. 2016. Scaling laws predict global microbial diversity. Proceedings of the National Academy of Sciences of the United States of America 113: 5970–5975.

Loo WT, Loor JG, Dudaniec RY, Kleindorfer S, Cavanaugh CM. 2018a. Diet, habitat, and host phylogeny shape the gut microbiome of Darwin's finches.

Loo WT, Loor JG, Dudaniec RY, Kleindorfer S, Cavanaugh CM. 2018b. Hybrid Darwin's finches share the gut microbiomes of their paternal species.

Love MI, Huber W, Anders S. 2014. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology 15: 31.

Lozupone C, Lladser ME, Knights D, Stombaugh J, Knight R. 2010. UniFrac: an effective distance metric for microbial community comparison. The ISME Journal 5: 169–172.

163

Lucas FS, Heeb P. 2005. Environmental factors shape cloacal bacterial assemblages in great tit Parus major and blue tit P. caeruleus nestlings. Journal of Avian Biology 36: 510–516.

Marchetti K, Price T. 1989. Differences in the foraging of juvenile and adult birds: the importance of developmental constraints. Biological Reviews 64: 51–70.

Mauchamp A. 1997. Threats from Alien Plant Species in the Galápagos Islands. Conservation Biology 11: 260–263.

Mauchamp A, Atkinson R. 2010. Rapid, recent and irreversible habitat loss: Scalesia forest on the Galápagos Islands. Galápagos Report: 108–112.

McFall-Ngai M, Hadfield MG, Bosch TCG, Carey HV, Domazet-Lošo T, Douglas AE, Dubilier N, Eberl G, Fukami T, Gilbert SF, Hentschel U, King N, Kjelleberg S, Knoll AH, Kremer N, Mazmanian SK, Metcalf JL, Nealson K, Pierce NE, Rawls JF, Reid A, Ruby EG, Rumpho M, Sanders JG, Tautz D, Wernegreen JJ. 2013. Animals in a bacterial world, a new imperative for the life sciences. Proceedings of the National Academy of Sciences 110: 3229–3236.

McMullen CK. 1999. Flowering Plants of the Galàpagos. Ithaca, NY: Cornell University Press.

McMurdie PJ, Holmes S. 2013. phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data. PLoS ONE 8: e61217.

McMurdie PJ, Holmes S. 2014. Waste Not, Want Not: Why Rarefying Microbiome Data Is Inadmissible. PLoS Computational Biology 10: e1003531.

Mediannikov O, Sekeyová Z, Birg M-L, Raoult D. 2010. A Novel Obligate Intracellular Gamma-Proteobacterium Associated with Ixodid Ticks, Diplorickettsia massiliensis, Gen. Nov., Sp. Nov. PLoS ONE 5: e11478.

Mende DR, Letunic I, Huerta-Cepas J, Li SS, Forslund K, Sunagawa S, Bork P. 2017. proGenomes: a resource for consistent functional and taxonomic annotations of prokaryotic genomes. Nucleic acids research 45: D529–D534.

Minoche AE, Dohm JC, Himmelbauer H. 2011. Evaluation of genomic high-throughput sequencing data generated on Illumina HiSeq and Genome Analyzer systems. Genome Biology 12: R112.

Moeller AH, Caro-Quintero A, Mjungu D, Georgiev AV, Lonsdorf EV, Muller MN, Pusey AE, Peeters M, Hahn BH, Ochman H. 2016. Cospeciation of gut microbiota with hominids. Science 353: 380–382.

164

Moeller AH, Suzuki TA, Lin D, Lacey EA, Wasser SK, Nachman MW. 2017. Dispersal limitation promotes the diversification of the mammalian gut microbiota. Proceedings of the National Academy of Sciences of the United States of America 114: 13768–13773.

Morrison ML. 1984. Influence of sample size and sampling design on analysis of behavior. Condor 86: 146.

Muegge BD, Kuczynski J, Knights D, Clemente JC, Gonzalez A, Fontana L, Henrissat B, Knight R, Gordon JI. 2011. Diet Drives Convergence in Gut Microbiome Functions Across Mammalian Phylogeny and Within Humans. Science 332: 970–974.

Nayfach S, Rodriguez-Mueller B, Garud N, Pollard KS. 2016. An integrated metagenomics pipeline for strain profiling reveals novel patterns of bacterial transmission and biogeography. Genome Research 26: 1612–1625.

Nielsen EE, Bach LA, Kotlicki P. 2006. hybridlab (version 1.0): a program for generating simulated hybrids from population samples. Molecular Ecology Notes 6: 971–973.

O'Connor JA, Dudaniec RY, Kleindorfer S. 2010. Parasite infestation and predation in Darwin's small ground finch: contrasting two elevational habitats between islands. Journal of Tropical Ecology 26: 285–292.

O'Hara AM, Shanahan F. 2006. The gut flora as a forgotten organ. EMBO reports 7: 688–693.

Ochman H, Worobey M, Kuo C-H, Ndjango J-BN, Peeters M, Hahn BH, Hugenholtz P. 2010. Evolutionary Relationships of Wild Hominids Recapitulated by Gut Microbial Communities. PLoS Biology 8: e1000546.

Oksanen J, Blanchet FG, Friendly M, Kindt R, Legendre P, McGlinn D, Minchin PR, OHara RB, Simpson GL, Solymos P, Stevens MHH, Szoecs E, Wagner H. 2017. Vegan: Community ecology package. R package version 2.4-5. 2016.

O’Connor JA, Robertson J, Kleindorfer S. 2010. Video analysis of host–parasite interactions in nests of Darwin’s finches. Oryx 44: 588–594.

O’Connor JA, Robertson J, Kleindorfer S. 2014. Darwin's Finch Begging Intensity Does Not Honestly Signal Need in Parasitised Nests. Ethology 120: 228–237.

O’Connor JA, Sulloway FJ, Robertson J, Kleindorfer S. 2009. Philornis downsi parasitism is the primary cause of nestling mortality in the critically endangered Darwin’s medium tree finch (Camarhynchus pauper). Biodiversity & Conservation 19: 853–866.

165

Paradis E, Claude J, Strimmer K. 2004. APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics (Oxford, England) 20: 289–290.

Pavoine S, Dufour A-B, Chessel D. 2004. From dissimilarities among species to dissimilarities among communities: a double principal coordinate analysis. Journal of Theoretical Biology 228: 523–537.

Peters KJ, Kleindorfer S. 2015. Divergent foraging behavior in a hybrid zone: Darwin's tree finches (Camarhynchus spp.) on Floreana Island. Current Zoology 61: 181–190.

Peters KJ, Kleindorfer S. 2017. Avian population trends in Scalesia forest on Floreana Island (2004-2013): Acoustical surveys cannot detect hybrids of Darwin's tree finches Camarhynchus spp. International 38: 1–17.

Peters KJ, Myers SA, Dudaniec RY, O'Connor JA, Kleindorfer S. 2017. Females drive asymmetrical introgression from rare to common species in Darwin's tree finches. Journal of evolutionary biology 16: 613.

Petren K. 1998. Microsatellite primers from Geospiza fortis and cross-species amplification in Darwin's finches. Molecular Ecology 7: 1782–1784.

Petren K, Grant BR, Grant PR. 1999. A phylogeny of Darwin's finches based on microsatellite DNA length variation. … of the Royal ….

Petren K, Grant PR, Grant BR, Keller LF. 2005. Comparative landscape genetics and the adaptive radiation of Darwin's finches: the role of peripheral isolation. Molecular Ecology 14: 2943–2957.

Phillips CD, Phelan G, Dowd SE, McDonough MM, Ferguson AW, Delton Hanson J, Siles L, Ordóñez-Garza N, San Francisco M, Baker RJ. 2012. Microbiome analysis among bats describes influences of host phylogeny, life history, physiology and geography. Molecular Ecology.

Planer JD, Peng Y, Kau AL, Blanton LV, Ndao IM, Tarr PI, Warner BB, Gordon JI. 2016. Development of the gut microbiota and mucosal IgA responses in twins and gnotobiotic mice. Nature 534: 263–266.

Post DM. 2002. Using stable isotopes to estimate trophic position: Models, methods, and assumptions. Ecology 83: 703–718.

Price MN, Dehal PS, Arkin AP. 2010. FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments. PLoS ONE 5: e9490.

166

Price TD, Grant PR, Gibbs HL, Boag PT. 1984. Recurrent patterns of natural selection in a population of Darwin's finches. Nature 309: 787–789.

Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population structure using multilocus genotype data. Genetics 155: 945–959.

Pruesse E, Peplies J, Glöckner FO. 2012. SINA: Accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics (Oxford, England) 28: 1823–1829.

R Core Team. 2014. R: A language and environment for statistical computing. Vienna, Austria. URL http://www.R-project.org/.

R Core Team. 2017. R: A language and environment for statistical computing.

Rakotondranary SJ, Struck U, Knoblauch C, Ganzhorn JU. 2011. Regional, seasonal and interspecific variation in 15N and 13C in sympatric mouse lemurs. Naturwissenschaften 98: 909–917.

Ren T, Kahrl AF, Wu M, Cox RM. 2016. Does adaptive radiation of a host lineage promote ecological diversity of its bacterial communities? A test using gut microbiota of Anolis lizards. Molecular Ecology 25: 4793–4804.

Rosenberg E, Zilber-Rosenberg I. 2013. The hologenome concept: human, animal and plant microbiota.

Russell EM. 2000. Avian life histories: is extended parental care the southern secret? Emu 100: 377–399.

Salter SJ, Cox MJ, Turek EM, Calus ST, Cookson WO, Moffatt MF, Turner P, Parkhill J, Loman NJ, Walker AW. 2014. Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biology 12: 118.

Sanders JG, Powell S, Kronauer DJC, Vasconcelos HL, Frederickson ME, Pierce NE. 2014. Stability and phylogenetic correlation in gut microbiota: lessons from ants and apes. Molecular Ecology 23: 1268–1283.

Sato A, O'hUigin C, Figueroa F, Grant PR, Grant BR, Tichy H, Klein J. 1999. Phylogeny of Darwin's finches as revealed by mtDNA sequences. Proceedings of the National Academy of Sciences of the United States of America 96: 5101–5106.

Sato A, Tichy H, O'hUigin C, Grant PR, Grant BR, Klein J. 2001. On the origin of Darwin's finches. Molecular biology and evolution 18: 299–311.

167

Schluter D. 2000. The Ecology of Adaptive Radiation. OUP Oxford.

Shapiro SS, Wilk MB. 1965. An analysis of variance test for normality (complete samples). Biometrika 52: 591–611.

Shokralla S, Spall JL, Gibson JF, Hajibabaei M. 2012. Next-generation sequencing technologies for environmental DNA research. Molecular Ecology 21: 1794–1805.

Smith LM, Burgoyne LA. 2004. Collecting, archiving and processing DNA from wildlife samples using FTA® databasing paper | BMC Ecology | Full Text. BMC Ecology 4: 4.

Smits SA, Leach J, Sonnenburg ED, Gonzalez CG, Lichtman JS, Reid G, Knight R, Manjurano A, Changalucha J, Elias JE, Dominguez-Bello MG, Sonnenburg JL. 2017. Seasonal cycling in the gut microbiome of the Hadza hunter-gatherers of Tanzania. Science 357: 802–806.

Stapp P, Polis GA, Piñero FS. 1999. Stable isotopes reveal strong marine and El Niño effects on island food webs 401: 467–469.

Statnikov A, Henaff M, Narendra V, Konganti K, Li Z, Yang L, Pei Z, Blaser MJ, Aliferis CF, Alekseyenko AV. 2013. A comprehensive evaluation of multicategory classification methods for microbiomic data. Microbiome 1: 11.

Steadman DW. 1986. Holocene Vertebrate Fossils from Isla Floreana, Galápagos.

Steeves TE, Maloney RF, Hale ML, Tylianakis JM, Gemmell NJ. 2010. Genetic analyses reveal hybridization but no hybrid swarm in one of the world’s rarest birds. Molecular Ecology 19: 5090–5100.

Sullam KE, Rubin BE, Dalton CM, Kilham SS, Flecker AS, Russell JA. 2015. Divergence across diet, time and populations rules out parallel evolution in the gut microbiomes of Trinidadian guppies. The ISME Journal 9: 1508–1522.

Sulloway FJ, Kleindorfer S. 2013. Adaptive divergence in Darwin's small ground finch (Geospiza fuliginosa): divergent selection along a cline. Biological journal of the Linnean Society. Linnean Society of London 110: 45–59.

Tebbich S, Sterelny K, Teschke I. 2010. The tale of the finch: adaptive radiation and behavioural flexibility. Philosophical Transactions of the Royal Society B: Biological Sciences 365: 1099–1109.

Tebbich S, Taborsky M, Fessl B, Blomqvist D. 2001. Do woodpecker finches acquire tool-use by social learning? Proceedings of the Royal Society of London B 268: 2189–2193.

168

Tebbich S, Taborsky M, Fessl B, Dvorak M. 2002. The ecology of tool-use in the woodpecker finch (Cactospiza pallida). Ecology Letters 5: 656–664.

Tebbich S, Taborsky M, Fessl B, Dvorak M, Winkler H. 2004. Feeding behavior of four arboreal Darwin's finches: Adaptations to spatial and seasonal variability. Condor 106: 95.

Theis KR, Dheilly NM, Klassen JL, Brucker RM, Baines JF, Bosch TCG, Cryan JF, Gilbert SF, Goodnight CJ, Lloyd EA, Sapp J, Vandenkoornhuyse P, Zilber-Rosenberg I, rosenberg E, Bordenstein SR, Gilbert JA. 2016. Getting the Hologenome Concept Right: an Eco- Evolutionary Framework for Hosts and Their Microbiomes. mSystems 1: e00028–16.

Turnbaugh PJ, Hamady M, Yatsunenko T, Cantarel BL, Duncan A, Ley RE, Sogin ML, Jones WJ, Roe BA, Affourtit JP, Egholm M, Henrissat B, Heath AC, Knight R, Gordon JI. 2008. A core gut microbiome in obese and lean twins 457: 480–484. van Dongen WF, White J, Brandl HB, Moodley Y, Merkling T, Leclaire S, Blanchard P, Danchin É, Hatch SA, Wagner RH. 2013. Age-related differences in the cloacal microbiota of a wild bird species. BMC Ecology 13: 11.

Vähä JP, Primmer CR. 2006. Efficiency of model-based Bayesian methods for detecting hybrid individuals under different hybridization scenarios and with different numbers of loci. Molecular Ecology 15: 63–72.

Vo A-TE, Jedlicka JA. 2014. Protocols for metagenomic DNA extraction and Illumina amplicon library preparation for faecal and swab samples. Molecular ecology resources 14: 1183– 1197.

Wang J, Kalyan S, Steck N, Turner LM, Harr B, Künzel S, Vallier M, Häsler R, Franke A, Oberg H-H, Ibrahim SM, Grassl GA, Kabelitz D, Baines JF. 2015. Analysis of intestinal microbiota in hybrid house mice reveals evolutionary divergence in a vertebrate hologenome. Nature Communications 6: 6440–.

Whitman WB, Coleman DC, Wiebe WJ. 1998. Prokaryotes: the unseen majority. Proceedings of the National Academy of Sciences of the United States of America 95: 6578–6583.

Wickham H. 2016. ggplot2: elegant graphics for data analysis.

Winter K, Smith JAC. 2012. Crassulacean Acid Metabolism. Berlin, Heidelberg: Springer Science & Business Media.

Woese C. 1987. Bacterial Evolution. Microbiological Reviews 51: 221–271.

169

Woese CR, Kandler O, Wheelis ML. 1990. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proceedings of the National Academy of Sciences of the United States of America 87: 4576–4579.

Xenoulis PG, Gray PL, Brightsmith D, Palculict B, Hoppes S, Steiner JM, Tizard I, Suchodolski JS. 2010. Molecular characterization of the cloacal microbiota of wild and captive parrots. Veterinary Microbiology 146: 320–325.

Zhang G, Li B, Li C, Gilbert MTP, Jarvis ED, Wang J. 2014. Comparative genomic data of the Avian Phylogenomics Project. GigaScience 3: 1–8.

Zhu C, Miller M, Marpaka S, Vaysberg P, Rühlemann MC, Wu G, Heinsen F-A, Tempel M, Zhao L, Lieb W, Franke A, Bromberg Y. 2018. Functional sequencing read annotation for high precision microbiome analysis. Nucleic acids research 46: e23–e23.

170

Supplementary Material

Chapter 2 Supplementary Material

Supplementary Figures

SGF SGF

Genus MGF MGF Acinetobacter Order Catellicoccus LGF Actinomycetales LGF Cellulomonas Bacillales Clostridium_XVIII Burkholderiales Corynebacterium CF CF Clostridiales Diplorickettsia Enterobacteriales Enterobacter STF Erysipelotrichales STF Enterococcus Lactobacillales Geodermatophilus Legionellales Kocuria LTF Pseudomonadales LTF Lactobacillus Rhizobiales Methylobacterium WPF Rhodospirillales WPF Mycobacterium Unclassified Rhodospirillum Sanguibacter VF VF Unclassified

WF WF

0.00 0.25 0.50 0.75 0.0 0.2 0.4 0.6 0.8 Figure S1. Mean relative abundance of bacterial taxa grouped at varying taxonomic levels.

Relative abundance was calculated for each bacterial ribosomal sequence variant (RSV) according to the total number of sequences per sample. Only taxa with mean relative abundance above 5% were included. A) Mean relative abundance for RSVs grouped at bacterial order. B)

Mean relative abundance for RSVs grouped at bacterial genera. Note that taxa labeled as

‘Unclassified’ at these taxonomic levels may be classified at larger taxonomic levels.

171

●● ●● ● ● ● ● ●● ● ●● ● ●● Habitat ● ●● ●● ● ● ●● ● ●●● ●● ● ●● ●● ● ●● ●● ● Highland ● ● ● 0.0 ● ●● ● ●● ● ● ● ●● ●● Lowland

●● ●● Species −0.1 ●● CF

●● LGF

●● LTF

−0.2 ●● MGF Axis2 [22.2%] ●● SGF

●● STF

−0.3 ●● VF

●● WF

●● WPF

−0.4 ●●

−0.2 −0.1 0.0 0.1 Axis1 [40.5%]

Figure S2. Double principal coordinate analysis with all samples showing the outlier warbler finch sample.

This sample was excluded from further analysis.

172

Figure S3. Beta dispersion of weighted UniFrac distances categorized by (A) habitat and

(B) species.

Weighted UniFrac distances were plotted on the first two principal coordinate axes and colored by category. Habitat displayed a significant difference in dispersion while species did not.

173

Figure S4. Procrustes Analysis of Co-Phylogeny plots.

Central points indicate the principal coordinates of Darwin’s finch gut microbiome samples. A)

Arrows point to the principal coordinates of the finch phylogenetic distances. Many microbiome samples are the nearest points to their corresponding finch coordinates (e.g. the right-most microbiome samples are all from warbler finches and are the nearest points to the warbler finch phylogenetic coordinates at 0.25, 0). B) Arrows point to the stable isotope value coordinates. The pattern is less apparent from the microbiome points to the stable isotope values. C) Arrows point to the Euclidean distances between principal components of foraging data across five diet categories.

174

175

SGF_Highland

MGF_Highland

STF_Highland

LTF_Highland Food Item WPF_Highland insect seed WF_Highland flower plant SGF_Lowland fruit MGF_Lowland

LGF_Lowland

CF_Lowland

VF_Lowland 0.00 0.25 0.50 0.75 1.00 Proportion of first foraging observations

Figure S5 First foraging observations of Darwin’s finch species across both habitats.

Two species, the small ground finch (SGF) and medium ground finch (MGF) occur in both habitats.

176

Beta Diversity Through Time

0.25

0.20 Correlated Data

2 Diet−Foraging

R 0.15 Diet−Stable Isotope Finch Phylogeny

0.10

0.05

0 1000 2000 3000 Millions of years

Figure S6. Beta Diversity Through Time with full bacterial timescale.

177

A Predicted Class CF LGF LTF MGF SGF STF VF WF WPF Accuracy CF 0 1 0 2 3 0 0 0 0 0.00 LGF 2 0 0 4 1 0 0 0 0 0.00 LTF 1 0 0 0 0 1 0 1 0 0.00 MGF 0 3 0 0 2 0 1 0 0 0.00 class

SGF 2 2 0 0 8 1 0 0 0 0.62 STF 0 0 0 0 2 3 0 1 1 0.43

Actual VF 2 0 0 0 2 0 0 0 0 0.00 WF 0 0 0 0 3 1 0 1 1 0.17 WPF 0 0 0 0 0 1 0 3 1 0.20

B Predicted class Highland Lowland Accuracy Actual Highland 27 2 0.93 class Lowland 0 28 1.00

Figure S7. Confusion matrices for the random forest classifier using Darwin’s finch gut microbiome communities with leave-one-out cross-validation.

A) Prediction of finch species. Overall accuracy of the predictions was 0.23 and was equal to the no information rate. B) Prediction of habitat. Overall accuracy was 0.96 with a 95% confidence interval of 0.88 to 1.00. This was significantly higher than the no information rate (0.51, p<0.0001).

178

20 Highland

10

0 Number of samples 20 Lowland

10

0 0 2 4 6 Abundance of discriminative bacteria Figure S8. Abundance of the ribosomal sequence variant with the most importance for classification by habitat, Tetrasphaera.

This RSV is present in almost all of the lowland samples but absent from almost all highland samples. This strong pattern of presence/absence explains why this RSV was so useful in classifying gut microbiome samples as coming from the highland or lowland habitats.

179

Supplementary Tables

Table S1. Statistical tests on amplicon library size mean and distribution across categorical variables of interest

Kruskal-Wallis test Levene’s test for for library size mean library size distribution Variable #2 p-value F value p-value Species 13.17 0.11 0.58 0.79 Habitat 0.11 0.74 0.05 0.82 Sex 0.51 0.48 2.23 0.14 PCRPlate 6.75 0.24 1.52 0.20

Table S2. Relative abundance of bacterial phyla across all samples

Phylum meanRA sdRA minRA maxRA Firmicutes 35.4% 32.3% 0.2% 98.3% Actinobacteria 31.3% 24.2% 0.2% 89.9% Proteobacteria 27.5% 22.7% 0.7% 93.9% Unclassified 3.6% 7.2% 0.1% 38.2% Chloroflexi 0.9% 1.6% 0.0% 8.9% Tenericutes 0.4% 2.3% 0.0% 17.2% Acidobacteria 0.3% 0.4% 0.0% 1.6% Planctomycetes 0.3% 0.3% 0.0% 1.5% Bacteroidetes 0.2% 0.9% 0.0% 7.0% Cyanobacteria 0.1% 0.2% 0.0% 1.1% Spirochaetes 0.1% 0.4% 0.0% 2.8% Deinococcus-Thermus 0.0% 0.1% 0.0% 0.7% Verrucomicrobia 0.0% 0.0% 0.0% 0.2% Chlamydiae 0.0% 0.0% 0.0% 0.0% Gemmatimonadetes 0.0% 0.0% 0.0% 0.1%

180

Table S3. Relative abundance of bacterial phyla across Darwin’s finch species

Phylum CF LGF SGF MGF STF LTF WPF VF WF Firmicutes 81.0% 12.6% 53.1% 34.6% 34.2% 35.2% 15.9% 12.4% 18.9% Actinobacteria 9.8% 53.2% 24.5% 40.6% 38.6% 35.3% 28.5% 13.9% 27.9% Proteobacteria 7.4% 27.2% 19.6% 15.9% 24.4% 28.8% 48.1% 54.9% 43.7% Unclassified 1.1% 4.0% 1.5% 6.1% 0.7% 0.3% 1.3% 17.3% 6.0% Chloroflexi 0.3% 2.2% 0.5% 2.0% 1.3% 0.3% 0.4% 0.1% 1.1% Tenericutes 0.1% 0.0% 0.0% 0.0% 0.0% 0.0% 3.7% 0.0% 0.9% Acidobacteria 0.1% 0.3% 0.3% 0.5% 0.4% 0.0% 0.2% 0.3% 0.2% Planctomycetes 0.1% 0.2% 0.3% 0.3% 0.4% 0.1% 0.3% 0.1% 0.4% Bacteroidetes 0.1% 0.0% 0.0% 0.0% 0.1% 0.0% 1.5% 0.2% 0.0% Cyanobacteria 0.0% 0.1% 0.1% 0.0% 0.1% 0.0% 0.1% 0.5% 0.1% Spirochaetes 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.7% Deinococcus- 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.3% 0.0% Thermus

Table S4. Most abundant bacterial genera across all Darwin finch gut microbiome samples above 1 percent mean relative abundance.

Phylum Genus meanRA sdRA minRA maxRA Firmicutes Lactobacillus 26.2% 32.8% 0.0% 98.2% Proteobacteria Acinetobacter 6.4% 16.0% 0.0% 86.0% Actinobacteria Kocuria 4.5% 12.2% 0.0% 69.7% Proteobacteria Methylobacterium 3.8% 4.9% 0.0% 25.6% Actinobacteria Mycobacterium 2.6% 6.6% 0.0% 51.3% Actinobacteria Nocardioides 1.8% 1.9% 0.0% 9.0% Actinobacteria Geodermatophilus 1.8% 3.2% 0.0% 16.3% Proteobacteria Diplorickettsia 1.7% 11.3% 0.0% 89.5% Actinobacteria Solirubrobacter 1.7% 2.1% 0.0% 9.1% Firmicutes Enterococcus 1.7% 4.6% 0.0% 22.6% Actinobacteria Cellulomonas 1.4% 3.1% 0.0% 21.3% Proteobacteria Enterobacter 1.3% 4.8% 0.0% 29.8% Actinobacteria Curtobacterium 1.3% 2.6% 0.0% 13.5% Proteobacteria Rhodospirillum 1.3% 9.9% 0.0% 78.9% Proteobacteria Rhizobium 1.2% 1.6% 0.0% 7.7% Firmicutes Clostridium_XVIII 1.1% 8.4% 0.0% 67.0% Actinobacteria Corynebacterium 1.0% 5.0% 0.0% 39.6%

181

Table S5. Most abundant classified bacterial genus for each Darwin finch species

Darwin finch Bacterial genus meanRA sdRA minRA maxRA species

SGF Lactobacillus 49.7% 34.3% 0.2% 95.7% MGF Lactobacillus 27.3% 34.8% 0.1% 84.5% LGF Kocuria 22.5% 26.5% 2.0% 69.7% CF Lactobacillus 77.8% 16.3% 56.5% 98.2% STF Lactobacillus 11.7% 20.3% 0.1% 61.8% LTF Lactobacillus 26.1% 13.2% 14.0% 40.2% WPF Rhodospirillum 15.8% 35.3% 0.0% 78.9% VF Acinetobacter 16.5% 29.9% 1.3% 61.4% WF Diplorickettsia 14.0% 33.4% 0.1% 89.5%

182

Table S6. Alpha diversity estimates by Darwin’s finch species

Species Observed RSVs Observed Chao1 mean Chao1 SE PD mean PD SE mean RSVs SE CF 522.17 151.25 701.04 188.83 52.09 10.04 LGF 796.63 112.25 1023.84 110.62 66.04 6.54 SGF 622.46 113.56 853.22 131.87 53.60 6.08 WF 575.00 70.67 729.71 74.16 59.64 4.99 STF 777.50 88.13 962.39 101.93 68.51 6.56 MGF 732.43 141.23 928.88 151.93 59.02 7.29 VF 471.25 73.26 623.35 84.95 58.62 4.80 LTF 475.67 151.15 684.07 170.49 44.06 8.20 WPF 997.20 99.30 1228.33 127.62 80.87 5.58

183

Table S7. Anova on alpha diversity metrics across species

Divesrity metric F value p value Observed RSVs 1.58 0.15

Chao 1 1.50 0.18

Phylogenetic 1.67 0.13 diversity

Table S8. Permanova tests of weighted UniFrac distances with categorical variables of interest

Variable F R2 p Habitat* 12.1 0.15 0.03 Habitat:Species* 1.6 0.21 0.03 Sex 0.97 0.02 0.41 PCRPlate 0.78 0.01 0.56

* Species is nested within habitat because only the small ground finch is present in both habitats

184

Table S9. Post hoc pairwise anova with weighted UniFrac distances between Darwin’s finch species

Pairs F.Model R2 p-value p-adjusted WF vs VF 6.46 0.45 0.005 0.180 WF vs STF 1.95 0.12 0.020 0.720 WF vs SGF 1.75 0.09 0.130 1.000 WF vs LGF 4.66 0.28 0.001 *0.036 WF vs LTF 2.49 0.26 0.052 1.000 WF vs MGF 2.79 0.20 0.013 0.468 WF vs WPF 1.75 0.16 0.025 0.900 WF vs CF 3.49 0.26 0.005 0.180 VF vs STF 5.28 0.31 0.001 *0.036 VF vs SGF 2.43 0.14 0.047 1.000 VF vs LGF 4.28 0.30 0.006 0.216 VF vs LTF 1.88 0.27 0.046 1.000 VF vs MGF 3.07 0.25 0.044 1.000 VF vs WPF 6.37 0.48 0.007 0.252 VF vs CF 1.12 0.12 0.306 1.000 STF vs SGF 1.78 0.08 0.096 1.000 STF vs LGF 4.64 0.22 0.001 *0.036 STF vs LTF 2.23 0.17 0.048 1.000 STF vs MGF 2.71 0.15 0.017 0.612 STF vs WPF 1.65 0.11 0.083 1.000 STF vs CF 3.76 0.21 0.002 0.072 SGF vs LGF 2.04 0.10 0.075 1.000 SGF vs LTF 0.90 0.06 0.398 1.000 SGF vs MGF 1.22 0.06 0.298 1.000 SGF vs WPF 1.85 0.10 0.088 1.000 SGF vs CF 1.40 0.08 0.212 1.000 LGF vs LTF 3.37 0.27 0.023 0.828 LGF vs MGF 0.67 0.05 0.727 1.000 LGF vs WPF 3.73 0.25 0.004 0.144 LGF vs CF 1.88 0.14 0.107 1.000 LTF vs MGF 1.72 0.18 0.155 1.000 LTF vs WPF 2.97 0.33 0.049 1.000 LTF vs CF 1.13 0.14 0.271 1.000 MGF vs WPF 2.27 0.19 0.039 1.000 MGF vs CF 1.22 0.10 0.261 1.000 WPF vs CF 3.20 0.26 0.008 0.288

185

Supplementary Methods

Morphological measurements

Eight morphological measurements were taken for all finches sampled. These included

(1) beak-head (beak tip to back of head), (2) beak-naris (beak tip to anterior end of the naris), (3) beak-feather (tip of beak to feather line), (4) beak depth (at the base of the beak), (5) beak width

(at the base of the beak), (6) naris diameter (taken from extremes of naris opening), (7) tarsus length, (8) wing length, and (9) body mass. Dial calipers were used to take morphological measurements to the nearest 0.01 mm and Telinga electronic scales were used to measure mass to the nearest 0.01 g.

Amplification conditions

For each fecal DNA sample, triplicate 25 µl PCR reactions were performed containing

12.5 µl master mix, 9.5 µl molecular grade water, 0.5 µl of 10 M stock for each primer, and 2 µl of DNA template. PCR conditions consisted of initial denaturation at 94°C for 5 min followed by

20 cycles of 98°C for 20 s, 55°C for 15 s, 72°C for 40 s, and a final extension at 72°C for 5 min.

All PCR products were purified using 0.66X Aline PCRClean DX (Aline Biosciences,

Woburn, MA) to size select for the ~450 bp PCR product. Purified PCR products were visualized and quantified using High Sensitivity D1000 ScreenTape on an Agilent 2200

TapeStation (Agilent, Santa Clara, CA) and pooled in equimolar concentrations for sequencing on a single MiSeq run (Illumina, USA) using v2 chemistry and 2 x 250-bp paired-end reads at the Harvard Biopolymers Facility (Boston, MA).

186

Chapter 3 Supplementary Material

Supplementary Figures

Beak length naris

4

3

2 Count

1

0

7.0 7.5 8.0 8.5 9.0 Beak length naris (mm)

Figure S9. Putative population assignment of Floreana tree finches based on the beak length-naris measurement.

Individuals to the right of the threshold at 8.2 mm were classified as C. pauper phenotype while individuals to the left of the threshold were classified as C. parvlulus phenotype.

187

Figure S10. Plot of the mean log likelihood of each model across 10 replicate runs of structure with K=1-4.

The delta K method (Evanno et al. 2005) was applied to 10 replicate runs of structure (Pritchard et al. 2000) for each of K=1-4 with a burn-in of 100,000 and a post burn-in chain length of

500,000. The delta K was calculated using STRUCTURE HARVESTER (Earl and vonHoldt

2011). K=2 had the highest average log likelihood and the highest delta K value.

188

Figure S11. Genetic cluster assignment based on a membership coefficient threshold of

0.75.

Membership coefficients were averaged across 20 runs of STRUCTURE with K=2. The threshold for genetic cluster assignment was determined by overall performance with simulations. A) Combined dataset of individuals genotyped in this study and those in (Peters et al. 2017). B) Cluster assignment of individuals with microbiome samples from this study.

189

Figure S12. Principal coordinate analysis plots for the bacterial species with enough coverage to call single nucleotide variants across all three genetic clusters.

None of the bacterial species show clustering based on the genetic cluster of the host individual.

190

1

Species ● STF ●● ● ● HTF ● ● ● ● NMDS2 ● ● ● ● ● ● ● MTF 0 ● ● ● ● ● ● ● ● ● ●● ●

● ● −1 −1.0 −0.5 0.0 0.5 1.0 NMDS1

Figure S13. Nonmetric multi-dimensional scaling (NMDS) of metagenomic samples based on the number of reads categorized to each Enzyme Commission number by mi-faser.

Point color corresponds to the genetic cluster of the metagenomic sample. NMDS did not reach convergence after 100 iterations.

191

Figure S14. KEGG pathways represented by Enzyme Commission numbers within the top

5% of mean coverage across Darwin’s tree finch metagenomic samples.

The count of EC numbers associated to each pathway are plotted. The top 5% of EC numbers were mapped to KEGG pathways using KEGG Mapper.

192

Metabolic pathways Biosynthesis of secondary metabolites Biosynthesis of antibiotics Microbial metabolism in diverse environments Glyoxylate and dicarboxylate metabolism Purine metabolism Pyruvate metabolism Carbon fixation pathways in prokaryotes Citrate cycle (TCA cycle) Glycolysis / Gluconeogenesis Propanoate metabolism Cysteine and methionine metabolism Glycine, serine and threonine metabolism Valine, leucine and isoleucine degradation Pyrimidine metabolism Methane metabolism Fatty acid biosynthesis Drug metabolism − other enzymes Tryptophan metabolism Arginine biosynthesis Aminoacyl−tRNA biosynthesis Pentose phosphate pathway Butanoate metabolism C5−Branched dibasic acid metabolism Nitrogen metabolism Biotin metabolism Alanine, aspartate and glutamate metabolism One carbon pool by folate Fatty acid degradation Starch and sucrose metabolism Carbon fixation in photosynthetic organisms Atrazine degradation Phenylalanine metabolism Terpenoid backbone biosynthesis Oxidative phosphorylation Acarbose and validamycin biosynthesis Lysine biosynthesis Phenylpropanoid biosynthesis Phenylalanine, tyrosine and tryptophan biosynthesis Thiamine metabolism Pantothenate and CoA biosynthesis Polyketide sugar unit biosynthesis Cyanoamino acid metabolism Nicotinate and nicotinamide metabolism Sulfur metabolism Streptomycin biosynthesis Benzoate degradation Monobactam biosynthesis Synthesis and degradation of ketone bodies Photosynthesis Lysine degradation Valine, leucine and isoleucine biosynthesis Glutathione metabolism Biosynthesis of ansamycins Amino sugar and nucleotide sugar metabolism Porphyrin and chlorophyll metabolism Glycerolipid metabolism Taurine and hypotaurine metabolism Biosynthesis of unsaturated fatty acids Selenocompound metabolism

0 5 10 15 20 25 30 35 40 Count of E.C. numbers

193

Supplementary Tables

Table S10. PCR primer sequences for multiplex microsatellite genotyping and population genetic statistics calculated on the combined dataset used to assign individuals to population clusters (n=384).

Locus Core No. of Size HO HE Direc- Primer sequence (5'–3') 5' Dye repeat alleles range tion TAGCATTTCTATGTAGTGTTATTTTAA Gf01 (AC)23 19 159- 0.74 0.9 F 6FAM 191 R TTTATTTATGTTCATATAAACTGCATG CCAGCTTAAAGCCAGCACTTCC Gf03 (GA)17 12 220- 0.52 0.72 F NED 250 R* ATGCAGTAATCAGTTAAGGATGACAAA TTCTGTTTGTAGAGTTCTGGTTT Gf04 (AC)11 5 230- 0.11 0.18 F* PET 260 R TTTTCTTATATCTATTGAGAGATGGT AAACACTGGGAGTGAAGTCT Gf05 (AC)14 6 200– 0.58 0.72 F VIC 218 R AACTATTCTGTGATCCTGTTACAC GCTATTGAGCTAACTAAATAAACAACT Gf06 (AC)13,6 5 175– 0.09 0.14 F 6FAM 179 R CACAAATAGTAATTAAAAGGAAGTACC GACTCATCTGTGTGTAACTGGG Gf07 (AC)22 8 270- 0.36 0.51 F* NED 310 R GCACTATCTCACAGTGATATCTAAAAT GTGCTATCAGCGAGGCATTTC Gf11 (AC)28 13 168– 0.43 0.66 F VIC 224 R AGGAGGATTTGGCTGACTGG AATCCTTCTCGTCCCTCTTGG Gf12 (AC)17 16 167– 0.78 0.87 F 6FAM 193 R TTTGAGTGTGCAGCAGTTGG TCCCCCGTGAAAAGTGGAGC Gf13 (AC)14 17 149– 0.59 0.73 F NED 177 R CAACACAATTGCAATATCGATTCCC

* These primers were redesigned from Petren (1998) to facilitate multiplex fragment analysis

(Galligan et al. 2012).

194

Table S11. Evanno method applied to 10 replicates of structure each for K=1-4 produced by structure harvester (Earl and vonHoldt 2011).

K Reps Mean LnP(K) Stdev LnP(K) Ln'(K) |Ln''(K)| Delta K 1 10 -8416.61 0.06 NA NA NA 2 10 -8268.98 13.62 147.63 295.87 21.73 3 10 -8417.22 55.30 -148.24 426.75 7.72 4 10 -8992.21 333.00 -574.99 NA NA

Table S12. Overall performance, efficiency, and accuracy of for each membership coefficient (qi) threshold determined by 10 simulated datasets.

Threshold value for qi Overall performance, % 0.75 0.80 0.85 C parvulus 70.1 14.8 9.7 Hybrid 63.1 69.3 4.9 C pauper 88.7 89.0 75.6 Accuracy, %(SE) C parvulus 91.7 (1.7) 100.0 (0) 9.7 (0.1) Hybrid 75.2 (1.9) 72.2 (0.7) 57.7 (1.9) C pauper 90.0 (1.1) 90.7 (1.6) 100.0 (0) Efficiency, %(SE) C parvulus 76.5 (2.7) 14.8 (2.9) 100.0 (0) Hybrid 83.9 (2.4) 96.0 (0.7) 8.5 (0.7) C pauper 98.5 (0.4) 98.0 (0.6) 75.6 (2.2)

195

Table S13. Statistical tests on amplicon library size mean and distribution across categorical variables of interest

Kruskal-Wallis test Levene’s test for for library size mean library size distribution Variable #2 p-value F value p-value Species 0.34 0.84 0.85 0.44 Sex 0.42 0.52 4.22 0.05 PCRPlate 5.85 0.32 0.17 0.97

196

Table S14. Input DNA amounts, adapter concentrations, and PCR cycles for metagenomic library preparation

FecalTube InputDNAn Bin Adapter_Conc_uM PCR_cycles Species g X210 2.139 2.5 0.75 10 HTF X200 2.259 2.5 0.75 10 STF X213 2.28 2.5 0.75 10 STF X151 2.541 2.5 0.75 10 STF X209 2.601 2.5 0.75 10 HTF X166 3.24 2.5 0.75 10 STF X203 3.48 2.5 0.75 10 MTF X149 3.96 5 1.5 9 HTF X134 4.29 5 1.5 9 STF X188 4.77 5 1.5 9 HTF X123 4.83 5 1.5 9 STF X150 4.95 5 1.5 9 STF X130 5.19 5 1.5 9 STF X152 5.28 5 1.5 9 STF X126 5.43 5 1.5 9 HTF X204 5.85 5 1.5 9 HTF X216 6.21 5 1.5 9 HTF X211 6.27 5 1.5 9 MTF X153 6.57 5 1.5 9 HTF X167 7.53 10 3 7 MTF X174 8.16 10 3 7 MTF X214 13.14 10 3 7 STF X202 14.46 10 3 7 HTF X155 15.09 10 3 7 STF X173 16.5 25 7.5 6 STF X205 16.59 25 7.5 6 STF X125 21.39 25 7.5 6 STF X136 24.39 25 7.5 6 HTF X137 165.3 100 15 2 HTF

197

Table S15. Pairwise tests of linkage disequilibrium

Marker1 Marker2 T2 df P-value Gf01 Gf03 188.79 228 9.7E-01 Gf01 Gf04 186.45 95 6.5E-08 Gf01 Gf05 156.11 114 5.4E-03 Gf01 Gf06 160.53 95 3.1E-05 Gf01 Gf07 214.98 152 5.9E-04 Gf01 Gf11 251.90 247 4.0E-01 Gf01 Gf12 336.48 304 9.7E-02 Gf01 Gf13 371.22 323 3.3E-02 Gf03 Gf04 20.98 60 1.0E+00 Gf03 Gf05 79.23 72 2.6E-01 Gf03 Gf06 91.71 60 5.3E-03 Gf03 Gf07 143.26 96 1.3E-03 Gf03 Gf11 191.37 156 2.8E-02 Gf03 Gf12 162.51 192 9.4E-01 Gf03 Gf13 163.37 204 9.8E-01 Gf04 Gf05 30.42 30 4.4E-01 Gf04 Gf06 228.68 25 8.4E-35 Gf04 Gf07 26.21 40 9.5E-01 Gf04 Gf11 48.98 65 9.3E-01 Gf04 Gf12 421.50 80 7.5E-48 Gf04 Gf13 72.90 85 8.2E-01 Gf05 Gf06 20.51 30 9.0E-01 Gf05 Gf07 66.59 48 3.9E-02 Gf05 Gf11 132.46 78 1.2E-04 Gf05 Gf12 102.92 96 3.0E-01 Gf05 Gf13 124.12 102 6.7E-02 Gf06 Gf07 116.18 40 2.3E-09 Gf06 Gf11 33.48 65 1.0E+00 Gf06 Gf12 209.81 80 1.4E-13 Gf06 Gf13 214.80 85 3.3E-13 Gf07 Gf11 102.80 104 5.1E-01 Gf07 Gf12 167.32 128 1.1E-02 Gf07 Gf13 102.89 136 9.8E-01 Gf11 Gf12 247.46 208 3.2E-02 Gf11 Gf13 185.86 221 9.6E-01 Gf12 Gf13 296.06 272 1.5E-01

198

Table S16. Statistical tests of difference in morphological traits between genetic clusters.

ANOVA Tukey post hoc p-values F value p-value HTF-STF HTF-MTF MTF-STF Tarsus length 7.71 0.002 0.224 0.038 0.002 Beak depth 2.8 0.079 0.484 0.332 0.071 Kruskal-Wallis Dunn's test adjusted p-values X2 p-value HTF-STF HTF-MTF MTF-STF Beak-head 6.64 0.04 0.381 0.136 0.016 Beak-feather 9.81 0.01 0.338 0.042 0.003 Beak-naris 11.84 0 0.129 0.055 0.001 Beak width 6.56 0.04 0.335 0.161 0.017 Wing length 6.18 0.05 0.373 0.17 0.021

Table S17. Relative abundance (%) of bacterial phyla across all samples

Phylum meanRA sdRA minRA maxRA Firmicutes 47.2 35.5 2.9 98.5 Actinobacteria 28.2 24.6 0.4 85.8 Proteobacteria 22.5 21.8 0.3 77.1 Unclassified 0.8 1.3 0.0 5.4 Chlamydiae 0.3 1.8 0.0 9.5 Chloroflexi 0.3 0.6 0.0 2.8 Acidobacteria 0.2 0.6 0.0 2.8 Planctomycetes 0.2 0.3 0.0 1.2 Deinococcus-Thermus 0.0 0.1 0.0 0.3 Tenericutes 0.0 0.1 0.0 0.5 Cyanobacteria 0.0 0.0 0.0 0.1 Verrucomicrobia 0.0 0.0 0.0 0.2 Bacteroidetes 0.0 0.0 0.0 0.1

199

Table S18. Relative abundance (%) of bacterial phyla by genetic cluster in Darwin’s tree finches from Floreana Phylum STF HTF MTF Firmicutes 42.1 38.6 88.9 Actinobacteria 30.9 32.4 7.5 Proteobacteria 25.0 26.5 2.9 Unclassified 1.1 0.6 0.6 Chloroflexi 0.4 0.4 0.0 Acidobacteria 0.3 0.3 0.0

Planctomycetes 0.2 0.2 0.0

Chlamydiae 0.0 0.9 0.0 Tenericutes 0.0 0.1 0.0

Table S19. Relative abundance (%) of bacterial genera by genetic cluster in Darwin’s tree finches from Floreana

Genus STF HTF MTF Lactobacillus 35.7 29.6 88.5 Kocuria 11.9 6.1 3.6 Acinetobacter 6.3 5.1 0.0 Methylobacterium 5.9 5.9 1.0 Enterococcus 3.8 5.7 0.0 Salmonella 3.6 0.5 0.0 Nocardioides 2.1 1.6 0.2 Pseudonocardia 1.6 1.1 0.1 Actinomycetospora 1.3 1.5 0.2

200

Table S20. Ribosomal sequence variants with significant differences in abundance between Floreana tree finch genetic clusters

RSV base log2Fold lfcSE stat padj Phylum Order Family Genus compare Mean Change 1 31.43 -22.07 3.81 -5.80 6.55E-06 Firmicutes Clostridiales Unclassified Unclassified MTF-STF 1 31.43 21.20 3.91 5.42 2.21E-06 Firmicutes Clostridiales Unclassified Unclassified HTF-MTF 2 131.08 -8.19 1.91 -4.30 1.01E-02 Proteobacteria Enterobacteriales Enterobacteriaceae Unclassified MTF-STF 2 131.08 6.25 1.95 3.20 3.39E-02 Proteobacteria Enterobacteriales Enterobacteriaceae Unclassified HTF-MTF 3 489.51 11.45 2.09 5.49 3.05E-05 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus MTF-STF 3 489.51 -11.70 2.14 -5.46 2.21E-06 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus HTF-MTF 4 148.65 28.61 4.44 6.45 3.41E-07 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus MTF-STF 4 148.65 -32.08 4.58 -7.01 3.59E-10 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus HTF-MTF 5 1061.30 10.57 1.73 6.12 1.43E-06 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus MTF-STF 5 1061.30 -10.37 1.77 -5.84 3.85E-07 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus HTF-MTF 201 6 159.15 11.71 2.92 4.02 2.91E-02 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus MTF-STF

6 159.15 -12.40 3.02 -4.11 1.18E-03 Firmicutes Lactobacillales Lactobacillaceae Lactobacillus HTF-MTF 7 10.45 -7.40 2.41 -3.07 4.47E-02 Firmicutes Lactobacillales Unclassified Unclassified HTF-MTF 8 65.17 -5.85 1.94 -3.01 4.47E-02 Actinobacteria Actinomycetales Corynebacteriaceae Corynebacterium HTF-MTF 9 43.73 3.70 1.23 3.00 4.47E-02 Proteobacteria Rhizobiales Methylobacteriaceae Methylobacterium HTF-MTF 10 2.17 -17.94 2.64 -6.80 3.02E-08 Actinobacteria Actinomycetales Nocardiaceae Williamsia HTF-STF

Table S21. Bacterial reference genomes with the most coverage across Darwin’s tree finch metagenomic samples.

Bacterial species mean med se Curtobacterium_sp._UNCCL17 12.2 2.5 4.9 Acinetobacter_junii 11.5 0.0 9.4 Enterococcus_casseliflavus_EC20 11.2 0.0 6.6 Microbacterium_sp._TS-1 9.0 1.6 4.5 Microbacterium_testaceum_StLB037 8.2 5.9 1.8 Methylobacterium_radiotolerans_JCM_2831 6.4 4.4 1.2 Klebsiella_oxytoca 5.1 0.0 4.9 Pantoea_sp._YR343 4.8 0.3 2.0 Bradyrhizobium_sp._DFCI-1 4.5 3.7 0.7 Lactobacillus_saerimneri_30a 4.5 0.4 2.0 Lactobacillus_salivarius 4.4 0.6 1.6 Curtobacterium_sp._S6 4.3 1.1 1.3 Pantoea_sp._Sc1 4.0 0.0 3.2 Actinomycetospora_chiangmaiensis_DSM_45062 4.0 2.9 0.6

Table S22. Stable isotope (! 13C and ! 15N) ratios for Darwin’s tree finch species from

Floreana

Species Habitat ! 13C mean ! 13C SD ! 15N mean ! 15N SD (‰) (‰) (‰) (‰) STF H -26.9 0.6 8.3 1.5 HTF H -26.3 1.5 8.0 1.1 MTF H -26.5 1.2 8.6 0.5

202

Supplementary Methods

Morphological measurements

Eight morphological measurements were taken for all finches sampled. These included

(1) beak-head (beak tip to back of head), (2) beak-naris (beak tip to anterior end of the naris), (3) beak-feather (tip of beak to feather line), (4) beak depth (at the base of the beak), (5) beak width

(at the base of the beak), (6) naris diameter (taken from extremes of naris opening), (7) tarsus length, (8) wing length, and (9) body mass. Dial calipers were used to take morphological measurements to the nearest 0.01 mm and Telinga electronic scales were used to measure mass to the nearest 0.01 g.

203

Chapter 4 Supplementary Material

Supplementary Figures

SGF

MGF

Genus Acinetobacter CF Enterococcus Helicobacter Kocuria STF Lactobacillus Methylobacterium Unclassified

HTF

MTF

0.00 0.25 0.50 0.75

Figure S15. Mean relative abundance of bacterial genera in Darwin’s finch microbiome samples from Floreana.

Only bacterial genera with mean relative abundance greater than 5% for a given finch species is shown.

204

SGF MGF CF

● ● ● ● ● 0.0 ● ● Habitat ● ● ● ● ● ● ● ● Highland −0.1 ●● Lowland −0.2

−0.3 Species ● SGF STF HTF MTF

● ● ● ● MGF ● ● ● ● ● ● ● ●● ● ● ● ● Axis2 [22.3%] 0.0 ● ● ●● CF ● ● ● ● ● ● ● ● ●● ● ● STF −0.1 ● ● HTF −0.2 ● MTF −0.3

0.2 0.1 0.2 0.1 0.2 0.1 − − 0.0 0.1 − − 0.0 0.1 − − 0.0 0.1 Axis1 [37.3%]

Figure S16. Double principal coordinate analysis of Darwin’s finch microbiome samples faceted by species.

Sample shape and color correspond to habitat and species, respectively.

0.08 ● Age 0.04 ● ● ● ● adult ● ● ● 0.00 ●● ● ● nestling ● ● ● ●

Axis2 [21.5%] −0.04 ● −0.05 0.00 0.05 0.10 Axis1 [45.2%]

Figure S17. DPCoA plot of small ground finch microbiome samples from the lowlands with adults vs nestlings.

The three nestling samples are tightly clustered but not differentiable from the adults

(PERMANOVA with weighted UniFrac p=0.17).

205

Figure S18. !13C and !15N stable isotope measurements for Darwin’s finch species across

Santa Cruz and Floreana islands.

Point color and point shape indicate host species and habitat, respectively. The four species that are present on both islands are plotted. A) Individual !13C and !15N values for each finch with gut microbiome samples. B) Mean !13C and !15N values for each species and habitat with standard deviation. Three points were the only sample from that habitat and species combination and therefore lack standard deviation error bars: the medium ground finch in the highlands and the small tree finch in the lowlands on Santa Cruz in addition to the cactus finch in the highlands on Floreana.

206

Supplementary Tables

Table S23. Statistical tests on amplicon library size mean and distribution across categorical variables of interest

Kruskal-Wallis test Levene’s test for for library size mean library size distribution Variable "2 p-value F value p-value Species 2.79 0.73 0.40 0.85 Habitat 0.04 0.84 1.40 0.24 Sex 10.46 0.03 0.16 0.96

Table S24. Summary of samples used for inter-island comparison

Common Name Abb. Scientific Floreana Santa Cruz Total Name H L H L Small Ground finch SGF Geospiza 13 12 8 5 38 fuliginosa Medium Ground finch MGF Geospiza 8 6 15 fortis Cactus finch CF Geospiza 6 6 12 scandens Small Tree finch STF Camarhynchu 14 9 23 s parvulus Total 27 26 17 17 87

207

Table S25. Relative abundance of bacterial phyla across all samples

Phylum meanRA sdRA minRA maxRA Firmicutes 50.6% 34.7% 1.2% 99.5% Actinobacteria 26.6% 26.4% 0.1% 96.0% Proteobacteria 18.7% 18.1% 0.2% 77.1% Unclassified 2.8% 9.2% 0.0% 55.3% Chloroflexi 0.5% 0.6% 0.0% 2.8% Acidobacteria 0.2% 0.5% 0.0% 2.8% Planctomycetes 0.2% 0.2% 0.0% 1.2% Chlamydiae 0.1% 1.1% 0.0% 9.5% Cyanobacteria 0.1% 0.3% 0.0% 2.0% Verrucomicrobia 0.0% 0.1% 0.0% 0.5% Tenericutes 0.0% 0.1% 0.0% 0.8% Deinococcus-Thermus 0.0% 0.0% 0.0% 0.3% Bacteroidetes 0.0% 0.0% 0.0% 0.2%

Table S26. Relative abundance of bacterial phyla by Darwin finch species on Floreana

Island

Phylum SGF MGF CF STF HTF MTF Firmicutes 49.74% 50.28% 69.35% 42.10% 38.56% 88.86% Actinobacteria 25.53% 36.01% 10.04% 30.94% 32.37% 7.52% Proteobacteria 17.37% 11.94% 19.48% 24.96% 26.55% 2.90% Unclassified 6.25% 0.40% 0.30% 1.07% 0.64% 0.60% Chloroflexi 0.60% 0.55% 0.28% 0.35% 0.38% 0.02% Acidobacteria 0.23% 0.37% 0.08% 0.30% 0.27% 0.02% Planctomycetes 0.14% 0.21% 0.11% 0.18% 0.20% 0.03% Chlamydiae 0.00% 0.00% 0.00% 0.00% 0.86% 0.00% Cyanobacteria 0.04% 0.15% 0.29% 0.01% 0.04% 0.04% Verrucomicrobia 0.02% 0.07% 0.05% 0.02% 0.03% 0.01% Tenericutes 0.03% 0.00% 0.00% 0.00% 0.06% 0.00% Deinococcus- 0.02% 0.01% 0.00% 0.05% 0.03% 0.00% Thermus Bacteroidetes 0.03% 0.00% 0.00% 0.01% 0.03% 0.01% Spirochaetes 0.00% 0.01% 0.00% 0.00% 0.00% 0.00%

208

Table S27. The relative abundance (%) of the most abundant bacterial genera across all

Darwin finch gut microbiome samples from Floreana.

Phylum Genus meanRA sdRA minRA maxRA Firmicutes Lactobacillus 43.55 36.51 0.05 99.41 Proteobacteria Acinetobacter 5.39 12.82 0.00 54.67 Actinobacteria Kocuria 4.93 12.46 0.00 64.21 Proteobacteria Methylobacterium 3.21 5.21 0.00 25.91 Firmicutes Enterococcus 2.98 12.57 0.00 92.11 Actinobacteria Cellulomonas 2.02 4.64 0.00 31.93 Actinobacteria Rubrobacter 1.48 10.05 0.00 88.42 Actinobacteria Curtobacterium 1.39 3.83 0.00 23.93 Actinobacteria Nocardioides 1.27 1.82 0.00 10.01 Actinobacteria Solirubrobacter 1.11 1.69 0.00 8.64 Actinobacteria Actinomycetospora 1.10 1.84 0.00 7.77 Actinobacteria Pseudonocardia 1.04 2.18 0.00 12.04

Table S28. Relative abundance (%) of the most abundant bacterial genus in each species of

Darwin’s finches on Floreana

Genus Species meanRA sdRA minRA maxRA Lactobacillus SGF 43.71 36.39 0.05 97.90 Lactobacillus MGF 47.83 40.57 0.71 99.41 Lactobacillus CF 48.15 35.37 0.23 96.71 Lactobacillus STF 35.68 36.79 0.10 93.12 Lactobacillus HTF 29.56 31.32 0.05 87.17 Lactobacillus MTF 88.54 5.33 82.18 95.23

209

Table S29. Relative abundance (%) of the second most abundant bacterial genus in each species of Darwin’s finches on Floreana

Genus Species meanRA sdRA minRA maxRA Acinetobacter SGF 4.9 11.0 0.0 37.8 Kocuria MGF 9.5 19.0 0.0 64.2 Enterococcus CF 13.2 34.8 0.0 92.1 Kocuria STF 11.9 20.5 0.0 62.3 Helicobacter HTF 6.7 22.3 0.0 74.1 Kocuria MTF 3.6 3.9 0.0 8.2

Table S30. Relative abundance of bacterial taxa in small ground finch samples from highland and lowland habitats on Floreana

Phylum Highland Lowland Firmicutes 69.5% 39.9% Proteobacteria 15.0% 18.2% Actinobacteria 13.8% 38.9% Unclassified 1.0% 1.3% Chloroflexi 0.2% 1.0% Order Highland Lowland Lactobacillales 67.8% 38.6% Actinomycetales 12.1% 28.6% Pseudomonadales 8.8% 3.4% Rhizobiales 4.3% 9.8% Unclassified 1.3% 2.2% Bacillales 1.3% 1.1% Solirubrobacterales 1.2% 2.1% Genus Highland Lowland Lactobacillus 64.7% 38.4% Acinetobacter 8.3% 3.2% Unclassified 4.1% 7.7% Curtobacterium 2.8% 1.0% Methylobacterium 2.6% 2.0% Weissella 1.7% 0.0% Enterococcus 1.1% 0.1% Staphylococcus 1.0% 0.1%

210

Table S31. Relative abundance of bacterial taxa in the small ground finch adults and nestlings from Floreana Island.

Phylum Adult Nestling Firmicutes 39.9% 15.7% Actinobacteria 38.9% 19.0% Proteobacteria 18.2% 22.9% Unclassified 1.3% 41.5% Chloroflexi 1.0% 0.7% Acidobacteria 0.3% 0.1% Planctomycetes 0.2% 0.1% Class Adult Nestling Bacilli 39.7% 9.6% Actinobacteria 38.9% 18.9% Alphaproteobacteria 13.7% 6.3% Gammaproteobacteria 4.1% 16.2% Unclassified 1.9% 41.7% Genus Adult Nestling Lactobacillus 38.4% 0.6% Rubrobacter 7.8% 0.5% Unclassified 7.7% 59.2% Cellulomonas 5.8% 2.6% Acinetobacter 3.2% 0.1% Paracoccus 2.3% 1.4% Aurantimonas 2.0% 0.5% Methylobacterium 2.0% 0.5%

211

Table S32. Permanova tests of weighted UniFrac distances with categorical variables of interest

Variable F R2 p-value Habitat 8.24 0.10 0.001 Species 1.20 0.08 0.21 Habitat:Species 1.33 0.03 0.20 Sex 0.97 0.51 0.42 Age+ 1.45 0.10 0.17

+ Age was tested using only small ground finch (G. fuliginosa) samples collected in the lowland (Adults=12, Nestling=3)

Table S33. Stable isotope (! 13C and ! 15N) ratios by Darwin’s finch species and habitat on

Floreana

Species Habitat ! 13C mean ! 13C SD ! 15N mean ! 15N SD (‰) (‰) (‰) (‰) H -17.3 3.4 8.5 0.7 SGF L -20.2 2.3 10.2 2.1 H -27.4 1.0 8.7 2.2 MGF L -23.1 2.7 9.4 2.4 CF H -23.0 * 8.5 * L -22.1 1.7 10.4 2.7 STF H -26.9 0.6 8.3 1.5 HTF H -26.3 1.5 8.0 1.1 MTF H -26.5 1.2 8.6 0.5

* Single sample for this species/habitat so standard deviation was not calculated

212

Table S34. PERMANOVA results for combined dataset across Santa Cruz and Floreana

Island with weighted UniFrac distances

Variable Df SumsOfSqs MeanSqs F.Model R2 Pr(>F) Island 1 0.03 0.03 1.9993 0.02 0.064 Habitat 1 0.21 0.21 12.4452 0.12 0.001 Island:Habitat 1 0.04 0.04 2.0768 0.02 0.048 Island:Habitat:Species 6 0.12 0.02 1.1826 0.07 0.202 Residuals 77 1.33 0.02 0.77 Total 86 1.74

Table S35. Post hoc pairwise comparisons of island and habitat combinations with weighted UniFrac distances

Pairwise comparison Df SumsOfSqs MeanSqs F.Model R2 Pr(>F) Highland Islands 1 0.11 0.11 3.18 0.07 0.008 Lowland Islands 1 0.04 0.04 0.90 0.02 0.431 Santa Cruz Habitats 1 0.19 0.19 7.31 0.19 0.001 Floreana Habitats 1 0.26 0.26 7.30 0.13 0.001

Table S36. Pairwise Euclidean distances between weighted average foraging patterns in each island/habitat combination for all five food categories (lower triangle) or broad plant v insect food categories (upper triangle).

FL_High FL_Low SC_High SC_Low FL_High - 0.84 0.18 0.88 FL_Low 0.71 - 0.67 0.03 SC_High 0.16 0.61 - 0.70 SC_Low 0.73 0.05 0.63 -

213

Supplementary Methods

Morphological measurements

Eight morphological measurements were taken for all finches sampled. These included

(1) beak-head (beak tip to back of head), (2) beak-naris (beak tip to anterior end of the naris), (3) beak-feather (tip of beak to feather line), (4) beak depth (at the base of the beak), (5) beak width

(at the base of the beak), (6) naris diameter (taken from extremes of naris opening), (7) tarsus length, (8) wing length, and (9) body mass. Dial calipers were used to take morphological measurements to the nearest 0.01 mm and Telinga electronic scales were used to measure mass to the nearest 0.01 g.

214