University of Connecticut OpenCommons@UConn

Master's Theses University of Connecticut Graduate School

5-5-2018 Comparison of OTUs and ASVs in 73 from Equatorial Guinea Darien Capunitan [email protected]

Recommended Citation Capunitan, Darien, "Comparison of OTUs and ASVs in 73 Bird Species from Equatorial Guinea" (2018). Master's Theses. 1188. https://opencommons.uconn.edu/gs_theses/1188

This work is brought to you for free and open access by the University of Connecticut Graduate School at OpenCommons@UConn. It has been accepted for inclusion in Master's Theses by an authorized administrator of OpenCommons@UConn. For more information, please contact [email protected].

Comparison of OTUs and ASVs in 73 Bird Species from Equatorial Guinea

Darien Capunitan

B.S., University of Connecticut, 2018

A Thesis

Submitted in Partial Fulfillment of the

Requirement for the Degree of

Master of Science

At the

University of Connecticut

2018

Approval Page

Master of Science Thesis

Comparison of OTUs and ASVs in 73 Bird Species from Equatorial Guinea

Presented by

Darien C. Capunitan, B.S.

Major Advisor: ______Sarah Hird

Associate Advisor: ______Johann Gogarten

Associate Advisor: ______Johnathan Klassen

University of Connecticut

2018

ii Acknowledgements

I would like to thank my major advisor, Sarah Hird, for always giving me guidance and allowing me for being one of the first members of her lab and for overall being awesome.

I would also like to thank the members of the Hird lab for always keeping an amazing working environment. Beth, Kirsten, and Josh are the best.

Thanks to Ryan and Oscar for getting these great samples in the first place and adding to the metadata table.

Thanks to the UCONN MARS facility for processing the samples and a special thanks to

Dr. Kendra Maas for answering all my questions.

Thank you to Peter Gogarten and Johnathan Klassen for taking the time to be associate advisors.

Finally I would like to thank my parents for their love and support throughout my existence with a special thanks to Amber and Rodney for always believing in me.

iii

Table of Contents

Approval Page ii

Acknowledgements iii

Table of Contents iv

Abstract vi

Main text 1

Chapter 1: Introduction 1

Chapter 2: Materials and Methods

2.1: Sampling 4

2.2 DNA Extraction and Sequencing 5

2.3 Analyses

2.3.1 Data Processing: Mothur 6

2.3.2 Data Processing: DADA2 6

2.3.3 Data Analysis: Unifrac 7

2.3.4 Data Analysis: Bray-Curtis 7

Chapter 3: Results

3.1 Alpha diversity 8

3.2 Beta diversity 19

Chapter 4: Discussion 30

Chapter 5: Conclusion 31

iv Appendices

Appendix A 33

Appendix B 34

Appendix C 38

Appendix D 39

Appendix E 40

References 41

v

Abstract

As scientists discover more information about the communities of bacteria that live on and inside hosts, "the microbiome", a new avenue for understanding the health of humans and has opened. There are many analysis pipelines for microbiome data processing, and choice of analytical tools can affect the biological results of the analysis. The current analysis climate does not point toward a single most effective protocol, which hinders comparisons across studies. An important step in microbiome data processing is the assignment of reads into groups of similar organisms. The traditional unit for grouping organisms is the species; however resolving sequences at the species level is not always possible. For this reason, the term

“operational taxonomic unit” (OTU) has been used as a proxy for species. Mothur is one popular software package that uses OTUs, as defined by a percent sequence similarity that is set by the user; 97% sequence similarity is frequently cited as the most similar to traditional species.

DADA2 is a newer pipeline that classifies unique sequences as “amplicon sequence variants” or

ASVs. ASVs assume biological sequences are present in the sample and can resolve differences in sequence variants at as low a single nucleotide. We compared OTUs and ASVs from a dataset of bird gastrointestinal tracts from Equatorial Guinea. One hundred and forty two samples were taken from 92 (73 species) and the V4 region of the 16S rRNA gene was used as the microbial fingerprint (515f and 806R primers). DADA2 produced 4245 ASVs against Mothur’s

9332 OTUs. DADA2 found 507 different ranks while Mothur found 932 against the same reference database, Silva version 132. We analyzed alpha and beta diversity in the OTUs and

ASVs using the Phyloseq program for R. The data suggests key differences in these analyses.

vi Procruste analyses (Figures 13 and 16) show significant differences as Mothur data clusters more tightly together. Some samples are also moved substantially across axes.

vii Chapter 1: Introduction

The microbiome is vital for vertebrate health (Cho et al. 2012). The microbes in the gut aid in digestion, gut immunity and organ development (Diaz Heijtz et al., 2011; Al-ASmakh et al. 2014). There are ten times the number of genes in the human microbiome compared to the human genome, which has further implicated the importance of the microbiome for host development and health. The gut microbiome provides a possible pathway for therapeutic relief of diseases because of its effect on the several health related regions including the gut-brain axis

(Schnorr and Bachner 2016). There have already been studies targeting the microbiome for brain, cardiovascular, and several other diseases (Ramezani et al., 2014; Shoemark et al. 2014).

The microbiome has also been shown to have a strong relationship with host behavior as both elements can directly affect one another (Ezenwa 2012). The gut microbiome is densely populated and there is substantial intra- and interspecific diversity among hosts of many taxa based on diet, environment, and overall locality (Wang et al. 2013; Spor et al. 2011). Only with continued sampling can scientists begin to understand the importance of the gut microbiome and its roles in host health and evolution.

Most of our knowledge of gut microbiomes comes from studies on mammals (Nelson et al., 2013; Muegge et al., 2011; Ley et al. 2008; Colston and Jackson 2016). However, mammals have unique adaptations that affect their microbiomes (e.g., vaginal delivery and milk,

Dominguez-Bello et al., 2010; Cho et al., 2012). Although samples from mammals are important microbiome data, the confounding factors that are associated with mammal birth and offspring development are not representative of all life. Furthermore, it has not yet been studied how strong the impact of early life mammalian behavior and physiology is on the gut microbiome and how much can be attributed to outside factors (e.g. diet, environment, location).

1 Birds are a globally distributed class of organisms with great diversity in their diets, behaviors, and ecologies and we know little about their microbiomes. A foundational requirement for comparative studies is the cataloging of microbes from different species’ microbiomes. Sampling diverse, wild species will allow for the description of a “core gut microbiome”. The “core gut microbiome” theoretically consists of species, genera, or families of bacteria that are consistently found among a particular species. Isolating the “core gut microbiome” may illuminate the effect of these bacteria on host health and may be able to distinguish transient bacteria that may overshadow functionally important groups, or at the very least, obscure the true signal. Catalogued “core” species will also help future studies as a point of reference. While current studies have categorized differences between one or a few host species

(Benskin et al., 2010; Kreisinger et al., 2015; Xenoulis et al., 2010), studies using many host species can help to build a foundation for understanding the complex relationship between hosts and their diverse environments and ecologies (Hird et al. 2015). A comprehensive detailing of the possible symbiotic relationship between birds and their gut microbiome can be further understood through large sampling studies.

How one analyses a microbiome can affect the results and interpretation of the data. One important step, performed near the beginning of an analysis, is defining the taxonomic unit. This allows scientists to measure and give meaning to microbial data that may not be assigned to species. One of the more popular and widely used measurements is the operational taxonomic unit (Okal & Sneath 1963), or OTU. OTUs classify sequences that are generally related (Schloss et al., 2005). OTUs are needed for comparisons of qualitative data that, without the OTU association, can otherwise not be quantitative. The reliability and reproducibility of the classification of DNA reads into OTUs is important to properly represent data. The stability of

2 OTUs is of key importance when various clustering methods are applied (He et al. 2015). OTUs have been used extensively through different disciplines, including microbial ecology. There are currently many ways to classify OTUs. “Closed reference” rely on a database and discard sequences that don’t match the database. Closed reference allows an easy way to compare across datasets in which samples are normalized. However, the discarding of data is difficult to support and also creates an overestimation of diversity (Edgar, 2017). "Open reference" OTU picking again rely on a database, however sequences that do not match the reference database are then clustered de novo. Closed and open reference methods can cause significant differences with alpha and beta diversity even with identical data (He et al. 2015). “De novo” methods cluster sequences within their own dataset into groups that are a certain sequence percent similarity;

97% is frequently cited as most representative of "species". This method ensures all of the data is utilized; however since no reference database is used there is an inability to make comparisons across studies. In addition, de novo clustering may take a long time (possibly months) when a large dataset it processed. There are multiple ways to do de novo clustering. OptiClust is the method applied in the popular package Mothur (Westcott et al., 2017). The program Mothur is one of the popular pipelines for microbial community analyses, as it provides all the steps of a microbiome analysis in a single package. Mothur (Schloss 2009) has been cited 8170 times

(Google Scholar 1 March 2018).

Recent calls for improved methods of OTU picking (Callahan et al., 2017) have led to the development of a new method of OTU picking, called "amplicon sequence variants" or ASVs and is implemented in the R package DADA2 (Callahan et al., 2016), currently cited 137 times

(Google Scholar 1 March 2018). ASVs are an analogous term to OTUs, however it refers to a unique sequence that is assigned to a taxonomic group. Each ASV is then calculated in each

3 sample and the recorded presence is noted. Recently there has been a call for the preferential use of ASVs over OTUs citing greater reproducibility and comprehensiveness

(Callahan 2017). There have also been claims that error control is increased by use of ASVs as they can be, “resolved down to the level of single-nucleotide differences over the sequenced gene region” (Callahan 2017). ASVs also allow for an easy way to compare samples across different studies, but they have not been as thoroughly vetted in the literature as OTUs.

Here, we conduct a direct comparison of calling taxonomic units as ASVs (with DADA2) and 97% OTUs (with Mothur) using 92 birds belonging to 73 species from Equatorial Guinea.

We compare OTU/ASV abundance and diversity, taxonomic ranks, and a variety of other variables. We also include similarities and dissimilarities of taxonomy and beta diversity (using

Bray Curtis and Unifrac distances). A Procruste analysis was conducted to quantify the dissimilarities between ASVs and OTUs using their respective PCOA and NMDS plots.

Although quantifying the differences between ASVs and OTUs does not correlate to a preference for one or the other, it will highlight the significance of using different pipelines on “identical” data. These data are also the first representation of the microbiome from these 73 bird species.

Chapter 2: Materials and Methods

2.1 Sampling

All birds were collected from 8 sites in Equatorial Guinea from 5 January 2016 to 2

February 2016 (Table 2). Mist nets were used to catch birds, which were then euthanized with thoracic compression and immediately prepared as museum specimens. Intestinal tracts were tied off, then removed and stored in 100% ethanol within approximately 15 minutes of death. The

Dirección General de Protección y Guardería Forestal and the Universidad Nacional de Guinea

Ecuatorial provided specimen collection and export permits.

4 2.2 DNA Extraction and Sequencing

DNA was extracted from the luminal contents of the intestines using Qiagen Power Fecal kits following the standard protocol. Some intestines were subsampled (samples labeled with suffixes A, B, C, D). Blanks (n=7) were extracted at the same time as the samples as negative

(kit) controls.

Extractions were sequenced at the MARS facility at the University of Connecticut on a

MiSeq. Negative (kit) controls from the PCRs were also sequenced (n=7). PCR replicates were also sequenced for a subset of the samples; these samples are labeled with the suffixes X, Y. A total of 142 samples were sequenced.

Quant-iT PicoGreen kit were used to quantify DNA extractions. Amplification of the V4 region of the 16S rRNA gene was done using 30 ng of extracted DNA as template. The V4 region was amplified using 515F and 806R with Illumina adapters and dual indices (8 basepair golay on 3’ (Caporaso 2012), and 8 basepair on the 5’ (Kozich et al., 2013)). The PCR reaction was incubated at 95˚C for 3.5 minutes, the 30 cycles of 30 s at 95.0°C, 30 s at 50.0°C and 90 s at

72.0°C, followed by final extension as 72.0°C for 10 minutes. PCR products were pooled for quantification and visualization. PCR products were normalized based on the concentration of

DNA from 250-400 bp then pooled. The pooled PCR products were cleaned using the Mag-Bind

RxnPure Plus (Omega Bio-tek) according to the manufacturer’s protocol. The cleaned pool was sequenced on the MiSeq using v2 2x250 base pair kit (Illumina, Inc).

2.3 Analyses

2.3.1 Data Processing: Mothur

Raw sequences were processed and analyzed in Mothur version 1.39.5 (Schloss et al., 2009) according to the standard MiSeq protocol

5 (http://www.mothur.org/wiki/MiSeq_SOP). Quality control, trimming and de-noising were performed as previously outlined (Kozich, et al., 2013). All sequences were aligned to the Silva reference alignment database version 128 and filtered so that they all overlapped with no overhangs. Sequences were clustered into OTUs based on 97% similarity, and up to the genus-level taxonomic affiliation of each OTU was identified according to current taxonomy

(Silva v128). The Mothur script can be found in the Appendix A.

2.3.2 Data Processing: DADA2

Raw sequences were processed through DADA2 pipeline according to the DADA2 pipeline tutorial 1.2 in R (Callahan et al., 2016). The DADA2 R script can be found in the

Appendix B. Filtering was completed according to the DADA2 protocol based on the number of ambiguous bases, a minimum quality score, and the expected errors in a read (Callahan 2016).

The filtered fastq files were dereplicated and output as unique sequences with their corresponding abundance. Denoising and merging were completed according to the DADA2 pipeline tutorial, while error parameters removed chimeras (Callahan 2016). All sequences were aligned to the current Silva reference alignment database (Silva v128), the same as was used for the Mothur analysis.

2.3.3 Data Analysis: Unifrac

Unifrac (Lozupone and Knight, 2005) distance matrices were created between bird gut samples to compare and represent individual gut microbial communities. Distances are calculated based on how much of the branch length is unique or shared in a common phylogenetic tree between the microbial gut communities. Unifrac distances were calculated in

Mothur and DADA2. A DADA2 script (Appendix B) was applied to DADA2’s seqtab.nochim file. R programs DECIPHER and phanghorn were used to create alignment and tree roots

6 respectively (Wright 2015). The tree was then merged with the otu and tax table into a single phyloseq object. Ordinations of weighted and unweighted Unifrac were created with non-metric multidimensional scaling (NMDS) distance matrixes. Mothur tree was created in Mothur under

MiSeq SOP. The stability.tre file was merged with existing Mothur shared, constaxonomy, and mapping files in R with import.mothur command through phyloseq. Again a singular phyloseq object was created and ordinations were calculated with NMDS distance matrixes to create weighted and unweighted Unifrac plots. Scripts can be found in Appendix C.

2.3.4 Data Analysis: Bray-Curtis

Ordinations of Bray-Curtis were calculated for Mothur and DADA2. In phyloseq, each sample was square root-transformed to bacterial phylum where a percentage was calculated to create a pairwise distance matrix. The two-dimensional positions of the samples were more similar the closer they were to one another and more different the farther they were apart.

Wisconsin distances (Bray-Curtis) were calculated from previously created phyloseq objects where an OTU table, taxonomy table, and mapping file were merged together. Ordinations were first calculated and then plotted against the mapping file to visualize trends. Scripts can be found in Appendix C.

Chapter 3: Results

3.1 Alpha diversity

Taxonomy bar charts were first created to note differences among taxonomic ranks according to Table 1. All samples can be seen at the class level for Mothur (Figure 1) and

DADA2 (Figure 2). The most abundant 12 bacterial classes were isolated for better visualization.

Bacilli, Bacterodia, and Mollicutes were most abundant among samples. Differences between

Mothur and DADA2 can be seen at the class level among all samples, however this becomes

7 more evident when the first 8 samples are plotted at the phylum level (Mothur in Figure 3 and

DADA2 in 4). Despite an identical dataset, Mothur and DADA2 assign different relative abundances of bacterial phyla against an identical reference dataset (Silva v128). For example, the third (Eurillas latirostris) and sixth (Cecropis abyssinica) samples from the left showed a substantial increase in Proteobacteria, on between using OTUs (Figure 3) compared to the ASVs

(Figure 4).

Firmicutes and Proteobacteria were the most abundant phyla among samples. Mothur and

DADA2 both detected 29 phyla against Silva version 128. However, at lower taxonomic ranks the two methodologies deviate from one another, as can be seen in Table 1. Mothur has a noticeable increase in classes, orders, and genus compared to DADA2. This becomes evident in the alpha diversity plots of Mothur (Figure 5) and DADA2 (Figure 6). Observed OTUs/ASVs,

Shannon and Simpson indexes produced similar plots (Figure 5). Mothur also has 9332 OTUs compared to 4245 ASVs in DADA2 (Table1). Despite the larger number of OTUs compared to

ASVs, the same trends can be seen in observed OTUs/ASVs alpha diversity test (Figure 6).

Although some discrepancies can be noted, it appears alpha diversity was not affected by which program was used.

8 Table 1: Comparative table of different taxonomy classes, OTUs/ASVs and total sequences between Mothur and DADA2. Both pipelines’ taxonomies are based off the reference database Silva version 128. Taxa Mothur DADA2 Phylum 29 29 Class 75 58 Order 150 102 Genus 932 507 OTUs/ASVs 9332 4245

9 Table 2: Metadata table of all Equatorial Guinea samples. MARS refers to unique sample names assigned by MARS facility, used to track samples through analysis. Samples were collected from 6 different sites and longitudes and latitudes are listed for specificity. Diet assignments of c, o, and h refer to carnivorous, omnivorous and herbivorous respectively.

MARS #SampleID Sex Provincia Locality Elevation Lat Long Day Month Year Diet Order Family Genus Species Common_Name 401 Sylvir401 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Macrosphenidae Sylvietta virens Green_crombec 402 Alcqua402 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Coraciiformes Alcedinidae Alcedo quadribrachys Shining_blue_kingfisher 403 Eurlat403 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Pycnonotidae Eurillas latirostris Yellow_whiskered_greenbul 404 Blenot404 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Pycnonotidae Bleda notatus Yellow_lored_bristlebill 405 Estatr405 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Estrildidae Estrilda atricapilla Black_headed_waxbill 408 Cecaby408 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Hirundinidae Cecropis abyssinica Lesser_striped_swallow 409a Eurvir409A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Pycnonotidae Eurillas virens Little_greenbul 409b Eurvir409B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Pycnonotidae Eurillas virens Little_greenbul 410a Hiraet410A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Hirundinidae Hirundo aethiopica Ethiopian_swallow 410b Hiraet410B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Hirundinidae Hirundo aethiopica Ethiopian_swallow 411 Vidmac411 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Viduidae Vidua macroura Pin_tailed_whydah 412a Apuaff412A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Apodiformes Apodidae Apus affinis Little_swift 412b Apuaff412B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Apodiformes Apodidae Apus affinis Little_swift 412c Apuaff412C Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Apodiformes Apodidae Apus affinis Little_swift 412d Apuaff412D Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Apodiformes Apodidae Apus affinis Little_swift 413a Musinf413A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 o Passeriformes Muscicapidae Muscicapa infuscata Sooty_flycatcher 413b Musinf413B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 o Passeriformes Muscicapidae Muscicapa infuscata Sooty_flycatcher 413c Musinf413C Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 o Passeriformes Muscicapidae Muscicapa infuscata Sooty_flycatcher 414x Cambra414X Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Camaroptera brachyura Gree_backed_camaroptera 414y Cambra414Y Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 c Passeriformes Cisticolidae Camaroptera brachyura Gree_backed_camaroptera 417a Estatr417A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Estrildidae Estrilda atricapilla Black_headed_waxbill 417b Estatr417B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Estrildidae Estrilda atricapilla Black_headed_waxbill 418a Estatr418A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Estrildidae Estrilda atricapilla Black_headed_waxbill 418b Estatr418B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 5 1 2016 o Passeriformes Estrildidae Estrilda atricapilla Black_headed_waxbill 419 Dyacas419 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 o Passeriformes Platysteiridae Dyaphorophyia castanea Chestnut_wattle_eye 423 Specuc423 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 h Passeriformes Estrildidae Spermestes cucullata Bronze_mannikin 427 Spebic427 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 6 1 2016 h Passeriformes Estrildidae Spermestes bicolor Black_and_white_mannikin 430 Stiery430 Female Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 7 1 2016 c Passeriformes Muscicapidae Stiphrornis erythrothorax Forest_robin 432 Alecas432 Female Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 7 1 2016 c Passeriformes Muscicapidae castanea Fire_crested_alethe 435a Pribai435A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Cisticolidae bairdii Banded_prinia 435b Pribai435B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Cisticolidae Prinia bairdii Banded_prinia 4389b Hirnig439B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Hirundinidae Hirundo nigrita White_bibbed_swallow 438a Hirnig438A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Hirundinidae Hirundo nigrita White_bibbed_swallow 438b Hirnig438B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Hirundinidae Hirundo nigrita White_bibbed_swallow 438c Hirnig438C Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Hirundinidae Hirundo nigrita White_bibbed_swallow 439a Hirnig439A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 c Passeriformes Hirundinidae Hirundo nigrita White_bibbed_swallow 440 Cyaoli440 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 7 1 2016 o Passeriformes Nectariniidae Cyanomitra olivacea Olive_sunbird 443 Estmel443 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 8 1 2016 o Passeriformes Estrildidae Estrilda olivacea Orange_cheeked_waxbill 444 Batruf444 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Cisticolidae Bathmocercus rufus Black_faced_rufous_warbler 445a Musset445A Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Muscicapidae Muscicapa sethsmithi Yellow_footed_flycatcher 445b Musset445B Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Muscicapidae Muscicapa sethsmithi Yellow_footed_flycatcher 446 Neopoe446 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Turdidae Neocossyphus poensis White_tailed_ant_thrush 448a Illful448A Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes fulvescens Brown_illadopsis 448b Illful448B Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Pellorneidae Illadopsis fulvescens Brown_illadopsis 448c Illful448C Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 8 1 2016 c Passeriformes Pellorneidae Illadopsis fulvescens Brown_illadopsis 453 Cinbat453 Female Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 9 1 2016 o Passeriformes Nectariniidae Cinnyris batesi Bates_sunbird 454a Neoruf454A Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 9 1 2016 c Passeriformes Turdidae Neocossyphus rufus Red_tailed_ant_thrush 454b Neoruf454B Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 9 1 2016 c Passeriformes Turdidae Neocossyphus rufus Red_tailed_ant_thrush 456 Chapol456 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 9 1 2016 c Passeriformes Muscicapidae Chamaetylas poliocephala Brown_chested_alethe 457 Blesyn457 Female Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 9 1 2016 c Passeriformes Pycnonotidae Bleda syndactyla Red_tailed_bristlebill

10 MARS #SampleID Sex Provincia Locality Elevation Lat Long Day Month Year Diet Order Family Genus Species Common_Name 458a Crical458A Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 9 1 2016 c Passeriformes Pycnonotidae Criniger calurus Red_tailed_greenbul 458b Crical458B Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 9 1 2016 c Passeriformes Pycnonotidae Criniger calurus Red_tailed_greenbul 459 Tervir459 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 9 1 2016 o Passeriformes Monarchidae Terpsiphone viridis African_paradise_flycatcher 461a Crindu461A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 9 1 2016 c Passeriformes Pycnonotidae Criniger ndussumensis White_bearded_greenbul 461b Crindu461B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 9 1 2016 c Passeriformes Pycnonotidae Criniger ndussumensis White_bearded_greenbul 466 Cyaoli466 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 o Passeriformes Nectariniidae Cyanomitra olivacea Olive_sunbird 468 Hylpra468 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Cettiidae Hylia prasina Green_hylia 469y Turafe469Y Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 10 1 2016 o Columbiformes Columbidae Turtur afer Blue_spotted_wood_dove 470a Tervir470A Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 10 1 2016 o Passeriformes Monarchidae Terpsiphone viridis African_paradise_flycatcher 470b Tervir470B Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 10 1 2016 o Passeriformes Monarchidae Terpsiphone viridis African_paradise_flycatcher 472a Hylpra472A Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Cettiidae Hylia prasina Green_hylia 472b Hylpra472B Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Cettiidae Hylia prasina Green_hylia 473 Camniv473 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Piciformes Picidae Campethera nivosa Buff_spotted_woodpecker 474 Smiruf474 Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Eurylaimidae Smithornis rufolateralis Rufous_sided_broadbill 475x Camchl475X Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Cisticolidae Camaroptera chloronota Olive_green_camaroptera 475y Camchl475Y Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 c Passeriformes Cisticolidae Camaroptera chloronota Olive_green_camaroptera 476 Mannit476 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 11 1 2016 h Passeriformes Estrildidae Mandingoa nitidula Green_backed_twinspot 481a Indmac481A Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 11 1 2016 o Passeriformes Indicatoridae Indicator maculatus Spotted_honeyguide 481b Indmac481B Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 11 1 2016 o Passeriformes Indicatoridae Indicator maculatus Spotted_honeyguide 482a Nicchl482A Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 11 1 2016 o Passeriformes Nicatoridae Nicator chloris Western_nicator 485 Acthyp485 Male Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.852 11 1 2016 c Charadriiformes Scolopacidae Actitis hypoleucos Common_sandpiper 492a Crichl492A Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 o Passeriformes Pycnonotidae Criniger chloronotus Eastern_bearded_greenbul 492b Crichl492B Female Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 10 1 2016 o Passeriformes Pycnonotidae Criniger chloronotus Eastern_bearded_greenbul 501ax Tchaus501AX Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 501ay Tchaus501AY Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 501bx Tchaus501BX Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 501by Tchaus501BY Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 501cx Tchaus501CX Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 501cy Tchaus501CY Male Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 c Passeriformes Malaconotidae Tchagra australis Brown_crowned_tchagra 502a Ceroli502A Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 13 1 2016 c Cuculiformes Cuculidae Cercococcyx olivinus Olive_long_tailed_cuckoo 502b Ceroli502B Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 13 1 2016 c Cuculiformes Cuculidae Cercococcyx olivinus Olive_long_tailed_cuckoo 502c Ceroli502C Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 13 1 2016 c Cuculiformes Cuculidae Cercococcyx olivinus Olive_long_tailed_cuckoo 502d Ceroli502D Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 13 1 2016 c Cuculiformes Cuculidae Cercococcyx olivinus Olive_long_tailed_cuckoo 505 Antcol505 Female Wele-Nzas Oyala, vic. Universidad Americana de Guinea Ecuatorial 650 1.615 10.881 13 1 2016 h Passeriformes Nectariniidae Anthodiaeta collaris Collared_sunbird 510 Psanit510 Female Wele-Nzas Oyala, Universidad Americana de Guinea Ecuatorial 650 1.609 10.859 14 1 2016 c Passeriformes Hirundinidae Psalidoprocne nitens Square_tailed_saw_wing 513 Shecyo513 Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 14 1 2016 c Passeriformes Muscicapidae Sheppardia cyornithopsis Lowland_akalat 517a Spehae517A Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 14 1 2016 o Passeriformes Estrildidae Spermophaga haematina Western_bluebill 517b Spehae517B Male Wele-Nzas Oyala, 1.3 km E Universidad Americana de Guinea Ecuatorial 650 1.614 10.863 14 1 2016 o Passeriformes Estrildidae Spermophaga haematina Western_bluebill 523 Phypoe523 Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Pycnonotidae Phyllastrephus poensis Cameroon_olive_greenbul 524 Cinrei524 Male Bioko Norte Pico Basile 2670 3.595 8.769 17 1 2016 o Passeriformes Nectariniidae Cinnyris reichenowi Northern_double_collared_sunbird 527 Linoli527 Male Bioko Norte Pico Basile 2670 3.595 8.769 17 1 2016 h Passeriformes Fringillidae Linurgus olivaceus Oriole_finch 530 Dyacha530 Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Platysteiridae Dyaphorophyia chalybea Black_necked_wattle_eye 531a Cryrei531A Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 h Passeriformes Estrildidae Cryptospiza reichenovii Red_faced_crimsonwing 531b Cryrei531B Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 h Passeriformes Estrildidae Cryptospiza reichenovii Red_faced_crimsonwing 533 Estast533 Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Estrildidae Estrilda astrild Common_waxbill 534a Eurlat534A Female Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Pycnonotidae Eurillas latirostris Yellow_whiskered_greenbul 534b Eurlat534B Female Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Pycnonotidae Eurillas latirostris Yellow_whiskered_greenbul 535a Chapol535A Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Muscicapidae Chamaetylas poliocephala Brown_chested_alethe 535b Chapol535B Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Muscicapidae Chamaetylas poliocephala Brown_chested_alethe 536a Terruf536A Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Monarchidae Terpsiphone rufiventer Red_bellied_paradise_flycatcher

11

MARS #SampleID Sex Provincia Locality Elevation Lat Long Day Month Year Diet Order Family Genus Species Common_Name 536b Terruf536B Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Monarchidae Terpsiphone rufiventer Red_bellied_paradise_flycatcher 537 Sheboc537 Female Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 c Passeriformes Muscicapidae Sheppardia bocagei Bocages_akalat 539 Pseaby539 Male Bioko Norte Pico Basile 2670 3.595 8.769 17 1 2016 o Passeriformes Pseudoalcippe abyssinica African_hill_babbler 544 Cosrob544 Male Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Muscicapidae Cossyphicula roberti White_bellied_robin_chat 545 Hylpra545 Male Bioko Sur Moka Research Station 1375 3.361 8.662 19 1 2016 c Passeriformes Cettiidae Hylia prasina Green_hylia 548 Cyaori548 Female Bioko Sur Moka Research Station 1375 3.361 8.662 18 1 2016 o Passeriformes Nectariniidae Cyanomitra oritis Cameroon_sunbird 554 Turpel554 Male Bioko Sur Moka Research Station 1375 3.361 8.662 20 1 2016 o Passeriformes Turdidae Turdus pelios African_thrush 557 Elmalb557 Male Bioko Sur Moka Research Station 1375 3.361 8.662 20 1 2016 c Passeriformes Elminia albiventris White_bellied_crested_flycatcher 561b Zosste561B Female Bioko Sur Moka Research Station 1375 3.361 8.662 20 1 2016 c Passeriformes Zosteropidae Zosterops stenocricotus Forest_white_eye 564 Eurvir564 Male Bioko Sur Moka Research Station 1375 3.361 8.662 20 1 2016 o Passeriformes Pycnonotidae Eurillas virens Little_greenbul 568cx Turtym568CX Male Bioko Sur Moka Research Station 1375 3.361 8.662 21 1 2016 h Columbiformes Columbidae Turtur tympanistra Tambourine_dove 568cy Turtym568CY Male Bioko Sur Moka Research Station 1375 3.361 8.662 21 1 2016 h Columbiformes Columbidae Turtur tympanistra Tambourine_dove 574x Camchl574X Female Bioko Sur Moka Research Station 1375 3.361 8.662 21 1 2016 c Passeriformes Cisticolidae Camaroptera chloronota Olive_green_camaroptera 574y Camchl574Y Female Bioko Sur Moka Research Station 1375 3.361 8.662 21 1 2016 c Passeriformes Cisticolidae Camaroptera chloronota Olive_green_camaroptera 575b Plonig575B Male Bioko Sur Moka Research Station 1375 3.361 8.662 22 1 2016 c Passeriformes Ploceidae Ploceus nigricollis Black_necked_weaver 578 Lanpoe578 Female Bioko Sur Moka Research Station 1375 3.361 8.662 22 1 2016 c Passeriformes Malaconotidae Laniarius poensis Mountain_sooty_boubou 580 Aritep580 Male Bioko Sur Moka Research Station 1375 3.361 8.662 22 1 2016 o Passeriformes Pycnonotidae Arizelocichla tephrolaema Western_greenbul 584 Antsei584 Female Bioko Sur Mirador Moca 1560 3.369 8.665 23 1 2016 h Passeriformes Nectariniidae Anthreptes seimundi Collared_sunbird 591 Nigcan591 Female Bioko Sur Mirador Moca 1560 3.369 8.665 23 1 2016 h Passeriformes Estrildidae Nigrita canicapillus Grey_headed_nigrita 599a Dyacas599A Female Bioko Sur Mirador Moca 1560 3.369 8.665 23 1 2016 o Passeriformes Platysteiridae Dyaphorophyia castanea Chestnut_wattle_eye 599b Dyacas599B Female Bioko Sur Mirador Moca 1560 3.369 8.665 23 1 2016 o Passeriformes Platysteiridae Dyaphorophyia castanea Chestnut_wattle_eye 618 Pollop618 Male Bioko Sur Moka Research Station 1375 3.361 8.662 25 1 2016 c Passeriformes Cisticolidae Poliolais lopezi White_tailed_warbler 620a Pogbil620A Male Bioko Sur Mirador Moca 1560 3.369 8.665 25 1 2016 h Piciformes Lybiidae Pogoniulus bilineatus Yellow_rumped_tinkerbird 620b Pogbil620B Male Bioko Sur Mirador Moca 1560 3.369 8.665 25 1 2016 h Piciformes Lybiidae Pogoniulus bilineatus Yellow_rumped_tinkerbird 625 Phyher625 Male Bioko Sur Mirador Moca 1560 3.369 8.665 25 1 2016 c Passeriformes Phylloscopidae Phylloscopus herberti Black_capped_woodland_warbler 633a Stiery633A Male Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Stiphrornis erythrothorax Forest_robin 633b Stiery633B Male Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Stiphrornis erythrothorax Forest_robin 634a Alecas634A Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Alethe castanea Fire_crested_alethe 634b Alecas634B Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Alethe castanea Fire_crested_alethe 634c Alecas634C Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Alethe castanea Fire_crested_alethe 634d Alecas634D Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 26 1 2016 c Passeriformes Muscicapidae Alethe castanea Fire_crested_alethe 640a Blenot640A Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 27 1 2016 c Passeriformes Pycnonotidae Bleda notatus Yellow_lored_bristlebill 640b Blenot640B Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 27 1 2016 c Passeriformes Pycnonotidae Bleda notatus Yellow_lored_bristlebill 643a Neopoe643A Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 27 1 2016 c Passeriformes Turdidae Neocossyphus poensis White_tailed_ant_thrush 643b Neopoe643B Female Bioko Sur ~2 km N Ureka 317 3.272 8.581 27 1 2016 c Passeriformes Turdidae Neocossyphus poensis White_tailed_ant_thrush 650 Psaful650 Female Bioko Sur Moka Research Station 1375 3.361 8.662 30 1 2016 h Passeriformes Hirundinidae Psalidoprocne fuliginosa Mountain_saw_wing 653a Corleu653A Male Bioko Sur Moaba camp, ~3 km N Ureka 0 3.235 8.624 30 1 2016 c Coraciiformes Alcedinidae Corythornis leucogaster White_bellied_kingfisher 653b Corleu653B Male Bioko Sur Moaba camp, ~3 km N Ureka 0 3.235 8.624 30 1 2016 c Coraciiformes Alcedinidae Corythornis leucogaster White_bellied_kingfisher 655a Camniv655A Male Bioko Sur ~2 km N Ureka 317 3.272 8.581 29 1 2016 c Piciformes Picidae Campethera nivosa Buff_spotted_woodpecker 655b Camniv655B Male Bioko Sur ~2 km N Ureka 317 3.272 8.581 29 1 2016 c Piciformes Picidae Campethera nivosa Buff_spotted_woodpecker 656 Ploalb656 Male Bioko Sur ~2 km N Ureka 317 3.272 8.581 29 1 2016 o Passeriformes Ploceidae Ploceus albinucha Maxwells_black_weaver 658 Zosbru658 Male Bioko Norte Pico Basile 2670 3.595 8.769 1 2 2016 o Passeriformes Zosteropidae Zosterops brunneus Fernando_Po_speirops

12 Mothur

100%

90%

80%

Mollicutes

70% Alphaproteobacteria Gammaproteobacteria

60% Bacteria_unclassified Epsilonproteobacteria

50% Deltaproteobacteria Clostridia

Betaproteobacteria 40% Bacteroidia

Bacilli 30% Alphaproteobacteria

Actinobacteria 20%

10%

s s is s a a s s a a a is is is is a a a a a a a a a a a r x a ii ii a a a a a a a s i i is s s s si s s a a s s is is is a a r is is a a a is a a a s s is s s s is is is is is is s s s s is s is a a is i s a ii ii d is is a a r r i a ti a is s is s s a a a a is is a i s a a zi s s ti x x a a a a s s is is a r r a a a s n y r tu ill ic n n ic ic r n n n n t t t r r ill ill ill ill e t lo a e d d it it it it it e e fu th th s n n n e fu fu l yl ru ru d s s e n fe d d n n s l t t l tu tu r o tu tu l l l l l l u u u u r n s n n s w u e v v il r r l l te te ge ic r n it io r tu n tr tr t t ll s d lu e e e tu tu r a a e e e e tu tu s s s e e s s h u e h st n e e p p u fi fi fi fi a a a u u n la o r n ir ir r r r r r c c i i n e e e t a t ri n n c si a ri ri si si o ra o o u lo c ra ra ra ra ra ra in in in in la e p ti ti n o e b o o tr st st a a n n n e si r l t e s s o o o n m n il n n p e r r n n n n n n o st st o o c e 0% ir ac o ta ap i ir ir o o ro f f f f sc sc sc y y ap ap ap ap a l ic o a a a ig ig ig ig ig a a ru m m e sc sc sc a ru ru ph c lu lu i e e a a r i i a a iv e n n id la la h u no no t t t t t t v v v v l it o a a e n c ly n n s o o ph ph e e ca i b a o e n co ir i i n n ic e e u p a a o ea ea rb o o a a a a ta ta e e in a a iv iv u n v r ir o c ss v v i i c a a a a u u u ch ch c c c c st cu b th st b b n n n n n liv liv s s s o e e e b s s e a a a v m m liv r u v v r r n t ro ro it u u c le o o s s s s s s li li li li o n h o e va a e e a ir ir e e iv iv o ss ro r a p ve ri v an an ro ro r o la im a st st l n n e th th st st st st o o o o g g g n n in n a ib t n ri y s s th th a s s s s f f f a a ri ri ri ri a u s o a a a o o o o o o o u th th p lv lv lv is u u c d c c e u u o p rt e e p p a la o o n ac ac r o r r u u u u u u o o o o c e it em em p ch li h ch ch a t t c c f f b y p r s i c s p p o o ig p ro e ic a a is ili ili h o o a a a a n n p p li o o a a b ru tt r la a t b lla lla e e m u u u u in in in r r t t t t c c e r c i i d d d d d rc e e s u u u r h h io yn er er n s s ia u n n ia ia r o hl hl a o p lo lo a a a a a a x x x x ta n rn a a s i o c i i d la la io io ru ru a b la ia it du lb o lla hl hl n s h s n c c la b b s r r c c c c a a s s fu c c r r al b ie d s d a a ri ri a a p p p p a a a b b a a a a a s st th e in in n n n n n ra ra e s s u f f f y p p l s g g o s s ra l T o o l l e f c c o m m at hy h h ra ra ra ra ra ra y y y y e c o h h u re s a re re il s s l l i a u l m r a n ri m m c c s iu p s a a a o s s u th th e e e e d d u u eu eu e e s s v a a le a is u u o o a A A A A p p p a a a a a a yi te e ry h Pr Pr u u u u u it it c a a h is is is nn sy sy o a i i ph du du it y ph ph y y th ru a a g r r c s c c g g g g g g cc cc cc cc a ro y a a h s u yi a a tr a a o o e e rd e ic y o u a te u ty ty a a u r e te c yi yi li u u p ry ry h h h h le le h h e l l th th u p yl u ill B ld p E E d d du a a a er er ld ld ld ld h s t ir ir ir ir ir o p p yp s s s i s s p d in in i n H i i H H e s r r in to to i i r r a a a a a a o o o o di p c g ep ri rg h z z s ill ill p p n on a p h H n T ni s E r r r r e ia t p a h h o l l o e e et t et et B B yp p cn is is e e e o S q r ri o n n i ic ic ic t t ri ri ri ri p e rm e le H H H H H m m m a a s p p p C o o s e r r s n m s s p i e e d a a N it e e h h h h h h c c c c o o a ag a r y p i i E r r s s o p ip p a i s u u e e c n la e it p p P iu iu c l le l l s sy o n n p p c r o u t r u u V c c c p p t t t t o e is A o o h ic ic s o o o c c la l C C rp er er o rp rp rn t t n c c ct ig ig c c c c c c o o o o h id i h h st n nu o sp sp u u la la ph ph p lc sy y m p rt rt t t lo a h r r o o n n os is is A A A A s s r r r lo te d E Es ec ir ir s s s o o Es Es Es Es r rm p rn an an t c c co d d d o o ty B e g g an e e am o op op a di di A in in T T T T T T rc rc rc rc t l rd p p a in i r o o E E ty ty i i e a s C l o u u op op P L c th ig r r o o ll rn rn co co p o o am am P s ce C H H u u u ar ar o e S o y y a s s o lla lla lla e e e T i i y T T C h r r n n r r e e e e n sa a o o ll C L o t t e e ps ps h o o E er T T r r ci n N o o g g y o o o o o th th C C o l M M M h p r C C B u u e I I I N N a in in C it a a M I I C C C C C C A P p y ph yp yp a a r r S d C t a a o A h h o o h r r e e lid y y Z A m m ap S h M M N Cr Cr m m m p rm rm h a r r e e u s m m el ap ap P P P h h N N a r r a a y ip am S a a e e e P y C C am am T T e Zo a a iz y y ip ip s Co Co C C D t C C h p p D s C C r D D t t P S Ch S S S Ch Ch P A S S Figure 1: Relative abundance chart of the top 12 bacterial classes in Mothur of all samples. There are a total of 142 samples ) and 75 classes (data not shown) found through the DADA2 pipeline. X-axis refers to bird genus and species name while the legend refers to abundant bacterial classes.

13 DADA2

100%

90%

80%

Mollicutes

70% Methanobacteria

Gammaproteobacteria

60% Erysipelotrichia Epsilonproteobacteria

50% Deltaproteobacteria Clostridia

Betaproteobacteria 40% Bacteroidia

Bacilli 30% Alphaproteobacteria

Actinobacteria 20%

10%

s s is s a a s s a a a is is is is a a a a a a a a a a a r x a ii ii a a a a a a a s i i is s s s si s s a a s s is is is a a r is is a a a is a a a s s is s s s is is is is is is s s s s is s is a a is i s a ii ii d is is a a r r i a ti a is s is s s a a a a is is a i s a a i s s ti x x a a a a s s is is a r r a a a s n y r u ll ic n n ic ic r n n n n t t t r r ll ll ll ll e t lo a e d d it it it it it e e u th th s n n n e u u l yl u u d s s e n fe d d n n s l t t l u u r o u u l l l l l l u u u u r n s n n s w u e v v il r r l l te te e ic r n it io r u n tr tr t t ll s d lu e e ez u u r a a e e e e u u s s s te te s s h u re ch st at pi n re re p p u fi fi fi fi a a a u u pi pi pi pi n la o r n ir ir r r r r r c c uf i i n e e e t uf uf ha t ur ur ri n n c si a ri ri si si o ra o o du at at lo c ot ot ra ra ra ra ra ra in in in in la te p ti ti n o e b o o tr st st ha ha n n ag n e si r l t ot re s s o o o n m un il n n p at at e r r n n n n a t at n n o s s o o c e 0% i a o t a si i i o o ro f f f f sc sc sc hy hy a a a a a ul ic ho a a a ig ig ig ig ig a a r m m e sc sc sc a r r p c l l i e e a a r i i a a iv e n n ti l l h eu n n t t t t t t iv iv iv iv l i o a a e n ac ly n n s o o p p e e c si b a o e en c i ni ni n n ic e ae p a a o e e rb ho ho a a a a t t e e in a a iv iv u n v r ir o c s v v i i c a a a a u u u c c c c c c st c b t st b b n n n n n iv iv s s s o e e e b s s e a a a v m m iv r u v v r r n t ro ro i u u c l o o s s s s s s l l l l o n h o e v a e e a ir ir e e v v o s ro r a p v ri v ro ro r o l im a st st l n n e t t st st st st o o o o g g g n n in n a ib t n ri y s s th th a s s s s f f f a a ri ri ri ri a u s o a a a ol ol u th th p lv lv lv is u u c d c c e u u ol p t e e p p a la o o n c c r o r r u u u u u u o o o o c e it em em p ch li h ch ch a t t c c fi fi b y p r s i c s pa pa o o ig p ro e ic a a is li li h o o a a a a n n p p li o o a a b ru tt r la a t b la la e e m u u u u in in in r r t t t t c c e r c i i do do do do do rc e e s u u u r h h io yn r r n s s a ur n n a a r o l l a a a o p lo lo a a a a a a x x x x ta n n a a s i o c i i d la la io io u u a b la a it u lb o la l l n s h s n c c la bi bi s r r c c c c a a s s fu c c r r al b ie d s d a a il il a a p p p p a a a b b a a a a a s t th e in in n n n n n ra ra e s s u f f f y p p l s ge ge o s s ra li T o o li li e f ch ch o m m t hy h h a a a a a a y y y y e c or h h u re s a re re il s s l l r r i a u li m rd a n il m m ch ch s iu p s a a a o u th th e e e e d d u u eu eu e e s v a a le a is ur u r o o a A A A A p p p a a a a a a yi te es y h r r u u u u u it it c a a h is is is n sy sy o a i i h du du it y h h y y th ru g r r ca c c gr gr gr gr gr gr cc cc cc cc a ro y h s u yi tr a a o o e e rd e ic y o u a te ur ty ty u r e te c yi yi li us us p y y h h h h le le h h e l l th th us p yl u ill B ld p E E d d u a a a er er ld ld ld ld h s er t P P ir ir ir ir ir o p p yp s s s in s s p d in in ip n n H ip ip H H e s ra ra in to to i is r r a a a a a a o o o o di p c ga ga ep ri g h za za s ill ill p p n n a p h H n T i s E r r ra ra e ia t p a h h o l l o er er t t t t B B yp yp cn is is e e e o S q r ri o n n id ic ic ic t t ri ri ri ri p e rm s le H H H H H m m m a a s p p p C o o s e r r s r r m s s p i e e d a a N it e e h h h h h h c c c c o o a a a r y ur p i i E r r s s ho ho p ip p a in s u u e e c n la e it p p P iu iu sc s s le le le le s s o n n p p c er o u t r u u V c c c p p t t t t o m e i A o no h ic ic s o o o c c la l C C rp e e no rp rp rn pt pt n ic ic ct ig ig c c c c c c o o o o h id di h h st n n o sp sp u u la la p p lc sy y m p rt rt pt t lo a h r r o o n n o i i A A A A s s r r r lo t d E s ec ir ir s s s o o s s s s r r p n n t c c o d d d o o y B e g g e e m o a A n n T T T T T T rc rc rc rc t l r p p a n i r o o E E y y i ip e a s C l o u u p P L c th ig r r o o ll n n o o p o o m m P s e E C H H u u u r r E E E E o e S or ya ya a s s oc la la la e e et T i i ya T T Ca h ro ro nd nd ri ri e e e e n sa a o o ll Ci L o t t et et ps ps h o o E er T T ro ro ci n N o o g g y or or oc oc o th th Ca Ca o lc M M M a a h p r C C B u u e Il Il Il N N a in in C it a a M I I C C C C C C A P p y h p p a a r r S d C t a a o A h h o o h r r e e lid y y Z A m m ap S h M M N Cr Cr m m m p rm rm h ap ry ry e e u s m m el ap ap P P P h h N N a r r a a y ip m S a a e e e P y C C m m T T e Zo a a z y y ip ip s o o C C D t ha C C Sh p p D ha ha s C C ri D D t t P C C S C S S C C P A S S

Figure 2: Relative abundance chart of the top 12 bacterial classes in DADA2 of all samples. There are a total of 142 samples ) and 75 classes (data not shown) found through the DADA2 pipeline. X-axis refers to bird genus and species name while the legend refers to abundant bacterial classes.

14 Mothur

100%

90%

80%

70%

Verrucomicrobia

60% Tenericutes

Proteobacteria

50% Firmicutes

Cyanobacteria/Chloroplast

40% Chlamydiae

Bacteroidetes

30% Actinobacteria

20%

10%

0% Sylvietta virens Alcedo Eurillas Bleda notatus Estrilda Cecropis Eurillas virens Eurillas virens quadribrachys latirostris atricapilla abyssinica * * Figure 3: Relative abundance chart of bacterial phylum in Mothur. These are the first 8 samples with the top 8 phylum. X-axis is bird genus and species names with identical species representing samples taken/ dissections performed on the same bird *. Legend refers to abundant bacterial phylum.

15 DADA2

100%

90%

80%

70%

Verrucomicrobia

60% Tenericutes

Proteobacteria

50% Firmicutes

Cyanobacteria/Chloroplast

40% Chlamydiae

Bacteroidetes

30% Actinobacteria

20%

10%

0% Sylvietta virens Alcedo Eurillas Bleda notatus Estrilda Cecropis Eurillas virens Eurillas virens quadribrachys latirostris atricapilla abyssinica * * Figure 4: Relative abundance chart of bacterial phylum in DADA2. These are the first 8 samples with the top 8 phylum. X-axis is bird genus and species names with identical species representing samples taken/ dissections performed on the same bird *. Legend refers to abundant bacterial phylum.

16 Mothur

O b s e r v ed S h a n n o n S im p s o n

1 .0 0

2000

0 .7 5 4

1500

e

r

u

s a

e D ie t

M

y c t

i 0 .5 0 s

r h Diet e v 1000

i o

D

a

h p

l carnivore A 2 herbivore

0 .2 5 500

0 0 0 .0 0

A Ch Cl Cr Cu Cy Pa Pc A Ch Cl Cr Cu Cy Pa Pc A Ch Cl Cr Cu Cy Pa Pc O rd e r_ A b re v DADA2 O b s e r v ed S h a n n o n S im p s o n 1 .0 0

5

600

0 .7 5

4

e

r u

s 3 a

e 400 D ie t

M

y c

t 0 .5 0

i s

r h

e v

i o

D

a

h

p l

A 2

200 0 .2 5

1

0 0 0 .0 0

A Ch Cl Cr Cu Cy Pa Pc A Ch Cl Cr Cu Cy Pa Pc A Ch Cl Cr Cu Cy Pa Pc O rd e r_ A b re v Figure 5: Comparison of estimated number of OTUs (top) to ASVs (bottom) across bird orders with diet category highlighted. Shannon and Simpson indexes also included. A- Apodiformes, Ch-Charadriiformes, Cl-Columbiformes, CrCoraciiformes, Cu-Cuculiformes, Pa-Passerifomres, and Pc-Piciformes

17 Mothur DADA2 O b s e r v ed S h a n n o n O b s e r v ed S im p s o n S h a n n o n S im p s o n 1 .0 0 1 .0 0

5 2000

600

0 .7 5 0 .7 5 4 4 1500 Diet

carnivore

e e

r r u

u herbivore s

s 3

a a e

e 400 D ie t D ie t

M M

y

y c c t

t 0 .5 0 i

i 0 .5 0 omnivore

s s r

r h h

e e

v v i i 1000

o o

D D

a a

h h

p p

l l A A 2 2

200 0 .2 5 0 .2 5 500

1

0 0 0 0 .0 0 0 0 .0 0

A Ch Cl Cr Cu Cy Pa Pc A Ch ClA CrCh CuCl CyCr PaCu PcCy Pa PcA Ch A Cl ChCr ClCu CrCy CuPa CyPc Pa Pc A Ch Cl Cr Cu Cy Pa Pc O rd e r_ A b re v O rd e r_ A b re v Figure 6: Comparison of estimated number of OTUs (left) to ASVs (right) across bird orders with diet category highlighted. A-Apodiformes, Ch-Charadriiformes, Cl-Columbiformes, CrCoraciiformes, Cu-Cuculiformes, Pa-Passerifomres, and Pc-Piciformes

18 3.2 Beta diversity

Bray-Curtis distances were calculated between all pairwise sample comparisons in both

Mothur and DADA2 processed data. NMDS plots of Bray-Curtis were plotted against several variables including bird order, diet, and overall locality (Mothur results Figures 7, 8, 9; DADA2 results Figures 10,11,12). No obvious trends on both the Mothur and DADA2 NMDS plots can be seen through the variables mentioned or others not shown. However, variations can be seen in all Mothur vs DADA2 plots that may be visualized under the variables tested. Procrustes analyses were run to show these variations. Figure 13 shows a Procrustes analysis of the Mothur vs DADA2 NMDS plots where arrows show movement of Mothur data points to DADA2. There is a significant difference between Mothur and DADA2 in which DADA2 points are drawn across a larger axis and various samples are scattered on opposing ends depending on which program was used.

Data included both sample replicates, where individual intestines were sequentially extracted, and PCR replicates from a single extraction. Replicate data showed interesting trends between the two different programs and between the replicates themselves. Sample replicates yielded different bacterial species and amounts (Figure 14, 15), but were largely similar (see *

Figure 3, 4). Figures 14 and 15 highlight the variation of replicates. Some groups of replicates cluster together as seen by the connected black lines, which is to be expected. However there are several instances of replicates that cluster away from its other replicate partners. A Procrustes analysis was run on the replicates-only datasets from Mothur and DADA2 (Figure 16). Again arrows refer to the movement of Mothur points to DADA2. The trends noted in Figure 13 are more evident with the smaller sample size.

19

20

21 Mothur Diet Mothur

0 .2 Diet carnivore herbivore D ie t

2 c S

D omh nivore M N o

0 .0

− 0 .2

− 0 .3 − 0 .2 − 0 .1 0 .0 0 .1 0 .2 0 .3 N M D S 1 Figure 9: NMDS plot of Mothur data with Bray-Curtis distance applied. Figure legend labels are used as reference only and are not meant to show particular trends or groupings. Colored labels of diet labeled c-carnivorous, h- herbivorous, and o-omnivorous refers to diet of particular bird.

22 DDAD a d a 2 DA2 ie t

0 .6

0 .3

Diet D ie t

2 c

S carnivore

D h

M N 0 .0 o herbivore omnivore

− 0 .3

− 0 .4 0 .0 0 .4 0 .8 N M D S 1

Figure 10: NMDS plot of DADA2 data with Bray-Curtis distance applied. Figure legend labels are used as reference only and are not meant to show particular trends or groupings. Colored labels of diet labeled c-carnivorous, h- herbivorous, and o-omnivorous refers to diet of particular bird.

23

24

25

26 Mothur

Figure 14: NMDS plot of Mothur replicate data with Bray-Curtis distance applied. Colored labels of refer to bird order. Colored lines refer to sample replicates where back lines connect dissected samples from the same bird and red lines connect PCR replicates in the same bird.

27 DADA2

Figure 15: NMDS plot of DADA2 replicate data with Bray-Curtis distance applied. Colored labels of refer to bird order. Colored lines refer to sample replicates where back lines connect dissected samples from the same bird and red lines connect PCR replicates in the same bird.

28

29 Chapter 4: Discussion

How scientists analyze data matters. The defining of the taxonomic units is an extremely vital and foremost step. New data analysis pipelines come out regularly as the data science field continues to grow. As the importance of statistical analysis pipelines becomes more evident, there is an ever-developing need to become more standardized in our methods. While new analytical tools may find niches among those who have specific needs, an overpopulation of methods creates a lack of uniformity and adds difficulty to comparing across studies.

Determining which tools work best will allow scientists to more easily share their data and create an environment where other scientists can more readily check others’ statistical analyses and apply similar methods to their own data. Ideally, all microbiome figures, tables, and other types of data representation would be directly comparable.

The differences between Mothur’s 97% OTUs and DADA2’s ASVs were clear (Table 1).

Despite the same number of phyla being detected in both methods, there is significant deviation of taxonomic classes that escalates at lower taxonomic ranks. The Procrustes analysis of the

Mothur and DADA2 NMDS plots shows differences in the plotting of OTUs and ASVs. The microbiome is a growing scientific field, as scientists look for a greater understanding of these unique systems. Data analysis pipelines are key in this field to interpret biological results. The determination of use of OTUs or ASVs could have a larger impact on gut microbiome data than initially anticipated. Reproducibility between runs or entire studies could have significant effects on large studies such as the Human Microbiome Project.

Wild host microbiomes will always be vital for understanding the many factors that shape this complex system. Only through a wild host gut microbiome can one begin to catalogue the bacteria that live there and the features that contribute to its formation. There is difficulty

30 sampling most wild hosts without invasive procedures, however the collaboration with natural history museums can allow scientists to increase the practical use of specimens that are already being collected. Birds play a particular role in the understanding of gut microbiomes. Birds have a symbiotic relationship with microorganisms that can likely be traced back before the Aves lineage was created. From an evolutionary standpoint, the genetic relationship between birds and their gut microbiome could illuminate various questions about microbiomes in other systems.

The global distribution of birds also allows for the research of the effects of diet, climate, and environment on the gut microbiome.

The data used for pipeline comparisons is unpublished data of bird gut samples from

Equatorial Guinea. The samples are diverse among bird species, orders, location, sex, and diet.

The gut sampling process also proved diverse as they were cut at varying sections along the gi tract. There are key differences among samples that are directly related to where the samples were taken along the gi tract. For this reason, sampling location uniformity may be a larger issue than initially anticipated. Standardization at this initial step must be taken into consideration to ensure similar regions are sampled. We have yet to fully understand the impact of sampling different regions of bird microbiomes. Until variation along the gastrointestinal tract has been further described on the micro-scale we must continue sampling similarly across studies.

Chapter 5: Conclusion

The highlight of OTU/ASV differences do not equate to a preference of one over another, but rather displays their differences in analyzing a particular dataset. Our dataset has displayed double the amount of taxonomic units using 97% OTUs compared to ASVs. This highlights the biological differences between these two methods. While it may be true that many tools fulfill a

31 specific scientist’s needs, the differences among them matter. For these reasons, understanding your own dataset and data processing protocol is vital whenever taxonomic units are required.

32 Appendix A: Mothur R Script rm(list=ls()) setwd("/Users/dariencapunitan/Desktop/bird_poop") # set working directory with full file path library(vegan) # call package library(ggplot2) library(dplyr) library(scales) library(grid) library(reshape2) library(Matrix) library(phyloseq) theme_set(theme_bw()) sharedfile = "stability.TWO.shared" taxfile = "stability.TWO.cons.taxonomy" mapfile = "EQ_map_forRCollab.csv" mothur_data <- import_mothur(mothur_shared_file = sharedfile, mothur_constaxonomy_file = taxfile) map <- read.csv(mapfile, row.names = c(1), sep=",", header=TRUE) head(map) map <- sample_data(map) moth_merge <- merge_phyloseq(mothur_data, map) moth_merge colnames(tax_table(moth_merge)) colnames(tax_table(moth_merge)) <- c("Kingdom", "Phylum", "Class", "Order", "Family", "Genus")

33 Appendix B: DADA2 R Script rm(list=ls()) setwd("/Users/dariencapunitan/Desktop/HirdEG_fastq_Dada")

#source("https://bioconductor.org/biocLite.R") #biocLite("dada2")

#biocLite("devtools") #library("devtools") #devtools::install_github("benjjneb/dada2") library(tibble) library(dada2); packageVersion("dada2") library(ShortRead); packageVersion("ShortRead") library(ggplot2); packageVersion("ggplot2") path <- "/Users/dariencapunitan/Desktop/HirdEG_fastq_Dada" fns <- list.files(path) fns fastqs <- fns[grepl(".fastq$", fns)] fastqs <- sort(fastqs) # Sort ensures forward/reverse reads are in same order fnFs <- fastqs[grepl("_R1", fastqs)] # Just the forward read files fnRs <- fastqs[grepl("_R2", fastqs)] # Just the reverse read files sample.names <- sapply(strsplit(fnFs, "_"), `[`, 1) # Specify the full path to the fnFs and fnRs fnFs <- file.path(path, fnFs) fnRs <- file.path(path, fnRs) plotQualityProfile(fnFs[[174]])

34 plotQualityProfile(fnRs[[174]])

# Make directory and filenames for the filtered fastqs filt_path <- file.path(path, "filtered") if(!file_test("-d", filt_path)) dir.create(filt_path) filtFs <- file.path(filt_path, paste0(sample.names, "_F_filt.fastq.gz")) filtRs <- file.path(filt_path, paste0(sample.names, "_R_filt.fastq.gz")) # Filter for(i in seq_along(fnFs)) { fastqPairedFilter(c(fnFs[i], fnRs[i]), c(filtFs[i], filtRs[i]), truncLen=c(240,160), maxN=0, maxEE=c(2,2), truncQ=2, rm.phix=TRUE, compress=TRUE, verbose=TRUE) } derepFs <- derepFastq(filtFs, verbose=TRUE) derepRs <- derepFastq(filtRs, verbose=TRUE) # Name the derep-class objects by the sample names names(derepFs) <- sample.names names(derepRs) <- sample.names dadaFs.lrn <- dada(derepFs, err=NULL, selfConsist = TRUE, multithread=TRUE) errF <- dadaFs.lrn[[1]]$err_out dadaRs.lrn <- dada(derepRs, err=NULL, selfConsist = TRUE, multithread=TRUE) errR <- dadaRs.lrn[[1]]$err_out plotErrors(dadaFs.lrn[[1]], nominalQ=TRUE) dadaFs <- dada(derepFs, err=errF, multithread=TRUE) dadaRs <- dada(derepRs, err=errR, multithread=TRUE)

35 dadaFs[[1]] mergers <- mergePairs(dadaFs, derepFs, dadaRs, derepRs, verbose=TRUE) # Inspect the merger data.frame from the first sample head(mergers[[1]]) seqtab <- makeSequenceTable(mergers[names(mergers) != "Mock"]) dim(seqtab) table(nchar(getSequences(seqtab))) seqtab.nochim <- removeBimeraDenovo(seqtab, verbose=TRUE) dim(seqtab.nochim) sum(seqtab.nochim)/sum(seqtab) taxa <- assignTaxonomy(seqtab.nochim, "silva_nr_v128_train_set.fa") unname(head(taxa)) library(phyloseq); packageVersion("phyloseq") library(ggplot2); packageVersion("ggplot2") library(data.table) library(reshape) theme_set(theme_bw())

#import mapping file similar to mothur import mapfile = "EQ_map_forRCollab.csv" map <- read.csv(mapfile, row.names = c(1), sep=",", header=TRUE) head(map) map <- sample_data(map)

#importing changed taxa file taxfile = "DadaTaxa2.csv"

36 taxa2 <- read.csv(taxfile, row.names = c(1), sep=",", header=TRUE) head(taxa2) taxa2 <- sample_data(taxa2)

# Construct phyloseq object (straightforward from dada2 outputs) ps <- phyloseq(otu_table(seqtab.nochim, taxa_are_rows=FALSE), tax_table(taxa)) ps moth_merge <- merge_phyloseq(ps, map) moth_merge

37

Appendix C: Diversity Calculations #alpha diversity plot_richness(moth_merge, x="Order_Abrev", measures=c("Observed", "Shannon", "Simpson"), color="Diet") + theme_bw() #Beta diversity #NMDS/Bray ord.nmds.bray <- ordinate(moth_merge, method="NMDS", distance="bray") plot_ordination(moth_merge, ord.nmds.bray, color="Sample_Order", title="Bray NMDS X")

#Tree for Unifrac source("https://bioconductor.org/biocLite.R") biocLite("DECIPHER") library(DECIPHER) seqs <- getSequences(seqtab.nochim) names(seqs) <- seqs # This propagates to the tip labels of the tree alignment <- AlignSeqs(DNAStringSet(seqs), anchor=NA,verbose=FALSE)

#construct a neighbor-joining tree, and then fit a Generalized time-reversible with Gamma rate variation #maximum likelihood tree using the neighbor-joining tree as a starting point. library(phangorn) #choose no when from compilation phangAlign <- phyDat(as(alignment, "matrix"), type="DNA") dm <- dist.ml(phangAlign) treeNJ <- NJ(dm) # Note, tip order != sequence order fit = pml(treeNJ, data=phangAlign) fitGTR <- update(fit, k=4, inv=0.2) fitGTR <- optim.pml(fitGTR, model="GTR", optInv=TRUE, optGamma=TRUE, rearrangement = "stochastic", control = pml.control(trace = 0)) detach("package:phangorn", unload=TRUE)

38

Appendix D: Procrustes Analysis pro <- procrustes(Mothur.nmds, Dada.nmds) pro summary(pro) plot(pro)

39

Appendix E: Unifrac Analysis ps_dada <- phyloseq(otu_table(tax_table, taxa_are_rows=FALSE),

sample_data(moth_merge),

tax_table(taxa), phy_tree(fitGTR$tree)) ps_dada ord.pcoamothur.unifrac<- ordinate(ps_dada, method="NMDS", distance="unifrac") plot_ordination(ps_dada, ord.pcoamothur.unifrac, color="Diet", title="Uni Un NMDS")

40

References

Al-Asmakh, M., Stukenborg, J.-B., Reda, A., Anuar, F., Strand, M.-L., Hedin, L., et al. (2014). The gut microbiota and developmental programming of th

Benskin, C. M. H., Rhodes, G., Pickup, R. W., Wilson, K., and Hartley, I. R. (2010). Diversity and temporal stability of bacterial communities in a model bird, the zebra finch. Mol. Ecol. 19, 5531–5544. doi: 10.1111/j.1365- 294X.2010.04892.x

Callahan BJ, McMurdie PJ, Rosen MJ, Han AW, Johnson AJ, Holmes SP. (2016a). DADA2: high-resolution sample inference from Illumina amplicon data. Nat Methods 13: 581–583.

Callahan, B. J., McMurdie, P. J. & Holmes, S. P. Exact sequence variants should replace operational taxonomic units in marker-gene data analysis. ISME J.http://dx.doi.org/10.1038/ismej.2017.119 (2017)

Cho, I. & Blaser, M.J. The human microbiome: at the interface of health and disease. Nat. Rev. Genet. 13, 260–270 (2012).

Colston TJ, Jackson CR. Microbiome evolution along divergent branches of the vertebrate tree of life: what is known and what is unknown. Mol Ecol. 2016;25:3776–3800.

Diaz Heijtz, R., Wang, S., Anuar, F., Qian, Y., Bjorkholm, B., Samuelsson, A., et al. (2011). Normal gut microbiota modulates brain development and behavior. Proc. Natl. Acad. Sci. U.S.A. 108, 3047–3052. doi: 10.1073/pnas.1010529108

Dominguez-Bello, M. G. et al. Delivery mode shapes the acquisition and structure of the initial microbiota across multiple body habitats in newborns. Proc. Natl Acad. Sci. USA 107, 11971–11975 (2010) Edgar, R. C. (2017). Accuracy of microbial community diversity estimated by closed‐ and open‐ reference OTUs. PeerJ, 5, e3889.

Ezenwa VO, Gerardo NM, Inouye DW, Medina M, Xavier JB. Microbiology. Animal behavior and the microbiome. Science(2012) 338:198–9. doi:10.1126/science.1227412

He Y, Caporaso JG, Jiang XT, Sheng HF, Huse SM, Rideout JR, Edgar RC, Kopylo va E, Walters WA, Knight R, Zhou HW. 2015. Stability of operational taxonomic units: an important but neglected property for analyzing microbial diversity. Microbiome 3 Article 20 Hird SM, Sánchez C, Carstens BC, Brumfield RT. Comparative gut microbiota of 59 Neotropical bird species. Front Microbiol. 2015;6:1403. pmid:26733954

41 Kozich, J. J., Westcott, S. L., Baxter, N. T., Highlander, S. K. & Schloss, P. D. Development of a dual-index sequencing strategy and curation pipeline for analyzing amplicon sequence data on the MiSeq Illumina sequencing platform. Appl. Environ. Microbiol.79, 5112–5120 (2013)

Kreisinger J, Čížková D, Kropáčková L, Albrecht T (2015) Cloacal Microbiome Structure in a Long-Distance Migratory Bird Assessed Using Deep 16sRNA Pyrosequencing. PLoS ONE 10(9): e0137401.

Ley RE, Hamady M, Lozupone C et al. (2008) Evolution of mammals and their gut microbes. Science, 320, 1647–1651.

Muegge BD, Kuczynski J, Knights D et al. (2011) Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science, 332, 970–974.

Nelson TM, Rogers TL, Brown MV (2013) The gut bacterial community of mammals from marine and terrestrial habitats. PLoS ONE, 9, e99562.

Ramezani, A. & Raj, D. S. The gut microbiome, kidney disease, and targeted interventions. J. Am. Soc. Nephrol. 25, 657–670 (2014).

Schnorr, S. L., & Bachner, H. A. (2016). Integrative Therapies in Anxiety Treatment with Special Emphasis on the Gut Microbiome. The Yale Journal of Biology and Medicine, 89(3), 397–422.

Schloss PD, et al. (2009) Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl Environ Microbiol 75:7537–7541

Schloss, P. D. & Handelsman, J. DOTUR, a computer program for defining operational taxonomic units and estimating species richness. Appl. Environ. Microbiol. 71, 1501– 1506 (2005). Shoemark DK, Allen SJ. 2014. The microbiome and disease: reviewing the links between the oral microbiome, aging, and Alzheimer’s disease. Journal of Alzheimer’s Disease 43:725-738

Sokal, R. R. and Sneath, P. H. A. 1963. Principles of numerical taxonomy, San Francisco: W. H. Freeman and Co..

A. Spor, O. Koren, R. Ley. Unravelling the effects of the environment and host genotype on the gut microbiome. Nat. Rev. Microbiol., 9 (2011), pp. 279-290

42 Wang, J., Brelsfoard, C., Wu, Y., and Aksoy, S. (2013). Intercommunity effects on microbiome and GpSGHV density regulation in tsetse flies. J. Invertebr. Pathol. 112(Suppl.) S32–S39. doi: 10.1016/j.jip.2012.03.028

Westcott, S. L. & Schloss, P. D. OptiClust, an improved method for assigning amplicon- based sequence data to operational taxonomic units. mSphere 2, 1–11 (2017).

Wright ES: Using DECIPHER v2.0 to analyze big biological sequence data in R. The R Journal.Page in Press,2016.

Xenoulis, P. G., Gray, P. L., Brightsmith, D., Palculict, B., Hoppes, S., Steiner, J. M., et al. (2010). Molecular characterization of the cloacal microbiota of wild and captive parrots. Vet. Microbiol. 146, 320–325. doi: 10.1016/j.vetmic.2010.05.024

43