The ISME Journal (2014) 8, 2369–2379 & 2014 International Society for Microbial Ecology All rights reserved 1751-7362/14 www.nature.com/ismej ORIGINAL ARTICLE Variation in gut microbial communities and its association with pathogen infection in wild bumble bees (Bombus)

This article has been corrected since Advance Online Publication and a corrigendum is also printed in this issue

Daniel P Cariveau1, J Elijah Powell2,3, Hauke Koch2,3, Rachael Winfree1 and Nancy A Moran2,3 1Department of Ecology, Evolution, and Natural Resources, Rutgers University, New Brunswick, NJ, USA; 2Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT, USA and 3Department of Integrative Biology, University of Texas, Austin, TX, USA

Bacterial gut symbiont communities are critical for the health of many insect species. However, little is known about how microbial communities vary among host species or how they respond to anthropogenic disturbances. Bacterial communities that differ in richness or composition may vary in their ability to provide nutrients or defenses. We used deep sequencing to investigate gut microbiota of three species in the genus Bombus (bumble bees). Bombus are among the most economically and ecologically important non-managed pollinators. Some species have experienced dramatic declines, probably due to pathogens and land-use change. We examined variation within and across bee species and between semi-natural and conventional agricultural habitats. We categorized as ‘core ’ any operational taxonomic units (OTUs) with closest hits to sequences previously found exclusively or primarily in the guts of honey bees and bumble bees (genera Apis and Bombus). Microbial community composition differed among bee species. Richness, defined as number of bacterial OTUs, was highest for B. bimaculatus and B. impatiens. For B. bimaculatus, this was due to high richness of non-core bacteria. We found little effect of habitat on microbial communities. Richness of non-core bacteria was negatively associated with bacterial abundance in individual bees, possibly due to deeper sampling of non-core bacteria in bees with low populations of core bacteria. Infection by the gut parasite Crithidia was negatively associated with abundance of the core bacterium Gilliamella and positively associated with richness of non-core bacteria. Our results indicate that Bombus species have distinctive gut communities, and community-level variation is associated with pathogen infection. The ISME Journal (2014) 8, 2369–2379; doi:10.1038/ismej.2014.68; published online 24 April 2014 Subject Category: Microbial ecology and functional diversity of natural habitats Keywords: Bombus; Gilliamella; gut microbiota; land-use change; pyrosequencing

Introduction communities are better able to resist invasion by pathogens in humans (Berg, 1996). Therefore broad- Diverse communities of bacteria inhabit insect guts scale surveys of gut microbial communities coupled and provide critical functions for their hosts, such as with studies of functional consequences for hosts, as digestion (Warnecke et al., 2007), pathogen defense has been done in human systems (Frank et al., (Dillon et al., 2005) and insecticide resistance 2007), will further the understanding of the role of (Kikuchi et al., 2012). Most work on functional gut microbiota in insect health, ecology and evolu- importance of gut microbiota has focused on one or tion (Hamdi et al., 2011). a few species of bacteria. However, community-level When taking a community approach to under- factors such as composition and richness may also standing the insect microbiome, it is useful to impact insect health and survival through mechan- differentiate between ‘core’ taxa that are repeatedly isms, such as colonization resistance (Dillon et al., found in individuals of a particular host species or 2005). For example, particular bacterial cluster of closely related host species and ‘non-core’ taxa that often occur in non-host-associated envir- onments. Whereas some insect species possess Correspondence: DP Cariveau, Department of Ecology, Evolution, erratic gut communities consisting of environmental and Natural Resources, Rutgers, The State University of New bacteria that vary widely among individuals (Engel Jersey, 14 College Farm Road, 152 ENR, New Brunswick, NJ 08901, USA. [email protected] and Moran, 2013; Wong et al., 2013), other insect Received 30 September 2013; revised 15 March 2014; accepted species have gut bacteria that are completely or 20 March 2014; published online 24 April 2014 largely restricted to the guts of their hosts (Warnecke Variation in Bombus gut microbiota DP Cariveau et al 2370 et al., 2007; Martinson et al., 2011; Anderson et al., species (B. bimaculatus Cresson, B. impatiens 2012; Sudakaran et al., 2012). Dominance by such Cresson and B. griseocollis DeGeer) collected from core bacterial species may be more common in semi-natural and agricultural habitats. We sought to social insects as interactions among host individuals first determine whether microbial communities facilitate transmission (Martinson et al., 2011). Core differ among bee species and, second, whether bacteria may have evolved in close association with microbial communities are influenced by anthropo- hosts and are potentially commensal or beneficial genic change. We focus on agricultural land-use (Berg, 1996; Martinson et al., 2011; Koch et al., change as it represents the dominant form of 2013). However, this assumption has rarely been anthropogenic change globally (Pereira et al., tested. Although they are facultative, environmen- 2010). Work in non-Bombus systems has shown that tally acquired bacteria sometimes provide fitness gut microbial communities can be influenced by benefits to their hosts (Kikuchi et al., 2012; host genotype, diet, geography and environmental McFrederick et al., 2012; Ridley et al., 2012). High factors (Zouache et al., 2010; Huang and Zhang, richness or abundance of non-core bacterial may 2013). However, no studies have examined the effect also reflect dysbioses associated with disease and of land-use change on the gut communities of pathogens (Sartor, 2008; Hamdi et al., 2011). Bombus. Agrochemical inputs can have dramatic In this study, we examine the composition of gut negative consequences for managed and native bees microbial communities in Bombus species (bumble (Whitehorn et al., 2012; Di Prisco et al., 2013; Pettis bees). Bombus are among the most important wild et al., 2013). A recent analysis conducted in our pollinators in natural and agricultural habitats New Jersey study region found that cranberry (Hegland and Totland, 2008; Garibaldi et al., 2013). agriculture (the habitat type examined here) used Furthermore, some Bombus species have undergone more types of fungicides than other crops and that dramatic recent declines. Some of these population fungicide exposure was associated with increased declines may result from land-use change and infection of honey bees (Apis mellifera) by the gut pathogens (Goulson et al., 2008; Grixti et al., 2009; pathogen Nosema (Pettis et al., 2013). Williams and Osborne, 2009; Cameron et al., 2011), Specifically, we investigated three questions: but the mechanisms underlying the declines of some (1) How do gut microbial community metrics such Bombus species remain unknown. The few studies as richness (number of operational taxonomic units of Bombus gut microbiota reveal a simple yet (OTUs)), beta diversity and composition differ distinctive bacteria community (Mohr and Tebbe, among three common Bombus species and between 2006; Martinson et al., 2011; Koch and semi-natural and agricultural habitats? (2) Do core Schmid-Hempel, 2011b). Bombus and Apis share and non-core gut microbiota differ among species many of the same groups of core bacterial taxa, and between semi-natural and agricultural habitats? including species within Alphaproteobacteria, Beta- (3) Do core and non-core gut microbiota have , and Lacto- different associations with (a) the infection of bacillales (Martinson et al.,2011;Kochet al.,2012). Bombus by two important gut parasites: Crithidia However, previous studies of Bombus gut bacteria sp. and Nosema sp., and (b) the overall abundance have mostly been targeted to specific taxa, and the few of gut bacteria of the host? broad sequence-based surveys suggest more variation in Bombus gut communities (Martinson et al.,2011). One critical role of gut bacteria could be pathogen Materials and methods defense, and pathogens are likely an important cause of declines in Bombus populations. Nosema Study design and field data collection bombi, an intracellular microsporidian that initially To assess whether agricultural land-use affects invades midgut epithelial cells, is suspected to Bombus gut microbiota, specimens were collected negatively impact some North American Bombus from six agricultural and six semi-natural sites in species (Cameron et al., 2011), and Crithida bombi,a Burlington County, NJ, USA in May and June 2011 trypanosome that infect Bombus guts, may also lead (Supplementary Figure S1). Agricultural sites were to declines (Brown et al., 2003; Otterstatter and conventionally managed cranberry farms with high Thomson, 2008; but see Cordes et al., 2012). Using fungicide use implicated to have negative conse- experimental manipulations, Koch and Schmid- quences for health (Pettis et al., 2013). Hempel (2011a, 2012) found that gut colonization Semi-natural habitats were abandoned cranberry by core bacteria, including Gilliamella (Gamma- bogs that had not been managed as farms for at least proteobacteria) and Snodgrassella (Betaproteobacteria) 5 years before collection and are reverting to natural species, lowered Crithidia infection in an European wetlands. In all the 12 sites, the dominant flowering bumble bee. Although the mechanism is unknown, plant was cranberry (Vaccinium macrocarpon these bacteria produce a biofilm on the hindgut wall Aiton: Ericaceae); the use of a focal study plant that may impede infection by gut pathogens (Engel limited between-treatment differences in diet that et al., 2012, Martinson et al., 2012). might confound effects of land-use and agrochem- Here we use deep sequencing to examine the gut ical input. Sites were separated by a mean of 19.9 km microbial communities of three native, wild Bombus (4.2–38.4 km) to minimize the probability of

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2371 collecting specimens from the same colony at alternate specimens. Any of the specimens of a different sites. In addition, sites of semi-natural species at a particular site that failed to amplify were habitats were located a mean of 20.2 km (6.1– pulled from the data set (n 7). Pooled amplicon 38.4 km) from agricultural habitats to ensure that was cleaned using Agencourt¼ Ampure beads (Beck- collected bees did not forage in areas of active man Coulter, Brea, CA, USA) and quantified using agriculture. A Mantel test using our primary out- the Quant-it Picogreen dsDNA quantification kit come variable (proportion of core bacterial species) (Invitrogen, Life Technologies, Grand Island, NY, indicated no spatial autocorrelation among sites USA) on a Victor X3 plate reader (Perkin Elmer, (Mantel r 0.19; P 0.11). Waltham, MA, USA). Equimolar amounts of cleaned At least¼ three¼ individual workers from each amplicon were pooled and sent to the University of species were collected at each site. Specimens were Georgia Genomics Facility for pyrosequencing via identified in the field using morphological charac- GS FLX Titanium XL (454 Life Sciences, teristics, placed into vials with 50 ml of 100% Branford, CT, USA). Initially,þ a 1/8 pilot plate was ethanol and held at 4 1C until further processing. sequenced followed by a full plate. Reads were analyzed using tools in QIIME versions 1.3 and 1.4 (Caporaso et al., 2010). A summary of read-proces- DNA preparation sing steps and read-related data is in Supplementary DNA from whole guts was extracted and quantified Table S3, and a mapping file may be found in using protocols described in Moran et al. (2012), and Supplementary Table S4. 1 DNA extracts were normalized top50 ng ml À for Relative abundance of microbiota in individual further procedures. Process controls included DNA samples was visualized by using ‘high abundance’ of both Escherichia coli K-12 and A. mellifera OTUs binned taxonomically (Supplementary Table microbiota (sample AZ125.3); the latter was char- S5 and Supplementary Figure S2). To assign ‘high acterized previously using the same pyrotag abundance’ OTUs to specific taxonomic groups, approach used in Moran et al. (2012). The presence BLASTn (Altschul et al., 1997) searches against the of microbial DNA was verified by PCR using GenBank nr database (accessed 20 September 2012) universal primers: 27F0-HT and 1492R0-HT (Tyson were performed through the National Center for et al., 2004). Any specimens for which DNA failed to Biotechnology Information sequence search func- amplify were replaced by preparations from alter- tion of Geneious (Drummond et al., 2011). High nate specimens. Bombus species identifications abundance OTUs were defined as those constitut- were verified by amplifying and sequencing the ing41% of any one sample. These 76 OTUs (from a Cytochrome Oxidase I gene using LepF1 and LepR1 total of 352) contained 98.95% of total reads. The primers (Hebert et al., 2004). Cleaned Cytochrome top 10 BLAST hits for each OTU were reviewed, and Oxidase I amplicons were sequenced at Yale Science the best BLAST hit was chosen as a representative Hill DNA Analysis Facility and returned sequences assignment for that OTU. Best BLAST hits were were submitted to the Barcode of Life Database chosen based on the highest bit score. When multi- (http://www.boldsystems.org) for taxonomic assign- ple hits had equivalent ranking, sequences derived ment (Ratnasingham and Herbert, 2007). Sequences from bees, floral samples or other insect hosts were from specimens field-identified as B. griseocollis chosen. OTUs were binned at the taxonomic level of contained several polymorphic regions, and taxo- Order for analysis (Supplementary Table S5). nomic assignments for B. griseocollis specimens Previously observed bacterial species found in were based on manually trimmed monomorphic corbiculate bees (for example, Gilliamella apicola segments. Based on these sequences, one specimen and Snodgrassella alvi) are noted down to genus in was incorrectly identified in the field and removed the taxonomic and phylogenetic analyses. These from analysis. Primers and reaction conditions are taxonomical bins were used in order to visualize listed in Supplementary Table S1. microbial community structure within samples (Supplementary Figure S2) while statistical analyses utilized ungrouped but taxonomically assigned PCR, pyrosequencing and pyrotag analysis OTUs. Control samples and one Bombus sample To amplify 16S rRNA gene sequences, primers with few reads (o200) were removed. A total of 99 338GF_TBF and 1492R_TBRn were designed with Bombus specimens were analyzed in the pyrotag 8 bp barcodes in BARCRAWL (Frank, 2009) with data set comprised of 30 B. bimaculatus, 36 B. Titanium adaptors and Lib L keys (Supplementary impatiens and 33 B. griseocollis. Tables S1 and S2). Approximately 50 ng DNA from three individuals per species per site was amplified in triplicate along with a negative control. The Characterization of core vs non-core bacteria triplicate reactions were pooled, and products were Characterization of OTUs as either core or non-core screened on 1% agarose gels. Reactions that failed to was determined by phylogenetic placement for amplify or for which a negative control amplified representative genomic sequences. OTUs were were repeated. DNA from specimens that failed classified as core if they were monophyletic with three attempts at amplification were replaced with other bacteria consistently and primarily associated

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2372 with Apis and Bombus (major groups of corbiculate clustalw/) (Thompson et al.,1994).Weincluded bees). This analysis was performed by retrieving sequences representing the known microsporidian full-length and nearly full-length 16S rRNA parasites of bees and additional close matches in sequences of closely related taxa from GenBank. GenBank. A maximum-likelihood phylogenetic Representative sequences of bee-associated and tree was computed with PhyML (Guindon et al., other closely related taxa from previously published 2010) using a GTR G Imodeland500bootstrap studies of Bombus and Apis microbiota were used replicates. þ þ (Cox-Foster et al., 2007; Martinson et al., 2011; Koch and Schmid-Hempel, 2011a). All sequences were aligned via web submission to Infernal aligner of Quantification of bacterial abundance RDP 2.2 (Wang et al., 2007). The alignments were To determine differences in abundance of bacteria used to reconstruct maximum-likelihood trees with among individual bees, absolute copy numbers of bootstrap support via RAxML v7.4.2 (Stamatakis, 16S rRNA genes in each sample were assessed using 2006) in the CIPRES web portal (Miller et al., 2009). real-time qPCR. DNA samples were diluted to 1:10, A GTRgamma base substitution model was used and 1 ml was amplified in triplicate with 16S rRNA with 1000 iterations. Resulting Newick files were primers 27F and 355R (Castillo et al., 2006). visualized with Figtree 1.4 (http://tree.bio.ed.ac.uk/ Absolute gene copy numbers were determined with software/figtree/). a standard curve using previously described meth- The Lactobacillus tree was inferred by a different ods (Martinson et al., 2012). Primers and reaction route because of the complexity and size of the conditions are listed in Supplementary Table S1. genus and ambiguities within 16S rRNA gene alignments (Claesson et al., 2008). We used an alignment based on a secondary structure model Statistical analysis used in a previous study of host specificity between The total number of bacterial reads per sample was hymenopterans and Lactobacilli (McFrederick et al., rarefied to the smallest number of reads for a given 2013). We obtained the alignment from TreeBASE sample (n 2500). Statistical tests were performed (Morell, 1996), trimmed representative OTU using this¼ subset. sequences to 636 bp (50 bases were ambiguously Chao1 estimates of alpha diversity (number of aligned in initial attempts) and then added them to bacteria OTUs) were calculated in QIIME and were the alignment using the PyNAST tool within QIIME. used to examine differences in bacteria between the This full alignment was used to reconstruct max- three species and habitat by performing analysis of imum-likelihood trees and bin core associates as variance in JMP10 (SAS, Cary, NC, USA). We use the described above. All resulting trees are presented in number of OTUs as our measure of richness. Supplementary Figure S3. Sequences are deposited Rarefaction curves of the Chao1 estimate were in the Sequence Read Archive under BioProject ID constructed in QIIME at 10 subsamplings of every PRJNA217796. 100 reads and were used to verify adequate depth of sampling (Supplementary Figures S4a and b). We used a beta-dispersion test to examine differences in Crithidia and Nosema infection beta diversity between habitats and bee species Infection status with the eukaryotic parasites using a Bray–Curtis dissimilarity matrix (Anderson, Crithidia and Nosema was detected by PCRs 2006). This test determines whether the turnover of specific for either parasite using DNA extracted bacterial communities among individual bees dif- from guts as template. For Crithidia,apartofthe fered among groups (species or habitat). We used an ITS region was amplified using the Crithidia- NMDS (nonmetric multidimensional scaling) to specific primers CrithITS1f and CrithITS1r visualize differences in bacterial community com- (Schmid-Hempel and Tognazzo, 2010). To detect position among species and habitat and assessed infections with Nosema,weusedtheNosema- differences statistically using a PERMANOVA for specific primer pair SSUrRNA-f1 and SSUrRNA- Bray–Curtis distances in community composition r1b (Li et al.,2012)targetingthemicrosporidian among groups (Anderson, 2001). Factors in the small subunit rRNA gene. In addition, all samples PERMANOVA were bee species, habitat and their were screened with the N. bombi-specific primer interaction. All community composition analyses pair N.b.af and N.b.ar (Erler et al.,2012).Primers were done using the vegan package in R (Oksanen and reaction conditions are listed in et al., 2013). Supplementary Table S1. Positive amplification To determine whether core and non-core bacteria and product sizes were verified on a 1.5% agarose varied between habitat types or among bee species, a gel. Products were sequenced directly from positive generalized linear mixed model was applied using PCR reactions with the respective forward primer the lmer package in R (Pinheiro et al., 2013). The for confirmation. Nosema SSU sequences were also response variable was proportion of core bacteria sequenced with the reverse primer and assembled per bee. This parameter indicates how much of the in MacVector (12.7). Nosema sequences were total gut microbiota is composed of core bacterial aligned in ClustalW (http://www.genome.jp/tools/ groups. Bee species, habitat and their interaction

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2373 were fixed effects. Collection site was a random effect. We used a Gaussian distribution as residuals were normally distributed. For this and all following generalized linear mixed models, non-significant interactions and fixed effects were removed from the model (P40.05). To determine which core bacterial OTUs best characterized the gut microbiota com- munities as a function of bee species, habitat and pathogen infection status, we used indicator species analysis in labdsv package in R (Roberts, 2012). Indicator species values are based on how specific and widespread an OTU is within a particular group and are independent of the relative abundances of other bacteria (Dufreˆne and Legendre, 1997). A generalized linear mixed model was used to determine whether core and non-core gut microbiota differ in their association with infection by Crithidia and Nosema. Separate models were conducted for each pathogen. The presence or absence of Crithidia or Nosema was the response variable; therefore, a Figure 1 Results of multivariate dispersion test. Gut commu- binomial error distribution was used. Bee species, nities among B. bimaculatus are more variable than those among absolute number of 16S rRNA gene copies, richness B. impatiens and B. griseocollis. of core bacteria, richness of non-core bacteria and habitat were fixed effects. Site was a random effect. P 0.23; Figure 2b) or the interaction between ¼ Finally, we used a generalized linear mixed model habitat type and species (F2,93 1.29; P 0.22). to determine whether absolute number of 16S rRNA Comparing across bee species, the¼ gut microbiota¼ gene copies (an index of total bacterial counts per bee) of B. bimaculatus had the lowest proportion of core was influenced by gut community richness. The bacteria (0.61±0.07 s.e., Figure 3), followed by response variable was the number of 16S rRNA genes, B. impatiens (0.74±0.09 s.e., Figure 3) with B. and it was log-transformed tocontrolforoverdisper- griseocollis having the highest (0.98±0.09 s.e.; sion. As the residuals were normally distributed Figure 3). For B. bimaculatus, the proportion of core following transformation, we used a Gaussian error bacteria was lower in semi-natural than agricultural structure. Fixed effects were richness of core bacteria, habitats (0.24±0.09 s.e. vs 0.61±0.07 s.e.), but this richness of non-core bacteria, bee species and habitat. proportion did not vary for the other two species (Figure 3). Indicator species analysis restricted to core OTUs Results revealed that different bacterial OTUs dominate gut communities of the three Bombus species. The Overall community-level measures differed among OTUs classified as core in the B. bimaculatus gut Bombus species but not among habitat types. microbiota are best characterized by two Acetobac- B. bimaculatus and B. impatiens (45.0±4.0 s.e., teraceae (Order Rhodospirillales), corresponding to 47.1±3.6 s.e., respectively) had a richer gut micro- taxa previously referred to as ‘Alpha-2.2’ (Mohr and biota than B. griseocollis (33.6±3.8, Supplementary Tebbe, 2006; Cox-Foster et al., 2007; Martinson Figure S4a). Rarefaction curves of Chao1 alpha et al., 2011, Table 1, Supplementary Table S5). We diversity estimates showed that 2500 reads reflected categorized Alpha-2.2 OTUs as core bacteria, a saturated sampling depth (Supplementary Figure because they have been found repeatedly in Apis S4b). Chao1 alpha diversity estimates for Bombus and Bombus guts in previous studies. However, gut microbiota did not differ between semi-natural unlike other core bacteria, the Alpha 2.2 OTUs are and agricultural habitats for any species also common in floral products, including nectar (Supplementary Table S6). B. griseocollis had lower and pollen. Thus, the finding that they are char- alpha diversity than the other two bee species acteristic of B. bimaculatus is consistent with the (Supplementary Table S6). Beta diversity analysis conclusion that this host species is dominated by indicated that variability in gut microbial commu- gut bacteria acquired from environmental sources nities among B. bimaculatus samples was higher (Table 1). In contrast, B. impatiens and B. griseo- than the other two species (beta-dispersion test: collis are best characterized by core bacteria that F2,93 17.52; Po0.001; Figure 1). There was no have only been sampled from the guts of Apis and difference¼ in beta diversity between habitat types Bombus (Table 1, Martinson et al., 2011). (F1,97 0.62; P 0.43). Similarly, microbial commu- We found that gut communities differed between nity composition¼ ¼ differed among species (PERMA- individuals infected and uninfected by pathogens. 2 NOVA R 0.27; F2,93 17.7; Po0.001; Figure 2a) Ten of the 99 specimens we analyzed were infected but was not¼ affected by¼ habitat types (F 1.29; with Crithidia (four impatiens, one griseocollis and 1,93 ¼

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2374

Figure 3 Estimates (means) from mixed model output showing the proportion of gut microbiota made up of core OTUs. Grey bars represent agricultural habitats while white bars represent semi- natural habitats. Error bars represent±s.e.m.

Nosema was found in 11 specimens (five impa- tiens, four griseocollis and two bimaculatus). We found no difference in gut microbial communities between hosts with and without Nosema. Interest- ingly, based on a maximum-likelihood tree using accessions from GenBank and the ones found in this study, most infections are from a novel strain of Nosema not previously represented in GenBank (Supplementary Figure S5), with an additional three individuals infected with N. ceranae. None of the specimens contained N. bombi. Indicator species analysis showed that one OTU (Snodgrassella no. 72) characterized Bombus infected by Nosema. Mixed model analysis of Nosema infection did not Figure 2 Results of NMDS (nonmetric multidimensional scal- reveal a significant relationship for Bombus species, ing) performed on bacterial gut communities, using individual habitat type or proportion of core bacteria. Bombus specimens as the replicates. Effects of (a) Bombus species The absolute number of gut bacteria per bee, and (b) habitat type. measured as the number of bacterial rRNA gene copies, varied among Bombus species. Results from the generalized linear mixed model indicated that the five bimaculatus). Indicator species analysis demon- log copy number was highest in B. impatiens strated that gut microbiota of individuals infected by (14.37±0.21 s.e., Figure 5), followed by B. griseocollis Crithidia were characterized primarily by the (14.3±0.35 s.e., Figure 5) and B. bimaculatus ‘Alpha-2.2’-related OTUs (Table 2), and those not (11.7±0.45 s.e., Figure 5). The average 30-fold differ- infected were characterized by a Gillamella OTU ence in size of gut bacterial communities between (no. 424; Table 2). Furthermore, logistic regression B. impatiens and B. bimaculatus does not correspond analysis revealed that Crithidia infection increased to a difference in worker body size, which is similar for as the richness of non-core bacteria increased the three species. The number of bacterial rRNA gene (b 0.13±0.06 s.e., P 0.024, Figure 4). Bombus copies was negatively associated with richness of species,¼ number of 16S¼ rRNA gene copies, richness non-core OTUs and positively associated with richness of core bacteria and habitat were not significant and of core OTUs (Figure 6). Habitat type was not were removed from the analysis. significant and was removed from the final model.

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2375 Table 1 Results of indicator species analysis showing the association between OTUs and host species

Indicator OTUs Association (GenBank) B. bimaculatus B. griseocollis Ind Val B. impatiens Ind Val P-Value Ind Val

Acetobacteraceae, Alpha-2.2 (418) Floral 0.56 — — 0.003 Acetobacteraceae, Alpha-2.2 (384) Floral 0.44 — — 0.002 Gilliamella (22) Apis mellifera — 0.67 — 0.001 Snodgrassella (167) Bombus bimaculatus — 0.64 — 0.001 Unclass Gammaproteobacteria (258) Bombus terrestris 0.61 — 0.001 Lactobacillaceae (362) Bombus terrestris — 0.49 — 0.001 Lactobacillaceae (4) Bombus terrestris — 0.45 — 0.001 Gilliamella (424) Bombus terrestris — — 0.87 0.001 Gilliamella (342) Bombus impatiens — — 0.77 0.001 Unclass (366) Bombus terrestris — — 0.74 0.001 Snodgrassella (16) Bombus vagans — — 0.48 0.001 Snodgrassella (394) Apis mellifera — — 0.47 0.002 Snodgrassella (447) Bombus vagans — — 0.46 0.001 Snodgrassella (72) Apis mellifera — — 0.45 0.001 Gilliamella (105) Bombus impatiens — — 0.35 0.003

In the indicator OTU column, names represent taxonomic affiliation of OTU based on GenBank BLAST search. Numbers in parentheses are study- specific OTU identifiers. Results are from only those OTUs classified as core based on GenBank searches and having indicator values 40.25 and significant at P o0.05.

Table 2 Results of indicator species analysis showing the association of OTUs with the presence or absence of Crithidia and Nosema pathogens

Pathogen Indicator OTUs Association (GenBank) Pathogen Absent Ind Val Pathogen Present Ind Val P-Value

Crithidia Gilliamella (424) Bombus terrestris 0.70 — 0.040 Crithidia Acetobacteraceae, Alpha-2.2 (418) Pollen — 0.60 0.020 Crithidia Acetobacteraceae, Alpha-2.2 (384) Floral — 0.54 0.002 Crithidia Acetobacteraceae. Alpha-2.2 (355) Apis dorsata — 0.45 0.001 Crithidia Acetobacteraceae. Alpha-2.2 (62) Apis dorsata — 0.36 0.001 Crithidia Acetobacteraceae, Alpha-2.2 (267) Pollen — 0.27 0.013 Nosema Snodgrassella (72) Apis mellifera — 0.50 0.027

In the indicator OTUs column, names represent taxonomic affiliations of OTU based on GenBank BLAST searches. Numbers in parentheses are study-specific OTU identifiers. Results are based on OTUs with indicator values 40.25 and significant at P o0.05.

Discussion deeply, resulting in higher richness estimates due to sampling effects. We used deep sequencing to characterize the Although overall microbiota richness has been composition of gut communities of three Bombus assumed to be important in microbial gut function, species in two habitat types and to test for associa- few studies have differentiated core from non-core tions between core and non-core gut bacteria, bacteria. Experimental work in locusts demon- pathogen infection and overall bacteria abundance. strated that increasing diversity of gut bacteria We defined core bacteria as OTUs corresponding to reduces susceptibility to a pathogen (Dillon et al., species primarily associated with Apis and Bombus 2005). In contrast, Koch et al. (2012) found that hosts in previous studies (Martinson et al., 2011; microbiota richness at the colony level was posi- Moran et al., 2012). OTU richness of core bacteria tively associated with Crithidia infection in bumble had a positive association with absolute numbers of bees. Our work may reconcile these seemingly bacteria. In contrast, richness of non-core bacteria disparate results, as richness of core taxa may be was negatively associated with the number of associated with healthy hosts while richness of non- bacteria per bee and positively associated with core taxa may represent dysbiosis. Crithidia presence. Non-core bacterial taxa may be We found that specific OTUs varied in their less adapted to the bee gut environment and thus association with pathogen infection. In particular, less able to proliferate there, with the result that an OTU corresponding to Gilliamella was positively Bombus individuals with a high proportion of non- associated with bees uninfected by Crithidia. In core bacteria may have lower numbers of gut honey bees, Gilliamella produces a biofilm on the bacteria. However, this mechanistic explanation ileum wall (Martinson et al., 2012; Kwong and remains untested. Also, in hosts with lower propor- Moran, 2013) and may provide a barrier to attach- tions of core bacteria and lower overall numbers of ment or entry of gut pathogens such as Crithidia. bacteria, non-core bacteria will be sampled more Further, Gilliamella had a weak, negative

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2376

Figure 4 Results of logistic regression analysis. X axis represents Figure 6 Results of mixed model analysis showing that the the richness of non-core gut microbiota. Y axis represents the richness of non-core (closed circles and solid line) bacteria OTUs probability of infection by Crithidia. Different size points is negatively associated with total number of 16S rRNA gene represent the number of samples at a given coordinate. copies per bee specimen, whereas richness of core (open circles and dashed line) bacteria OTUs is positively associated.

with Nosema infection. We note that our study cannot determine whether associations are causal. In particular, positive associations could result if transmission occurs through common routes, causing both a gut bacterium and pathogen to be transferred together. Finally, five other core bacterial OTUs, previously referred to as ‘Alpha-2.2’ within the family Acetobacteraceae, had a positive associa- tions with Crithidia infection. Although we categor- ized this group as core based on repeated retrieval from Apis and Bombus (Martinson et al., 2011; Moran et al., 2012), these bacteria are also common in nectar (Jojima et al., 2004) and solitary bees that otherwise lack core bacteria found in Apis and Bombus (Martinson et al., 2011; Koch et al., 2013). Therefore, these Acetobacteraceae OTUs may be repeatedly introduced from nectar, as suggested by studies on honey bees (Vojvodic et al., 2013). We found the largest differences in microbial communities among the three Bombus species. The Figure 5 Means of natural log-transformed numbers of bacterial most dramatic difference involved B. bimaculatus, 16S rRNA gene copies per specimen. Error bars represent±s.e.m. which had the highest variability among individuals and the lowest proportion and richness of core bacteria. In indicator species analysis, B. bimacula- tus was characterized only by the Acetobacteraceae association with Crithidia infection in B. terrestris in OTUs, which are also found in nectar (Jojima et al., Europe (Koch and Schmid-Hempel, 2011a). The 2004; Vojvodic et al., 2013). The distinctness of results presented here provide further evidence B. bimaculatus communities does not reflect phylo- that Gilliamella may confer protective benefits. genetic distance, as it is in the same subgenus as B. We found no evidence of a protective association impatiens (Pyrobombus) while B. griseocollis is in of gut bacteria against infection by Nosema, con- the subgenus Cullumanobombus. Samples of gut sistent with previous studies (Koch and Schmid- microbiota from multiple Bombus species coupled Hempel, 2011a). In fact, one OTU of the core with host traits and local ecological conditions are bacterium Snodgrassella had a positive association needed to uncover mechanisms that drive variation

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2377 within and among species. In A. mellifera workers, by an NSF-Dimensions in Biodiversity Grant No. 1046153 the proportion of core bacteria in individual guts to NAM (PI), J Evans (Co-PI) and RW (Senior Investigator) ranges from 0.95 to 40.99 (Moran et al., 2012), and Swiss National Science Foundation Grants 140157 with one species having similar (B. griseocollis and 147881 to HK. The complete data set used for this mean 0.96) and the other two Bombus species manuscript is available online through Dryad Digital Repository (doi: 10.5061/dryad.3k019). having¼ lower (B. bimaculatus mean 0.48, B. impa- tiens mean 0.76) representation of¼ core taxa, as compared with¼ A. mellifera. Potentially, these differences among species reflect differences in gut References physiology or morphology, or differences in colony life cycle, that impact transmission between hosts. Altschul SF, Madden TL, Scha¨ffer AA, Zhang J, Zhang Z, Despite the fact that bees were sampled from Miller W et al. (1997). Gapped BLAST and PSI-BLAST: farms with high agro-chemical use (Pettis et al., a new generation of protein database search programs. 2013), we found little effect of habitat type on any Nucleic Acids Res 25: 3389–3402. metric of gut microbial communities. High fungicide Anderson KE, Russell JA, Moreau CS, Kautz S, Sullam KE, Hu YI et al. (2012). Highly similar microbial commu- applications are reported to increase risk of Nosema nities are shared among related and trophically similar infection in honey bees (Pettis et al., 2013). How- ant species. Mol Ecol 21: 2282–2296. ever, the only habitat effect evident in our study was Anderson MJ. (2006). Distance-based tests for homogeneity that B. bimaculatus had a greater proportion of core of multivariate dispersions. Biometrics 62: 245–253. bacteria when collected from farms. We may have Anderson MJ. (2001). A new method for non-parametric failed to detect a larger effect of habitat on gut multivariate analysis of variance. Austral Ecol 26:32–46. communities, because cultivated cranberry is a Berg RD. (1996). The indigenous gastrointestinal micro- temporary resource with bloom lasting for approxi- flora. Trends Microbiol 4: 430–435. mately 6 weeks, whereas the Bombus species we Brown MJF, Schmid-Hempel R, Schmid-Hempel P. (2003). studied are active for up to 6 months per year. Strong context-dependent virulence in a host-parasite system: reconciling genetic evidence with theory. Potentially, these core gut bacteria are evolutionarily J Anim Ecol 72: 994–1002. conserved in host lineages and little affected by Cameron SA, Lozier JD, Strange JP, Koch JB, Cordes N, local environment. In a study of 35 Bombus species Solter LF et al. (2011). Patterns of widespread decline collected world-wide, Koch et al. (2013) found that in North American bumble bees. Proc Natl Acad Sci specific strains of core bacteria in the Snodgrassella USA 108: 662–667. group were consistently associated with particular Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, phylogenetic groups within Bombus, independently Andersen GL, Knight R. (2010). PyNAST: a flexible of geographic location. However, that study exam- tool for aligning sequences to a template alignment. ined only presence/absence and did not exclude an Bioinformatics 26: 266–267. effect of environmental conditions on relative or Cariveau DP, Williams NM, Benjamin FE, Winfree R. (2013). Response diversity to land use occurs but does absolute abundance of particular bacterial strains. not consistently stabilise ecosystem services provided The health of wild Bombus species is important to by native pollinators. Ecol Lett 16: 903–911. understand as they are among the most important Castillo M, Martin-Orue SM, Manzanilla EG, Badiola I, pollinators for many native plants as well as a Martin M, Gasa J. (2006). Quantification of total number of crops (Hegland and Totland, 2008; bacteria, enterobacteria and lactobacilli populations Garibaldi et al., 2013). For example, native bees are in pig digesta by real-time PCR. Vet Microbiol 114: important pollinators at the cranberry farms studied 165–170. here, as Bombus provide470% of total pollination Claesson MJ, van Sinderen D, O’Toole PW. (2008). from native, wild bees (Cariveau et al., 2013). Lactobacillus phylogenomics - towards a reclassifi- Several Bombus species have recently undergone cation of the genus. Int J Syst Evol Microbiol 58: 2945–2954. steep population declines, with pathogen infection Cordes N, Huang W-F, Strange JP, Cameron SA, Griswold TL, being a potential cause (Otterstatter and Thomson, Lozier JD et al. (2012). Interspecific geographic 2008; Cameron et al., 2011). Our results suggest that distribution and variation of the pathogens Nosema the community structure of gut microbiota may be bombi and Crithidia species in United States bumble important factors in, or indicators for, the health of bee populations. J Invertebr Pathol 109: 209–216. wild Bombus. Cox-Foster DL, Conlan S, Holmes EC, Palacios G, Evans JD, Moran NA et al. (2007). A metagenomic survey of microbes in honey bee colony collapse disorder. Conflict of Interest Science 318: 283–287. Di Prisco G, Cavaliere V, Annoscia D, Varricchio P, Caprio E, The authors declare no conflict of interest. Nazzi F et al. (2013). Neonicotinoid clothianidin adversely affects insect immunity and promotes Acknowledgements replication of a viral pathogen in honey bees. Proc Natl Acad Sci USA 110: 18466–18471. We thank L Fenner and Z Portman for help in collecting Dillon RJ, Vennard CT, Buckling A, Charnley AK. (2005). Bombus, J Klein for DNA extractions and P Degnan for Diversity of locust gut bacteria protects against assistance with bioinformatics. This research was funded pathogen invasion. Ecol Lett 8: 1291–1298.

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2378 Drummond AJ, Ashton B, Buxton S, Cheung M, Cooper A, intestinal parasite. Proc Natl Acad Sci USA 108: Duran C et al. (2011). Geneious v5.4 website. Available 19288–19292. fromhttp://www.geneious.comAccessed 5 Jan 2012. Koch H, Schmid-Hempel P. (2011b). Bacterial commu- Dufreˆne M, Legendre P. (1997). Species assemblages and nities in Central European : low diversity indicator species: the need for a flexible asymmetrical and high specificity. Microb Ecol 62: 121–133. approach. Ecol Monogr 67: 345–366. Koch H, Schmid-Hempel P. (2012). Gut microbiota instead Engel P, Moran NA. (2013). The gut microbiota of insects— of host genotype drive the specificity in the interaction diversity in structure and function. FEMS Microbiol of a natural host-parasite system. Ecol Lett 15: Rev 37: 699–735. 1095–1103. Engel P, Martinson VG, Moran NA. (2012). Functional Koch H, Abrol DP, Li J, Schmid-Hempel P. (2013). diversity within the simple gut microbiota of the Diversity and evolutionary patterns of bacterial gut honey bee. Proc Natl Acad Sci USA 109: 11002–11007. associates of corbiculate bees. Mol Ecol 22: 2028–2044. Erler S, Lommatzsch S, Lattorff HMG. (2012). Comparative Koch H, Cisarovsky G, Schmid-Hempel P. (2012). analysis of detection limits and specificity of Ecological effects on gut bacterial communities in molecular diagnostic markers for three pathogens wild colonies. J Anim Ecol 81: 1202–1210. (Microsporidia, Nosema spp.) in the key pollinators Kwong WK, Moran NA. (2013). Cultivation and characte- Apis mellifera and Bombus terrestris. Parasitol Res rization of the gut symbionts of honey bees and 110: 1403–1410. bumble bees: Snodgrassella alvi gen. nov., sp. nov., a Frank DN. (2009). BARCRAWL and BARTAB: software member of the Neisseriaceae family of the Beta- tools for the design and implementation of barcoded proteobacteria; and Gilliamella apicola gen. nov., sp. primers for highly multiplexed DNA sequencing. nov., a member of Orbaceae fam. nov., Orbales ord. BMC Bioinformatics 10: 362. nov., a sister taxon to the Enterobacteriales order of the Frank DN, St Amand AL, Feldman RA, Boedeker EC, Gammaproteobacteria. Int J Syst Evol Microbiol 63: Harpaz N, Pace NR. (2007). Molecular-phylogenetic 2008–2018. characterization of microbial community imbalances Li J, Chen W, Wu J, Peng W, An JD, Schmid-Hempel P et al. in human inflammatory bowel diseases. Proc Natl (2012). Diversity of Nosema associated with bumble- Acad Sci USA 104: 13780–13785. bees (Bombus spp.) from China. Int J Parasitol 42: Garibaldi LA, Steffan-Dewenter I, Winfree R, Aizen MA, 49–61. Bommarco R, Cunningham SA et al. (2013). Wild Martinson VG, Danforth BN, Minckley RL, Rueppell O, pollinators enhance fruit set of crops regardless of Tingek S, Moran NA. (2011). A simple and distinctive honey bee abundance. Science 339: 1608–1611. microbiota associated with honey bees and bumble Goulson D, Lye GC, Darvill B. (2008). Decline and bees. Mol Ecol 20: 619–628. conservation of bumble bees. Annu Rev Entomol 53: Martinson VG, Moy J, Moran NA. (2012). Establishment 191–208. of characteristic gut bacteria during development of Grixti JC, Wong LT, Cameron SA, Favret C. (2009). Decline the honeybee worker. Appl Environ Microbiol. 78: of bumble bees (Bombus) in the North American Midwest. Biol Conserv 142: 75–84. 2830–2840. Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, McFrederick QS, Cannone JJ, Gutell RR, Kellner K, Gascuel O. (2010). New algorithms and methods to Plowes RM, Mueller UG. (2013). Specificity between estimate maximum-likelihood phylogenies: assessing lactobacilli and hymenopteran hosts is the exception the performance of PhyML 3.0. Syst Biol 59: 307–321. rather than the rule. Appl Environ Microbiol 79: Hamdi C, Balloi A, Essanaa J, Crotti E, Gonella E, Raddadi N 1803–1812. et al. (2011). Gut microbiome dysbiosis and honeybee McFrederick QS, Wcislo WT, Taylor DR, Ishak HD, Dowd SE, health. J Appl Entomol 135:524–533. Mueller UG. (2012). Environment or kin: whence do bees Hebert PDN, Penton EH, Burns JM, Janzen DH, Hallwachs W. obtain acidophilic bacteria? Mol Ecol 21:1754–1768. (2004). Ten species in one: DNA barcoding reveals Miller MA, Holder MT, Vos R, Midford PE, Liebowitz T, cryptic species in the neotropical skipper butterfly Chan L et al. (eds). (2009). The CIPRES Portals. CIPRES. Astraptes fulgerator. Proc Natl Acad Sci USA 101: URL http://www.phylo.org/sub_sections/portal. 14812–14817. Mohr KI, Tebbe CC. (2006). Diversity and phylotype Hegland SJ, Totland Ø. (2008). Is the magnitude of pollen consistency of bacteria in the guts of three bee species limitation in a plant community affected by pollinator (Apoidea) at an oilseed rape field. Environ Microbiol 8: visitation and plant species specialisation levels? 258–272. Oikos 117: 883–891. Moran NA, Hansen AK, Powell JE, Sabree ZL. (2012). Huang S, Zhang H. (2013). The impact of environmental Distinctive gut microbiota of honey bees assessed heterogeneity and life stage on the hindgut microbiota using deep sampling from individual worker bees. of Holotrichia parallela larvae (Coleoptera: Scarabaei- PLoS One 7: e36393. dae). PLoS One 8: e57169. Morell V. (1996). TreeBASE: The roots of phylogeny. Jojima Y, Mihara Y, Suzuki S, Yokozeki K, Yamanaka S, Science 273: 569. Fodu R. (2004). Saccharibacter floricola gen. nov., sp. Oksanen J, Blanchet FG, Kindt R, Legendre P, Minchin PR, nov., a novel osmophilic acetic acid bacterium O’Hara RB et al. (2013). Vegan: community ecology isolated from pollen. Int J Syst Evol Microbiol 54: package. R package version 2.0-7. 2263–2267. Otterstatter MC, Thomson JD. (2008). Does pathogen Kikuchi Y, Hayatsu M, Hosokawa T, Nagayama A, Tago K, spillover from commercially reared bumble bees Fukatsu T. (2012). Symbiont-mediated insecticide threaten wild pollinators? PLoS One 3: e2771. resistance. Proc Natl Acad Sci USA 109: 8618–8622. Pettis JS, Lichtenberg AM, Andree M, Stitzinger J, Rose R, Koch H, Schmid-Hempel P. (2011a). Socially transmitted vanEngelsdorp D. (2013). Crop pollination exposes gut microbiota protect bumble bees against an honey bees to pesticides which alters their

The ISME Journal Variation in Bombus gut microbiota DP Cariveau et al 2379 susceptibility to the gut pathogen Nosema ceranae. apterus (Hemiptera, Pyrrhocoridae). Mol Ecol 21: PLoS One 8: e70182. 6134–6151. Pereira HM, Leadley PW, Proenca V, Alkemade R, Thompson JD, Higgins DG, Gibson TJ. (1994). CLUSTAL Scharlemann JPW, Fernandez-Manjarres JF et al. W: improving the sensitivity of progressive multiple (2010). Scenarios for global biodiversity in the 21st sequence alignment through sequence weighting, century. Science 330: 1496–1501. position-specific gap penalties and weight matrix Pettis JS, Lichtenberg EM, Andree M, Stitzinger J, Rose R, choice. Nucleic Acids Res 22: 4673–4680. vanEngelsdorp D. (2013). Crop pollination exposes Tyson GW, Chapman J, Hugenholtz P, Allen EE, Ram RJ, honey bees to pesticides which alters their suscept- Richardson PM et al. (2004). Community structure ibility to the gut pathogen Nosema ceranae. PLoS One and metabolism through reconstruction of microbial 8: e70182. genomes from the environment. Nature 428: 37–43. Pinheiro J, Bates D, DebRoy S, Sarkar D and the R Vojvodic S, Rehan SM, Anderson KE. (2013). Microbial Development Core Team. (2013). nlme: Linear and gut diversity of Africanized and European honey bee Nonlinear Mixed Effects Models. R Package version larval instars. PLoS One 8: e72106. 3.1-109. Wang Q, Garrity GM, Tiedje JM, Cole JR. (2007). Naı¨ve Ratnasingham S, Herbert PD. (2007). BOLD: The Barcode Bayesian classifier for rapid assignment of rRNA of Life Data System (www.barcodinglife.org). Mol Ecol sequences into the new bacterial . Appl Notes 7: 355–364. Environ Microbiol 73: 5261–5267. Ridley EV, Wong AC-N, Westmiller S, Douglas AE. (2012). Warnecke F, Luginbu¨ hl P, Ivanova N, Ghassemian M, Impact of the resident microbiota on the nutritional Richardson TH, Stege JT et al. (2007). Metagenomic phenotype of Drosophila melanogaster. PLoS One 7: and functional analysis of hindgut microbiota of e36765. a wood-feeding higher termite. Nature 450: 560–565. Roberts DW. (2012). labdsv: Ordination and Multivariate Whitehorn PR, O’Connor S, Wackers FL, Goulson D. Analysis for Ecology. R package version 1: 5–0. (2012). Neonicotinoid pesticide reduces bumble bee Sartor RB. (2008). Therapeutic correction of bacterial colony growth and queen production. Science 336: dysbiosis discovered by molecular techniques. 351–352. Proc Natl Acad Sci USA 105: 16413–16414. Williams PH, Osborne JL. (2009). Bumblebee vulnerability Schmid-Hempel R, Tognazzo M. (2010). Molecular and conservation world-wide. Apidologie 40: divergence defines two distinct lineages of Crithidia 367–387. bombi (Trypanosomatidae), parasites of bumblebees. Wong AC-N, Chaston JM, Douglas AE. (2013). The J Eukaryot Microbiol 57: 337–345. inconstant gut microbiota of Drosophila species Stamatakis A. (2006). RAxML-VI-HPC: maximum like- revealed by 16S rRNA gene analysis. ISME J 7: lihood-based phylogenetic analyses with thousands 1922–1932. of taxa and mixed models. Bioinformatics 22: Zouache K, Raharimalala FN, Raquin V, Tran-Van V, 2688–2690. Raveloson LHR, Ravelonandro P et al. (2010). Bacterial Sudakaran S, Salem H, Kost C, Kaltenpoth M. (2012). diversity of field-caught mosquitoes, Aedes albopictus Geographical and ecological stability of the symbiotic and Aedes aegypti, from different geographic regions mid-gut microbiota in European firebugs, Pyrrhocoris of Madagascar. FEMS Microbiol Ecol 75: 377–389.

Supplementary Information accompanies this paper on The ISME Journal website (http://www.nature.com/ismej)

The ISME Journal Legends for Supplementary Tables and Figures

SUPPLEMENTARY TABLES

Supplementary Table S1: Table of PCR primers and protocols

Supplementary Table S2: 454 barcoded primers used in this study

Supplementary Table S3: Summary of Pyrotag processing and read preprocessing steps.

Supplementary Table S4: Metadata mapping file used for Qiime processing and analysis of 454 reads

Supplementary Table S5: Binned taxonomies. Lineages represent increasing specificity for each operational taxonomic unit (OTU)

Supplementary Table S6: Summary table of statistical test results based on Chao 1 estimates

SUPPLEMENTARY FIGURES

Supplementary Figure S1: Map of collection sites. Filled circles represent active, conventional cranberry farms and open circles represent abandoned bogs

Supplementary Figure S2: Bombus sp. sample summary

Supplementary Figure S3: Phylogenetic trees used to designate core and non-core bacteria: a) Gammaproteobacteria: Gilliamella apicola, b) Betaproteobacteria: Snodgrassella alvi, c) Alphaproteobacteria overview, d) Lactobacillus(Firm-4, Firm-5)

Supplementary Figure S4: Chao 1 estimates comparing alpha diversity rarefaction curves by: a) species, b) habitat

Supplementary Figure S5: Phylogenetic tree of Nosema strains from present study and closely related taxa. Sequences from this study marked in red (novel Nosema phylotype), or blue (Nosema ceranae). Labels include Genbank accession numbers, specimen labels, and host species. Tree includes all putatively novel Nosema taxa found in Chinese Bombus by Li et al. (2012).

! Supplementary Figure S1

● ● ● ● ● ● ●● ● ●● ●

Supplemental Figure S5!

JN872255_Nosem

JN872253_Nosema_sp.A_Bombus

KF640596_TBR101_B.impatiens

JN872258_Nosema_sp.A_Bombus KF640605_TBR79_B.griseocollis

KF640603_TBR93_B.griseocollis

a_sp.

KF640598_TBR96_B. JN872266_Nosema_ceranaeKF640602_TBR44_B.griseocollis A_Bom KF640601_TBR69_B.impatiens

U26533_Nosema_ceranaeKF640594_TBR98_B.griseocollis

impatiens bus KF640595_TBR64_B.impatiens 99 53 JN872252_Nosema_sp.B_Bombus JN872239_Nosema_sp.B_Bombus 56 94 56 EU219086_ JN872259_Nosema_sp.B_Bombus KF640605_TBR82_B.bimaculatus 67 62

76 aculatus HQ399665_Nosema_sp._Pieris 99 96 KF640600_TBR91_B.griseocollism Nosema_thomsoni

AY 100 KF640599_TBR106_B.impatiens 008373

JN872279_Nosema_sp.C_Bombus _Nosema_bombi JN87222 D85501_Nosema_sp._Bombyx_mori 9_Nosema_sp.D_ JN8 KF640597_TBR7_B.bi Bombus 72293_Nosema JN87222 3_No _Vairimorpha_lymantria U11047_Nosema_vespula sem AF033316_Nosema_portugal a_sp.D_Bomb

_sp us .C_ AF033315 Bombus DQ272237_Vairimorpha_disparis U26534_Nosema_ap

is

0.02 Supplementary Table S1: Table of PCR primers and protocols Purpose Forward Reverse Cycling Protocol Chemistry Reference 94°C 5 min x1 94°C 30 sec PCR reactions were conducted in 20 60-50°C (-1°C/cycle) 30 sec x10 μL reactions of 0.8 U Taq DNA 72°C 1 min Polymerase (New England BioLabs), 16S QC 27F'-HT (5'-AGRGTTTGATYMTGGCTCAG-3') 1492R'-HT (5'-GGYTACCTTGTTACGACTT- 3') 400 μM dNTPs (5prime), 1x PCR (Tyson et al., 2004) 94°C 30 sec buffer (New England BioLabs), 0.5 50°C 30 sec x30 μM concentrations of each primer and 72°C 1 min ~50ng DNA. 72°C 10 min x1

94°C 4 min x1 94°C 30 sec PCR reactions were conducted in 20 57-47°C (-1°C/cycle) 30 sec x10 μL reactions of 0.8 U Taq DNA 72°C 1 min Polymerase (New England BioLabs), Cytochrome Oxidase I ID LepF1 (5'-ATTCAACCAATCATAAAGATAT-3') LepR1 (5'-TAAACTTCTGGATGTCCAAAAA-3') 94°C 30 sec 400 μM dNTPs (5prime), 1x PCR (Hebert et al., 2004) buffer (New England BioLabs), 0.5 47°C 30 sec x30 μM concentrations of each primer and 72°C 1 min ~50ng DNA. 72°C 10 min x1

95°C 5 min x1 Approximately 50 ng of each sample (5'cctatcccctgtgtgccttggcagtctcag TC (5'ccatctcatccctgcgtgtctccgactca gNNNNN was amplified in triplicate in 20 ul 338GF_TBF 1492R_TBRn 98°C 20 sec ACTCCTACGGGAGGCAGCAG-3') NNNTCTACGGYTACCTTGTTACGACTT-3') reactions with 0.4 U KAPA HiFi HotStart 16S 454 barcoding 62°C 15 sec x30 DNA Polymerase (Kappa Biosystems), 1x This study GC Buffer 0.3 μM each primer, 0.3 mM 72°C 30 min Sequences in italics are the Titanium adapters, the underlined text is the Lib-L key, ‘NNNNNNNN’ is a variable 8 base barcode segment, the 2 each dNTP (Kappa Biosystems), 2.5 mM base pair linkers are denoted in bold and the sequences in all caps are the universal primers 72°C 5 min x1 Mg2+, and 1M betaine (Sigma).

95°C 10 min x1

95°C 15 sec PCR reactions were conducted in 10 65-60°C (-1°C/cycle) 15 sec x5 μL reactions of 5 μl 2x Roche Fast Start 16S quantitative PCR 27F (5’-AGAGTTTGATCCTGGCTCAG-3’) 355R (5’-CTGCTGCCTCCCGTAGGAGT-3’) 68°C 20 sec SYBR Master mix, 300 nM (Castillo et al., 2006) 95°C 15 sec concentrations of each primer and ~50ng DNA. 60°C 15 sec x35 68°C 20 sec

95˚C 5 min x1 PCR reactions were conducted in 12.5 μL 95˚C 1 min reactions of 0.5 U Taq DNA Polymerase (New England BioLabs), 250 μM dNTPs Nosema SSU rRNA gene SSUrRNA-f1 (5‘-CACCAGGTTGATTCTGCC-3‘) SSUrRNA-r1b (5‘-TGTTCGTCCAGTCAGGGTCGTCA-3‘) 64˚C 30 sec x40 (5prime), 1x PCR buffer (New England (Li et al. 2012) 72˚C 30 sec BioLabs), 0.5 μM concentrations of each 72˚C 5 min x1 primer and 1.25µl template.

95˚C 5 min x1 PCR reactions were conducted in 12.5 μL 95˚C 30 sec reactions of 0.5 U Taq DNA Polymerase Nosema bombi SSU rRNA (New England BioLabs), 250 μM dNTPs N.b.af (5‘-TGCGGCTTAATTTGACTC-3‘) N.b.ar (5‘-GGGTAATGACATACAAACAAAC-3‘) 55˚C 30 sec x40 (Erler et al. 2012) gene (5prime), 1x PCR buffer (New England 72˚C 30 sec BioLabs), 0.5 μM concentrations of each 72˚C 5 min x1 primer and 1.25µl template.

95˚C 5 min x1 PCR reactions were conducted in 12.5 μL 95˚C 30 sec reactions of 0.5 U Taq DNA Polymerase (New England BioLabs), 250 μM dNTPs (Schmid-Hempel & Crithidia ITS CrithITS1f (5‘-GGAAACCACGGAATCACATAGACC-3‘) CrithITS1r (5‘-AGGAAGCCAAGTCATCCATCGC-3‘) 55˚C 30 sec x40 (5prime), 1x PCR buffer (New England Tognazzo 2010) 72˚C 1 min BioLabs), 0.5 μM concentrations of each 72˚C 5 min x1 primer and 1.25µl template. Supplementary Table S2: 454 barcoded primers used in this study Common Forward 338GF_TBF 5'- cctatcccctgtgtgccttggcagtctcagTCACTCCTACGGGAGGCAGCAG -3

Barcoded reverse TBR1 5' ccatctcatccctgcgtgtctccgactcag ACGTACGT TC tacggYtaccttgttacgactt 3' TBR2 5' ccatctcatccctgcgtgtctccgactcag TATACGTC TC tacggYtaccttgttacgactt 3' TBR3 5' ccatctcatccctgcgtgtctccgactcag TACAGTAC TC tacggYtaccttgttacgactt 3' TBR4 5' ccatctcatccctgcgtgtctccgactcag TAGTACAC TC tacggYtaccttgttacgactt 3' TBR5 5' ccatctcatccctgcgtgtctccgactcag ACTACGAC TC tacggYtaccttgttacgactt 3' TBR6 5' ccatctcatccctgcgtgtctccgactcag ACGTAGAC TC tacggYtaccttgttacgactt 3' TBR7 5' ccatctcatccctgcgtgtctccgactcag AGTAGTAC TC tacggYtaccttgttacgactt 3' TBR8 5' ccatctcatccctgcgtgtctccgactcag TATAGACG TC tacggYtaccttgttacgactt 3' TBR9 5' ccatctcatccctgcgtgtctccgactcag TACTACTG TC tacggYtaccttgttacgactt 3' TBR10 5' ccatctcatccctgcgtgtctccgactcag TACTCTAG TC tacggYtaccttgttacgactt 3' TBR11 5' ccatctcatccctgcgtgtctccgactcag TACGTATG TC tacggYtaccttgttacgactt 3' TBR12 5' ccatctcatccctgcgtgtctccgactcag TCTACTAG TC tacggYtaccttgttacgactt 3' TBR13 5' ccatctcatccctgcgtgtctccgactcag TGTGTACG TC tacggYtaccttgttacgactt 3' TBR14 5' ccatctcatccctgcgtgtctccgactcag TGACGTAG TC tacggYtaccttgttacgactt 3' TBR15 5' ccatctcatccctgcgtgtctccgactcag ACTATACG TC tacggYtaccttgttacgactt 3' TBR16 5' ccatctcatccctgcgtgtctccgactcag ACGACTCG TC tacggYtaccttgttacgactt 3' TBR17 5' ccatctcatccctgcgtgtctccgactcag TATCGCGT TC tacggYtaccttgttacgactt 3' TBR18 5' ccatctcatccctgcgtgtctccgactcag TACTGTCT TC tacggYtaccttgttacgactt 3' TBR19 5' ccatctcatccctgcgtgtctccgactcag TACACACT TC tacggYtaccttgttacgactt 3' TBR20 5' ccatctcatccctgcgtgtctccgactcag TACGACAT TC tacggYtaccttgttacgactt 3' TBR21 5' ccatctcatccctgcgtgtctccgactcag TAGTCTGT TC tacggYtaccttgttacgactt 3' TBR22 5' ccatctcatccctgcgtgtctccgactcag TAGCTAGT TC tacggYtaccttgttacgactt 3' TBR23 5' ccatctcatccctgcgtgtctccgactcag TCTAGTCT TC tacggYtaccttgttacgactt 3' TBR24 5' ccatctcatccctgcgtgtctccgactcag TCTCTACT TC tacggYtaccttgttacgactt 3' TBR25 5' ccatctcatccctgcgtgtctccgactcag TCTGACGT TC tacggYtaccttgttacgactt 3' TBR26 5' ccatctcatccctgcgtgtctccgactcag TCACGAGT TC tacggYtaccttgttacgactt 3' TBR27 5' ccatctcatccctgcgtgtctccgactcag TGTACTGT TC tacggYtaccttgttacgactt 3' TBR28 5' ccatctcatccctgcgtgtctccgactcag TGCTACGT TC tacggYtaccttgttacgactt 3' TBR29 5' ccatctcatccctgcgtgtctccgactcag ATACTCGT TC tacggYtaccttgttacgactt 3' TBR30 5' ccatctcatccctgcgtgtctccgactcag ACTACAGT TC tacggYtaccttgttacgactt 3' TBR31 5' ccatctcatccctgcgtgtctccgactcag ACGTCACT TC tacggYtaccttgttacgactt 3' TBR32 5' ccatctcatccctgcgtgtctccgactcag ACGTGTAT TC tacggYtaccttgttacgactt 3' TBR33 5' ccatctcatccctgcgtgtctccgactcag AGTATCGT TC tacggYtaccttgttacgactt 3' TBR34 5' ccatctcatccctgcgtgtctccgactcag CGTATACT TC tacggYtaccttgttacgactt 3' TBR35 5' ccatctcatccctgcgtgtctccgactcag CGAGTAGT TC tacggYtaccttgttacgactt 3' TBR36 5' ccatctcatccctgcgtgtctccgactcag TACTAGCA TC tacggYtaccttgttacgactt 3' TBR37 5' ccatctcatccctgcgtgtctccgactcag TACTGCGA TC tacggYtaccttgttacgactt 3' TBR38 5' ccatctcatccctgcgtgtctccgactcag TACGCTCA TC tacggYtaccttgttacgactt 3' TBR39 5' ccatctcatccctgcgtgtctccgactcag TAGCACGA TC tacggYtaccttgttacgactt 3' TBR40 5' ccatctcatccctgcgtgtctccgactcag TCTACACA TC tacggYtaccttgttacgactt 3' TBR41 5' ccatctcatccctgcgtgtctccgactcag TCTCTCGA TC tacggYtaccttgttacgactt 3' TBR42 5' ccatctcatccctgcgtgtctccgactcag TCTCAGTA TC tacggYtaccttgttacgactt 3' TBR43 5' ccatctcatccctgcgtgtctccgactcag TCAGTAGA TC tacggYtaccttgttacgactt 3' TBR44 5' ccatctcatccctgcgtgtctccgactcag TCAGACTA TC tacggYtaccttgttacgactt 3' TBR45 5' ccatctcatccctgcgtgtctccgactcag TGTAGCTA TC tacggYtaccttgttacgactt 3' TBR46 5' ccatctcatccctgcgtgtctccgactcag TGTCGACA TC tacggYtaccttgttacgactt 3' TBR47 5' ccatctcatccctgcgtgtctccgactcag TGTGCGTA TC tacggYtaccttgttacgactt 3' TBR48 5' ccatctcatccctgcgtgtctccgactcag ATACGTGA TC tacggYtaccttgttacgactt 3' TBR49 5' ccatctcatccctgcgtgtctccgactcag ATAGCGTA TC tacggYtaccttgttacgactt 3' TBR50 5' ccatctcatccctgcgtgtctccgactcag ATCGTCTA TC tacggYtaccttgttacgactt 3' TBR51 5' ccatctcatccctgcgtgtctccgactcag ACTAGTGA TC tacggYtaccttgttacgactt 3' TBR52 5' ccatctcatccctgcgtgtctccgactcag ACTCGCTA TC tacggYtaccttgttacgactt 3' TBR53 5' ccatctcatccctgcgtgtctccgactcag ACGTATCA TC tacggYtaccttgttacgactt 3' TBR54 5' ccatctcatccctgcgtgtctccgactcag ACGCTATA TC tacggYtaccttgttacgactt 3' TBR55 5' ccatctcatccctgcgtgtctccgactcag AGTACTCA TC tacggYtaccttgttacgactt 3' TBR56 5' ccatctcatccctgcgtgtctccgactcag AGTGTAGA TC tacggYtaccttgttacgactt 3' TBR57 5' ccatctcatccctgcgtgtctccgactcag CTACTGTA TC tacggYtaccttgttacgactt 3' TBR58 5' ccatctcatccctgcgtgtctccgactcag CACGACGA TC tacggYtaccttgttacgactt 3' TBR59 5' ccatctcatccctgcgtgtctccgactcag CGTACAGA TC tacggYtaccttgttacgactt 3' TBR60 5' ccatctcatccctgcgtgtctccgactcag CGTCTCTA TC tacggYtaccttgttacgactt 3' TBR61 5' ccatctcatccctgcgtgtctccgactcag CGACTACA TC tacggYtaccttgttacgactt 3' TBR62 5' ccatctcatccctgcgtgtctccgactcag GTCTACGA TC tacggYtaccttgttacgactt 3' TBR63 5' ccatctcatccctgcgtgtctccgactcag GACGAGTA TC tacggYtaccttgttacgactt 3' TBR64 5' ccatctcatccctgcgtgtctccgactcag TATATCGC TC tacggYtaccttgttacgactt 3' TBR65 5' ccatctcatccctgcgtgtctccgactcag TATCTGAC TC tacggYtaccttgttacgactt 3' TBR66 5' ccatctcatccctgcgtgtctccgactcag TATCGATC TC tacggYtaccttgttacgactt 3' TBR67 5' ccatctcatccctgcgtgtctccgactcag TATGACTC TC tacggYtaccttgttacgactt 3' TBR68 5' ccatctcatccctgcgtgtctccgactcag TATGCTAC TC tacggYtaccttgttacgactt 3' TBR69 5' ccatctcatccctgcgtgtctccgactcag TACTCATC TC tacggYtaccttgttacgactt 3' TBR70 5' ccatctcatccctgcgtgtctccgactcag TCTAGAGC TC tacggYtaccttgttacgactt 3' TBR71 5' ccatctcatccctgcgtgtctccgactcag TCTCGCAC TC tacggYtaccttgttacgactt 3' TBR72 5' ccatctcatccctgcgtgtctccgactcag TCTGTATC TC tacggYtaccttgttacgactt 3' TBR73 5' ccatctcatccctgcgtgtctccgactcag TCAGCGAC TC tacggYtaccttgttacgactt 3' TBR74 5' ccatctcatccctgcgtgtctccgactcag TGTCTAGC TC tacggYtaccttgttacgactt 3' TBR75 5' ccatctcatccctgcgtgtctccgactcag TGATACTC TC tacggYtaccttgttacgactt 3' TBR76 5' ccatctcatccctgcgtgtctccgactcag TGATCTAC TC tacggYtaccttgttacgactt 3' TBR77 5' ccatctcatccctgcgtgtctccgactcag TGAGTCAC TC tacggYtaccttgttacgactt 3' TBR78 5' ccatctcatccctgcgtgtctccgactcag TGCTAGAC TC tacggYtaccttgttacgactt 3' TBR79 5' ccatctcatccctgcgtgtctccgactcag ATACACTC TC tacggYtaccttgttacgactt 3' TBR80 5' ccatctcatccctgcgtgtctccgactcag ATAGTAGC TC tacggYtaccttgttacgactt 3' TBR81 5' ccatctcatccctgcgtgtctccgactcag ATCACTAC TC tacggYtaccttgttacgactt 3' TBR82 5' ccatctcatccctgcgtgtctccgactcag ACTATCTC TC tacggYtaccttgttacgactt 3' TBR83 5' ccatctcatccctgcgtgtctccgactcag ACGTCTGC TC tacggYtaccttgttacgactt 3' TBR84 5' ccatctcatccctgcgtgtctccgactcag ACGTGCTC TC tacggYtaccttgttacgactt 3' TBR85 5' ccatctcatccctgcgtgtctccgactcag ACGATAGC TC tacggYtaccttgttacgactt 3' TBR86 5' ccatctcatccctgcgtgtctccgactcag AGTGACGC TC tacggYtaccttgttacgactt 3' TBR87 5' ccatctcatccctgcgtgtctccgactcag AGAGTGTC TC tacggYtaccttgttacgactt 3' TBR88 5' ccatctcatccctgcgtgtctccgactcag CTAGATAC TC tacggYtaccttgttacgactt 3' TBR89 5' ccatctcatccctgcgtgtctccgactcag CTCTCGTC TC tacggYtaccttgttacgactt 3' TBR90 5' ccatctcatccctgcgtgtctccgactcag CGACAGTC TC tacggYtaccttgttacgactt 3' TBR91 5' ccatctcatccctgcgtgtctccgactcag CGCTGTAC TC tacggYtaccttgttacgactt 3' TBR92 5' ccatctcatccctgcgtgtctccgactcag GTCTATAC TC tacggYtaccttgttacgactt 3' TBR93 5' ccatctcatccctgcgtgtctccgactcag GCACGTAC TC tacggYtaccttgttacgactt 3' TBR94 5' ccatctcatccctgcgtgtctccgactcag TATCTCTG TC tacggYtaccttgttacgactt 3' TBR95 5' ccatctcatccctgcgtgtctccgactcag TACATCAG TC tacggYtaccttgttacgactt 3' TBR96 5' ccatctcatccctgcgtgtctccgactcag TAGTGATG TC tacggYtaccttgttacgactt 3' TBR97 5' ccatctcatccctgcgtgtctccgactcag TCTATGTG TC tacggYtaccttgttacgactt 3' TBR98 5' ccatctcatccctgcgtgtctccgactcag TCTGTCAG TC tacggYtaccttgttacgactt 3' TBR99 5' ccatctcatccctgcgtgtctccgactcag TCATACAG TC tacggYtaccttgttacgactt 3' TBR100 5' ccatctcatccctgcgtgtctccgactcag TCATCGTG TC tacggYtaccttgttacgactt 3' TBR101 5' ccatctcatccctgcgtgtctccgactcag TCACATCG TC tacggYtaccttgttacgactt 3' TBR102 5' ccatctcatccctgcgtgtctccgactcag TCAGTGCG TC tacggYtaccttgttacgactt 3' TBR103 5' ccatctcatccctgcgtgtctccgactcag TGTGATAG TC tacggYtaccttgttacgactt 3' TBR104 5' ccatctcatccctgcgtgtctccgactcag TGATAGCG TC tacggYtaccttgttacgactt 3' TBR105 5' ccatctcatccctgcgtgtctccgactcag TGACTATG TC tacggYtaccttgttacgactt 3' TBR106 5' ccatctcatccctgcgtgtctccgactcag ATATCACG TC tacggYtaccttgttacgactt 3' TBR107 5' ccatctcatccctgcgtgtctccgactcag ATACAGCG TC tacggYtaccttgttacgactt 3' TBR108 5' ccatctcatccctgcgtgtctccgactcag ATCTATCG TC tacggYtaccttgttacgactt 3' TBR109 5' ccatctcatccctgcgtgtctccgactcag ATCGACAG TC tacggYtaccttgttacgactt 3' TBR110 5' ccatctcatccctgcgtgtctccgactcag ACTGATCG TC tacggYtaccttgttacgactt 3' Supplementary Table S3: Summary of pyrotag processing and read preprocessing steps. Pyrotag processing summary

Combined reads step reads avg length raw reads 1 143,699 564 bp reads 1 passing quality 126,417 raw reads 2 1,027,453 674 bp reads 2 passing quality 673,791 combined quality reads 800,208 858 bp

7845.18 average reads per sample (206-19,738) final filtered reads (see step 792,520 4 below)

average final reads per 7769.8 sample (185-19,717)

OTUs step OTUs number of reads

initial OTUs 523 800208

filtered OTUs (post removal of singletons, 18S, chloroplast, chimeric 352 792520 sequences and alignment failures)

Read preprocessing steps performed in Qiime 1.3-1.4

.sff and qual files from 1/8 plate and full plate downloaded from step 1 Georgia Genomics Facility to Yale OMEGA cluster. Files decompressed

step 2 Qiime utilized to perform denoising of each plate. Resultant reads then concatenated

OTU clustering at 0.97, RDP taxonomy assigned at >0.8 (via RDP step 3 classifier 2.2), representative sequences aligned (with PyNAST), lanemask applied

Singletons, plastid, 18S, chimeras screened (via ChimeraSlayer from Reference: Haas BJ, Gevers D, Earl AM, Feldgarden M, Ward DV, Giannoukos G, et al.2011. Chimeric 16S rRNA sequence formation and step 4 Haas et al. 2011) and removed from representative sequences. OTU table produced, reads aligned , reconstruction of tree detection in Sanger and 454-pyrosequenced PCR amplicons. Genome Research 21:494-504.

For alpha and beta analysis, Apis and K12 process controls were removed as was a single Bombus sample with low read coverage step 5 (TBR10) and table was subsampled without replacement at a depth of 2500 sequences

Resultant OTU table and representative sequences used to perform operations detailed in manuscript (e.g. Alpha and Beta diversity, BLAST based binning of OTUs to taxonomic groups, creation of abundance profiles, reconstruction of trees, statistical analysis etc.) Supplementary Table S4: Metadata mapping file used for Qiime processing and analysis of 454 reads #SampleID BarcodeSequence LinkerPrimerSequence Condition TBR1 ACGTACGT TCTACGGYTACCTTGTTACGACTT FARM TBR2 TATACGTC TCTACGGYTACCTTGTTACGACTT FARM TBR3 TACAGTAC TCTACGGYTACCTTGTTACGACTT FARM TBR4 TAGTACAC TCTACGGYTACCTTGTTACGACTT FARM TBR5 ACTACGAC TCTACGGYTACCTTGTTACGACTT FARM TBR6 ACGTAGAC TCTACGGYTACCTTGTTACGACTT FARM TBR7 AGTAGTAC TCTACGGYTACCTTGTTACGACTT BOG TBR8 TATAGACG TCTACGGYTACCTTGTTACGACTT BOG TBR9 TACTACTG TCTACGGYTACCTTGTTACGACTT FARM TBR10 TACTCTAG TCTACGGYTACCTTGTTACGACTT FARM TBR11 TACGTATG TCTACGGYTACCTTGTTACGACTT FARM TBR12 TCTACTAG TCTACGGYTACCTTGTTACGACTT FARM TBR13 TGTGTACG TCTACGGYTACCTTGTTACGACTT FARM TBR14 TGACGTAG TCTACGGYTACCTTGTTACGACTT FARM TBR15 ACTATACG TCTACGGYTACCTTGTTACGACTT BOG TBR16 ACGACTCG TCTACGGYTACCTTGTTACGACTT BOG TBR17 TATCGCGT TCTACGGYTACCTTGTTACGACTT FARM TBR18 TACTGTCT TCTACGGYTACCTTGTTACGACTT FARM TBR19 TACACACT TCTACGGYTACCTTGTTACGACTT FARM TBR20 TACGACAT TCTACGGYTACCTTGTTACGACTT FARM TBR21 TAGTCTGT TCTACGGYTACCTTGTTACGACTT FARM TBR22 TAGCTAGT TCTACGGYTACCTTGTTACGACTT FARM TBR23 TCTAGTCT TCTACGGYTACCTTGTTACGACTT BOG TBR24 TCTCTACT TCTACGGYTACCTTGTTACGACTT BOG TBR25 TCTGACGT TCTACGGYTACCTTGTTACGACTT FARM TBR26 TCACGAGT TCTACGGYTACCTTGTTACGACTT FARM TBR27 TGTACTGT TCTACGGYTACCTTGTTACGACTT FARM TBR28 TGCTACGT TCTACGGYTACCTTGTTACGACTT FARM TBR29 ATACTCGT TCTACGGYTACCTTGTTACGACTT FARM TBR30 ACTACAGT TCTACGGYTACCTTGTTACGACTT FARM TBR31 ACGTCACT TCTACGGYTACCTTGTTACGACTT BOG TBR32 ACGTGTAT TCTACGGYTACCTTGTTACGACTT BOG TBR33 AGTATCGT TCTACGGYTACCTTGTTACGACTT FARM TBR34 CGTATACT TCTACGGYTACCTTGTTACGACTT FARM TBR35 CGAGTAGT TCTACGGYTACCTTGTTACGACTT FARM TBR36 TACTAGCA TCTACGGYTACCTTGTTACGACTT FARM TBR37 TACTGCGA TCTACGGYTACCTTGTTACGACTT FARM TBR38 TACGCTCA TCTACGGYTACCTTGTTACGACTT FARM TBR39 TAGCACGA TCTACGGYTACCTTGTTACGACTT BOG TBR40 TCTACACA TCTACGGYTACCTTGTTACGACTT FARM TBR41 TCTCTCGA TCTACGGYTACCTTGTTACGACTT FARM TBR42 TCTCAGTA TCTACGGYTACCTTGTTACGACTT FARM TBR43 TCAGTAGA TCTACGGYTACCTTGTTACGACTT FARM TBR44 TCAGACTA TCTACGGYTACCTTGTTACGACTT FARM TBR45 TGTAGCTA TCTACGGYTACCTTGTTACGACTT FARM TBR46 TGTCGACA TCTACGGYTACCTTGTTACGACTT FARM TBR47 TGTGCGTA TCTACGGYTACCTTGTTACGACTT BOG TBR48 ATACGTGA TCTACGGYTACCTTGTTACGACTT BOG TBR49 ATAGCGTA TCTACGGYTACCTTGTTACGACTT FARM TBR50 ATCGTCTA TCTACGGYTACCTTGTTACGACTT FARM TBR51 ACTAGTGA TCTACGGYTACCTTGTTACGACTT FARM TBR52 ACTCGCTA TCTACGGYTACCTTGTTACGACTT FARM TBR53 ACGTATCA TCTACGGYTACCTTGTTACGACTT FARM TBR54 ACGCTATA TCTACGGYTACCTTGTTACGACTT FARM TBR55 AGTACTCA TCTACGGYTACCTTGTTACGACTT BOG TBR56 AGTGTAGA TCTACGGYTACCTTGTTACGACTT BOG TBR57 CTACTGTA TCTACGGYTACCTTGTTACGACTT FARM TBR58 CACGACGA TCTACGGYTACCTTGTTACGACTT FARM TBR59 CGTACAGA TCTACGGYTACCTTGTTACGACTT FARM TBR60 CGTCTCTA TCTACGGYTACCTTGTTACGACTT FARM TBR61 CGACTACA TCTACGGYTACCTTGTTACGACTT FARM TBR62 GTCTACGA TCTACGGYTACCTTGTTACGACTT FARM TBR63 GACGAGTA TCTACGGYTACCTTGTTACGACTT BOG TBR64 TATATCGC TCTACGGYTACCTTGTTACGACTT BOG TBR65 TATCTGAC TCTACGGYTACCTTGTTACGACTT FARM TBR66 TATCGATC TCTACGGYTACCTTGTTACGACTT FARM TBR67 TATGACTC TCTACGGYTACCTTGTTACGACTT FARM TBR68 TATGCTAC TCTACGGYTACCTTGTTACGACTT FARM TBR69 TACTCATC TCTACGGYTACCTTGTTACGACTT FARM TBR70 TCTAGAGC TCTACGGYTACCTTGTTACGACTT FARM TBR71 TCTCGCAC TCTACGGYTACCTTGTTACGACTT BOG TBR72 TCTGTATC TCTACGGYTACCTTGTTACGACTT BOG TBR73 TCAGCGAC TCTACGGYTACCTTGTTACGACTT BOG TBR74 TGTCTAGC TCTACGGYTACCTTGTTACGACTT BOG TBR75 TGATACTC TCTACGGYTACCTTGTTACGACTT BOG TBR76 TGATCTAC TCTACGGYTACCTTGTTACGACTT BOG TBR77 TGAGTCAC TCTACGGYTACCTTGTTACGACTT BOG TBR78 TGCTAGAC TCTACGGYTACCTTGTTACGACTT BOG TBR79 ATACACTC TCTACGGYTACCTTGTTACGACTT BOG TBR80 ATAGTAGC TCTACGGYTACCTTGTTACGACTT BOG TBR81 ATCACTAC TCTACGGYTACCTTGTTACGACTT BOG TBR82 ACTATCTC TCTACGGYTACCTTGTTACGACTT BOG TBR83 ACGTCTGC TCTACGGYTACCTTGTTACGACTT BOG TBR84 ACGTGCTC TCTACGGYTACCTTGTTACGACTT BOG TBR85 ACGATAGC TCTACGGYTACCTTGTTACGACTT BOG TBR86 AGTGACGC TCTACGGYTACCTTGTTACGACTT BOG TBR87 AGAGTGTC TCTACGGYTACCTTGTTACGACTT BOG TBR88 CTAGATAC TCTACGGYTACCTTGTTACGACTT BOG TBR89 CTCTCGTC TCTACGGYTACCTTGTTACGACTT BOG TBR90 CGACAGTC TCTACGGYTACCTTGTTACGACTT BOG TBR91 CGCTGTAC TCTACGGYTACCTTGTTACGACTT BOG TBR92 GTCTATAC TCTACGGYTACCTTGTTACGACTT BOG TBR93 GCACGTAC TCTACGGYTACCTTGTTACGACTT BOG TBR94 TATCTCTG TCTACGGYTACCTTGTTACGACTT BOG TBR95 TACATCAG TCTACGGYTACCTTGTTACGACTT BOG TBR96 TAGTGATG TCTACGGYTACCTTGTTACGACTT BOG TBR97 TCTATGTG TCTACGGYTACCTTGTTACGACTT BOG TBR98 TCTGTCAG TCTACGGYTACCTTGTTACGACTT BOG TBR99 TCATACAG TCTACGGYTACCTTGTTACGACTT BOG TBR100 TCATCGTG TCTACGGYTACCTTGTTACGACTT BOG TBR101 TCACATCG TCTACGGYTACCTTGTTACGACTT BOG TBR102 TCAGTGCG TCTACGGYTACCTTGTTACGACTT BOG TBR103 TGTGATAG TCTACGGYTACCTTGTTACGACTT BOG TBR104 TGATAGCG TCTACGGYTACCTTGTTACGACTT BOG TBR105 TGACTATG TCTACGGYTACCTTGTTACGACTT BOG TBR106 ATATCACG TCTACGGYTACCTTGTTACGACTT BOG TBR107 ATACAGCG TCTACGGYTACCTTGTTACGACTT BOG TBR108 ATCTATCG TCTACGGYTACCTTGTTACGACTT BOG TBR109 ATCGACAG TCTACGGYTACCTTGTTACGACTT NA TBR110 ACTGATCG TCTACGGYTACCTTGTTACGACTT NA Supplementary Table S4: Metadata mapping file used for Qiime processing and analysis of 454 reads Description bimaculatis_BU.2 bimaculatis_CU.1 bimaculatis_HA.1 bimaculatis_JJ.1 bimaculatis_MO.2 bimaculatis_WH.1 bimaculatis_NB1.7 bimaculatis_NB2.1 bimaculatis_BU.3 bimaculatis_CU.7 bimaculatis_HA.2 bimaculatis_JJ.2 bimaculatis_MO.3 bimaculatis_WH.2 bimaculatis_NB1.5 bimaculatis_NB2.2 bimaculatis_BU.4 bimaculatis_CU.3 bimaculatis_HA.3 bimaculatis_JJ.3 bimaculatis_MO.4 bimaculatis_WH.7 bimaculatis_NB1.4 bimaculatis_NB2.3 griseocollis_BU.2 griseocollis_CU.1 griseocollis_HA.1 griseocollis_JJ.1 griseocollis_MO.1 griseocollis_WH.1 griseocollis_NB1.1 griseocollis_NB2.4 griseocollis_BU.3 griseocollis_CU.2 griseocollis_HA.2 griseocollis_JJ.2 griseocollis_MO.2 griseocollis_WH.2 griseocollis_NB1.2 griseocollis_NB2.4 griseocollis_BU.6 griseocollis_CU.3 griseocollis_HA.3 griseocollis_JJ.3 griseocollis_MO.4 griseocollis_WH.3 griseocollis_NB1.3 griseocollis_NB2.6 impatiens_BU.1 impatiens_CU.1 impatiens_HA.1 impatiens_JJ.1 impatiens_MO.1 impatiens_WH.1 impatiens_NB1.6 impatiens_NB2.2 impatiens_BU.2 impatiens_CU.2 impatiens_HA.2 impatiens_JJ.2 impatiens_MO.2 impatiens_WH.2 impatiens_NB1.7 impatiens_NB2.3 impatiens_BU.4 impatiens_CU.3 impatiens_HA.3 impatiens_JJ.3 impatiens_MO.3 impatiens_WH.3 impatiens_NB1.9 impatiens_NB2.5 bimaculatis_NB3.1 bimaculatis_NB4.1 bimaculatis_NB5.1 bimaculatis_NB6.1 griseocollis_NB3.1 griseocollis_NB4.1 griseocollis_NB5.1 bimaculatis_NB3.2 bimaculatis_NB4.2 bimaculatis_NB5.2 bimaculatis_NB6.2 griseocollis_NB3.2 griseocollis_NB4.2 griseocollis_NB5.2 bimaculatis_NB3.3 bimaculatis_NB4.3 bimaculatis_NB5.3 bimaculatis_NB6.3 griseocollis_NB3.3 griseocollis_NB4.3 griseocollis_NB5.3 impatiens_NB3.4 impatiens_NB4.1 impatiens_NB5.1 impatiens_NB6.4 griseocollis_NB6.1 impatiens_NB3.6 impatiens_NB4.3 impatiens_NB5.2 impatiens_NB6.5 griseocollis_NB6.3 impatiens_NB3.7 impatiens_NB4.7 impatiens_NB5.3 impatiens_NB6.8 griseocollis_NB6.8 A_mellifera125.3 K12 Supplementary Table S5 – Binned taxonomies. Lineages represent increasing specificity for each operational taxonomic unit (OTU ). Percent Identity GenBank Number of Top OTU of OTU to Top Lineage 1 BLASTn Hit BLASTn Hit 363 EF221579.1 99% Bacteria 304 JN695023.1 98% Bacteria 490 JN695023.1 95% Bacteria 33 AB491757.1 95% Bacteria 305 AB491757.1 95% Bacteria 334 NR_042540.1 97% Bacteria 366 JQ388908.1 94% Bacteria 313 AB689757.1 94% Bacteria 517 GQ915078.1 96% Bacteria 276 HE590767.1 96% Bacteria 4 JQ388898.1 96% Bacteria 34 HM534813.1 94% Bacteria 93 AB523780.1 96% Bacteria 110 AB572592.1 97% Bacteria 131 JQ009345.1 95% Bacteria 155 HM534813.1 96% Bacteria 237 HM046568.1 97% Bacteria 277 HM046580.1 94% Bacteria 287 HM218308.1 97% Bacteria 344 JQ009345.1 96% Bacteria 362 JQ388898.1 95% Bacteria 396 HM046580.1 98% Bacteria 484 JQ009340.1 95% Bacteria 133 AB690198.1 97% Bacteria 415 AB680247.1 97% Bacteria 46 AB600029.1 97% Bacteria 375 HM439416.1 98% Bacteria 388 GU086439.1 96% Bacteria 62 HM108484.1 97% Bacteria 267 NR_024819.1 94% Bacteria 281 AB682004.1 98% Bacteria 294 NR_024819.1 94% Bacteria 325 AB682007.1 99% Bacteria 355 HM108484.1 97% Bacteria 384 AB513363.1 96% Bacteria 418 AB513363.1 96% Bacteria 115 JQ901428.1 97% Bacteria 264 AY752956.1 98% Bacteria 278 JQ901428.1 99% Bacteria 285 JN872505.1 99% Bacteria 16 JQ746649.1 95% Bacteria 72 JQ746651.1 96% Bacteria 167 JQ746647.1 97% Bacteria 394 JQ746651.1 96% Bacteria 447 JQ746649.1 95% Bacteria 22 AY370191 95% Bacteria 105 HM108542.1 98% Bacteria 342 HM108563.1 96% Bacteria 424 HM108563.1 93% Bacteria 65 EU560860.1 97% Bacteria 92 EF088378.1 96% Bacteria 161 JX162063.1 96% Bacteria 187 FJ193279.1 98% Bacteria 270 AB626126.1 97% Bacteria 310 HE978270.1 98% Bacteria 356 JX162063.1 98% Bacteria 398 AF130895.1 98% Bacteria 425 HM196339.1 97% Bacteria 445 JN583795.1 97% Bacteria 491 EU029106.1 97% Bacteria 224 AB522704.1 97% Bacteria 229 AB681768.1 96% Bacteria 389 AB681768.1 94% Bacteria 7 AF321239.1 96% Bacteria 50 JQ771134.1 96% Bacteria 225 JQ771134.1 95% Bacteria 329 JQ771134.1 92% Bacteria 460 JX307680.1 98% Bacteria 258 HM215025.1 96% Bacteria 374 EU627172.1 98% Bacteria 462 JQ291604.1 93% Bacteria 139 JN805706.1 98% Bacteria 158 HQ671828.1 83% Bacteria 162 EU448814.1 95% Bacteria 45 JN990104.1 98% Bacteria 85 JQ770003.1 98% Bacteria Supplementary Table S5 – Binned taxonomies. Lineages represent increasing specificity for each operational taxonomic unit (OTU ).

Lineage 2 Lineage 3 Lineage 4

Actinobacteria Acidimicrobidae Acidimicrobiales Actinobacteridae Actinomycetales Actinobacteria Actinobacteridae Actinomycetales Actinobacteria Actinobacteridae Bifidobacteriales Actinobacteria Actinobacteridae Bifidobacteriales Bacteroidetes Flavobacteria Flavobacteriales Bacteroidetes Unclassified Bacteroidetes Bacilli Bacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Firmicutes Bacilli Lactobacillales Proteobacteria Alphaproteobacteria Caulobacterales Proteobacteria Alphaproteobacteria Rhizobiales Proteobacteria Alphaproteobacteria Rhizobiales Proteobacteria Alphaproteobacteria Rhizobiales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Alphaproteobacteria Rhodospirillales Proteobacteria Betaproteobacteria Burkholderiales Proteobacteria Betaproteobacteria Burkholderiales Proteobacteria Betaproteobacteria Burkholderiales Proteobacteria Betaproteobacteria Burkholderiales Proteobacteria Betaproteobacteria Neisseriales Proteobacteria Betaproteobacteria Neisseriales Proteobacteria Betaproteobacteria Neisseriales Proteobacteria Betaproteobacteria Neisseriales Proteobacteria Betaproteobacteria Neisseriales Proteobacteria Gammaproteobacteria Orbales Proteobacteria Gammaproteobacteria Orbales Proteobacteria Gammaproteobacteria Orbales Proteobacteria Gammaproteobacteria Orbales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Enterobacteriales Proteobacteria Gammaproteobacteria Legionellales Proteobacteria Gammaproteobacteria Oceanospirillales Proteobacteria Gammaproteobacteria Oceanospirillales Proteobacteria Gammaproteobacteria Pseudomonadales Proteobacteria Gammaproteobacteria Pseudomonadales Proteobacteria Gammaproteobacteria Pseudomonadales Proteobacteria Gammaproteobacteria Pseudomonadales Proteobacteria Gammaproteobacteria Pseudomonadales Proteobacteria Gammaproteobacteria Unclassified Gammaproteobacteria Proteobacteria Gammaproteobacteria Xanthomonadales Proteobacteria Gammaproteobacteria Xanthomonadales Proteobacteria Proteobacteria Proteobacteria Uncultured bacterium Uncultured bacterium Lineage 5 Lineage 6

Paenibacillaceae Enterococcaceae Fructobacillus Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Lactobacillaceae Leuconostoc

Acetobacteraceae Acetobacteraceae Acetobacteraceae Acetobacteraceae Acetobacteraceae Acetobacteraceae Acetobacteraceae Acetobacteraceae Burkholderiales Burkholderiales Burkholderiales Burkholderiales Neisseriaceae Snodgrassella Neisseriaceae Snodgrassella Neisseriaceae Snodgrassella Neisseriaceae Snodgrassella Neisseriaceae Snodgrassella Orbaceae Gilliamella Orbaceae Gilliamella Orbaceae Gilliamella Orbaceae Gilliamella

Unclassified Gammaproteobacteria Relationship to Association (GenBank) Notes Corbiculate Hosts Environmental inconstant Environmental inconstant Environmental inconstant Mammalian inconstant Mammalian inconstant Mammalian inconstant Bombus terrestris core Environmental inconstant Insect inconstant Plant inconstant Bombus terrestris Firm-5 core Apis mellifera Firm-4 core Floral inconstant Floral inconstant Apis or Floral inconstant Apis mellifera Firm-4 core Apis cerana Firm-5 core Apis cerana Firm-5 core Environmental inconstant Apis or Floral inconstant Bombus terrestris Firm-5 core Apis cerana Firm-5 core Apis or Floral inconstant Type Strain inconstant Type Strain inconstant Plant inconstant Environmental inconstant Plant inconstant Apis dorsata Alpha-2.2 core Pollen Alpha-2.2 core Type Strain inconstant Pollen Alpha-2.2 core Type Strain inconstant Apis dorsata Alpha-2.2 core Floral Alpha-2.2 core Floral Alpha-2.2 core Environmental inconstant Plant inconstant Environmental inconstant Floral inconstant Bombus vagans core Apis mellifera core Bombus bimaculatus core Apis mellifera core Bombus vagans core Apis mellifera core Bombus impatiens core Bombus impatiens core Bombus terrestris core Insect inconstant Type Strain inconstant Environmental inconstant Environmental inconstant Type Strain inconstant Environmental inconstant Environmental inconstant Type Strain inconstant Plant inconstant Plant inconstant Insect inconstant Insect inconstant Type Strain inconstant Type Strain inconstant Environmental inconstant Floral inconstant Floral inconstant Floral inconstant Environmental inconstant Bombus terrestris core Plant inconstant Insect inconstant Environmental inconstant Environmental inconstant Plant inconstant Plant inconstant Environmental inconstant Supplementary Table S6: Summary table of statistical test results based on Chao 1 estimates Analysis of Variance for Chao 1 Estimator of Alpha Diversity Among Bee Species

Source Degree of Freedom Sum of Squaures Mean Square F Ratio Prob > F Species 2 3538 1769 3.71 0.028 Error 96 45723 476 C. Total 98 49261

Means for One Way Anova N Mean SE bimaculatus 30 45 4 griseocollis 33 33.6 3.8 impatiens 36 47.1 3.6

t-tests for difference in Chao 1 estimates of alpha diversity between natural bog and conventional agricultural habitats for individual bee species

B. bimaculatus t-ratio 0.06 DF 28 Prob > |t| 0.95

N Means SE Bog 15 44.7 8.4 Farm 15 45.3 8.4

B. griseocollis t-ratio -0.51 DF 31 Prob > |t| 0.61

N Means SE Bog 17 35.6 5.7 Farm 16 31.4 5.9

B. impatiens t-ratio -0.59 DF 34 Prob > |t| 0.56

N Means SE Bog 18 48.6 3.6 Farm 18 45.6 3.6 Legends for Supplementary Tables and Figures

SUPPLEMENTARY TABLES

Supplementary Table S1: Table of PCR primers and protocols

Supplementary Table S2: 454 barcoded primers used in this study

Supplementary Table S3: Summary of Pyrotag processing and read preprocessing steps.

Supplementary Table S4: Metadata mapping file used for Qiime processing and analysis of 454 reads

Supplementary Table S5: Binned taxonomies. Lineages represent increasing specificity for each operational taxonomic unit (OTU)

Supplementary Table S6: Summary table of statistical test results based on Chao 1 estimates

SUPPLEMENTARY FIGURES

Supplementary Figure S1: Map of collection sites. Filled circles represent active, conventional cranberry farms and open circles represent abandoned bogs

Supplementary Figure S2: Bombus sp. sample summary

Supplementary Figure S3: Phylogenetic trees used to designate core and non-core bacteria: a) Gammaproteobacteria: Gilliamella apicola, b) Betaproteobacteria: Snodgrassella alvi, c) Alphaproteobacteria overview, d) Lactobacillus(Firm-4, Firm-5)

Supplementary Figure S4: Chao 1 estimates comparing alpha diversity rarefaction curves by: a) species, b) habitat

Supplementary Figure S5: Phylogenetic tree of Nosema strains from present study and closely related taxa. Sequences from this study marked in red (novel Nosema phylotype), or blue (Nosema ceranae). Labels include Genbank accession numbers, specimen labels, and host species. Tree includes all putatively novel Nosema taxa found in Chinese Bombus by Li et al. (2012).