The Pennsylvania State University

The Graduate School

College of Agricultural Sciences

POLLINATION SERVICES, COLONY ABUNDANCES AND

POPULATION GENETICS OF

A Thesis in

Entomology

by

Carlene Miller McGrady

© 2018 Carlene Miller McGrady

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science

August 2018

The thesis of Carlene Miller McGrady was reviewed and approved* by the following:

Shelby J Fleischer Professor of Entomology Thesis Advisor

Margarita M López-Uribe Assistant Professor of Entomology

Maryann Frazier Senior Extension Associate

James P Strange Research Entomologist

Gary W Felton Professor of Entomology Head of the Department of Entomology

*Signatures are on file in the Graduate School

ii ABSTRACT

Although recent studies suggest that native bees are likely supplying sufficient pollination services in Cucurbita agroecosystems, commercial pumpkin growers in Pennsylvania are spending thousands of dollars renting managed honey bees to insure adequate pollination. Here, we evaluate the ability of native bee populations to respond to increasing floral resources in a mass- flowering crop such as pumpkins, and the effect of temporal and spatial variables on pollination services supplied by native bees. We catalogued a surprisingly large community comprised of 37 species of native bees foraging in commercial pumpkin fields. Honey bees (Apis mellifera L.), Squash bees (Peponapis pruinosa Say) and Bumble bees (Bombus spp., primarily B. impatiens Cresson) were the most active pollinator taxa, contributing over 95% of all pollination services. We then examine the effect of distance from field edge on flower sex preferences using chi- squared tests and visitation rates using regression for the most active taxa. While visitation rates were not significantly impacted by distance from field edge, A. mellifera and Bombus preference for female flowers decreased as distance from field edge increased. We also evaluate the effect of field size, day of year and floral density on visitation rates using regression. Bombus visitation rates decreased as field size increased. Both A. mellifera and Bombus spp. visitation rates exhibited a curvilinear response as the growing season progressed and both responded positively to increasing floral density. We synthesized existing literature to estimate minimum “pollination thresholds” per taxa and calculated that A. mellifera, Bombus and P. pruinosa were each providing 10x, 12.75x and 1.8x the necessary pollination services, respectively. The relationship between visitation rates and pumpkin yield metrics were examined with principal components and correlation analysis for each year separately. Bombus spp. visitation rates positively influenced seed set and pumpkin weight in some years. P. pruinosa visitation rates positively influenced fruit per square meter in some years. A. mellifera visitation rates were never positively associated with any yield metric. This study provides strong evidence that native pollinators are sufficient for pumpkin pollination services in most settings, but managed pollinators should be considered for larger fields (> 3 - 4 ha), depending on configuration. These results have important implications for pollination management decisions and further highlights the importance of monitoring and conserving native pollinator populations. To evaluate the reliability of pollination services provided by wild B. impatiens, we estimate the abundance of Bombus impatiens colonies providing foragers for pollination services using a genetic technique known as microsatellite analysis. Microsatellite analysis is an important genetic tool with previous studies publishing guidelines for optimizing multiplexes iii and checklists for monitoring potentially biased loci. In this study, we proposed a standardized workflow for evaluating microsatellite loci for 5 common issues and demonstrate using the workflow to trial 14 non-species-specific loci for use in Bombus impatiens, an ecologically and economically important pollinator. We examined the DNA of > 6000 B. impatiens individuals collected from 30 sites over 4 years. We evaluated each locus for evidence of allelic drift, monomorphism, frequency of null alleles, Hardy-Weinberg disequilibrium and linkage disequilibrium. During this process, we propose a new method to visualize and account for allelic drift, which enabled us to efficiently eliminate one locus from our multi-plex, BL15. We found that BT28 was an extensively monomorphic locus. Including the monomorphic locus predictably decreased overall genetic diversity, but it did not alter patterns of genetic diversity between sites. Furthermore, monomorphic loci did not substantially impact the ability to identify genetically related foragers. Five loci exhibited isolated instances of null alleles in less than 10% of sites. One loci, BTMS0081 exhibited universal deviation from Hardy-Weinberg, but it was driven by only 2 sites. Several loci pairs were universally linked, but each linkage was driven by only 1 or 2 sites and including linked pairs had little impact on subsequent results. Implementing this systematic workflow will promote standardized methods to evaluate the extent of potentially biased loci and report the severity of the impacts on subsequent analyses. Furthermore, we can now provide a rigorously tested and thoroughly optimized multiplex of 11 microsatellite loci for use in Bombus impatiens (and potentially other Bombus species), saving financial resources and research hours for future researchers. We analyzed the genotypes generated from this optimized multiplex to test hypotheses about the abundance of B. impatiens colonies providing foragers to pumpkin fields and population genetics. Current studies assume conserving and promoting wild bumble bee colony abundance will result in increased economic and ecological benefits in the form of stable or increased pollination services, but the relationship between colony abundance and pollination services is understudied, particularly in agroecosystems. Bombus impatiens, the common eastern bumble bee, is an important pollinator in the eastern United States with recent studies proposing that the agriculture industry integrate these native bees into their pollination management strategies. However, studies regarding B. impatiens population abundance and genetic status are limited. We used microsatellite analysis and statistical models to estimate the number of B. impatiens colonies providing foragers to 30 commercial pumpkin fields in Pennsylvania and found foragers from 543.7 ± 21.7 SE (range of 291 to 891) colonies per field. Average colony abundance per field was not affected by year (n = 4), or geographic region (n=3), indicating a

iv temporally and spatially stable population of native pollinators. We used our large sample size to evaluate the influence of low levels of polygamy on estimating colony abundance, and showed that monogamy is a reasonable and conservative assumption for estimating colony abundance of B. impatiens. We tested for evidence of genetic differentiation using G-statistics and analysis of molecular variance and evaluated genetic diversity using expected heterozygosity and allelic richness. We confirmed previous assumptions that B. impatiens is a single, panmictic population throughout our study region of 5,000 square km and is characterized by relatively high genetic diversity, indicating a genetically resilient population with the potential to respond to selective pressures in the future. We also measured Bombus visitation rates to pumpkin flowers in a subset of 24 fields and found 0.3 ± 0.05SE bee visits per flower per 45secons. We examined the relationship between colony abundance on a per field and per hectare basis, against visitation rates as a metric of the ecosystem service of pollination. We found that colony abundance per hectare accounted for 23% of the variation in visitation rates, indicating that wild bumble bee colony abundance, mediated by field size, positively impacts pollination services in agroecosystems. We use these relationships to discuss the influence of a mass flowering crop on colony-level abundances of a wild, native, eusocial species.

v TABLE OF CONTENTS

LIST OF FIGURES ...... VIII LIST OF TABLES ...... XII ACKNOWLEDGMENTS ...... XVI DEDICATION...... XVIII

CHAPTER 1: INTRODUCTION TO BOMBUS IMPATIENS AND CUCURBITA AGROECOSYSTEMS ...... 1

Introduction...... 1 Bombus impatiens eusociality and genetic system ...... 2 Bombus impatiens Populations ...... 3 Study System...... 11

Thesis Objectives ...... 15 References ...... 17

CHAPTER 2: POLLINATION SERVICES IN CUCURBITA AGROECOSYSTEMS . 22

Introduction...... 22 Pumpkin Pollinators:...... 22 Agricultural Objectives: ...... 24 Pumpkin Yield: ...... 25

Methods ...... 26 Study site configuration ...... 26 Sampling procedures ...... 26 Analysis ...... 28

Results ...... 30 Community Composition ...... 30 Pollinator Activity Distribution ...... 30 Flower Sex Foraging Preferences ...... 31 Spatial, Temporal and Floral Resource effects on Visitation rates ...... 31 Pollination Thresholds ...... 32 Yield Metrics...... 33 Visitation Rates & Yield Metrics ...... 33

Discussion ...... 35 References ...... 43 Appendices ...... 57

CHAPTER 3: STANDARDIZING MICROSATELLITE GENOTYPING: ...... 64

Introduction / Background ...... 64 Methods ...... 67

vi DNA Extraction ...... 67 PCR Amplification and Sequencing ...... 67 Visualizing Allelic Drift and Setting Bins using Geneious ...... 68 Amplification Success ...... 68 Identifying Related Individuals ...... 69 Testing non-species-specific loci for significant issues ...... 69 Loci Diversity ...... 71

Results ...... 72 Visualizing Allelic Drift and Setting Bins using Geneious ...... 72 Amplification Success and Loci Diversity ...... 72 Testing non-species-specific loci for significant issues ...... 72 Monomorphic Loci...... 72 Null Alleles ...... 73 Hardy-Weinburg Equilibrium (HWE) ...... 73 Linkage Disequilibrium (LD) ...... 73

Discussion ...... 74 References ...... 76 Appendices ...... 84

CHAPTER 4: WILD BUMBLE BEE COLONY ABUNDANCE, MEDIATED BY FIELD SIZE, PREDICTS POLLINATION SERVICES IN AGROECOSYSTEMS ... 113

Introduction / Background ...... 113 Methods ...... 116 Data Collection ...... 116 Molecular Methods...... 116 Colony Abundance ...... 117 Estimated Total Colony Abundance ...... 117 Population Genetics ...... 118 Genetic Diversity and Inbreeding ...... 119 Statistical Analysis ...... 119

Results ...... 120 Effects of Mating Assumptions on Colony Analysis ...... 120 Estimates of Total Colony Abundance ...... 121 Effect of nectar-rich Mass Flowering Crop Field Area ...... 121 Colony Abundance and Pollination services ...... 122 Population Structuring ...... 122 Genetic Variation...... 122

Discussion ...... 123 References ...... 127 Appendices ...... 138

CHAPTER 5: THESIS DISCUSSION ...... 151

References ...... 155

vii LIST OF FIGURES

2 Figure 1. Haplodiploidy results in unusually highly related offspring

4 Figure 2. Genetic methods and statistical models can be used to estimate total colony abundance represented by a collection of foragers. Each circle represents an individual bee and colors represent unique genotypes.

9 Figure 3. Male and Female Pumpkin flowers. Nectaries are shaded red. Representation of pores are outlines in the floor of the male flower.

11 Figure 4. Eastern Pennsylvania, including county lines. The three 3 study regions are labeled: Centre (pink), Columbia (green) and Lancaster (gold). Study fields are indicated by black circles

12 Figure 5. Columbia region is outlined in green on the small map of Pennsylvania, below, and enlarged. We conducted research from a total of 19 fields in the Columbia Region, 17 of which were in Columbia County. Field 2 was in Northumberland County and Field 25 was in Lycoming County. Colors indicate the type(s) of data collected at each field: only colony abundance data, only ecosystem service data, and both types of data

13 Figure 6. Lancaster region is outlined in gold on the small map of Pennsylvania, below, and enlarged. We conducted research from a total of 7 fields in the Lancaster Region, all of which were in Lancaster County. All fields are blue indicating both colony abundance data and ecosystem service data was collected at all fields.

14 Figure 7. Centre Region is outlined in salmon on the small map of Pennsylvania, below, and enlarged. We conducted research in a total of 7 field in the Centre Region, all of which were in Centre County. All fields are yellow indicating only colony abundance data was collected at all fields.

46 Figure 8. Sampling diagram. Visitation, floral density and yield measures were collected along transects spaced 0, 25, 50 and 100m from the field edge (orange). Transects were between 80 – 100m long (orange arrow) with at least 50m of field on either side (green arrow).

46 Figure 9. Dominance distribution of pollinator visits to pumpkin flowers. See Table 1 for bee species included in each morpho-taxa

48 Figure 10. For each distance from field edge (A) 0m, (B) 25m, (C) 50m and (D) 100m, the distribution of male and female flowers observed (left of black line) is compared with the distribution of male and female flowers visits for Apis mellifera, Bombus spp. and Peponapis pruinosa (right of the black line). Male flowers observed and male flower visits are in blue while female flowers observed and female flower visits are in red. After Bonferroni corrections, the proportion of female flowers visits is significantly higher than the proportion of female flowers observed when P < 0.004, indicated by an *

49 Figure 11. Bee taxa and Flower sex significantly affected mean visitation rates to pumpkin flowers. Error bars are one standard error from the mean

viii

50 Figure 12. Male flower floral density per square meter positively affected A. mellifera visitation rates to (A) female and (B) male flowers, as well as (C) Bombus spp. visitation rates to male flowers. Each point represents a single transect. The x-axis is uniform for all graphs. The y-axis is uniform per flower sex. The shaded region represents a 95% CI surrounding the regression line of fit.

51 Figure 13. Pumpkin field area negatively affected Bombus spp. visitation rates in commercial pumpkin fields. Each point represents the mean visitation rate for a given field area with error bars indicating 1 standard error. The shaded region (blue) represents a 95% CI surrounding the regression line of fit.

51 Figure 14. Visitation rates from (A) A. mellifera and (B) Bombus spp. exhibited a curvilinear response throughout the pumpkin floral bloom period. Data is summarized as a mean visitation rate for each day, surrounded by error bars indicating 1 standard error. The shaded region (blue) represents a 95% CI surrounding the regression line of fit.

52 Figure 15. Distance from field edge predicts has a weak, negative relationship with visitation rates, only significant at alpha = 0.1. Visitation rates are summarized for 4 distances as boxplots. The dotted line represents the non-significant line of fit for the regression.

54 Figure 16. Seed set effects weight per C. pepo cv ‘Gladiator’ pumpkin. Each point represents a single pumpkin. C. pepo cv ‘Cannonball’ pumpkins are displayed (gray) but not included in analysis. Line of fit is surrounded by a shaded region representing 95% confidence intervals.

54 Figure 17. Principal components analyses (PCA) examining associations between mean visitation rates averaged across all sample dates per transect and average yield metrics per transect for (A) 2013, (B) 2014, and (C) 2015 separately. Visitation rates within each PCA are labeled with 2 letters to indicate the specific bee taxa: A. mellifera (A), Bombus spp. (B), and P. pruinosa (P) visits to male (M) or female (F) flowers. Fruit per m2 (fruit/m2) was measured in 2014 and 2015, pumpkin weight (weight) was measured in all years and seed set (Seeds) was measured in 2013 and 2014. Associated variables between visitation rate of a bee taxa and a yield metric, supported with a significant correlation, are depicted in blue, grouped by solid or dashed lines.

55 Figure 18. Average seed set per pumpkin was positively affected by B. impatiens visitation rates to male flowers in 2013. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

55 Figure 19. Pumpkin weight was positively affected by (A) B. impatiens visitation rate to male flowers in 2013 and (B) B. impatiens visitation rate to female flowers in 2014. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

56 Figure 20. Fruit per m2 was positively affected by P. pruinosa visitation rates to (A) male flowers and (B) female flowers in 2014. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

ix 62 Figure E21. Effect of A. mellifera visitation rates effect on native bee visitation rates to male flowers. X-axis depicts A. mellifera visitation rates and is the same for both figures. Y-axes depicts (A) P. pruinosa and (B) Bombus spp visitation rates and is scaled independently for each figure. Points represent a single transect, averaged across sampling dates. Significance indicated by the line of fit indicates surrounded by a shaded region representing the 95% confidence interval.

63 Figure F22. Overall Distribution of male and female flowers observed compared with the distribution of male and female flowers visited by Bombus spp., Apis mellifera, and Peponapis pruinosa. * indicates when the proportions of visits for a bee taxa to female flowers is significantly higher than the proportion of female flowers observed.

79 Figure 23. A standardized workflow for evaluating microsatellite loci, particularly when using non-species-specific loci for the first time. Actions or processes (blue squares) provide results (circles) which are used to make decisions (diamonds) about discarding or retaining loci for subsequent analyses.

81 Figure 24. Visualizing cluster patterns of allele peaks and setting bins for 3 loci using Geneious. Colors represent the dye used to label primers during PCR. The thin, dark lines represent the midpoint of a peak for a single individual. The broad shaded regions represent the bin for a single allele. Peaks for (A) locus BT10 appear early in bins early in the range and exhibit drift, appearing late in bins late in range. Peaks for (B) locus BL11 are diverse for each allele, but still exhibit clear clustering and can be binned. Peaks for locus (C) BL15 vary considerably without clear patterns, resulting in many peaks falling outside of bins.

83 Figure 25. Comparing results for several analyses when including (green) or excluding (gray) monomorphic locus BT28. Excluding BT28 resulted in a small, but significant increase for genetic diversity when measured with both (A) expected heterozygosity (including 0.648 ± 0.001SE; excluding 0.711 ± 0.001SE) and (B) rarified allelic richness (including 12.52 ± 0.04SE; excluding 13.57 ± 0.04SE). Excluding BT28 did not significantly affect (C) average full-sibship families detected by Colony (including: 163.4 ± 2.9SE; excluding: 160.9 ± 2.8SE). Excluding BT28 revealed 2 sites that significantly deviated from Hardy-Weinberg equilibrium and 1 additional site with significant inbreeding. Whiskers depict range with mid-line representing the mean. Significance represented by bolded P-value. Sample size is 60 per test, 30 sites in each group.

91 Figure I26. Necessary settings for visualizing allelic drift using Geneious.

101 Figure N27. Allele frequency distributions for 11 loci (A-K) based on the 6,222 B. impatiens genotypes examined in this study. For each locus, N denotes the number of foragers successfully genotyped, AN denotes total number of alleles, and HE denotes the expected heterozygosity based on Nei’s 1978 estimate of gene diversity. Alleles missing from the x-axis indicate no individuals were found to have that allele. Colors indicate the florescent dye used during PCR

133 Figure 28. Mating assumptions have a significant effect on mean full-sibship families identified by Colony. Error bars depict standard error.

x 133 Figure 29. Average mates per queen has a significant effect on polygamous detected colony numbers. The x-axis depicts hypothetical mates per queen, ranging from 1 – 1.22 in increments of 0.02. Each point represents a mean of all 30 fields, surrounded by error bars depicting 95% confidence intervals. Values for 1.02 – 1.22 mates per queen are based on the number of full-sibship families identified by Colony under polygamous assumptions (black). The shaded region (gray) depicts a 95% CI surrounding the regression line of fit. Included for comparison, the values for 1 mate per queen is based on the number of full-sibship families identified by Colony under monogamous assumptions (red) and is not part of the regression analysis. The monogamous mean (red solid line) is approximately equal to the polygamous mean at 1.12 mates per queen. The 95% CI for monogamous mean (red dotted line) excludes 1.02 and 1.22. Estimated mates per queen for B. impatiens (green line) are taken from previous studies (Cnaani et al, 2002; Payne et al, 2003)

134 Figure 30. Mean Colony abundance per field was stable across (A) 4 successive years and (B) 3 geographic regions. Error bars depict standard error.

135 Figure 31. Field area did not affect B. impatiens (A) total colony abundance per field, but did had a negative relationship with both (B) colony abundance per hectare and (C) mean visitation rate. The x-axis (field area) is uniform for all graphs and each point represents a single commercial pumpkin field. Error bars in (A) represent the 95% confidence intervals (CI) of estimated colony abundances supplied by Capwire. The shaded region (blue) depicts the 95% CI for the line of fit for significant relationships. For (B) fields less than 1 ha (n = 4) were excluded.

136 Figure 32. Visitation rates have no relationship with (A) colony abundance per field, but they are positively affected by (B) colony abundance per hectare. Each point represents a single pumpkin field and is colored according to field size, depicted in the legend. The blue line of fit, indicating significance (F1, 19 = 5.3, P = 0.0322, R2 = 0.23) is surrounded by a shaded region depicting a 95% confidence interval. The dotted line indicates the line of fit when including an outlier for colony abundance per hectare, depicted as an unfilled circle (F1, 19 = 4.5, P = 0.046, R2 = 0.19)

149 Figure Z33. The smaller circle represents the total number of collected B. impatiens foragers, distinguishing between individuals removed from all analyses for failing to amplify in 5 or more loci (gray) and individuals included in subsequent analysis (dark green). The larger circle represents those 6,222 individuals included in subsequent analysis, categorizing individuals based on the number of loci that failed to amplify.

150 Figure AB34. Comparison between model mean estimates of colony abundance using ANOVA.

150 Figure AB35. Colony estimates provided by Capwire, ranked low to high for all 30 sites. Each site has two estimates, provided by the two different models, TIRM (light blue) and ECM (dark blue). The error bars represent the 95% confidence interval.

xi LIST OF TABLES

12 Table 1. The types of data collected per spatially referenced fields in the Columbia Region for 2012 – 2015.

13 Table 2. The types of data collected per spatially referenced field in the Lancaster Region for 2013 – 2015.

14 Table 3. The type of data collected per spatially referenced field in the Centre Region for 2013 – 2015.

47 Table 4. Comprehensive list of all bee species collected from C. p. pepo cv ‘Gladiator’ flowers including the taxonomic resolution (Taxa), number of specimen collected (N), morpho-taxa terminology used during visitation observations (Morpho-taxa) and the years in (Year) and fields at (Field) which species were collected

49 Table 5. Overall regression model results testing the effect of Bee taxa, Flower Sex, Field area, Day of year, Distance-from-field-edge, and male flower floral density per square meter on visitation rates (bee visits / flower / 45s) in commercial Pumpkin agroecosystems. Estimates included for continuous variables only.

52 Table 6. Necessary number of pollinator visits per taxa to achieve optimal Cucurbita yield, synthesized from multiple studies. PT: necessary number of total pollen grains deposited per stigma to adequately fertilize ovarioles. FO: necessary number of fertilized ovarioles to achieve optimal fruit set and or fruit weight, as indicated by seed set. PV: average number of pollen grains deposited per pollinator visit. VT: necessary visits per taxa to achieve optimal yield. – indicates data type not collected for that study

53 Table 7. On average, pollinator visitation rates to female flower in this study exceed estimated pollination thresholds. Total visits (green) and visitation rate (gray) are compared between ‘pollination services: estimated thresholds’ and ‘pollination services: observed in this study’ to determine if pumpkins are receiving ‘sufficient pollination services’ in PA commercial agroecosystems

53 Table 8. Overall and yearly means + SE for pumpkin yield metrics. The overall range is presented in (). Yearly means labeled with different letters per row are significantly different according to 1-way ANOVA. – indicates when data were not collected for a given year.

57 Table A9. Sampling support for Visitation Measures, Floral Density measures and Yield measures. Site: unique number used to identify each sampling site to protect identity of grower collaborators. Region: the geographical location of the specific sampling site with C for Columbia County and L for Lancaster county. T (m): the transect located X meters from the field edge. NVR: number of visitation rate samples taken per transect. NFD: number of floral density measures taken per transect. NW: number of pumpkins weighed per transect. NC: number of pumpkin circumferences measured per transect. NL: number of pumpkin lengths measured per transect. NSS: number of pumpkins from which seeds were harvested per transect.

xii 60 Table B10. Overall mixed model results testing the fixed effect of Bee taxa, Flower Sex, Field area, Day of year, Distance-from-field-edge, and male flower floral density per square meter on visitation rates (bee visits / flower / 45s) in commercial Pumpkin agroecosystems when including year, county and field as random effects. Estimates included for continuous variables only.

61 Table C11. Estimates of total pollinator visits to male flowers for the lifetime of the flower derived from visitation rates in this study.

61 Table D12. Box-cox transformations for visitation rates

80 Table 13. 14 microsatellite loci from Bombus spp. and their characteristics in Bombus impatiens. All loci were isolated from B. terrestris unless otherwise specified in parentheses under the locus name. Primers labeled with the same florescent dye color (Dye) do not have overlapping size ranges. The number of nucleotides in the repeat motifs (tri-, di-) are reported for this study, while the motif repeat sequence is reported from the references. Size ranges, measured in number of base pairs (bp) reflect the distribution of alleles found in this study based on sample sizes in parenthesis. Loci that were trialed and ultimately discarded are included in gray. During PCR, all primers had an annealing temperature of 55 ̊C.

82 Table 14. Summary of universal issues and diversity metrics for eleven non-species- specific microsatellite loci when used in Bombus impatiens. In row 1, thresholds for universal significance is indicated below the name of each test. The % amplification success (AMP) is based on 6,222 successfully genotyped foragers followed by (sample size). The number of sites with significant null alleles (Null) is based on 4,902 unrelated individuals evaluated per site. Universal significance of monomorphic loci (Mono), deviations from Hardy-Weinberg Equilibrium (HWE), and linkage disequilibrium (LD) are based on 4,902 unrelated, pooled individuals. Monomorphic loci are indicated by reporting the minor allele frequency (maf). Values reported for HWE are p-values based on Chi-squared tests with the number of sites driving universal deviations in parentheses. Values for LD are the linked loci with the number of sites driving universal significant linkages in parenthesis. Gray LD values indicate a repeat of linked loci repeated earlier in the chart. Total number of alleles (AN), observed heterozygosity (HO) and expected heterozygosity based on Nei’s 1978 estimate of gene diversity (He) are based on 6,222 pooled individuals.

94 Table K15. Error rates used by Colony when identifying related foragers

95 Table L16. Binning information for each locus including locus name, dye color (6-FAM, VIC, NED, PET), allele length (allele) and the minimum (min) and maximum (max) for each allele’s bin.

97 Table M17. Percent of individuals that successfully amplified at each locus, displayed by site (n = 30) and summarized across all samples (overall). 100% amplification success is indicated by . Shading depicts a gradient of amplification success from 0 – 100% with the darkest colors indicating the least success.

102 Table O18. Results of common genetic tests performed with (11 loci) and without monomorphic locus BT28 (Ex BT28). Number of individuals successfully genotyped

xiii (NG) were first sorted into full-sibship families (NF) using Colony. Genetic diversity is measured with expected heterozygosity (HE) and allelic richness (AR). Values of NF, HE, and AR when excluding BT28 are colored when values decrease (red) or increase (green). Deviations from Hardy-Weinberg Equilibrium (HWE) are significant at p < 0.002 after Bonferroni corrections. Inbreeding (FIS) is significant when positive 95% confidence intervals do not overlap 0. Significance is bolded and changes in significance when excluding BT28 have *. Tests for HE, AR, HWE and FIS were run with sample sizes listed under NF, 11 loci.

103 Table P19. For each locus, the total observed homozygotes (TOBS) significantly exceeds the total expected homozygotes (TEXP). In all instances, the source of error (Error) is identified as stuttering. Equations from Chakraborty 1992 and Brookfield 1996 are used to provide estimates of null allele frequencies (Oosterhout, Chakraborty, Brookfield 1, Brookfield 2)

103 Table Q20. Testing each locus for deviations from Hardy-Weinberg Equilibrium across 4,902 foragers.

103 Table Q21. Summary comparison of limited test results when including and excluding BTMS0081.

104 Table R22. Each locus pair included below exhibited highly significant linkage disequilibrium for the universal dataset. The site column indicates the unique sites driving the universal linkage disequilibrium. P-value indicates the probability per site that the two loci were not linked given the observed results. Universal P-Value indicates the probability that two loci were not linked for the universal dataset once removing the Site in ( ).

132 Table 23. Bombus impatiens colony abundances and visitation rates were measured for a total of 30 fields, sampled across 4 years (Y) and 3 regions (R). Full-sibship families detected by Colony under monogamous assumptions (FS families Mono) is the average across three separate runs (numbers in parenthesis indicate when separate runs detected different numbers of full-sibship families) and is the same value as detected colony numbers used in subsequent analysis. FS families poly indicates Full-sibship families detected by Colony under polygamous assumptions. Colonies per field is estimated total colony abundance per field provided by Capwire; Colonies per hectare is the total colony abundance per field ÷ field area. Visitation rate is the mean bee visits per pumpkin flower per 45s for each field and is based on ~120 observations averaged per transect, and then 4 transects averaged per field.

136 Table 24. Overall estimates of genetic differentiation between sites or regions using 3 G-statistics for each year separately. 95% confidence intervals are in ( ) and always include or overlap zero, indicating no significance.

136 Table 25. Sources of genetic variation partitioned between individuals, sites and regions using AMOVA for each year independently.

137 Table 26. Genetic diversity metrics for the panmictic population each year. N1pc is the number of foragers reduced to a single representative per colony per site. AN is the total number of alleles from all loci. AP is the number of private alleles found only in that xiv year. AR is the allelic richness calculated from a rarified sample size of 100 individuals per year. HE is the expected heterozygosity calculated from the smallest sample size of 293 samples per year. FIS is the inbreeding coefficient, expressed as a 95% confidence interval with significance, represented in bold, when intervals do not overlap 0.

138 Table T27. Sampling support for colony abundance and pollination activity. Field: unique number and initials to identify each field, shortened to protect identity of grower collaborators. Region: the geographical location of each field which could be Columbia (Co), Lancaster (L) or Centre (Ce). DateC: date ~200 B. impatiens individuals were collected. NIC: number of individuals collected. DaysC-VR: days between forager collection date and visitation rate observation date. Transect (m): distance from field edge at which visitation rates were recorded; NVR: number of visitation rate samples taken per transect.

140 Table U28. 11 microsatellite loci multiplex used to genotype Bombus impatiens for from Bombus spp. All loci were isolated from B. terrestris unless otherwise specified in parentheses under the locus name. Primers labeled with the same florescent dye color (Dye) do not have overlapping size ranges. The number of nucleotides in the repeat motifs (tri-, di-) are reported for this study, while the motif repeat sequence is reported from the references. Size ranges, measured in number of base pairs (bp) reflect the distribution of alleles found in this study based on sample sizes in parenthesis. During PCR, all primers had an annealing temperature of 55 ̊C.

146 Table W29. Pairwise genetic differentiation for site pairs within each year (2013, 2014, 2015). Maximum value across all years (bold) was less than 0.0069. Cells are colored according to value with the higher values shaded darker.

146 Table W30. Pairwise genetic differentiation for regional pairs within each year (2013, 2014, 2015). Maximum value across all years (bold) was less than 0.003. Cells are colored according to value with the higher values shaded darker.

xv ACKNOWLEDGMENTS

I would first like to acknowledge several funding sources, without which this thesis could not have been completed. This material is based upon the work that is supported by the National Institute of Food and Agriculture, U. S. Department of Agriculture, under award number 2012-51181-20105: Developing Sustainable Pollination Strategies for U. S. Specialty Crops. The Huck Institute of Life Science at the Pennsylvania State University provided supplementary funding to complete molecular work. The Sigma Xi Research Society provided funding in the form of a graduate student Grant-in-Aid of research award. Findings and conclusions in this thesis do not necessarily reflect the views of the above funding agencies. I am indebted to many researchers and organizations for providing the support critical to my development as a professional scientist while pursuing my Master’s degree at Penn State. I would like to thank all the collaborators participating in the nation-wide Integrated Crop Pollination Project, particularly the Project Director, Rufus Isaacs and the Cucurbit Team Leader, Neal Williams. Through the ICP network, I connected with some of the best bee researchers in the country and had access to amazing resources. Thank you ALL for your contributions to my thesis work, with special recognition to Jason Gibbs, Sam Droege and Rob Jean for providing expert identification for bee species reported in Table 4, Chapter 2 and Emily May (The Xerces Society for Invertebrate Conservation), who created the bee icons used in several figures in Chapter 2. The USDA ARS Logan Bee lab was instrumental in kick-starting my adventure into molecular and genetic research. I have unending gratitude for Amber Tripodi, Joyce Knoblett and Jonathan Koch for responding to my endless questions and off-the-wall ideas! Christina Grozinger provided home for my wet lab work in the Center for Pollinator Research at Penn State and many of the Grozinger Lab members enhanced my research through compassionate and intelligent critique including Mali Döke, Etya Amsalem, and Maggie Douglas, and by sharing skills including Doug Sponsler, who created the maps included in Chapter 1. A very special thank you to Gabe Villar for helping me through my incredibly humble beginnings with DNA extraction and PCR; I would have been a disaster without your care and wisdom. Many thanks to the Fleischer lab and particularly Kristal Watrous who is the best lab manager I have ever had the privilege to learn from and who will always be my ‘scientific mama.’ I’m lucky to have been raised in a lab setting with Kevin Rice, Maggie Lewis, Dana Roberts, and Erin Treanore. I’d like to offer sincere gratitude to the all the people who collected field data for my thesis, particularly fellow graduate student Ryan Reynolds, my amazing undergraduate employees Hannah Balko, Catherine Galleher, and Nick Krause and members of the Biddinger Lab. I want xvi to thank the entire Penn State Entomology department, led by the esteemed Gary Felton, for creating such an open and supportive atmosphere. I am so thankful for the amazing friends that have been there to struggle, succeed, laugh, cry, canoe, craft, write, present and research right alongside me, especially Carolyn Trietch, Elizabeth Rowen, Tyler Jones, Anne Jones, Shelley Whitehead and Kevin Cloonan. I am beyond grateful for my amazing committee Jamie Strange, Margarita López-Uribe and Maryann Frazier. Thank you so much for the time and energy you invested in me – only with your help have I become the scientist I am today. I can barely find the words to express my gratitude to my advisor Shelby Fleischer. Shelby is full of unending wisdom, insight, patience, compassion, and humor – all of which he supplied freely during my time at Penn State. He is unflappable in the face of chaos – which I certainly brought to his life and lab – I am sorry for the cardiac stress! I could not have asked for a more wonderful advisor to see me through this thesis. Thank you!! I would like to thank my parents, Kevin and Lisa, as well as my siblings, Tricia, Mimi and Dusty – I am so lucky to have an incredible family that has been here for every trial and tribulation since 1989 and can finally share in this great victory! I never could have done it if you all did not consistently fortify my soul. And lastly, I would be lost without my amazing husband, Stephen McGrady. If you had not fed me, sheltered me, comforted me, loved me, challenged me and celebrated me, this master’s degree would have crumbled into dust. You are my rock and my source of strength. Thank you, my dearest love, with all my heart!

xvii DEDICATION

This thesis is dedicated in loving memory to my grandfather, Laney Funderburk. So much of who I am was inspired by my Laney – my love of travel, delight in the natural world, desire to build community, and ability to play a mean game of gin rummy! He shared his whole soul with me – including a passion for tickly little sand fleas, the first I ever loved J I am a better scientist and a better person because he loved me.

xviii Chapter 1: Introduction to Bombus impatiens and Cucurbita agroecosystems

INTRODUCTION

Insect-vectored pollination is a symbiotic relationship wherein a plant provides an with nutritional resources (pollen and/or nectar) as the insect unwittingly transfers pollen from stamen to pistil, resulting in the necessary fertilization for plant propagation. This process is so critical that certain insect taxa are commonly referred to as “Pollinators,” simply defined by their functional role in this relationship. Various bee taxa are among the most recognizable pollinators because of their relationship with plants that are commonly cultivated for human use in managed agroecosystems. As it turns out, bees and humans share an affinity for snacking on products from the same plants; humans typically consume the fruit, while bees fuel up on pollen and nectar. In fact, more than 75% of the top 115 crops grown worldwide benefit from pollinators, which accounts for 35% of the food produced worldwide. (Klein et al, 2007). With their fuzzy bodies and bumbling flight, bumble bees can be quite an efficient pollinator, depositing 3 – 75x the amount of pollen per visit compared with other bees (Pfister et al, 2017; Artz & Nault, 2011; Thomas & Goodell, 2001). Furthermore, bumble bees are often found foraging in weather conditions that keep other pollinators at home, active even on cold, windy, or rainy days (Tuell & Isaacs, 2010). Commercially produced bumble bees have proved beneficial to crop production in many agroecosystems, including several varieties of apples (Thomas & Goodell, 2011), peaches (Zhang et al, 2015) and raspberries (Lye et al, 2011). However, recent studies have found managed pollinators may not be necessary when healthy wild bumble bee populations exist. The common eastern bumble bee, Bombus impatiens Cresson, is a prolific native bumble bee species active in managed and natural ecosystems throughout the eastern United States. A recent population genetics study in Massachusetts found that wild B. impatiens were providing the majority of pollination services to cranberries, even in fields stocked with commercial hives (Suni et al, 2017). Petersen, et al 2013 found that adding managed B. impatiens colonies did not increase visitation frequency to Cucurbita agroecosystems in New York. While these studies provide promising results for the efficacy of native B. impatiens pollinators in certain cropping systems, more work should be done to identify factors influencing bumble bee foraging activity. Furthermore, we understand little regarding the population dynamics and the

1 potential for these native eusocial pollinator populations to continue providing essential ecosystem services in the future.

BOMBUS IMPATIENS EUSOCIALITY AND GENETIC SYSTEM All bumble bees are primitively eusocial, meaning that at least part of all individuals’ life cycle is spent living in a colony group characterized by reproductive division of labor, cooperative brood care and overlapping generations. Each spring, bumble bee colonies are founded by a single reproductive queen, and all offspring are related siblings of this one queen (but see drifting workers Birmingham et al, 2004; Lopez-Vaamonde et al, 2004; and nest usurpation Alford, 1975). Throughout the floral blooming season, colonies grow larger with wild Bombus impatiens colonies reported to contain between 25 to > 450 workers, among the largest of any bumble bee species (Plath, 1934). If colonies persist and reach a ‘switch point,’ new queens (gynes) and males are produced. Eventually the founding queen will die and the colony will collapse as no new eggs are laid. The new gynes leave the natal colony and mate with males from other colonies before all males die and the gynes overwinter underground as a solitary insect. The phenology of this colony cycle is different for each species; Bombus impatiens queens begin emerging in early April and reproductives may be produced as early as June, with all bees dead or overwintering (gynes) by early October (Colla et al, 2011). Like all hymenopterans, bumble bees are haplodiploid (Figure 1). Haplodiploidy is described eloquently elsewhere (Goulson, 2010), but in short males produced from an unfertilized egg containing a single set of genes are haploid organisms. Females, produced from a fertilized egg, are diploid like the majority of organisms. Therefore, when a single queen mates with a single male to produce Haplodiploid offspring, the female offspring Organisms are sisters with an unusually high relatedness coefficient, Unfertilized eggs Fertilized eggs become males become females r=0.75. Because most bumble with only 1 copy of with 2 copies of nuclear DNA bee queens mate singly (Owen nuclear DNA & Whidden, 2013; Schmidt- Hempel & Schmidt-Hempel, 2000), not only are all workers in a colony related, but they are ( x 100%) + ( x 50%) = .75 related usually full sisters. There is Graphic(by(C.(McGrady evidence that 20 - 35% of B. Figure 1. Haplodiploidy results in unusually highly related offspring 2 impatiens queens mate with up to 3 males (Payne, et al 2003; Cnaani, et al 2002). In such cases, there is typically an unequal contribution of patrilines with the majority of offspring belonging to a single male (Payne, et al 2003).

BOMBUS IMPATIENS POPULATIONS Previous studies based on individual foragers have shown that B. impatiens is unlikely to be in decline in various regions throughout their range (Colla et al, 2012; Cameron et al, 2011; Grixti et al, 2008). However, Beckham et al, 2016 were unable to find B. impatiens foragers in areas of Texas where historical samples were collected. This finding indicates a probable reduction at the southern periphery of their geographic range, possibly driven by global climate change. Cold adapted bumble bees have been shown to disappear from areas as temperatures warm (Kerr et al, 2015). Even a stable, common species such as B. impatiens can be impacted by changing environmental conditions, indicating a need for monitoring and assessment of current population status. While studies based on individual foragers are valuable for determining changes in range, they are probably not the most effective evaluation of social species populations (Gieb et al, 2015). By definition, populations are comprised of reproductive individuals, hence many census of wildlife populations only measure the adult females. For social organisms such as B. impatiens, the colony as a unit produces the next generation (new queens and males) and therefore the breeding population is reduced to the number of colonies (Crozier & Pamilo, 1996; Crozier, 1979). Ultimately, the availability of bumble bee foragers for critical pollination services is largely dependent on the size and number of colonies producing those foragers. Directly measuring the abundance of bumble bee colonies can be quite tricky, as they are notoriously difficult to locate for census purposes. Researchers in previous studies have attempted to find colonies by painstakingly scrutinizing small areas (O’Connor et al, 2012; Osborne et al, 2008) or training dogs to sniff out underground nests (Waters et al, 2011; O’Connor et al, 2012), yet neither approach is easy to implement at a broad scale. Fortunately, because foragers from the same colony are closely related, genetic techniques have been developed to estimate bumble bee colony abundances from a collection of foragers.

Genetically-derived Colony Abundance: Researchers can collect random foragers from a given site, examine their genetic profiles and match-up the full sisters, grouping them into full-sibship families (FS families). The FS families can then be grouped into unique colonies based on mating assumptions. Because most bumble bee species are monogamous, all previous studies

3 have designated each FS family as a unique colony. For polygamous species like Bombus impatiens, multiple FS families could be grouped into the same colony wherein each FS family is the product of different paternal DNA and therefore foragers within the same colony are half- sisters, only sharing maternal DNA from the queen. In this way, one can infer the number of unique colonies sending foragers to a given site (previous studies often use the term ‘detected colony numbers’ to refer to unique colonies inferred from full-sibship families). Practical constraints and conservation ethics make it impossible to exhaustively sample all foragers at a given location. Therefore, it’s likely that foragers representing some colonies will not be collected and thus ‘detected colony numbers’ will be an underestimation of the true number of colonies. One can statistically infer the number of colonies represented by 0 foragers (i.e. unsampled colonies) by examining the distribution of detected colonies which contain 1, 2, 3, … k foragers. Adding the number of unsampled colonies to the number of detected colonies will provide an estimate of total colony abundances per site (Figure 2). Previous studies have used several methods to model the distribution of detected colonies starting with Darvill et al, 2004 which found that a Poisson distribution best fit the distribution of detected B. terrestris and B. pascuorum colonies in the United Kingdom.

Figure 2. Genetic methods and statistical models can be used to estimate total colony abundance represented by a collection of foragers. Each circle represents an individual bee and colors represent unique genotypes. 4 Subsequent studies used a Poisson distribution to estimate colony abundances (Rao & Strange, 2012) until Goulson et al, 2010 which was the first to employ Capwire (Miller et al, 2005) to estimate colony abundances. Capwire is software with origins in DNA mark-recapture studies which uses the number of times an individual’s DNA is non-lethally sampled to estimate population size. Collected bumble bee foragers are treated as non-lethal DNA samples of their individual colony to estimate the total colony abundance per site. Capwire provides two estimates of colony abundance per site using different models: The Event Capture Model (ECM) and the Two Innate Rate Model (TIRM). ECM assumes each colony has an equal probability of being sampled and has been shown to provide similar total colony abundances as the Poisson method (Goulson et al, 2010). TIRM assumes capture probability heterogeneity, which may more closely align with biological reality as some colonies are more likely to be sampled than others due to (1) unequal distribution of colonies in a landscape and (2) the fact that colonies vary in size and therefore numbers of foragers available for sampling. The TIRM model provides higher estimates of total colony abundance per site, but model selection does not affect the pattern of relative colony abundance between sample sites (Goulson et al, 2010). Currently, all studies using Capwire to provide bumble bee colony estimates have selected the TIRM model: Goulson et al, 2010 found it was the model of best fit for the majority of their sites (based on a likelihood ratio test), Jha & Kremen, 2012 elected to remain consistent with Goulson et al, 2010, and Wood et al, 2015 agreed with biological assumptions inherent in the TIRM approach. Previous studies throughout Europe and the western US have estimated total colony abundance per site for several bumble bee species, ranging from an average of 20.4 colonies per site (B. terrestris – Darvill et al, 2004) to 630 colonies per site (B. terrestris - Wood et al, 2015). While peer-reviewed studies reporting estimates of ‘total colony abundances’ are lacking for bumble bee species in the eastern United States, two population genetic studies have mentioned ‘detected colony numbers’ for B. impatiens. Lozier et al, 2009 found between 10 and 18 colonies detected from 13 to 21 genotyped foragers collected from sites in Illinois and Suni et al, 2017 found between 5 and 34 colonies detected from 5 to 49 genotyped foragers collected from sites in Massachusetts. Sidhu, 2012 reported 82 to 167 colonies detected from 162 to 209 genotyped foragers. However, no estimates of total colony abundances (which include unsampled colonies) were provided for B. impatiens in any study. Furthermore, no studies have examined recent trends in colony abundance across multiple regions or years. An evaluation of colony numbers can provide insights regarding the status of B. impatiens populations at a time when monitoring

5 populations of native pollinators has become a global priority; even the US government has published a national pollinator protection plan.

Genotyping B. impatiens foragers with Microsatellites: To use genetic methods to estimate total colony abundances, one must begin by generating a genetic profile for each collected forager. The most commonly used method for genotyping foragers in colony abundance studies is microsatellite analysis. A detailed explanation of microsatellites and their potential ecological uses can be found in several published reviews (Vieira et al, 2016; Woodard et al, 2015; Dudgeon et al, 2012, Guichoux et al, 2011; Selkoe & Toonen et al, 2006), but in essence, microsatellites are sequences of DNA comprised of tandem repeats of 1-10 base pair motifs surrounded by species-specific flanking regions of DNA. The number of repeat motifs differs between individuals, resulting in multiple alleles of different lengths at each microsatellite locus. Microsatellites are present throughout the genome, but those found in the non-coding regions of nuclear DNA tend to be selectively neutral and follow Mendelian inheritance, making them appropriate molecular markers for ecological studies reflecting recent time spans. I will use microsatellite analysis to determine the genotypes of B. impatiens foragers, which can then be used for colony abundance studies (mentioned above), as well as population genetic analyses. Microsatellites for use in bumble bees were developed starting in the mid 1990s (Estoup et al, 1996; Funk et al, 2006; Stolle et al, 2009). Microsatellites have been frequently used for population genetic studies. Woodard et al, 2015 found that from 1984 – 2014, 68% of population genetics studies on bumble bees utilized microsatellite markers. Their popularity is based on the relatively easy sample prep and wet lab processing as well as the high information content per locus (Selkoe et al, 2006). Estimates of colony numbers based on microsatellite data were explored starting in the 2000’s in Europe (Chapman et al, 2003; Darvill et al, 2004; Knight et al, 2005). It is important to note that when microsatellite loci are first isolated for use in specific species, the primary literature characterizes the way each locus behaves in a given species, including the original species from which the locus was derived and at times, several other related species. The same locus can behave differently between bumble bee species, with alleles occurring at different lengths, diversity, frequencies or possibly not occurring at all if the binding regions are not conserved across species. Because none of the bumble bee microsatellites have been developed from B. impatiens, current studies use non-species-specific loci, most without published information detailing exact parameters for each locus. Prior knowledge is often only available through informal communication between researchers with personal or professional connections, which can make microsatellite analysis inaccessible to 6 researchers new to the field. Often, each research group conducts an independent trial-and- error process to verify which loci amplify well and contain appropriate genetic information to measure the metric of interest for each study. This time-consuming process can delay research results and be needlessly expensive, as loci primers must be purchased in bulk from suppliers before they can be verified as useful genetic markers. Additionally, when different research groups trial and ultimately use different loci, it can be difficult to directly compare findings across studies. The scientific community would benefit from an established microsatellite analysis protocol with extensive details regarding appropriate loci for use in B. impatiens.

Population Genetics: As the global climate conditions change and landcover alters to suit anthropogenic needs, a species’ survival will depend, at least in part, on their genetic capacity to respond to selective pressures. Therefore, studies evaluating populations’ genetic variation have become increasingly important in conservation research. Genetic variation has several main components include genetic differentiation and genetic diversity (Lowe et al, 2004). Bumble bee species characterized by a single genetically-connected breeding population throughout their range appear stable, as opposed to declining species characterized by fragmented and genetically isolated populations. Common bumble bee species in Europe don’t exhibit much genetic differentiation between populations, unless separated by substantial topographical barriers such as large bodies of water or mountain ranges (Estoup et al, 1996; Shao et al, 2004; Pirounakis et al, 1998). By contrast, rare and declining species, like B. muscorum and B. sylvarum exhibited significant differentiation between populations, even those separated by just 3 km (Darville, et al 2006; Ellis et al 2006). Lozier et al, 2011 examined the population genetics of multiple bumble bee species in the US at a broad scale, including B. impatiens. From an analysis based 596 individuals from 33 sites across the United States, B. impatiens was found to be a panmictic population with very little differentiation across sites, although there was a marginally significant relationship between genetic differentiation and geographic distance. Lozier et al, 2011 did uncover significant genetic differentiation and significant differences in genetic diversity when examining B. impatiens collected from small coastal islands (Galveston Island, TX; Dauphin Island, AL), suggesting that (1) environmental barriers to gene flow can create isolated and potentially vulnerable B. impatiens populations and (2) fine scale analysis can reveal results masked in global, continental wide studies. The most recent study concerning B. impatiens population genetics was conducted by Suni et al, 2017 who analyzed 628 individuals collected from 22 sites in Massachusetts. They found that foraging bees from both agricultural and natural areas in New England are a single genetically homogenous population with no 7 significant overall genetic differentiation, a finding in keeping with other bumble bee studies showing low differentiation even across large areas. However, previous studies have not conducted fine-scale evaluations of B. impatiens populations in Pennsylvania. High genetic diversity is another hallmark of stable bumble bee populations, compared with declining species defined by lower levels of genetic diversity (Darvill et al, 2006, 2010; Ellis et al, 2006; Lozier & Cameron, 2009; Charman et al, 2010; Cameron et al, 2011; Lozier et al, 2011). While it is somewhat difficult to compare metrics of diversity across studies due to variations in tools used to quantify diversity, general comparisons can be made. Goulson, 2010 presents a chart summarizing studies from 1996-2008 that report common bumble bee species, like B. terrestris and B. pascuorum have higher levels of genetic diversity (allelic richness and expected heterozygosity) than declining species. Lozier et al, 2011 also found that B. impatiens had higher genetic diversity than relatively rare Bombus species included in their study. When populations are not large enough to maintain gene-flow and genetic diversity with each subsequent generation, patterns of genetic differentiation and diversity will begin to shift – with increasing differentiation between localized groups and decreasing diversity. Lozier et al, 2009 examined the population genetics of B. impatiens populations in Illinois, comparing historical and modern populations as well as contrasting with B. pensylvanicus, a declining species in the region (Grixti et al, 2009). They found no significant differentiation or differences in genetic diversity between or among historical and modern B. impatiens populations, concluding that “population structure … in B. impatiens has increased by only a small and insignificant amount [since the late 1960s].” These findings suggest that B. impatiens populations have been stable and well connected in Illinois for the past 40+ years, a trend that is likely to continue in the future. However, the authors caution that it can take time for impacts of environmental conditions to show up in genetic data and recommend continued monitoring of the genetic status of both common and rare bumble bees in the face of continued shifts in global land use. Additionally, this study does not necessarily describe the conditions of B. impatiens populations in other regions, including Pennsylvania. A single, genetically undifferentiated breeding population characterized by high genetic diversity maintained across successive years would be a strong indicator of a resilient B. impatiens population likely to persist for the for-seeable future. While B. impatiens is likely a stable and possibly expanding bumble bee species, population dynamics studies are lacking for this economically and ecologically valuable pollinator. It is critical to capture baseline data of common species for long-term monitoring as global climate conditions and land use patterns effect native pollinators, particularly those with the potential to replace or supplement pollination services provided by managed pollinators. An 8 evaluation of genetic variability in tandem with colony abundance data will provide valuable insights regarding the reliability of native bee populations to supply pollination services now and in the future.

Cucurbita Agroecosystems: Many species in the plant family Cucurbitacea are monecious plants with separate male (staminate) and female (pistillate) flowers blooming on the same plant. Male flowers produce a large, heavy pollen that must be vectored by to the sticky pistil in female flowers. Within Cucurbitacea is the genus Cucurbita, which includes commonly cultivated crops like squash, and gourds and pumpkins (Cucurbita pepo pepo). C. p. pepo flowers reflect ultra- violet light to attract pollinators (Hurd et al. 1971), which utilize pumpkin pollen, nectar or both as nutritional resources. The male flowers produce both pollen and nectar, while the female flowers produce only nectar (Figure 3). The male flowers have a single, stalk like anther. On the floor of the male flower, there are small pores through which nectaries can be accessed. In comparison, the sticky pistil of the female flower is surrounded by a broad, trough-like nectary, allowing for easier access to nectaries which can accommodate more nectar-collecting bees simultaneously (CMM per. obs.). On average, female flowers produce up to 3x the amount of nectar as male flowers and female nectar can have higher sugar concentrations (Artz et al, 2011). Male flowers are produced on stalk-like stems Figure 3. Male and Female Pumpkin flowers. Nectaries are shaded red. Representation of pores are outlines in the floor of that protrude from the plant, often the male flower. visible above the leaves. Female flowers are located basally on the plant, or else along vines creeping on the ground, such that any maturing fruit will be resting on the ground as they grow. Flowers are large, vivid yellow blooms and are only open for a single day. They open early (~6:00 ESTD) and senesce before midday, sometimes as early as 9:30am depending on temperature (Hurd et al, 1971; Tepedino 1981). Plant propagation begins after pollination, when pollen grains (male gametes) fertilize female gametes within ovarioles. Barring complications, the resulting zygote develops into a viable seed and thus ‘seed set’ has occurred. Because insects are vectoring pollen from male to female flowers, multiple studies have demonstrated that when pollinator visits are intentionally 9 limited, there is a strong, positive relationship between bee visits and seed set for Cucurbita pepo (Artz & Nault, 2011). Even without sufficient pollination, pumpkin plant ovaries begin to enlarge and attempt to create a fruit; however, the plant will abort fruit with insufficient seed set. If an adequately pollinated fruit is retained by the plant for a requisite number of days, ‘fruit set’ occurs. Fruit set is a measure of how many female flowers initiate viable fruit that have the potential to mature and result in mature fruit. Much can happen between fruit set and the production of a mature pumpkin – damage from insects and other herbivores, lack of plant nutrients (soil nutrients, water, sunlight) and disease. Additionally, the plant itself may reduce fruit set to concentrate its energies on producing fewer fruit, focused on fruit produced under conditions of high pollen competition during the fertilization process (Winsor et al, 1987). Originally domesticated in Mexico between 5-10,000 years ago (Smith, 1997; Whitaker & Cutler, 1965) pumpkins are now grown world-wide for both consumption and ornamentation. Demand for pumpkins has more than doubled since the 1980s. In the US, pumpkin production was worth >$185 million in 2017 (USDA NASS, 2018) and 70% of all pumpkins produced nation-wide are grown in just 7 states, including Pennsylvania. In commercial pumpkin agroecosystems, production objectives depend largely on (1) the end use of pumpkin and (2) retail strategy. In the US, pumpkins are often referred to as either ‘Pie pumpkins’ or ‘Face pumpkins.’ Pie pumpkins are grown and subsequently processed or canned for consumption, often eaten in pies, pastries, breads, muffins, and other baked goods. The canned product may also be used in soups and stews. Production is often concentrated near the processing plant. Face pumpkins are grown for ornamentation, often carved with faces during autumnal festivities. Face pumpkins reach consumer through two main retail strategies: direct market and wholesale. Direct market pumpkins are produced for Pick-your-own-Pumpkin-Patch operations and farm stands, where harvest occurs multiple times throughout October. In a wholesale system, pumpkins are harvested earlier (in early September), packed into standardized bins, and shipped to large retailers nationwide. Each field is typically harvested once, with agricultural objectives emphasizing synchronous production and fruit maturation, early in the fall season in large quantities. Because of the need for a relatively early harvest date, wholesale fields are often planted and bloom earlier than direct market fields. Because this culturally and economically valuable pumpkin crop is a completely reliant on insect-vectored pollination, agricultural pumpkin fields are often supplemented with commercially produced honey bees (Apis mellifera ) to ensure a profitable yield. This can be an expensive endeavor for growers: in 2017, the cost of renting honey bees for pumpkin production in the Mid-Atlantic was $76.20 per hive (USDA NASS, 2017) with extension 10 guidelines recommending 1 hive per acre (Canon, 2011; Orzelek et al, 2012). With almost 5,000 acres of pumpkins grown in PA in 2017, the cost of pollination management through honey be rental alone could amount to nearly $400,000 annually. However, there is mounting evidence that throughout the Mid-Atlantic, native pollinators may be supplying sufficient pollination services to commercially produced pumpkins.

STUDY SYSTEM Research for this thesis was conducted in wholesale, commercial pumpkin fields in Pennsylvania, in collaboration with professional growers (Figure 4). Because of the need for a relatively early harvest date, wholesale field are often planted and therefore may bloom earlier (July) and often have fewer cultivars than direct market fields. Additionally, wholesale fields tend to be larger to produce greater quantities of pumpkins to fill wholesale orders. Each field is typically harvested once, with agricultural objectives emphasizing synchronous production and fruit maturation, early in the fall season in large quantities. From 2013 – 2015, pollination services were measured in 24 commercial pumpkin fields (see Chapter 2). From 2012 – 2015, B. impatiens foragers were collected from 30 commercial pumpkin fields for colony abundance and population genetics studies (see Chapter 3 & 4). Data collection overlapped in 21 fields, allowing for comparative studies between colony abundances and pollination services. Commercial pumpkin fields were located in 3 distinct regions in Pennsylvania: Columbia, Lancaster and Centre (Figure 5, Figure 6, Figure 7). Fields are labeled by number in Table 1, Table 2, and Table 3 and these numbers are uniformly referenced throughout the thesis.

Figure 4. Eastern Pennsylvania, including county lines. The three 3 study regions are labeled: Centre (pink), Columbia (green) and Lancaster (gold). Study fields are indicated by black circles 11 Columbia Region

0 2.5 5 km

Figure 5. Columbia region is outlined in green on the small map of Pennsylvania, below, and enlarged. We conducted research from a total of 19 fields in the Columbia Region, 17 of which were in Columbia County. Field 2 was in Northumberland County and Field 25 was in Lycoming County. Colors indicate the type(s) of data collected at each field: only colony abundance data, only ecosystem service data, and both types of data Table 1. The types of data collected per spatially referenced fields in the Columbia Region for 2012 – 2015.

Year Field Latitude Longitude Colony Abundance Ecosystem Services Both 2012 1 41.22508 -76.417

2 40.94936 -76.6528

2013 3 41.06224 -76.33317

4 41.1817 -76.38639

5 41.15359 -76.36739

6 41.2207 -76.43759

2014 12 41.0427 -76.36185

13 41.0477 -76.37685

14 41.20586 -76.40469

15 41.25839 -76.4295

31 41.05151 -76.33006 33 41.04632 -76.36515 2015 20 41.22594 -76.40208

21 41.22773 -76.41379

22 41.0987 -76.38083

23 41.18999 -76.69755

24 41.16095 -76.37449

25 41.1008 -76.36939

32 41.0595 -76.33831 Total 19 16 17 14

12 Lancaster Region

0 2.5 5 km

Figure 6. Lancaster region is outlined in gold on the small map of Pennsylvania, below, and enlarged. We conducted research from a total of 7 fields in the Lancaster Region, all of which were in Lancaster County. All fields are blue indicating both colony abundance data and ecosystem service data was collected at all fields.

Table 2. The types of data collected per spatially referenced field in the Lancaster Region for 2013 – 2015.

Year Field Latitude Longitude Colony Abundance Ecosystem Services Both 2013 7 39.85678 -76.30309

8 39.9531 -76.2246

2014 16 39.873 -76.28297

17 39.87689 -76.18526

2015 26 39.97195 -76.19864

27 39.87282 -76.28159

28 39.86787 -76.1773

Total 7 7 7 7

13 Centre Region

0 2.5 5 km

Figure 7. Centre Region is outlined in salmon on the small map of Pennsylvania, below, and enlarged. We conducted research in a total of 7 field in the Centre Region, all of which were in Centre County. All fields are yellow indicating only colony abundance data was collected at all fields.

Table 3. The type of data collected per spatially referenced field in the Centre Region for 2013 – 2015.

Year Field Latitude Longitude Colony Abundance Ecosystem Services Both 2013 9 40.78646 -78.02143

10 40.70915 -77.94985

11 40.76346 -77.88334

2014 18 40.71515 -77.95412

19 40.76346 -77.88334

2015 29 40.71339 -77.95648

30 40.76346 -77.88334

Total 7 7 0 0

14 THESIS OBJECTIVES

This thesis will contribute to the scientific literature examining the pollination services and population status of pollinators active in commercial agroecosystems. The focus of these studies is to examine the factors influencing pollination activity in wholesale pumpkin fields in one of the top pumpkin-producing regions of the United States. I begin by evaluating factors influencing individual bees as they forage in pumpkin flowers. The value of a pollinator depends on both (1) the ecology of the bee supplying pollination services and (2) the pollination demands of a given agroecosystem; therefor I intentionally considered factors relevant to commercial pumpkin growers. I then measured Bombus impatiens colony abundances and genetic variation to better understand the connection between pollinator populations and ecosystem services, as well as the potential of native bee populations to continue providing pollination services. Chapter two is an evaluation of pollinators and their visitation rates to pumpkin flowers. I determined the community composition and dominance distribution of pollinators in 24 commercial, wholesale, ‘Face-pumpkin’ agroecosystems in Pennsylvania. Out of 37 species identified, B. impatiens was the most active, supplying 54% of all visits recorded. I measured visitation rates of the most common pollinators to pumpkin flowers and explored sources of variation impacting visitation rates including flower sex, temporal dynamics across the growing season, spatial dynamics across larger pumpkin fields, and floral density. I found that Bombus impatiens visitation rates responds positively to increasing floral resources and increase throughout most of the season. However, visitation rates in general decreased slightly as pumpkin flowers occurred further from the field edge and Bombus impatiens visitation rates in particular declined as field sizes increased. I also calculated a minimum ‘pollination threshold’ for adequate pumpkin fertilization and compared it with mean pollination services supplied by each of the most active pollinators in my study. On average, Bombus impatiens independently supplied almost 13x the required pollination services! Peponapis pruinosa, on average, provide close to 2x the required services. I also explored potential relationships between pollinator visitation rates and various pumpkin yield metrics, including seed set per pumpkin, pumpkin weight and pumpkins per square meter. In some years, B. impatiens visitation rates had a positive relationship with seed set and pumpkin weight. I conclude that in many cases, B. impatiens are supplying sufficient pollination services for commercial pumpkin fields in PA. However, the size and configuration of particular fields could render native pollinators less effective and managed pollinators might be beneficial in larger fields (> 3 - 4 hectares or 7-10 acres). Future studies 15 should consider mapping pollination services throughout the entirety of a pumpkin field to better predict when supplementation is necessary. To ensure my conclusions in chapter four are based on high quality genetic information, chapter three is focused on improving methods for microsatellite analysis when generating genetic data. Previous studies have published guides for designing multiplexes and suggest which fundamental issues to evaluate. However, there is no well-defined process for deciding when to exclude certain loci from genetic analysis when potential issues are found and thus loci are arbitrarily excluded based on changing criteria from study to study. l developed a conceptual workflow emphasizing a systematic and repeatable process for objectively evaluating microsatellite loci with clearly defined thresholds for eliminating problematic loci. I then demonstrated the use of this workflow by evaluating 14 microsatellite loci isolated from various Bombus spp. for use specifically in Bombus impatiens. While modeling the workflow, I present a novel method to address allelic drift, which is a common issue that has been the focus of multiple previous studies. Furthermore, I systematically evaluate the effect of monomorphic loci on several key genetic analyses, an issue that has been previously understudied. I identified one monomorphic locus, BT28 and concluded that including monomorphic loci does not substantially impact results of multiple genetic tests. Ultimately, I provide an optimized multi- plex of 11 non-species-specific microsatellite loci that can be confidently used in Bombus impatiens. In chapter four, I used genetic methods to estimate the abundance of B. impatiens colonies providing foragers for pollination in 30 commercial pumpkin fields spread across 4 years and 3 regions in central Pennsylvania. Because B. impatiens queens may exhibit some degree of polygamy, I evaluated the effect of mating assumptions (monogamy vs. polygamy) on the analysis and interpretation of colony abundance data. I found that assuming monogamy when a species actually practices polygamy results in a conservative estimate of sampled colonies when queens mate with less than < 1.13 males per queen. If queens mate with more than 1.13 males on average, assuming monogamy results in an inaccurately high number of sampled colonies. Coupling genetic methods and statistical models, I estimated approximately 544 colonies supplying foragers to a single pumpkin field. I also found that these relatively high colony abundances were stable across successive years and across nearby regions. I also evaluated patterns of population structuring and genetic diversity to better understand B. impatiens genetic potential to respond to selective pressures in the future. I found that B. impatiens are a single, panmictic population cross all study regions characterized by relatively high genetic diversity. I also found that neither gene flow nor diversity was decreasing across successive years. A subset 16 of 21 fields with visitation rate measures from chapter 2 overlapped with fields with colony abundance measures and so I examined the relationships between pumpkin field size, colony abundance per field, colony abundance per hectare and pollination services. I found that as colony abundance per hectare increased (due to decreasing field sizes), visitation rates increased. I conclude that pollination services provided by a population of eusocial B. impatiens are reliable given the current environmental conditions and furthermore, populations are likely to persist in the future. However, pollination services are diluted as field sizes increase due to the finite number of colonies within foraging distance of given field. Chapter 5 serves as a final discussion revisiting the most important findings presented throughout research chapters 2, 3, and 4 and connect major themes. I highlight portions of this thesis with the potential to make the most significant contributions to the scientific literature.

REFERENCES

Alford, D.V. (1975) Bumblebees. Davis-Poynter Ltd., London Artz, D. R., Hsu, C. L., & Nault, B. A. (2011). Influence of Honey Bee, Apis mellifera, Hives and Field Size on Foraging Activity of Native Bee Species in Pumpkin Fields. Environmental Entomology,40(5), 1144-1158. doi:10.1603/en10218 Artz, D. R., & Nault, B. A. (2011). Performance of Apis mellifera, Bombus impatiens, and Peponapis pruinosa (: ) as Pollinators of Pumpkin. Journal of Economic Entomology,104(4), 1153-1161. doi:10.1603/ec10431 Beckham, J., Warriner, M., Atkinson, S., & Kennedy, J. (2016). The Persistence of Bumble Bees (Hymenoptera: Apidae) in Northeastern Texas. Proceedings of The Entomological Society of Washington, 118(4), 481-497. http://dx.doi.org/10.4289/0013-8797.118.4.481 Birmingham, A.L. and Winston, M.L. (2004) Orientation and drifting behaviour of bumblebees (Hymenoptera: Apidae) in commercial tomato greenhouses. Canadian Journal of Zoology 82, 52-59 Cameron, S., Lozier, J., Strange, J., Koch, J., Cordes, N., Solter, L., & Griswold, T. (2011). Patterns of widespread decline in North American bumble bees. Proceedings of The National Academy of Sciences, 108(2), 662-667. http://dx.doi.org/10.1073/pnas.1014743108 Canon, D. (2011). Bee Colony Pollination rental prices, eastern US with comparison to west coast. Mid-Atlantic Apiculture Research and Extension Consortium. http://agdev.anr.udel.edu/maarec/wp-content/uploads/2011/02/Pollination-rentals- PNWEAST.pdf Chapman, R., Wang, J., & Bourke, A. (2003). Genetic analysis of spatial foraging patterns and resource sharing in bumble bee pollinators. Molecular Ecology, 12(10), 2801-2808. http://dx.doi.org/10.1046/j.1365-294x.2003.01957.x

17 Charman, T., Sears, J., Green, R., & Bourke, A. (2010). Conservation genetics, foraging distance and nest density of the scarce Great Yellow Bumblebee (Bombus distinguendus). Molecular Ecology, 19(13), 2661-2674. http://dx.doi.org/10.1111/j.1365-294x.2010.04697.x Cnaani, J., Schmid-Hempel, R., & Schmidt, J. (2002). Colony development, larval development and worker reproduction in Bombus impatiens Cresson. Insectes Sociaux, 49(2), 164-170. http://dx.doi.org/10.1007/s00040-002-8297-8 Colla, S., Richardson, L., & Williams, P. (2011). Bumble bees of the eastern United States. Washington, D.C.: U.S. Dept. of Agriculture Pollinator Partnership. Crozier R.H., Pamilo P. (1996) Evolution of Social Insect Colonies: Sex Allocation and Kin Selection. Oxford University Press, New York. Crozier, R H.1979. Genetics of sociality. In: H. R. Herman (ed.) Social Insects Vol. I, Academic Press, pp. 223–286. Darvill, B., Ellis, J., Lye, G., & Goulson, D. (2006). Population structure and inbreeding in a rare and declining bumblebee, Bombus muscorum (Hymenoptera: Apidae). Molecular Ecology, 15(3), 601-611. http://dx.doi.org/10.1111/j.1365-294x.2006.02797.x Darvill, B., Knight, M., & Goulson, D. (2004). Use of genetic markers to quantify bumblebee foraging range and nest density. Oikos, 107(3), 471-478. http://dx.doi.org/10.1111/j.0030- 1299.2004.13510.x Dudgeon, C. L., Blower, D. C., Broderick, D., Giles, J. L., Holmes, B. J., Kashiwagi, T., . . . Ovenden, J. R. (2012). A review of the application of molecular genetics for fisheries management and conservation of sharks and rays. Journal of Fish Biology, 80(5), 1789- 1843. doi:10.1111/j.1095-8649.2012.03265.x Ellis, J., Knight, M., Darvill, B., & Goulson, D. (2006). Extremely low effective population sizes, genetic structuring and reduced genetic diversity in a threatened bumblebee species, Bombus sylvarum (Hymenoptera: Apidae). Molecular Ecology, 15(14), 4375-4386. http://dx.doi.org/10.1111/j.1365-294x.2006.03121.x Estoup, A., Solignac, M., Cornuet, J., Goudet, J., & Scholl, A. (1996). Genetic differentiation of continental and island populations of Bombus terrestris (Hymenoptera: Apidae) in Europe. Molecular Ecology, 5(1), 19-31. http://dx.doi.org/10.1111/j.1365- 294x.1996.tb00288.x Funk, C.R., Schmid-Hempel, R., & Schmid-Hempel, P. (2006). Microsatellite loci for Bombus spp. Molecular Ecology Notes, 6(1), 83-86. http://dx.doi.org/10.1111/j.1471- 8286.2005.01147.x Geib, J., Strange, J., & Galen, C. (2015). Bumble bee nest abundance, foraging distance, and host-plant reproduction: implications for management and conservation. Ecological Applications, 25(3), 768-778. http://dx.doi.org/10.1890/14-0151.1 Goulson, D. (2010). Bumblebees behaviour, ecology, and conservation (2nd ed.). Oxford: Oxford Univ. Press. Goulson, D., Lepais, O., O’Connor, S., Osborne, J., Sanderson, R., & Cussans, J. et al. (2010). Effects of land use at a landscape scale on bumblebee nest density and survival. Journal of Applied Ecology, 47(6), 1207-1215. http://dx.doi.org/10.1111/j.1365-2664.2010.01872.x

18 Grixti, J., Wong, L., Cameron, S., & Favret, C. (2009). Decline of bumble bees (Bombus) in the North American Midwest. Biological Conservation, 142(1), 75-84. http://dx.doi.org/10.1016/j.biocon.2008.09.027 Guichoux, E., Lagache, L., Wagner, S., Chaumeil, P., Léger, P., Lepais, O., . . . Petit, R. J. (2011). Current trends in microsatellite genotyping. Molecular Ecology Resources, 11(4), 591-611. doi:10.1111/j.1755-0998.2011.03014.x Hurd PD, Linsley EG, Whitaker TW. 1971 Squash and gourd bees (Peponapis, Xenoglossa) and the origin of the cultivated. Cucurbita. Evolution, 25, 218–234. (doi:10.2307/2406514) Jha, S., & Kremen, C. (2012). Resource diversity and landscape-level homogeneity drive native bee foraging. Proceedings of The National Academy of Sciences, 110(2), 555-558. http://dx.doi.org/10.1073/pnas.1208682110 Kerr, J.T., Pindar, A., Galpern, P., Packer, L., Potts, S.G., Roberts, S.M., Rasmont, P., Schweiger, O., Colla, S.R., Richardson, L.L. Wagner, D.L., Gall, L.F., Sikes, D.S., Pantoja, A. (2015). Climate change impacts on bumblebees converge across continents. Science, 349(6244) 177-180. DOI: 10.1126/science.aaa7031 Klein AM, Vaissiere BE, Cane JH, Steffan-Dewenter I, Cunningham SA, Kremen C, Tscharntke T. 2007. Importance of pollinators in changing landscapes for world crops. Proceedings of the Royal Society B 274:303–313 DOI 10.1098/rspb.2006.3721. Knight, M., Martin, A., Bishop, S., Osborne, J., Hale, R., Sanderson, R., & Goulson, D. (2005). An interspecific comparison of foraging range and nest density of four bumblebee (Bombus) species. Molecular Ecology, 14(6), 1811-1820. http://dx.doi.org/10.1111/j.1365- 294x.2005.02540.x Lopez Vaamonde, Carlos & Will Koning, J & M Brown, Ruth & C Jordan, William & F G Bourke, Andrew. (2004). Social parasitism by male-producing reproductive workers in a eusocial insect. Nature. 430. 557-60. 10.1038/nature02769. Lowe, A., Harris, S., & Ashton, P. (2004). Ecological genetics: Design, analysis, and application. Malden, MA, USA: Blackwell Pub. Lozier, J., & Cameron, S. (2009). Comparative genetic analyses of historical and contemporary collections highlight contrasting demographic histories for the bumble bees Bombus pensylvanicus and B. impatiens in Illinois. Molecular Ecology, 18(9), 1875-1886. http://dx.doi.org/10.1111/j.1365-294x.2009.04160.x Lozier, J., Strange, J., Stewart, I., & Cameron, S. (2011). Patterns of range-wide genetic variation in six North American bumble bee (Apidae: Bombus) species. Molecular Ecology, 20(23), 4870-4888. http://dx.doi.org/10.1111/j.1365-294x.2011.05314.x Lye, G. C., Jennings, S. N., Osborne, J. L., & Goulson, D. (2011). Impacts of the Use of Nonnative Commercial Bumble Bees for Pollinator Supplementation in Raspberry. Journal of Economic Entomology, 104(1), 107-114. doi:10.1603/ec10092 Miller, C., Joyce, P., & Waits, L. (2005). A new method for estimating the size of small populations from genetic mark-recapture data. Molecular Ecology, 14(7), 1991-2005. http://dx.doi.org/10.1111/j.1365-294x.2005.02577.x O’Connor, S., Park, K.J. and Goulson, D. (2012) Humans versus dogs; a comparison of methods for the detection of bumblebee nests. Apidologie 51, 204-211. 19 Orzolek, Michael D., Elkner, Timothy E., Lamont Jr., William J., Kime, Lynn F., and Harper, Jayson K. (2012) Pumpkin Production. Penn State Extension Agricultural Alternatives. Retrieved from http://extension.psu.edu/business/ag-alternatives/horticulture/melons- and-pumpkins/pumpkin-production Osborne, J.L., Martin, A.P., Shortall, C.R., Todd, A.D., Goulson, D., Knight, M.E., Hale, R.J. and Sanderson, R.A. (2008) Quantifying and comparing bumblebee nest densities in gardens and countryside habitats. Journal of Applied Ecology 45, 784-792. Owen, R., & Whidden, T. (2013). Monandry and polyandry in three species of North American bumble bees (Bombus) determined using microsatellite DNA markers. Canadian Journal Of Zoology, 91(7), 523-528. http://dx.doi.org/10.1139/cjz-2012-0288 Payne, C., Laverty, T., & Lachance, M. (2003). The frequency of multiple paternity in bumble bee (Bombus) colonies based on microsatellite DNA at the B10 locus. Insectes Sociaux, 50(4), 375-378. http://dx.doi.org/10.1007/s00040-003-0692-2 Pfister, S. C., Eckerter, P. W., Schirmel, J., Cresswell, J. E., & Entling, M. H. (2017). Sensitivity of commercial pumpkin yield to potential decline among different groups of pollinating bees. Royal Society Open Science,4(5), 170102. doi:10.1098/rsos.170102 Pirounakis K, Koulianos S, Schmid-Hempel P, 1998. Genetic variation among European populations of Bombus pascuorum (Hymenoptera: Apidae). Eur J Entomol . 95:27- 33.Plath, O. E. (1934). Bumblebees and their ways. New York: The Macmillan Company. Rao, S., & Strange, J. (2012). Bumble Bee (Hymenoptera: Apidae) Foraging Distance and Colony Density Associated With a Late-Season Mass Flowering Crop. Environmental Entomology, 41(4), 905-915. http://dx.doi.org/10.1603/en11316 Schmid-Hempel, R., & Schmid-Hempel, P. (2000). Female mating frequencies in Bombus spp. from Central Europe. Insectes Sociaux, 47(1), 36-41. http://dx.doi.org/10.1007/s000400050006 Selkoe, K., & Toonen, R. (2006). Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecology Letters, 9(5), 615-629. http://dx.doi.org/10.1111/j.1461-0248.2006.00889.x Shao, Z.Y., Mao, H.X., Fu, W.J., Ono, M., Wang, D.-S., Bonizzoni, M., Zhang, Y.-P. (2004). Genetic Structure of Asian Populations of Bombus ignitus (Hymenoptera: Apidae), Journal of Heredity, 95(1)(1), 46–52. https://doi.org/10.1093/jhered/esh008 Sidhu, C. (2012). Farm-level and landscape-level effects on Cucurbit pollinators on small farms in diversified agroecosystems. Ph.D. dissertation. Pennsylvania State University Smith, BD. (1997) The initial domestication of Cucurbita pepo in the Americas 10000 years ago. Science. 276, 932 – 934. (doi:10.1126/science.276. 5314.932) Stolle, E., Rohde, M., Vautrin, D., Solignac, M., Schmid-Hempel, P., Schmid-Hempel, R., & Moritz, R. (2009). Novel microsatellite DNA loci for Bombus terrestris (Linnaeus, 1758). Molecular Ecology Resources, 9(5), 1345-1352. http://dx.doi.org/10.1111/j.1755- 0998.2009.02610.x Suni, S., Scott, Z., Averill, A., & Whiteley, A. (2017). Population genetics of wild and managed pollinators: implications for crop pollination and the genetic integrity of wild bees. Conservation Genetics, 18(3), 667-677. http://dx.doi.org/10.1007/s10592-017-0955-5 20 Tepedino VJ. 1981. The pollination efficiency of the squash bee (Peponapis pruinosa) and the honey bees (Apis mellifera) on summer squash (Cucurbita pepo). Journal of the Kansas Entomological Society 54:359–377. Thomson, J. D., & Goodell, K. (2002). Pollen removal and deposition by honeybee and bumblebee visitors to apple and almond flowers. Journal of Applied Ecology, 38(5), 1032-1044. doi:10.1046/j.1365-2664.2001.00657.x Tuell, J.K., A.K. Fielder, D. Landis, and R.Isaacs. 2008. Visitation by wild and managed bees (Hymenoptera: Apoidea) to eastern U.S. native plants for use in conservation programs. Environmental Entomology, 37(3):707-718. United States Department of Agriculture, National Agriculture Statistics Service. Quick Stats. (2017). https://quickstats.nass.usda.gov/ United States Department of Agriculture, National Agriculture Statistics Service. Cost of Pollination (2017). http://usda.mannlib.cornell.edu/usda/current/CostPoll/CostPoll-12- 21-2017.pdf Vieira, M., Santini, L., Diniz, A., & Munhoz, C. (2016). Microsatellite markers: what they mean and why they are so useful. Genetics And Molecular Biology, 39(3), 312-328. http://dx.doi.org/10.1590/1678-4685-gmb-2016-0027 Waters, J., O’Connor, S., Park, K., & Goulson, D. (2011). Testing a detection dog to locate bumblebee colonies and estimate nest density. Apidologie, 42(2), 200-205. http://dx.doi.org/10.1051/apido/2010056 Whitaker, T. W., & Cutler, H. C.(1965). Cucurbits and cultures in the Americas. Economic Botany 19(4):344-349. Winsor, J. A., L. E. Davis, and A. G. Stephenson. 1987. The relationship between pollen load and fruit maturation and the effect of pollen load on offspring vigor in Cucurbita pepo. The American Naturalist, 129 (5): 643-656. Woodard, S., Lozier, J., Goulson, D., Williams, P., Strange, J., & Jha, S. (2015). Molecular tools and bumble bees: revealing hidden details of ecology and evolution in a model system. Molecular Ecology, 24(12), 2916-2936. http://dx.doi.org/10.1111/mec.13198 Wood, T., Holland, J., Hughes, W., & Goulson, D. (2015). Targeted agri-environment schemes significantly improve the population size of common farmland bumblebee species. Molecular Ecology, 24(8), 1668-1680. http://dx.doi.org/10.1111/mec.13144 Zhang, H., Huang, J., Williams, P. H., Vaissière, B. E., Zhou, Z., Gai, Q., . . . An, J. (2015). Managed Bumblebees Outperform Honeybees in Increasing Peach Fruit Set in China: Different Limiting Processes with Different Pollinators. Plos One, 10(3). doi:10.1371/journal.pone.0121143

21 Chapter 2: Pollination Services in Cucurbita Agroecosystems

INTRODUCTION

Commercially produced pumpkins (Cucurbita pepo pepo) are among the most pollinator-dependent crops worldwide (Klein et al, 2007). Like many other species in the plant family Cucurbitacea, pumpkins are monecious plants with separate male (staminate) and female (pistillate) flowers blooming on the same plant. Male flowers produce a large, heavy pollen that must be vectored by insects to the sticky pistil in female flowers. When insect pollinators were excluded from female flowers, C. pepo plants yielded no fruit (Artz et al, 2011; Hoehn et al, 2008). Originally domesticated in Mexico between 5-10,000 years ago (Smith, 1997) pumpkins and other Cucurbitaceae (including cucumbers, melons and squash) are now grown world-wide for both consumption and ornamentation. Demand for pumpkins in particular has more than doubled since the 1980s. In the US, pumpkin production was worth >$185 million in 2017 (USDA NASS, 2018) and 70% of all pumpkins produced nation-wide are grown in just 7 states, including Pennsylvania. Among the crops reported from ≈4,000 diversified farms in PA, pumpkins rank among the top 3 -5 (out of >20) species, and collectively the PA pumpkin industry was worth over $13 million in 2017 (USDA NASS, 2018). Because this culturally and economically valuable pumpkin crop is a completely reliant on insect-vectored pollination, agricultural pumpkin fields are often supplemented with commercially produced honey bees (Apis mellifera) or bumble bees (most commonly Bombus impatiens, to ensure a profitable yield. This can be an expensive endeavor for growers: in 2017, the cost of renting honey bees for pumpkin production in the Mid-Atlantic was $76.20 per hive (USDA NASS, 2017; this cost is higher in other cropping systems) with extension guidelines recommending 1 hive per acre (Canon, 2011; Orzelek et al, 2012). With almost 5,000 acres of pumpkins grown in PA in 2017, the cost of pollination management alone could amount to nearly $400,000 annually. With these expenses in mind, growers are eager for cost-saving alternatives, including pollination provided by native bee populations.

PUMPKIN POLLINATORS: Studies published in the past decade have highlighted several wild bee species foraging in pumpkin flowers worldwide. Wild species include the solitary Cucurbita specialists in the Peponapis and Xenoglossa genera (commonly referred to as ‘squash bees’), solitary 22 generalists of the Helictus and Lasiglossum genera (commonly referred to as ‘sweat bees’), and eusocial generalists of the Bombus genera (commonly referred to as ‘bumble bees’) (Pfister et al, 2017; Phillips & Gardiner, 2015; Petersen et al, 2013; Artz et al, 2011; Artz & Nault, 2011; Cane et al, 2011; Julier & Roulston, 2009). However, both the community of pollinators and the relative abundance of pollinating visits for each bee taxa are unknown for one of the nation’s top pumpkin-producing states: Pennsylvania. The potential value of a pollinator depends not only on their visitation abundance, but also on their foraging preferences and pollination efficiency. Foraging preferences, likely dictated by resource needs of each bee taxa, will affect how frequently male and female flowers are visited. While only male pumpkin flowers produce pollen, both male and female flowers produce nectar. Bees interested in pollen resources, like female squash bee who collect pumpkin pollen to provision their nests (Hurd et al, 1971), may be more likely to visit male flowers. Alternatively, bees interested in nectar resources may preferentially visit female flowers because of their large nectaries, as was the case for honey bees foraging in New York pumpkins. (Artz & Nault, 2011, but see Pfister et al 2017). Flower gender preferences are unknown for common pollinators in Pennsylvania. Furthermore, a single visit to a female flower is not equally valuable across all pollinator species due to differences in both pollen acquisition and deposition. Previous studies provide a wide range of Cucurbita pollen deposition rates per visit per bee taxa, but even the most conservative studies estimate that Bombus spp. deposit the most pumpkin pollen per visit, ranging from 3x (A. mellifera, P. pruinosa) – 75x (halictids and other small bees) that of other pollinators (Pfister et al, 2017; Artz & Nault, 2011). Furthermore, the number of deposited pollen grains needed for adequate pollination varies across Cucurbita species. Minimum pollination requirements i.e. ‘pollination thresholds,’ have been mentioned for various Cucurbita cultivars worldwide. Phillips & Gardier, 2015 compared observed pollen grain deposition to estimated required pollen deposition to determine which pollinators were providing sufficient pollination in for C. p. pepo cv ‘Gladiator’ in Ohio. Pfister et al, 2017 used pollinator efficiency in tandem with plant fertilization requirements to calculate the minimum number of visits needed to achieve adequate pollination of Cucurbita maxima cultivar ‘hokkaido’ in Germany. However, ‘pollination thresholds’ based on visitation rates remains undefined for the most active pumpkin pollinators in the US. Furthermore, no studies have compared native bee visitation rates and estimated ‘pollination thresholds’ to determine if native bees are supplying sufficient pollination services.

23 AGRICULTURAL OBJECTIVES: In addition to pollinator attributes (visitation abundance, foraging preferences, pollination efficiency), production objectives influence pollination needs. In commercial pumpkin agroecosystems, production objectives depend largely on (1) the end use of pumpkin and (2) retail strategy. In the US, pumpkins are often referred to as either ‘Pie pumpkins’ or ‘Face pumpkins.’ Pie pumpkins processed for consumption, whereas face pumpkins are grown for ornamentation, often carved with faces during autumnal festivities. In Pennsylvania, >99% of pumpkins produced in 2016 were ‘Face pumpkins’ (USDA NASS, 2018). Face pumpkins reach consumer through two main retail strategies: direct market and wholesale. Direct market pumpkins are produced for Pick-your-own-Pumpkin-Patch operations and farm stands, where harvest occurs multiple times throughout October. In a wholesale system, pumpkins are harvested earlier (in early September), packed into standardized bins, and shipped to large retailers nationwide. Each field is typically harvested once, with agricultural objectives emphasizing synchronous production and fruit maturation, early in the fall season in large quantities. Because of the need for a relatively early harvest date, wholesale fields are often planted and bloom earlier than direct market fields. Therefore, the timing of pollination activity is critical. General bee activity and the availability of pollinating foragers fluctuates across the growing season depending on farm management and life history traits of different bee species. Temporal dynamics of species-specific visitation rates to pumpkin flowers across the growing season are currently unknown. Because wholesale growers need large quantities of pumpkins to fill wholesale orders, fields tend to be larger (> 2 acres) than direct market fields. Therefore, spatial dynamics of pollination activity should be considered. Because foraging ranges and strategies differ among bee taxa, bees may move through a patch of resources (i.e. fields of pumpkin flowers) differently. It is currently unknown if species-specific pollination activity is equally distributed across large pumpkin fields or concentrated at field edges. Additionally, larger fields will inevitably contain more flowers than smaller fields. Pumpkin plants produce large yellow-orange flowers that stand out against a backdrop of dark leafy green. Because of their location high up on the plant, male flowers have the potential to serve as bright advertisements to attract passing pollinators. A higher density of male flowers would create a more concentrated floral display and potentially attract greater forager abundances. However, if pollinator populations are limited, increased floral resources may actually dilute pollination services as the set number of foragers disperse among the larger numbers of flowers. Pollinator response to floral density is unknown in Pennsylvania pumpkins. Furthermore, its unknown if density of the more visible male flowers will affect visitation rates to female flowers. 24 PUMPKIN YIELD: Ultimately, the shared economic goal of growers and biological aim of pumpkin plants is to produce pumpkin fruit. The first step in plant propagation is adequate seed set. Because insects are vectoring pollen from male to female flowers, multiple studies have demonstrated that when pollinator visits are intentionally limited, there is a strong, positive relationship between bee visits and seed set for Cucurbita pepo (Artz & Nault, 2011; Xie et al, 2016). However, seed set is only one component of fruit set, fruit retention and agriculturally relevant yield. Fruit set is a measure of how many female flowers initiate viable fruit that have the potential to mature and result in harvestable yield. A lot can happen between fruit set and harvestable yield – damage from insects and other herbivores, lack of plant nutrients (soil nutrients, water, sunlight) and disease – all of which can result in low yields. Additionally, the plant itself may reduce fruit set to concentrate its energies on producing fewer fruit, focused on fruit produced under conditions of high pollen competition during the fertilization process (Winsor et al, 1987). Just as humans thin young apple fruit, the pumpkin plant will retain only a subset of fruit. Pollination activity has been linked to these metrics – seed set, fruit set and harvested yield – in pumpkins. Artz & Nault, 2011 demonstrated strong, positive relationships between visits from specific bee species and the percentage of female flowers that set fruit. Furthermore, increased pollination activity in some settings can be linked with increased yield (Artz & Nault, 2011, but see Peterson et al, 2013). It’s unknown if increasing pollination activity from specific bee species is correlated with harvestable yield in commercial pumpkin agroecosystems in Pennsylvania. Sufficient seed set is not only required for fruit set, but for some crops, additional seed set beyond the bare minimum has a positive relationship with fruit appearance and longevity. For pumpkins, seed set is correlated with pumpkin weight. Weight is not a direct production objective for Face-pumpkins; however, weight is likely correlated with circumference and length. Pumpkin shape is incredibly important for Face-pumpkins, particularly in a wholesale setting. When pumpkins are harvested in a wholesale setting, a predetermined number of pumpkins are packed into standardized bins and shipped to retailers. Growers are aiming for a particular pumpkin diameter in order to fit the specified number of pumpkins in the specified size container – pumpkins that are too small or too large are not harvested. The relationship between seed set and weight is unreported for C. p. pepo cv ‘Gladiator’ pumpkins and, furthermore, the relationship between pumpkin weight, circumference and length is also unreported. The number of bee visits have been linked with increasing pumpkin weight (Artz & Nault, 2011), but pollination activity has not been shown to influence pumpkin weight in a

25 commercial setting (Petersen et al, 2013). It is unknown if there is a link between pollination activity and pumpkin weight in commercial settings in Pennsylvania. In this study, we will determine the community composition and dominance distribution of pollinators in commercial wholesale Face-pumpkin agroecosystems in Pennsylvania. We will measure visitation rates of the most common pollinators and determine if native visitors are supplying sufficient visits per female flower to achieve optimal fertilization. We will explore sources of variation impacting pollinator visitation rates, including flower sex, temporal dynamics across the growing season, spatial dynamics across larger pumpkin fields, and floral density. Finally, we will explore potential relationships between pollinator visitation rates and pumpkin yield.

METHODS

STUDY SITE CONFIGURATION This study was conducted during pumpkin bloom (July 16th – Aug 22nd) in 2 regions (Lancaster county, and Columbia and adjacent counties) in Pennsylvania in 2013, 2014 and 2015. Twenty- four commercial pumpkin fields (2013, n = 6; 2014, n = 8; 2015, n = 10) were sampled, ranging from 1.28 to 12.7 ha (6.25 + 0.63 SE). Within each field, four transects were designated ranging from 80 – 100m in length. These transects were parallel to the field edge and placed at distances of 0m, 25m, 50m, and 100m from the field edge (Figure 8). In most cases, field edge was adjacent to unmanaged or forested habitat. All plants were the “Gladiator” cultivar in 22 of the 24 fields. Several ‘Cannonball’ cultivar plants were in one field (field 7) in 2013, but only in a few plots per transect. In 2015, one field (field 21) was ‘Giant’ cultivar plants and that field was excluded from yield analyses because that cultivar had a large effect on pumpkin weight and pumpkins per square meter.

SAMPLING PROCEDURES Pollination Activity: Observers visited the majority of fields (18 of 24) on 2 dates during pumpkin bloom to observe bee visits to pumpkin flowers. Four fields were sampled only once and two fields were sampled 3 times (sampling support detailed in Appendix A). Sampling was conducted between 0630 and 1200 hours EDST when weather conditions were favorable for bee activity (>15.5oC with low wind speeds). The sampling unit was the transect, and along each transect, ≈60 independent measures (Appendix A) were taken and subsequently averaged. For 26 each measure, bee visits to designated pumpkin flowers within a 1 m2 area were observed for 45 seconds. Designated flowers were defined as the available flowers for which an observer could confidently keep track of bee visits: across the entire study, 3.47 ± 0.02 SE flowers were observed per 45-second measure. Flower number, flower sex, and visits per bee morpho-taxa were recorded. Because not all observers could reliably identify bees to species in the field, bees were recorded as 1 of 9 morpho-categories: Honey Bee (Apis mellifera), Bumble bee (Bombus spp.), Squash bee (Peponapis pruinosa), Large Black Bee, Small Black bee, Large Striped Bee, Small Striped Bee, Green Bee, and Other (see Table 4 for species associated for each morpho-taxa). A ‘visit’ was defined as any instance in which a bee came in contact with the reproductive portions of the flower (either stamen or pistil). These measures provided a rate of taxa-specific pollinator visits per flower sex per 45 seconds.

Bee Pollinator Survey: During visitation measures, observers collected representative examples of each morpho-taxa that were actively foraging on both stamens and pistils. Glass or plastic 20ml scintillation vials were placed over an actively foraging bee and shaken gently to disturb the bee. Once the bee flew to the top of the vial, vials were capped and placed on ice before being transferred to a -20C freezer to kill and preserve specimens. To determine the community composition of pumpkin pollinators, collected specimen were pinned and identified to species with assistance from experts including Jason Gibbs, Sam Droge and Robert Jean. The species list was then compared with the number of visits contributed by each morpho-taxa to determine the dominance distribution of pollination activity. The taxa comprising > 95% of the visits were included in subsequent analyses.

Floral Density: In 22 of the 24 fields, floral density measures were taken for each transect on each sampling date after bee visitation observations were completed. The sampling unit was the transect and along each transect, 10 – 60 (detailed in Appendix A) floral density measures were taken and subsequently averaged. For each density measure, the number of male and female flowers in 1m2 was recorded.

Yield Metrics: Eighteen of the 24 fields were visited once after pumpkin maturation to collect yield metrics in advance of commercial harvesting (Aug 28th – Sept 26th). Five types of data were collected: pumpkin weight, circumference, and length, seed set, and fruit per square meter. The sampling unit was the transect and for each metric, multiple measures were taken per transect

27 and subsequently averaged. Along each transect, randomly selected pumpkins were weighed and the circumference was measured at the roundest part of the pumpkin (n = 5 per transect in 2013; n = 20 per transect in 2014 & 2015). Of the weighed pumpkins, 5 were cut open to measure the length from stem to calyx, and seeds were collected in 2013 and 2014. Seeds were washed and placed in a seed drier for 3 days before hand-counting. Additionally, in 2014 and 2015, fruit per square meter was measured 20 times along each transect by counting the number of mature pumpkins in random 1 square meter plots.

ANALYSIS JMP® Pro, Version 13.0.0 (SAS Institute 2007) was used to complete all following analyses. Significance is found at alpha = 0.05 unless otherwise specified. Regressions were completed using “Fit Model” with model personality “Standard Least Squares”, and emphases “Effect Leverage.” Analysis of Variances (ANOVAs) were completed using “Fit Y by X.” Visitation rate data was normalized using LogE (x + 0.01) transformations, based on results from Box-Cox Y tests. Untransformed data is presented in all figures.

Flower Sex Foraging Preferences: To test for flower sex foraging preferences, the distribution of male and female flowers observed was compared to the distribution male and female flower visits from each bee taxa independently. We considered spatial dynamics of flower sex preference by examining preferences for each distance from field edge (0m, 25m, 50m & 100m) independently. When there is no difference between flowers observed and flower visits, no preference exists. Comparisons were made using a Likelihood Ratio Chi-square test based on total visits per bee taxa, summed across all sampling dates for each transect, implemented with the contingency analysis function in JMP® Pro. Because there were 12 independent tests (3 bee taxa x 4 distances from field edge), significance is found at alpha = 0.004 after Bonferroni corrections.

Factors influencing visitation rates: We used an overall model to examine the effect of categorical variables (bee taxa and flower sex) and continuous variables (field area, distance from field edge, day of year, and male flower floral density per square meter) on visitation rates. We included 2- and 3-way interactions between categorical variables and each continuous variable. We used an overall regression model just examining the fixed effects as well as an overall mixed model including year, region and field as random effects. Both overall models provided similar

28 results. Because the overall regression model provides an R-squared value indicating the amount of variation explained by the model, we report results for the regression model in the text, and also report the mixed model results as an Appendix. Removing non-significant terms from the overall regression model only increased the value of the F-statistic and therefore we report the model only with significant terms. Significant 2-way interactions between categorical variables (bee taxa and flower sex) were examined with a 2-way ANOVA. When 3-way interactions were significant, visitation rates were partitioned by flower sex and the effect of continuous variables was evaluated with regression for each bee taxa separately. When 2-way interactions involving continuous variables were significant, visitation rates were first combined across the non-significant categorical variable and then partitioned by the significant categorical variable to examine the effect of continuous variables on each subset of data. When significant continuous variables did not interact with any categorical variable, visitation rates were combined across bee taxa and flower sex to examine the effect of continuous variables with regression.

Pollination Thresholds: We synthesized literature to determine the number of required visits per female flower lifetime for optimal Cucurbita pollination for the most active bee taxa in our study. To be as cautious as possible, we used the highest number of visits required to achieve pollination reported for each taxa. Required visits were converted to visitation rates (visits per flower per 45 seconds) to reflect the unit of visitation rates measured in our study. Because pumpkin flowers are open a minimum of ~4 hours on a single day (Tependino, 1981), pollinators have a minimum of 14,400 seconds (4 hours x 3600seconds per hour) to deliver the maximum necessary visits within a female flower’s lifetime. Therefore, the “visitation rate threshold” per taxa can be calculated as:

��������� ������ ��� ������ × 45 ��� = ��������� ���������� ���� (������⁄������/45 sec ) 14,400 ���

To determine if current visitation rates meet or exceed pollination thresholds, we compared mean female flower visitation rates from this study with the calculated “visitation rate threshold” for each bee taxa.

Yield: Before averaging by transect, the mean, standard error, and range of each yield metric is reported. One-way analysis of variances (ANOVAs) and pairwise comparisons of means using

29 Tukey tests were completed using “Fit Y by X” in JMP® Pro to examine the effect of year on each yield metric. The relationship between seed set and pumpkin weight is evaluated using regression. Pearson’s correlation was used to measure the strength of linear relationships between the weight, circumference and length per pumpkin using “Correlations Multivariate” in JMP Pro with significance reported at alpha = 0.01.

Relationships between Visitation rates and Yield metrics: To test for relationships between yield metrics and visitation rates, yield metrics were averaged by transect. Because not all yield metrics were collected in all years, we examined associations between yield metrics and visitation rates for each year separately with principal components analysis (PCA) using the RLEM method, and multivariate correlation analysis, both implemented in JMP Pro. Visitation rates were separated by Bee taxa and Flower sex. Visually associated yield metrics and visitation rates supported by significant correlations were further examined with simple linear regression.

RESULTS

COMMUNITY COMPOSITION From a total of 844 collected specimen, 37 bee species were identified from 15 genera within 4 families (Table 4). The majority (78%) belonged to 3 species from the Apidae family: Bombus impatiens (n = 349, 41%), Peponapis pruinosa (n = 164, 19%), and Apis mellifera (n = 147, 17%). While the majority of Bombus specimen \were B. impatiens (94%), 5 additional species were encountered. To maintain accuracy, the term “Bombus spp.” is used in subsequent analyses. The majority of other collected pollinators were small green or black sweat bees. Most green sweat bees were a single species, Augochlora pura, while the small black bees were a mix of species, many from the Lasioglossum genus (Table 4). Species from Diptera and Coleoptera were also found in pumpkin flowers, but were not consistently collected by observers, and are not included in any analyses.

POLLINATOR ACTIVITY DISTRIBUTION Over the course of the study, 10,436 visitation measures were taken (60 measures per transect x 4 transects per field x 2 dates per field x 24 fields – missing data) for a total observation time of ≈130 hours (45sec x 10,436 measures). Once calculating an average per transect for each date for each field, analyses were performed on a sample size of 182 transects (4 transects per date x 30 2 dates per field x 24 fields – missing data). Between 68 - 553 male flowers were observed per transect (189.9 + 7.12 SE) and 0 - 53 female flowers were observed per transect (8.9 + 0.65 SE). We recorded 14,152 bee visits to pumpkin flowers. Three species were responsible for 97% of all visits: Bombus spp. (n = 7690, 54%), A. mellifera (n = 3482, 25%) and P. pruinosa (n = 2577, 18%) (Figure 9). Small black and green sweat bees combined were responsible for 2.3% of all visits (n = 332), with all other visitors providing just 0.5% of pollination activity. Because A. mellifera, Bombus spp. and P. pruinosa were the most common pollinators in this study, subsequent analyses focus primarily on these three species. Occasionally, all taxa are included when analyzing “All Pollinators.”

FLOWER SEX FORAGING PREFERENCES In total, 36,192 flowers were observed, 95.51% of which were male flowers (n = 34,566) and 4.49% of which were female flowers (n = 1,626). Distance from field edge did not influence the distribution of male and female flowers observed (X2 = 1.37, P = 0.719). The proportion of A. mellifera visits to female flowers was significantly greater than the proportion of female flowers observed for every distance from field edge (Figure 10, 0m: X2 = 349.1, P < 0.0001; 25m: X2 = 281.38, P < 0.0001; 50m: X2 = 173.1, P < 0.0001, 100m: X2 = 111.8, P < 0.0001), but as distance from edge increased, proportion of A. mellifera female flower visits decreased from 22.5% at 0m to 14.9% at 100m from the edge. The proportion of Bombus spp. visits to female flowers also decreased as distance from field edge increased (0m: 9%, 100m: 4.5%) and female flower visits were only significantly greater than female flowers observed at 0m, 25 and 50m from field edge (Figure 10, 0m: X2 = 57.5, P < 0.0001; 25m: X2 = 12.7, P = 0.0004; 50m: X2 = 12.3, P < 0.0005; 100m: X2 = 0.003, P = 0.9585). The distribution of male and female flower visits for P. pruinosa never differed from the distribution of male and female flowers observed (Figure 10, 0m: X2 = 0.04, P = 0.8361; 25m: X2 = 2.1, P = 0.1461; 50m: X2 = 0.003, P = 0.9579; 100m: X2 = 0.45, P = 0.4847).

SPATIAL, TEMPORAL AND FLORAL RESOURCE EFFECTS ON VISITATION RATES Bee taxa, flower sex, field area, distance from field edge, day of year, and male flower floral density per square meter were all significant factors in predicting visitation rates either

2 independently or when interacting with other factors (Table 5, F18, 956 = 21.86, P < 0.001, R = 0.29) (mixed model results can be viewed in Appendix B). Because of the significant interaction between Bee taxa and Flower sex in the overall model (Table 5), we performed a 2-way ANOVA on visitation rates and found significant effects

31 2 of bee taxa, flower gender and the interaction term (Figure 11, F5, 1029 = 25.8, P < 0.0001, R = 0.11). Because of the significant 3-way interaction between Flower Sex, Bee Taxa and male flower Floral density (Table 5), we examined the effects of floral density on visitation rates for each bee taxa to each flower gender separately. P. pruinosa visitation rates to male and female flowers and Bombus spp visitation rates to female flowers were independent from floral density (P > 0.73, 0.38, 0.48, respectively). However, male flower floral density had a positive relationship

2 with A. mellifera visitation rates to both female (Figure 12.A, F1, 151 = 33.48, P < 0.0001, R =

2 0.18), and male flowers (Figure 12.B, F1, 170 = 42.16, P < 0.0001, R = 0.19) and as well as Bombus

2 spp visits to male flowers (Figure 12.C, F1, 170 = 35.63, P < 0.0001, R = 0.09). Because of the significant 2-way interaction between Bee taxa and Field area (Table 5), we combined visitation rates across flower sex and examined the effect of field area on visitation rates for each bee taxa separately. A. mellifera and P. pruinosa visitation rates were independent from field area (P > 0.28, 0.88, respectively). However, field area had a negative effect on Bombus

2 spp. visitation rates (Figure 13, F1, 180 = 7.23, P = 0.0079, R = 0.04). Because of the significant 2-way interaction between Bee taxa and day of year (Table 5), we combined visitation rates across flower sex and examined the effect of day of year on visitation rates for each bee taxa separately. P. pruinosa visitation rates were independent from day of year (P > 0.52). . As the season progressed, A. mellifera and Bombus spp visitation rates both exhibited a curvilinear response with significant quadratic terms (A. mellifera: Figure 14.A, F2, 179

2 2 = 18.89, P < 0.0001, R = 0.17; Bombus spp: Figure 14.B F2, 179 = 47.3, P < 0.0001, R = 0.35). Because distance from field edge was significant in the overall model and did not interact with either categorical variable (Table 5), we combined visitation rates across Flower sex and Bee taxa, which decreased the sample size from 1035 in the overall model to 182. With the reduced sample size, distance from field edge has a weak, negative relationship with visitation rates, only

2 significant at alpha = 0.1 (F1, 180 = 2.72, P = 0.1, R = 0.02).

POLLINATION THRESHOLDS Literature synthesis revealed a range of values (both within and among specific taxa) for necessary pollinator visits to achieve adequate Cucurbita yield (Table 6). We used the maximum visits published to set the most conservative “Pollination Threshold” for each species: 16 required visits for A. mellifera, 8 required visits for B. impatiens and 16 for P. pruinosa (Table 3). In this study, each female flower received a mean total of ≈282.5 visits from all pollinators (Table

32 7; male flower visits detailed in Appendix C). Each species independently provided 1.7x – 12.75x of required pollination services, exceeding “pollination thresholds” (Table 7). YIELD METRICS Weight and circumference were measured for 1,141 pumpkins (5 or 20 pumpkins per transect x 4 transects per field x 18 fields in 2013, 2014, and 2015 + 1 additional pumpkin). Once calculating an average weight and circumference per pumpkin for each transect in each field, analyses were performed on a sample size of 72 transects. Length and seed set were measured for 250 pumpkins (5 pumpkins per transect x 4 transects per field x 13 fields in 2013 and 2014 – isolated incidents of missing data). Once calculating an average length and seed set per pumpkin for each transect for each field, analyses were performed on a sample size of 52 transects. Over the course of the study, 1038 fruit per m2 measures were taken (20 measures per transect x 4 transects per field x 13 fields in 2014 and 2015 – isolated incidents of missing data). Once calculating an average fruit per m2 for each transect for each field, analyses were performed on a sample size of 52 transects. See Appendix A for a full summary of the number of measures taken per yield metric. The overall mean, standard error, and range as well as yearly means and standard errors for each yield metric is reported in Table 8. There were no differences in yearly means for Fruit

2 2 2 per m (F2, 1036 = 0.01, P = 0.94, R < 0.001), seed set (F1, 248 = 3.2, P = 0.07, R = 0.01) or length

2 (F1, 247 = 0.26, P = 0.61, R = 0.00). Year did have an effect on pumpkin weight (F2, 1138 = 87.4, P

2 2 = <0.0001, R = 0.13) and circumference (F2, 1138 = 87.9, P = <0.0001, R = 0.13). For C. p. pepo

2 cv ‘Gladiator’ pumpkins, weight was affected by seed set (Figure 16, F1, 242 = 68.6, P < 0.0001, R = 0.22,). The positive relationship previously identified between seed set and weight for individual pumpkins (Figure 16) meant that mean seed set and mean fruit weight per transect were visually associated in 2013 (Figure 17A) and 2014 (Figure 17B), supported with correlation (2013,: r = 0.72, P = 0.0003; 2014, r = 0.5, P = 0.003). Weight was also strongly correlated with circumference (correlation coefficient: 0.92, P < 0.0001) and length (r: 0.78, P < 0.0001). Length was also strongly correlated with circumference (r: 0.75, P < 0.0001). Because of the strong correlations among pumpkin weight, length and circumference, only pumpkin weight was selected for analysis with visitation rate, along with Fruit per m2 and seed set.

VISITATION RATES & YIELD METRICS For each year, the first two principal components accounted for 65.5%, 54.2%, and 63.3% of the variation, respectively (Figure 17). Seed set was measured in the first two years. In 2013, seed set

33 was aligned with Bombus spp. visits to male flowers (Figure 17A), supported with correlation (r =0.75, P = 0.0002) and Bombus spp. visits to male flowers explained 56% of the variation in seed

2 set that year (Figure 18, F1, 18 = 22.5, P = 0.0002, R = 0.56). Although A. mellifera visits to female flowers appeared aligned with seed set in 2014 (Figure 17B), the vectors were short and their correlation was not significant (P > 0.92). Seed set was not significantly associated with any other visitation rates in 2013 or 2014 (P > 0.19). Pumpkin weight was measured all three years. In 2013, pumpkin weight was aligned with Bombus spp. visits to male flowers (Figure 17A), supported by correlation (r =0.49, P = 0.03); Bombus spp. visits to male flowers explained 24% of the variation in pumpkin weight that year

2 (Figure 19A, F1, 18 = 5.6, P = 0.029, R = 0.24). Surprisingly, in 2013 the pumpkin weight vector was in opposition to P. pruinosa visits to male flowers (Figure 17A), supported by negative correlation significant at P = 0.054 (r = -0.4611). Independently, P. pruinosa visits to male flowers

2 explained 21% of the variation in pumpkin weight (F1, 18 = 4.86, P = 0.041, R = 0.21). In 2014, pumpkin weight was closely aligned Bombus spp. visits to female flowers (Figure 17B), supported by correlation (r =0.57, P = 0.0007), which explained 32% of the variation in pumpkin weight

2 that year (Figure 19B, F1, 30 = 14.4, P = 0.0007, R = 0.32). Surprisingly, in 2014 the pumpkin weight vector was completely opposite A. mellifera visits to male flowers (Figure 17B), supported by a negative correlation (r =-0.37, P = 0.0348). Independently, A. mellifera visits to male flowers

2 explained 13.6% of the variation in pumpkin weight in 2014 (F1, 30 = 4.7, P = 0.0378, R = 0.136). Other visitation rates were not significantly correlated with Pumpkin weight for any year (P > 0.13). Fruit per m2 was measured in the last two years. In 2014, P. pruinosa visitation rates to both male and female flowers were associated with fruit per m2 (Figure 17B), each supported by correlation (male flower: r =0.64, P < 0.0001, female flower: r =0.43, P = 0.015). P pruinosa visits to male flower had a stronger positive relationship with fruit per m2 and accounted for 40% of

2 2 the variation in fruit per m (Figure 20A, F1, 30 = 20.3, P < 0.0001, R = 0.4) while P. pruinosa

2 visits to female flowers accounted for 18% of the variation in fruit per m (Figure 20B, F1, 30 = 6.7, P = 0.015, R2 = 0.18). Fruit per m2 was not significantly associated with any other visitation rates in 2013, 2014 or 2015 (P > 0.07).

34 DISCUSSION

POLLINATOR COMMUNITY COMPOSITION Commercial pumpkin fields in Pennsylvania support a surprisingly high diversity of bee species (n = 37; Table 4) for Cucurbita agroecosystems; although most were not significantly contributing to pollination services based on the highly skewed dominance distribution (Figure 9). Similar to previous studies conducted in Cucurbita agroecosystems in the US, the 3 most abundant pollinators found in our study were A. mellifera, B. impatiens and P. pruinosa (Artz et al, 2011; Julier & Roulston 2009; Petersen et al, 2013; Phillips & Gardiner, 2015). These 3 species were also among the largest in body size, and thus expected to transfer more pollen per visit than the smaller species such as the many Lasioglossum that we documented. Most previous studies have reported only a single Bombus species: B. impatiens, whereas we collected individuals representing 5 additional species, albeit in low quantities. Several Bombus species collected are considered ‘uncommon’ including, B. fervidus and B. terricola, the latter of which is thought to be in decline throughout its range (Colla et al, 2011). We also collected Triepeolus remigatus, which is a known kleptoparasite of P. pruinosa. Within an ecological context, the presence of kleptoparasites indicates that primary consumer populations are robust enough to support a tertiary trophic level. In our case, the presence of T. remigatus suggests robust P. pruinosa populations.

FLOWER SEX PREFERENCES Similar to previous studies, we found that the proportion of A. mellifera visits to female flowers was 5 – 3x higher than the proportion of female flowers observed, suggesting a strong preference for female flowers (Figure 10.A-D) This supports the idea that A. mellifera are only nectar-collecting, as reported in Artz et al, 2011. The proportion of Bombus spp. visits to female flowers was up to twice as high as the proportion of female flowers observed, also suggesting a preference for female flowers, varying with distance into the field. Nectar and pollen collecting behaviors were not specifically documented in our study; nevertheless, most observers agree that both A. mellifera and Bombus spp. foragers appeared to be primarily nectar collecting when visiting pumpkin flowers. It is possible that Bombus spp. were collecting pumpkin pollen as well, although Bombus spp. foragers were occasionally observed brushing pollen from their bodies, leaving behind bright bursts of orange pollen on the dark green pumpkin leaves. A disinterest in pumpkin pollen could be due to a lack of appropriate macronutrient ratios. Vaudo et al, 2016

35 reported B. impatiens foraging for pollen with an optimal protein to lipid ratio of 5:1 and Treanor, 2017 reported pumpkin pollen contains only a 1.45:1 protein to lipid ratio. In microcolony experiments, B. impatiens workers fed only pumpkin pollen lost a significantly greater amount of weight compared with workers fed other diets (Treanor, 2017). Certain species of bees might also avoid pumpkin pollen due to secondary plant compounds. Plants in the Cucurbitaceae family, including C. p. pepo, contain cucurbitacin: defensive plant toxins found in the vegetative material to protect the plant from herbivory. Preliminary work from the Irwin lab at NCSU is finding that secondary plant compounds are found in higher concentration in pollen than nectar and at times, can equal levels found in vegetative material (Heiling, unpublished). Unlike the other two taxa, P. pruinosa did not exhibit a preference for either flower sex (Figure 10). Given A. mellifera and Bombus spp. preference for female flowers, it’s possible that P. pruinosa were excluded from female flowers by the presence of other bees. Xie et al, 2016 found that bees foraging in Cucurbita pepo L. took longer to enter a flower after it was visited by individuals of a different species, compared with individuals of the same species. Artz et al, 2011 reported P. pruinosa avoided entering flowers with other bee species which could be evidence of possible competition. Flower sex preferences may exist for P. pruinosa, depending on the sex of the P. pruinosa forager. Unlike the other 2 taxa, P. pruinosa is a solitary pollinator and both male and female individuals were found foraging in pumpkin flowers. With no nests to provision, males are unlikely to be using pollen resources and therefore may demonstrate a preference for the larger nectar rewards in female flowers. However, because female P. pruinosa use both nectar and pollen resources from Cucurbita plants, the lack of flower sex preference is not entirely unexpected. Future studies should consider P. pruinosa male and female foragers separately when evaluating flower sex preference.

SPATIAL DYNAMICS OF POLLINATION SERVICES Overall, we found that visitation rates decreased ever so slightly as distance from the field edge increased; however, it was not a strong relationship (Figure 15). A different sampling design may reveal spatial dynamics that we could not discern with our sampling. Precision agriculture methods can create spatially-references probability maps to predict where pest-damage might exceed economic thresholds within a given field (Fleischer et al, 1999). Using similar techniques, future studies could map pollinator activity throughout entire fields to gain a better understanding of taxa-specific movement through large floral resources and predict where activity pollination services might be lacking.

36 However, for the first time, we did find that A. mellifera and Bombus spp. preference for female flowers were affected by spatial dynamics: as distance from field edge increased, preference for female flowers decreased for both pollinators (Figure 10). This could be due in large part to plant structure and flower placement. Female flowers, located close to the ground, can be obscured by leafy vegetation. Pumpkin plants tend to get more lush and vegetative as distance from field edge increases – which could conceal female flowers and make female flower foraging more energy intensive for the pollinator. If male flowers which occur in much higher numbers provided adequate nectar resources, it could have been too high a fitness cost for bees to exert additional effort seeking out female flowers. We noticed Bombus spp. foragers flying awkwardly through dense foliage, often bumping into spiky pumpkin stems when trying to reach flowers amongst thick vegetation. Even with these spatial dynamics at play, female flowers 100meters from the edge were still visited at a relatively high rate of 0.7 + 0.11 SE bee per flower per 45s. If this decreasing trend in visitation rates continues at distances even greater than 100m from the edge, eventually there could be a negative effect on production objectives in certain field layouts. Any square field larger than 4 ha (200m L x 200m W) or circle fields larger than 3.14 ha (100m radius) could begin to experience yield issues towards the center. It is interesting to note, however, that cultivation practices in Pennsylvania often tend to follow contours in hilly landscapes, resulting in a large edge-to-area ratio. This agricultural practice, typically implemented by farmers for soil conservation goals, may be helping ensure pollination services in our agroecosystems.

TEMPORAL DYNAMICS OF POLLINATION SERVICES Temporal dynamics across the season effected visitation rates differently for each bee taxa. A. mellifera visitation rates peaked mid-season and were highest during the timeframe in which growers typically rent commercial hives (Figure 14A). This suggests that most A. mellifera foragers were from managed colonies, as opposed to feral honey bee populations which may have contributed to A. mellifera visitation rates in similar studies (Petersen et al, 2013). In our case, increases in A. mellifera visitation rates were due to increases in the number of A. mellifera colonies at or near a given field. Bombus spp. visitation rates, in contrast, increased throughout the season (Figure 14B), exhibiting a pattern similar to previous studies (Julier & Roulston, 2009). Unlike A. mellifera, an increase in Bombus spp. visitation rates was not due to increasing colonies because our grower collaborators did not stock commercial bumble bees. Instead, all Bombus spp visitation rates were supplied by wild populations. The greatest number of Bombus colonies will be in early spring when over-wintering queens first emerge and found colonies. Over time, 37 colonies will fail due to lack of resources, parasitism, predation or disease and thus throughout the season, colony numbers are inevitably decreasing (Goulson, 2010). However, the colonies that do persist are growing in size as the queen continuously lays eggs and additional workers emerge. B. impatiens, the most common Bombus species encountered in our study, is estimated to contain between 25 – 450 workers, the largest of which were reported later in the season (Plath, 1934). Therefore, we believe that Bombus spp visitation rates increased throughout the season due to the increasing size, rather than number, of colonies. If growers hope to rely on native Bombus spp for pollination services, this could be worrying, depending on the number of colonies nesting within foraging distance. If Bombus foragers originate from a few large colonies, pollination services could be vulnerable to the loss of a few key colonies. Future studies should estimate the abundance of common Bombus spp. colonies in this region (See Chapter 4) to better understand the reliability of native pollinators. Seasonal patterns were not discernable for P. pruinosa given the noise in our data. Future studies could sample more frequently throughout a season to ascertain if seasonal dynamics impact P. pruinosa visitation rates.

POLLINATOR RESPONSE TO FLORAL RESOURCES Both A. mellifera and Bombus spp visitation rates increased with increasing male flower floral density (Figure 12). This suggests C. p. pepo blooms act as a mass floral resource and attracts bee foragers. Furthermore, it’s interesting that male flower floral density also increased A. mellifera visitation rates to female flowers (Figure 12). The idea of ‘spillover’ has oft been studied in plant- pollinator interactions, where increased visitation rates to an attractive resource may cause visiting insects to “spill-over” into surrounding resources that would also benefit from increased visitation. In agricultural settings, planting strips of wildflowers next to crops with the goal of increasing crop pollination has been met with varying success. It’s possible that pumpkin plants are employing a similar “spill-over” strategy to increase visitation to female flowers with large displays of male flowers. This strategy could be advantageous to the overall health of the pumpkin plant as it’s possible that certain plant pathogens may enter plants through exposed nectaries in female flowers. Limiting the number of vulnerable female flowers, while still attracting pollinators with male flowers could be one way the plant defends itself while still receiving adequate pollination. Higher male flower floral density may also be a fitness strategy for the plant: distributing pollen from many male flowers is a less resource-intensive way to pass on genetic material than producing fruit. Increased visitation rates in response to increasing floral densities also suggests that pollinator populations in our current agroecosystems are large

38 enough to keep up and even increase visitation rates in the face of additional flowers, as opposed to becoming diluted. This is an exciting possibility for the native Bombus spp populations and their potential to provide necessary levels of pollination services. P. pruinosa response to floral density was inconsistent, depending on the time of season. It could be that when P. pruinosa emerge from last year’s fields, they are drawn to fields with more flowers during their search for a new home early in the season. After that time, they are situated in their home territories and visit flowers regardless of density in their chosen fields – rather than being attracted to new fields based on increasing floral resources. More research is needed to define the phenological distribution of Peponapis host-finding behavior, and how that corresponds to horticultural practices that define attractive floral or other Cucurbita cues. This information has direct potential application for helping growers encourage Peponapis colonization near their rotated crops.

IMPLICATIONS FOR AGRICULTURAL PRODUCTION Based on previous studies, we determined that a female Cucurbita flower needs ~13.3 visits to achieve adequate pollination, given the pollination efficiency of our most active bee taxa (Table 6, Table 7). From our observation data, we estimate that each female flower receives ~280 visits in a single morning (Table 7) – almost 20x what is required! We estimated that a single female flower was visited ~150 times by A. mellifera, ~102 time by Bombus spp., and ~27 times by P. pruinosa (Table 7). This suggests that native pollinators combined are providing almost 15x the necessary pollination services for commercial pumpkins in Pennsylvania! Excessive pollination services in Cucurbita agroecosystems have been reported previously. Phillips & Gardiner, 2015 found female pumpkin flowers received double the necessary pollen grains for adequate ‘Gladiator’ pumpkin pollination in Ohio, most of which was deposited before 800 EST. Pfister et al, 2017 used modeling to determine that only 11% and 7% of Bombus spp and A. mellifera pollination activity, respectively, was necessary to adequately pollinate Cucurbita maxima cv “Hokkaido” pumpkins in Germany. Julier & Roulston, 2009 reported 5.5 P. pruinosa and 3.1 B. impatiens foragers in Cucurbita flowers every minute. Additional previous studies have reported a single female flower receiving > 100 A. mellifera visits, ~19 Bombus spp. visits, and ~5.5 P. pruinosa visits in a lifetime (Artz & Nault, 2011; Pfister et al, 2017). Even so, pollination services supplied by native bees in our study appear to be greater than other studies in surrounding areas. These differences could be artificial; simply a result of variation in the way different studies measured pollination services. For example, in our study, a single pollinator

39 could supply multiple visits if it alighted and re-landed on the same flower. However, I believe our results represent actual differences in visitation rates, with our system experiencing substantially more visits (~5x) from native bees than other systems because of larger native bee populations. The pumpkin cultivar used in our study, ‘Gladiator’, produces a larger pumpkin than the cultivars used in other studies (‘Mystic’ and ‘Hokkaido’) and therefore may also have larger flowers, able to accommodate more bees simultaneously or supply more resources per flower. Furthermore, Artz & Nault, 2011 specifically mention management practices used in their setting that are known to have adverse effect on P. pruinosa populations. Pennsylvania, on the other hand, is one of the leading states in no-till pumpkin agriculture (along with leading in no-till of many other crops), which has the potential to support much larger P. pruinosa populations. Although all studies report intense native bee activity, the idea that native P. pruinosa and Bombus spp. populations are more abundant in Pennsylvania should encourage efforts to conduct context-specific research, even in closely related systems. Furthermore, future studies could examine additional factors, including landscape composition and configuration as well as management practices, to explore reasons for stronger native populations in some regions compared with others. Our yield data supports the hypothesis that on average, current pollination services are sufficient to meet agricultural production objectives. First of all, we appear to be achieving adequate pollen deposition for viable seed set. Seed set ranged from 330 – 850 seeds per pumpkin (Table 8, Figure 16), which exceeds open pollination seed set averages reported in the adjacent state of New York (Artz et al, 2011). Growers in our system are aiming for 1 – 2 pumpkins per square meter (C. McGrady, pers. Comm.). Our results indicate that on average, pumpkin plants produced closer to 2 pumpkins per square meter (1.74 + 0.03), and sometimes range up to 5 (Table 8). Pumpkins per square meter was positively affected by P. pruinosa visitation rates to male flowers, but only in 2013 (Figure 20). Because P. pruinosa are solitary Cucurbita specialists, visitation rates are likely an accurate measure of population abundances. Therefore, our results could suggest that as P. pruinosa populations increase, so do the number of female flowers that receive at least adequate pollination to produce viable fruit. Not only were our growers achieving adequate numbers of pumpkins, but they were also achieving pumpkins of sufficient size and weight. When growing ‘Gladiator’ pumpkins for the wholesale ‘Face-pumpkin’ market, growers are aiming for a diameter of 10 – 12 inches (25 - 30cm) which requires a circumference of 78.5 – 94.5cm. Pumpkins in our study met this particular agricultural objective with an average circumference of 87.5 + 0.33cm per pumpkin (Table 8). Circumference was strongly correlated with pumpkin weight (Table 8) which was 40 positively affected by Bombus spp. visits to male flowers in 2013 and female flowers in 2014 (Figure 19). A causal relationship between visitation rates and pumpkin weight would be mediated by increased seed set, as seen in 2013. In 2014 where there is a lack of relationship between Bombus spp visitation rates and seed set, it is possible that the relationship with pumpkin weight is a correlation, rather than a causal relationship. Both pumpkin weight and Bombus foragers may simultaneously be responding to horticultural conditions. While pumpkin weight is in part due to seed (Figure 16), the plant must also have adequate resources to grow large fruit. Therefore, when horticultural conditions are sufficient to provide plants with resources to grow larger fruit, it may also produce better floral resources – namely more nectar. Artz et al, 2011 demonstrated that in better horticultural conditions, pumpkin plants produced greater nectar volumes and more sugary nectar. If Bombus spp respond to this improved floral reward, then both Bombus visits and pumpkin weight would respond positively to horticultural conditions. It’s possible that increasing Bombus visitation rates independently results in heavier pumpkins, but more studies are needed to understand if relationships between native bee visitation rates and yield metrics are causal or correlated in response to horticulture conditions. Seed set, fruit per square meter and pumpkin weight were not related to visitation rates from managed bees, A. mellifera, in our study. We can anecdotally report that one of our grower collaborator reduced his use of managed pollinators by half based on our preliminary results, decreasing honey bee stocking rates from 1 hive per acre to 0.5 hives per acre and saw no negative effects on yield. This idea is in concert with findings from previous studies; Petersen et al, 2013 found no increases in visitation rates or yield when stocking pumpkin fields with managed bees in New York and Julier & Roulston, 2009 found native bee pollination activity was sufficient to pollinate C. pepo in Maryland and northern Virginia.

SUMMARY / CONCLUSIONS Managed A. mellifera foragers accounted for roughly half of the total female flower visitation rates, with an estimated visit every 1m 36s (Table 7). Bombus spp accounted for one third of the total female flower visitation rates, with an estimated visit every 2m 21s (Table 7). P. pruinosa accounted for roughly 10% of total female flower visitation rates, with a visit estimated every 8m 52s. These might seem like high visitation rates, but they are in keeping with previous studies in the region suggesting native pollinators are sufficient for pumpkin pollination. Managed A. mellifera pollinators demonstrate a strong preference for female flowers and respond positively to increasing floral resources. However, A. mellifera visitation rates were not related to pumpkin yield for any year. Native Bombus spp. pollinators demonstrate a preference 41 for female flowers, visit more intensely as the season progresses and respond positively to increasing floral resources. In some years, Bombus spp. visitation rates had a positive effect on seed set (2013) and pumpkin weight (2013, 2014). Native P. pruinosa visitation rates had a positive effect on fruit per square meter in 2013. A. mellifera visitation rates alone might suggest they would be the most important pollinator for pumpkins, but when weighted by their pollination efficiency, Bombus spp., (who deposit 3x-6x pollen grains per visit compared to A. mellifera), are likely providing a more valuable pollination services with less visits – a theme purported in other studies (Artz & Nault, 2011) and supported by our yield results. Our results suggest that native bee populations can supply sufficient pollination services in commercial Cucurbita agroecosystems in Pennsylvania in some circumstances; on average Bombus spp. and P. pruinosa are providing 12.75x and 1.7x, respectively, the necessary pollination services. Given the average visits per female flower from native bees (Table 7), it’s likely that renting commercial honey bees hives is superfluous in this system. However, native bee foraging dynamics are influenced by spatial factors. Bombus spp. visitation rates decreased as fields got larger (Figure 13) and their preference for female flowers dropped as distance from field edge increased (Figure 10), suggesting that pollination services from native pollinations may be limited in larger fields, depending on field configuration. Commercial growers can test the sufficiency of native pollinators in their own pumpkin fields by surveying for adequate pollination in ‘real-time.’ During the critical pollination time period (~55 days before harvest, depending on cultivar), growers can observe bee visits in female flowers. Conservatively assuming a 4-hour bloom, if female flowers receive at least 1 Bombus spp every 30 minutes OR 1 P. pruinosa every 16 minutes, growers can likely expect an adequate yield – or at least one that is not limited by lack of pollination. However, managed pollinators may not be available instantly if pollination services from native bees are lacking, and stocking fields with rented A. mellifera colonies or purchased B. impatiens colonies may function as an insurance against low visitation rate by native pollinators. Future studies should be aimed at understanding the abundance and resilience of native bee populations in this region to offer growers a better sense of security when it comes to relying solely on native pollinators for their livelihood. Accordingly, the recently developed Pollinator Protection Plan for Pennsylvania (http://ento.psu.edu/pollinators/research/the-pennsylvania-pollinator-protection-plan-p4) includes a policy recommendation to “Establish an affordable lab service at PA Department of Agriculture or Penn State University that beekeepers and growers (in Pennsylvania and nationally) could utilize to evaluate bee health”. The work presented in this thesis contributes to the rational for this policy recommendation, and the work in Chapter 4 focusing on Bombus 42 impatiens colony abundance and genetic diversity contributes to the methods used for this envisioned lab service.

REFERENCES

Artz, D. R., Hsu, C. L., & Nault, B. A. (2011). Influence of Honey Bee, Apis mellifera, Hives and Field Size on Foraging Activity of Native Bee Species in Pumpkin Fields. Environmental Entomology,40(5), 1144-1158. doi:10.1603/en10218 Artz, D. R., & Nault, B. A. (2011). Performance of Apis mellifera, Bombus impatiens, and Peponapis pruinosa (Hymenoptera: Apidae) as Pollinators of Pumpkin. Journal of Economic Entomology,104(4), 1153-1161. doi:10.1603/ec10431 Cane, J. H., Sampson, B. J., & Miller, S. A. (2011). Pollination Value of Male Bees: The Specialist Bee Peponapis pruinosa (Apidae) at Summer Squash (Cucurbita pepo). Environmental Entomology,40(3), 614-620. doi:10.1603/en10084 Canon, D. (2011). Bee Colony Pollination rental prices, eastern US with comparison to west coast. Mid-Atlantic Apiculture Research and Extension Consortium. http://agdev.anr.udel.edu/maarec/wp-content/uploads/2011/02/Pollination-rentals- PNWEAST.pdf Colla, S., Richardson, L., & Williams, P. (2011). Bumble bees of the eastern United States. Washington, D.C.: U.S. Dept. of Agriculture Pollinator Partnership. Fleischer, S. J., Blom, P. E., & Weisz, R. (1999). Sampling in Precision IPM: When the Objective Is a Map. Phytopathology, 89(11), 1112-1118. doi:10.1094/phyto.1999.89.11.1112 Goulson, D., Lepais, O., O’Connor, S., Osborne, J., Sanderson, R., & Cussans, J. et al. (2010). Effects of land use at a landscape scale on bumblebee nest density and survival. Journal Of Applied Ecology, 47(6), 1207-1215. http://dx.doi.org/10.1111/j.1365-2664.2010.01872.x Hurd PD, Linsley EG, Whitaker TW. 1971 Squash and gourd bees (Peponapis, Xenoglossa) and the origin of the cultivated. Cucurbita. Evolution, 25, 218–234. (doi:10.2307/2406514) Hoehn, P., Tscharntke, T., Tylianakis, J. M., & Steffan-Dewenter, I. (2008). Functional group diversity of bee pollinators increases crop yield. Proceedings of the Royal Society B: Biological Sciences,275(1648), 2283-2291. doi:10.1098/rspb.2008.0405 Julier, H. E., & Roulston, T. H. (2009). Wild Bee Abundance and Pollination Service in Cultivated Pumpkins: Farm Management, Nesting Behavior and Landscape Effects. Journal of Economic Entomology,102(2), 563-573. doi:10.1603/029.102.0214 Klein AM, Vaissiere BE, Cane JH, Steffan-Dewenter I, Cunningham SA, Kremen C, Tscharntke T. 2007. Importance of pollinators in changing landscapes for world crops. Proceedings of the Royal Society B 274:303–313 DOI 10.1098/rspb.2006.3721. Nicodemo, D., Couto, R. H., Malheiros, E. B., & Jong, D. D. (2009). Honey bee as an effective pollinating agent of pumpkin. Scientia Agricola,66(4), 476-480. doi:10.1590/s0103- 90162009000400007

43 Orzolek, Michael D., Elkner, Timothy E., Lamont Jr., William J., Kime, Lynn F., and Harper, Jayson K. (2012) Pumpkin Production. Penn State Extension Agricultural Alternatives. Retrieved from http://extension.psu.edu/business/ag-alternatives/horticulture/melons- and-pumpkins/pumpkin-production Petersen, J. D., Huseth, A. S., & Nault, B. A. (2014). Evaluating Pollination Deficits in Pumpkin Production in New York. Environmental Entomology,43(5), 1247-1253. doi:10.1603/en14085 Petersen, J. D., Reiners, S., & Nault, B. A. (2013). Pollination Services Provided by Bees in Pumpkin Fields Supplemented with Either Apis mellifera or Bombus impatiens or Not Supplemented. PLoS ONE,8(7). doi:10.1371/journal.pone.0069819 Pfister, S. C., Eckerter, P. W., Schirmel, J., Cresswell, J. E., & Entling, M. H. (2017). Sensitivity of commercial pumpkin yield to potential decline among different groups of pollinating bees. Royal Society Open Science,4(5), 170102. doi:10.1098/rsos.170102 Phillips, B. W., & Gardiner, M. M. (2015). Use of video surveillance to measure the influences of habitat management and landscape composition on pollinator visitation and pollen deposition in pumpkin (Cucurbita pepo) agroecosystems. PeerJ,3. doi:10.7717/peerj.1342 Plath, O. E. (1934). Bumblebees and their ways. New York: The Macmillan Company. Smith, BD. (1997) The initial domestication of Cucurbita pepo in the Americas 10000 years ago. Science. 276, 932 – 934. (doi:10.1126/science.276. 5314.932) Tepedino VJ. 1981. The pollination efficiency of the squash bee (Peponapis pruinosa) and the honey bees (Apis mellifera) on summer squash (Cucurbita pepo). Journal of the Kansas Entomological Society 54:359–377. The Pennsylvania Pollinator Protection Plan (P4) (Center for Pollinator Research). (n.d.). Retrieved from http://ento.psu.edu/pollinators/research/the-pennsylvania-pollinator- protection-plan-p4 Treanor, E. 2017. Supporting Bombus and other bees in Cucurbita agroecosystems. M.S. thesis. Pennsylvania State University. United States Department of Agriculture, National Agriculture Statistics Service. Quick Stats. (2017). https://quickstats.nass.usda.gov/ United States Department of Agriculture, National Agriculture Statistics Service. Cost of Pollination (2017). http://usda.mannlib.cornell.edu/usda/current/CostPoll/CostPoll-12- 21-2017.pdf Vaudo, A. D., Patch, H. M., Mortensen, D. A., Tooker, J. F., & Grozinger, C. M. (2016). Macronutrient ratios in pollen shape bumble bee (Bombus impatiens) foraging strategies and floral preferences. Proceedings of the National Academy of Sciences,113(28). doi:10.1073/pnas.1606101113 Vidal, M. D., Jong, D. D., Wien, H. C., & Morse, R. A. (2010). Pollination and fruit set in pumpkin (Cucurbita pepo) by honey bees. Revista Brasileira de Botânica,33(1), 106-113. doi:10.1590/s0100-84042010000100010 Winsor, J. A., L. E. Davis, and A. G. Stephenson. 1987. The relationship between pollen load and fruit maturation and the effect of pollen load on offspring vigor in Cucurbiata pepo. The American Naturalist, 129 (5): 643-656. 44 Xie, Z., Pan, D., Teichroew, J., & An, J. (2016). The Potential Influence of Bumble Bee Visitation on Foraging Behaviors and Assemblages of Honey Bees on Squash Flowers in Highland Agricultural Ecosystems. Plos One,11(1). doi:10.1371/journal.pone.0144590

45

Figure 8. Sampling diagram. Visitation, floral density and yield measures were collected along transects spaced 0, 25, 50 and 100m from the field edge (orange). Transects were between 80 – 100m long (orange arrow) with at least 50m of field on either side (green arrow).

Bombus Apis Peponapis Small Black Green Other spp. mellifera pruinosa Bees Bees Bees

Figure 9. Dominance distribution of pollinator visits to pumpkin flowers. See Table 1 for bee species included in each morpho-taxa

46 Table 4. Comprehensive list of all bee species collected from C. p. pepo cv ‘Gladiator’ flowers including the taxonomic resolution (Taxa), number of specimen collected (N), morpho-taxa terminology used during visitation observations (Morpho-taxa) and the years in (Year) and fields at (Field) which species were collected

Taxa N Morpho-taxa Year Field Total: 4 families, 15 genera, 37 species 844 9 3 30 APIDAE (7 genera, 13 species) 700 Apis mellifera Linnaeus, 1758 147 A. mellifera 2013, 14, 15 All* Bombus bimaculatus Cresson, 1863 4 Bombus spp 2013, 14, 15 22, 14, 3 Bombus fervidus Fabricius, 1798 5 Bombus spp 2013 8 Bombus griseocollis De Geer, 1773 9 Bombus spp 2013, 15 8, 22, 23, 32 Bombus impatiens Cresson, 1863 349 Bombus spp 2013, 14, 15 All Bombus terricola Kirby, 1837 3 Bombus spp 2014 14 Bombus vagans Smith, 1854 2 Bombus spp 2013 4, 6 Ceratina calcarata Robertson, 1900 1 Small Black 2013 7 Ceratina dupla Say, 1837 1 Small Black 2013 7 Melissodes bimaculatus Lepeletier, 1825 10 Large Black Bee 2013, 14 3, 7, 8, 13, 33 Peponapis pruinosa Say, 1837 164 P. pruinosa 2013, 14, 15 All* Triepeolus remigatus Fabricius, 1804 4 Large Striped 2013 7 Xylocopa virginica Linnaeus, 1771 1 Other 2015 33 HALICTIDAE (6 genera, 22 species) 141 Agapostemon virescens Fabricius, 1775 4 Green 2013 6, 7 Augochlora pura Say 1837 60 Green 2013, 14, 15 All except 5, 8, 17, 23, 24 Augochlorella aurata Smith, 1853 10 Green 2013, 14, 15 5, 7, 15, 22, 23 Augochloropsis metallica Fabricius, 1793 1 Green 2013 6 Halictus ligatus Say, 1837 1 Small Striped 2013 3 Halictus rubicundus Christ, 1791 1 Small Striped 2013 5 Lasioglossum albipenne Robertson, 1890 1 Small Black 2014 15 Lasioglossum bruneri Crawford, 1902 2 Small Black 2013, 15 5, 33 Lasioglossum ephialtum Gibbs, 2010 4 Small Black 2015 22 Lasioglossum hitchensi Gibbs, 2012 4 Small Black 2013, 14, 15 7, 14, 22 Lasioglossum illinoense Robertson, 1892 1 Small Black 2013 7 Lasioglossum imitatum Smith, 1853 5 Small Black 2015 23, 32, 33 Lasioglossum laevissimum Smith, 1853 1 Small Black 2014 13 Lasioglossum lineatulum Crawford, 1906 1 Small Black 2014 14 Lasioglossum obscurum Robertson, 1892 1 Small Black 2014 12 Lasioglossum paradmirandum 4 Small Black 2015 22, 23 Knerer and Atwood, 1966 Lasioglossum pilosum Smith, 1853 16 Small Black 2013, 14, 15 3, 4, 6, 13, 21, 23, 24, 32 Lasioglossum truncatum Robertson, 1901 2 Small Black 2014 13, 33 Lasioglossum versans Lovell, 1905 1 Small Black 2013 3 Lasioglossum versatum Robertson, 1902 7 Small Black 2013, 14, 15 3, 6, 15, 22, 31 Lasioglossum weemsi Mitchell, 1960 6 Small Black 2013, 14, 15 4, 15, 22, 23, 25 Lasioglossum zephyrum Smith, 1853 8 Small Black 2013, 14 7, 8, 12, 31 COLLETIDAE (1 genus, 1 species) 2 Hylaeus annulatus Linnaeus, 1758 2 Small Black 2013 6 MEGACHILIDAE (1 genus, 1 species) 1 Megachile brevis Say, 1837 1 Large Striped 2013 5 *not collected at every field, but reliable visitation data indicates species presence in all fields

47

Figure 10. For each distance from field edge (A) 0m, (B) 25m, (C) 50m and (D) 100m, the distribution of male and female flowers observed (left of black line) is compared with the distribution of male and female flowers visits for Apis mellifera, Bombus spp. and Peponapis pruinosa (right of the black line). Male flowers observed and male flower visits are in blue while female flowers observed and female flower visits are in red. After Bonferroni corrections, the proportion of female flowers visits is significantly higher than the proportion of female flowers observed when P < 0.004, indicated by an *

48 Table 5. Overall regression model results testing the effect of Bee taxa, Flower Sex, Field area, Day of year, Distance-from-field-edge, and male flower floral density per square meter on visitation rates (bee visits / flower / 45s) in commercial Pumpkin agroecosystems. Estimates included for continuous variables only.

Source / Variable DF Estimate F P R Overall Model 18 21.86 < 0.0001 0.29

Bee taxa 2 51.36 < 0.0001 Flower sex^ 1 0.01 0.943 Field area^ 1 -0.012 0.64 0.4255 Distance-from-field-edge 1 -0.005 14.64 0.0001 Day of year 1 0.017 13.96 0.0002 Male flower floral density per m2 1 0.19 59.39 < 0.0001 Bee taxa*Flower sex 2 23.91 < 0.0001 Bee taxa*Field area 2 4.47 0.0117 Bee taxa*Day of year 2 56.58 < 0.0001 Bee taxa*Male flower floral density per m2 2 17.51 < 0.0001 Flower sex*Male flower floral density per m2^ 1 0.01 0.9315 Flower sex*Bee taxa*Male flower floral density per m2 2 4.06 0.0175 ^Non-significant factors are significant in higher level interaction terms

Apis mellifera Bombus spp Peponapis pruinosa P = < 0.0001 R2 = 0.11 n = 1035

Figure 11. Bee taxa and Flower sex significantly affected mean visitation rates to pumpkin flowers. Error bars are one standard error from the mean

49 A P < 0.001

R2 = 0.18 n = 153 flowers

female to ifera A. mell

B P < 0.001

R2 = 0.19 n = 172 flower

male to (visits per(visits per seconds) flower 45 A. mellifera

C P < 0.001

Visitation Rates 2 R = 0.09 n = 172 flowers

male to to Bombus spp

Male Flower Floral Density per m2

Figure 12. Male flower floral density per square meter positively affected A. mellifera visitation rates to (A) female and (B) male flowers, as well as (C) Bombus spp. visitation rates to male flowers. Each point represents a single transect. The x-axis is uniform for all graphs. The y-axis is uniform per flower sex. The shaded region represents a 95% CI surrounding the regression line of fit.

50 P = 0.0079 R2 = 0.04 n = 182

(visits flower per per 45 sec)

Visitation Rate

Field Area (hectares) Figure 13. Pumpkin field area negatively affected Bombus spp. visitation rates in commercial pumpkin fields. Each point represents the mean visitation rate for a given field area with error bars indicating 1 standard error. The shaded region (blue) represents a 95% CI surrounding the regression line of fit.

A P = 0.0079 R2 = 0.04

n = 182

A. mellifera

B P = 0.0079 2

(visits per flower per 45 sec) 45 per flower per (visits R = 0.04 n = 182

Bombus spp. Visitation Rate

Day of Year Figure 14. Visitation rates from (A) A. mellifera and (B) Bombus spp. exhibited a curvilinear response throughout the pumpkin floral bloom period. Data is summarized as a mean visitation rate for each day, surrounded by error bars indicating 1 standard error. The shaded region (blue) represents a 95% CI surrounding the regression line of fit. 51

P = 0.1 R2 = 0.02 n = 182

sec) (visits per flower per 45 45 per flower per (visits

Visitation Rate

Distance from field edge (meters) Figure 15. Distance from field edge predicts has a weak, negative relationship with visitation rates, only significant at alpha = 0.1. Visitation rates are summarized for 4 distances as boxplots. The dotted line represents the non-significant line of fit for the regression.

Table 6. Necessary number of pollinator visits per taxa to achieve optimal Cucurbita yield, synthesized from multiple studies. PT: necessary number of total pollen grains deposited per stigma to adequately fertilize ovarioles. FO: necessary number of fertilized ovarioles to achieve optimal fruit set and or fruit weight, as indicated by seed set. PV: average number of pollen grains deposited per pollinator visit. VT: necessary visits per taxa to achieve optimal yield. – indicates data type not collected for that study

Pollinator Taxa Citation Type of Cucurbita PT FO PV VT Nicodemo C. maxima - - - 16 et al, 2009 cv ‘Exposição’ Vidal et al, 2010 C. pepo cv ‘Howden’ 1253 + 484 53 – 250 12 Apis mellifera 582 ± 752 (empirical) Pfister, 2017 C. maxima cv ‘hokkaido’ 2500 500 11 260 (modeled) Artz & Nault, 2011 C. pepo cv ‘Mystic’ ~ 1,360 400 – 500 70 > 8 Bombus terrestris, 3368 ± 2473 (empirical) Pfister, 2017 C. maxima cv ‘hokkaido’ 2500 500 2 lucorum & cryptarum 864 (modeled) Bombus impatiens Artz & Nault, 2011 C. pepo cv ‘Mystic’ ~ 1,360 400 – 500 170 4 – 8 Variety: ‘Crookneck squash’ Cane et al, 2011 - ~ 225 - 7 Peponapis pruinosa ‘straightneck yellow squash’ ‘zucchini’ Artz & Nault, 2011 C. pepo ‘Mystic’ ~ 1,360 400 - 500 ~70 > 8 45 ± 76 (empirical) Lasioglossum Pfister, 2017 C. maxima ‘hokkaido’ 2500 500 123 16 (modeled)

52 Table 7. On average, pollinator visitation rates to female flower in this study exceed estimated pollination thresholds. Total visits (green) and visitation rate (gray) are compared between ‘pollination services: estimated thresholds’ and ‘pollination services: observed in this study’ to determine if pumpkins are receiving ‘sufficient pollination services’ in PA commercial agroecosystems

Pollinator Pollination Services: Pollination Services: Sufficient Species Estimated Threshold Observed in this study Pollination Services? Total Visitsa Visitation Rateb Total Visitsc Visitation Rated 16 ~150 Yes: Observed is A. mellifera 0.05 0.467 ± 0.05 (1 every 15 min) (1 every 1m 36s) ~9.3x threshold 8 ~102 Yes: Observed is Bombus spp 0.025 0.32 ± 0.04 (1 every 30 min) (1 every 2m 21s) ~12.75x threshold 16e ~27 Yes: Observed is P. pruinosa 0.05 0.0848 ± 0.02 (1 every 15 min) (1 every 8m 52s) ~1.7x threshold ‘All pollinators’ 13.3f ~282.5 Yes: Observed is 0.042 0.882 ± 0.06g combined (1 every 19 min) (1 every 51 sec) ~21.2x threshold a Required visits to each female flower for adequate pollination based on previous studies, summarized in Table 6 b Required visitation rates (visits / female flower / 45 second) to achieve required visitsa within 4 hour flower lifespan c Inferred number of actual bee visits to each female flower assuming consistent visitation ratesd across a 4 hour flower lifespan d Mean + SE female flower visitation rate (visits / female flower / 45 seconds) observed during this study e While there is no clear consensus from the literature regarding necessary visits for P. pruinosa, Artz & Nault, 2011 found that P. pruinosa and A. mellifera deposited similar amounts of pollen per visit, therefore, we used the A. mellifera threshold for P. pruinosa f 13.3 is the mean of the three species listed g Includes ‘other’ bees in addition to A. mellifera, Bombus spp and P. pruinosa

Table 8. Overall and yearly means + SE for pumpkin yield metrics. The overall range is presented in (). Yearly means labeled with different letters per row are significantly different according to 1-way ANOVA. – indicates when data were not collected for a given year.

Yield Metric Overall 2013 2014 2015 Fruit per m2 1.74+0.02 - 1.74 + 0.03 1.74 + 0.03 n = 1038 (1 - 5) Weight (kg) 6.6+0.06 6.8 + 0.2 7.21 + 0.08 5.5 + 0.1 (B) n = 1141 (1.5 – 13) (A) (A) Circumference 87.5+0.33 86.5 + 1.1 91.1 + 0.4 82.1 + 0.53 (cm) (45.7 – (B) (A) (C) n = 1141 116.8) Seed Set 505.8+6.8 490.9 + 515.8 + 8.7 - n = 250 (144 – 817) 10.7 Length (cm) 25.7+0.2 24.8 + 0.3 25.6 + 0.27 - n = 249 (13.5 – 33)

53 P < 0.001 R2 = 0.22 n = 244

Figure 16. Seed set effects weight per C. pepo cv ‘Gladiator’ pumpkin. Each point represents a single pumpkin. C. pepo cv ‘Cannonball’ pumpkins are displayed (gray) but not included in analysis. Line of fit is surrounded by a shaded region representing 95% confidence intervals.

A 2013 B 2014 C 2015

Figure 17. Principal components analyses (PCA) examining associations between mean visitation rates averaged across all sample dates per transect and average yield metrics per transect for (A) 2013, (B) 2014, and (C) 2015 separately. Visitation rates within each PCA are labeled with 2 letters to indicate the specific bee taxa: A. mellifera (A), Bombus spp. (B), and P. pruinosa (P) visits to male (M) or female (F) flowers. Fruit per m2 (fruit/m2) was measured in 2014 and 2015, pumpkin weight (weight) was measured in all years and seed set (Seeds) was measured in 2013 and 2014. Associated variables between visitation rate of a bee taxa and a yield metric, supported with a significant correlation, are depicted in blue, grouped by solid or dashed lines.

54 2013 transect Seed Set per

R2 = 0.56 Mean P < 0.001 n = 20

Bombus spp. visitation rate to male flowers (bee visits / flower / 45 sec)

Figure 18. Average seed set per pumpkin was positively affected by B. impatiens visitation rates to male flowers in 2013. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

A 2013 B 2014 per transect

R2 = 0.24 R2 = 0.32 P = 0.03 P < 0.001 n = 20 n = 32 Mean Pumpkin weight (kg) Bombus spp. visitation rate to male flowers Bombus spp. visitation rate to female flowers (bee visits / flower / 45 sec) (bee visits / flower / 45 sec)

Figure 19. Pumpkin weight was positively affected by (A) B. impatiens visitation rate to male flowers in 2013 and (B) B. impatiens visitation rate to female flowers in 2014. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

55 2014 A B per transect

2

R2 = 0.4 R2 = 0.18

Mean Fruit per m P < 0.001 P = 0.015 n = 32 n = 32

P. pruinosa visitation rate to male flowers P. pruinosa visitation rate to female flowers (bee visits / flower / 45 sec) (bee visits / flower / 45 sec)

Figure 20. Fruit per m2 was positively affected by P. pruinosa visitation rates to (A) male flowers and (B) female flowers in 2014. Each point represents a single transect. Line of fit is surrounded by a shaded region representing a 95% confidence interval.

56 APPENDICES

APPENDIX A For visitation measures, 18 of the 24 fields were visited twice. In 2013, 1 field was visited once & 1 field was visited three times. In 2015, 3 fields were visited once while 1 field was visited three times.

Table A9. Sampling support for Visitation Measures, Floral Density measures and Yield measures. Site: unique number used to identify each sampling site to protect identity of grower collaborators. Region: the geographical location of the specific sampling site with C for Columbia County and L for Lancaster county. T (m): the transect located X meters from the field edge. NVR: number of visitation rate samples taken per transect. NFD: number of floral density measures taken per transect. NW: number of pumpkins weighed per transect. NC: number of pumpkin circumferences measured per transect. NL: number of pumpkin lengths measured per transect. NSS: number of pumpkins from which seeds were harvested per transect. Visitation & Floral Density Measures Yield Measures Year Field Region Sampling Date T (m) NVR NFD Sampling Date T (m) NW NC NL NSS NFm2 2013 3 C 7/24/2013 0 60 60 8/30/2013 0 5 5 5 5 - 25 60 60 25 5 5 5 5 - 50 60 60 50 5 5 5 5 - 100 60 60 100 5 5 5 5 - 8/20/2013 0 45 60 25 37 60 50 33 60 100 45 60 2013 4 C 7/25/2013 0 60 60 8/29/2013 0 61 61 5 5 - 25 60 60 25 5 5 5 5 - 50 60 60 50 5 5 5 5 - 100 60 60 100 5 5 5 5 - 8/15/2013 0 59 60 25 50 60 50 58 60 100 56 60 2013 5 C 7/26/2013 0 60 60 8/29/2013 0 5 5 5 5 - 25 59 60 25 5 5 5 5 - 50 60 60 50 5 5 5 5 - 100 60 60 100 5 5 5 5 - 8/16/2013 0 57 60 25 54 60 50 57 60 100 56 60 8/21/2013 0 52 60 25 59 60 50 58 60 100 58 60 2013 6 C 8/14/2013 0 60 60 No data 0 - - - - - 25 59 60 25 - - - - - 50 58 60 50 - - - - - 100 60 60 100 - - - - - 8/22/2013 0 59 60 25 60 60 50 60 60 100 60 60 2013 7 L 7/18/2013 0 30 30 9/4/2013 0 5 5 5 5 - 25 30 30 25 5 5 5 5 - 50 30 30 50 5 5 5 5 - 100 30 30 100 5 5 5 5 - 8/5/2013 0 60 60 25 60 60 50 60 60 100 60 60

57 2013 8 L 8/6/2013 0 60 60 9/5/2013 0 5 5 5 5 - 25 60 60 25 5 5 5 5 - 50 60 60 50 5 5 5 5 - 100 60 60 100 5 5 5 5 - 2014 12 C 7/22/2014 0 60 10 9/6/2014 0 20 20 5 5 20 25 60 107 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 8/7/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 13 C 7/22/2014 0 60 10 9/5/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 8/11/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 14 C 7/18/2014 0 59 10 9/6/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 8/1/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 15 C 7/21/2014 0 60 10 9/12/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 7/31/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 31 C 7/18/2014 0 60 10 9/12/2014 0 20 20 2 22 20 25 60 10 25 20 20 1 12 20 50 60 10 50 20 20 2 22 20 100 60 10 100 20 20 5 5 20 8/1/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 33 C 7/17/2014 0 60 10 9/4/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 107 100 20 20 5 5 20 7/31/2014 0 60 10 25 60 10 50 60 10 100 60 10 2014 16 L 7/28/2014 0 60 10 9/26/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 183 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 8/8/2014 0 60 10 25 60 10 50 60 10 100 60 10

58

2014 17 L 7/28/2014 0 60 10 9/26/2014 0 20 20 5 5 20 25 60 10 25 20 20 5 5 20 50 60 10 50 20 20 5 5 20 100 60 10 100 20 20 5 5 20 8/8/2014 0 60 107 25 60 10 50 60 10 100 60 10 2015 20 C 7/24/2015 0 60 10 9/9/20154 0 20 20 - - 20 25 60 10 8/30/2015 25 20 20 - - 20 50 60 10 50 20 20 - - 20 100 60 10 100 20 20 - - 20 7/31/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 21 C 7/17/2015 0 60 10 8/28/20155 0 20 20 - - 20 25 60 10 25 20 20 - - 20 50 59 10 50 20 20 - - 20 100 60 10 100 20 20 - - 20 7/23/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 22 C 7/16/2015 0 60 10 8/29/2015 0 20 20 - - 20 25 60 10 25 20 20 - - 20 50 60 10 50 20 20 - - 20 100 60 10 100 20 20 - - 20 7/23/2015 0 60 10 25 60 10 50 60 10 100 60 10 8/1/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 23 C 7/22/2015 0 60 10 8/29/2015 0 20 20 - - 20 25 60 10 25 20 20 - - 20 50 60 10 50 20 20 - - 20 100 60 10 100 20 20 - - 20 7/31/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 24 C 7/22/2015 0 49 10 9/9/2015 0 20 20 - - 20 25 60 10 25 20 20 - - 20 50 60 10 50 20 20 - - 20 100 60 10 100 20 20 - - 20 8/2/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 25 C 7/23/2015 0 50 10 9/10/2015 0 20 20 - - 20 25 50 10 25 20 20 - - 20 50 50 10 50 20 20 - - 20 100 50 10 100 20 20 - - 20 7/31/2015 0 60 10 25 60 10 50 60 10 100 60 10

59

2015 32 C 7/22/2015 0 60 10 9/9/20156 0 20 20 - - 20 25 60 10 25 - - - - - 50 60 10 50 - - - - - 100 60 10 100 - - - - - 8/1/2015 0 60 10 25 60 10 50 60 10 100 60 10 2015 26 L 7/28/2015 0 30 - No data 0 - - - - - 25 30 - 25 - - - - - 2015 27 L 7/29/2015 0 60 10 No data 0 - - - - - 25 90 208 25 - - - - - 50 60 10 50 - - - - - 100 30 - 100 - - - - - 2015 28 L 8/5/2015 0 30 - No data 0 - - - - - 25 30 - 25 - - - - - 50 30 - 50 - - - - - 100 30 - 100 - - - - - 1Original pumpkin weighed and circumference was rotten inside, so an additional pumpkin was measured and seeds harvested 2Seeds from some pumpkins were misplaced during processing, so none were included to maintain accuracy 3Several fruit per m2 measures are inexplicably missing 4The 0 transect was shaded and thus pumpkins matured slower, so they were measured later upon reaching maturity 5Site 21 planted with ‘Giant’ cultivar and thus yield metrics not used for analysis 6Site 32 yield metrics only collected for 1 transect and thus the entire field was excluded 7Untrained observed counted closed blossoms from previous days; floral density measures were too high and statistically counted as outliers 8Density measures were inadvertently measured twice for the 25m transect and not measured for the 100m transect

APPENDIX B Table B10. Overall mixed model results testing the fixed effect of Bee taxa, Flower Sex, Field area, Day of year, Distance-from-field-edge, and male flower floral density per square meter on visitation rates (bee visits / flower / 45s) in commercial Pumpkin agroecosystems when including year, county and field as random effects. Estimates included for continuous variables only.

Source / Variable DF Estimate F P AIC Overall Model 956 3322.26

Random effects Year -0.0019 - < 0.0001 County - < 0.0001 Field - < 0.0001

Fixed effects Bee taxa 2 60.5 < 0.0001 Flower sex^ 1 0.05 0.8315 Field area^ 1 -0.014 0.11 0.741 Distance-from-field-edge 1 -0.004 14.33 0.0002 Day of year 1 0.023 6.12 0.0135 Male flower floral density per m2 1 0.17 34.18 < 0.0001 Bee taxa*Flower sex 2 28.16 < 0.0001 Bee taxa*Field area 2 5.26 0.0053 Bee taxa*Day of year 2 66.65 < 0.0001 Bee taxa*Male flower floral density per m2 2 20.62 < 0.0001 Flower sex*Male flower floral density per m2^ 1 0.02 0.8866 Flower sex*Bee taxa*Male flower floral density per m2 2 4.78 0.0085 ^Non-significant factors are significant in higher level interaction terms

60 APPENDIX C Table C11. Estimates of total pollinator visits to male flowers for the lifetime of the flower derived from visitation rates in this study. Pollinator Taxa Male flower visitation ratea Total visits per male flower lifetimeb A. mellifera 0.0933 ± 0.01 ≈31 (1 visit every 7m 44s)

Bombus spp 0.214 ± 0.02 ≈68 (1 visit every 3m 31s)

P. pruinosa 0.0821 ± 0.01 ≈26 (1 visit every 9m 14s)

‘All Pollinators’ combined 0.405 ± 0 .03c ≈130 (1 visit every 1m 50s a Mean + SE female flower visitation rate (visits / female flower / 45 seconds) observed during this study b Inferred number of actual bee visits to each male flower assuming consistent visitation ratesd across a 4 hour flower lifespan c Includes ‘other’ bees in addition to A. mellifera, Bombus spp and P. pruinosa

APPENDIX D Visitation rate data were non-normally distributed, so box-cox formulas (Box & Cox, 1964) were used to estimate the best transformations for each Bee Taxa to Flower Gender data set in JMP® Pro. Transformed data distributions were checked to confirm the transformations approached normality. In one instance, the suggested transformation was not used. When analyzing P. pruinosa visits to female flowers in mixed models and regressions, results did not differ between 1/(√(Y)) transformed data and Log(Y) transformed data. Therefore, Log(Y) transformed data was used for consistency with other models.

Table D12. Box-cox transformations for visitation rates All pollinators Apis mellifera Bombus spp. Peponapis pruinosa Female Flowers λ 0.39 ≈ 0.5 0.12 ≈ 0 -0.12 ≈ 0 -0.989 ≈ -1 transformation √(Y) when λ = 0.5 Log(Y) when λ = 0 Log(Y) when λ = 0 1/(√(Y)) when λ = -1* Male Flowers λ 0.125 ≈ 0 -0.034 ≈ 0 0.116 ≈ 0 -0.244 ≈ 0 transformation Log(Y) when λ = 0 Log(Y) when λ = 0 Log(Y) when λ = 0 Log(Y) when λ = 0 *Log(Y) was used

Reference Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society, Series B, 26, 211-252.

61 APPENDIX E To examine the effect of managed pollinators (A. mellifera) on the most common native bee taxa, regression was used to compare visitation rates to male flowers, averaged across all sampling dates. For male flowers, as A. mellifera visitation rates to male flowers increased, P. pruinosa 2 visitation rates decreased (F1, 92 = 5.7, P = 0.02, R = 0.06, n = 94, Figure E21). There is no discernable relationship between A. mellifera and Bombus spp. visitation rates (F1, 92 = 1.2, P = 0.21, R2 = 0.02, n = 94).

Peponapis pruinosa Peponapis

Visitation (Bee/Flower/45s) Rate Bombus spp

Apis mellifera Visitation Rate (Bee/flower/45s)

Figure E21. Effect of A. mellifera visitation rates effect on native bee visitation rates to male flowers. X-axis depicts A. mellifera visitation rates and is the same for both figures. Y-axes depicts (A) P. pruinosa and (B) Bombus spp visitation rates and is scaled independently for each figure. Points represent a single transect, averaged across sampling dates. Significance indicated by the line of fit indicates surrounded by a shaded region representing the 95% confidence interval.

62 APPENDIX F While the distribution of P. pruinosa visits to male and female flowers did not differ from the distribution of observed male and female flowers, the proportion of A. mellifera and B. impatiens visits to female flowers was significantly greater than the proportion of female flowers observed (Figure F22, Likelihood Ratio Chi2 test, A. mellifera: X2 = 901.54, P <.0001; Bombus spp: X2 = 57.268, P <.0001; P. pruinosa: X2 = 0.831, P = 0.3619).

* * ns

Flowers Observed Visits Observed Bombus spp A. Mellifera P. pruinosa

Figure F22. Overall Distribution of male and female flowers observed compared with the distribution of male and female flowers visited by Bombus spp., Apis mellifera, and Peponapis pruinosa. * indicates when the proportions of visits for a bee taxa to female flowers is significantly higher than the proportion of female flowers observed.

63 Chapter 3: Standardizing microsatellite genotyping: evaluating cross-specific amplification, visualizing allelic drift, and evaluating effects of monomorphic loci in the eastern bumble bee Bombus impatiens

INTRODUCTION / BACKGROUND

While new genotyping and sequencing techniques are gaining in popularity, microsatellites remain a useful molecular marker for a host of research objectives due to their relatively easy sample preparation and high information content (Guichoux et al, 2011). A detailed explanation of microsatellites and their potential ecological uses can be found in several published reviews (Vieira et al, 2016; Woodard et al, 2015; Dudgeon et al, 2012, Guichoux et al, 2011; Selkoe & Toonen et al, 2006). In summary microsatellites are sequences of DNA comprised of tandem repeats of 1-10 base pair motifs surrounded by species-specific flanking regions of DNA. The number of repeat motifs differs between individuals, resulting in multiple alleles of different lengths at each microsatellite locus. The fragments are labeled with a fluorescent dye, drawn through capillaries using electrophoresis and specific lengths are distinguished with lasers. Microsatellites are present throughout the genome, but those found in the non-coding regions of nuclear DNA tend to be selectively neutral and follow Mendelian inheritance, making them appropriate molecular markers for ecological studies reflecting recent time spans. Multiple publications have called for standardized evaluation of microsatellite loci and consistent reporting of loci characteristics and multi-plex primer designs (Morin et al, 2010; Guichoux et al, 2011). Such standardization will enable comparisons across studies and, furthermore, consistent reporting of useful methodological data facilitates reproducibility, saves researchers resources and time. Guichoux et al, 2011 details a useful step-by-step process for designing multiplexes to maximize genetic information while minimizing cost. As with most analyses, microsatellites can have limitations that should be address during each study. Selkoe & Toonen, 2006 provides a detailed checklist of possible issues to test for, as well as corrective measures to reduce issues. However, when problems are identified and can’t be corrected, the process for deciding when to include or discard loci is subjective. Evaluation practices and tolerance for effects on subsequent results varies from study to study. Here, I propose a systematic workflow for evaluating loci when working with large sample sizes, particularly when testing non-species-specific loci for use in a novel species (Figure 23). Most

64 importantly, I define clear thresholds to objectively determine which loci are included, discarded or require further evaluation. I demonstrate this workflow by testing 14 microsatellite loci developed from other bumble bee species for use in the common eastern bumble bee, Bombus impatiens. One of the first issues addressed in the evaluation workflow is a technical issue arising during fragment analysis which causes variability in fragment size (Figure 23, step 3). Theoretically, allele fragments should be precisely separated by multiples of a given base pair (bp) motif repeat: 2 or 3 or more bp, depending on the locus. However, chemical or physical impurities during the capillary electrophoresis process can lead to fragments separated by slightly more or less than then a specific motif repeat. Amos et al, 2007 found that alleles with a 2 base pair motif repeat could have alleles separated by 1.77 – 2.23 base pairs. In order to capture this potential diversity, allele peaks are ‘binned’ whereby a small range of fragment lengths are grouped as a single designated allele. For example, fragments with a length of 122.8 – 123.2 could be ‘binned’ as allele 123. However, when fragments are consistently separated by too little or too much throughout the entire allelic range for a given loci, an entire base pair or more could be lost, leading to incorrect identification of alleles. Fragments appearing earlier or later in bins throughout a locus’ range has been referred to as ‘electrophoretic migration’ (Morin et al, 2010) or ‘allelic drift’ (Guichoux et al, 2011). Multiple groups have designed methods to address this issue, employing something as simplistic as exporting fragment calls to a spreadsheet and plotting the frequency of size distribution (Jayashree et al 2006), to creating stand-alone software programs like Allelobin (Idury & Cardon, 1997) and Flexibin (Amos et al, 2007), to designing integrated packages in R (‘MsatAllele’ Alberto, 2009) or excel macros (‘Autobin,’ Guichoux et al, 2010). However, these methods can be time consuming and bring their own additional errors as datasets are reformatted and transferred between software programs. Here, I propose a simpler method to account for allelic drift using functions built into Geneious, an existing software already used to score genotypes. This new method allows users to visualize fragment clusters and adjust bins without having to involve additional software or reformat data. In the proposed workflow, several additional common issues are examined including null alleles, deviations from Hardy-Weinberg equilibrium and linkage disequilibrium - all of which are well-defined in Selkoe and Toonen, 2006. Briefly, null alleles are alleles that fail to amplify due to mutations in the flanking region or PCR issues like large allelic dropout and stuttering. Null alleles are detected when the observed number of homozygotes significantly exceeds the expected number of homozygotes based on the occurrence of heterozygotes in a given population. A locus conforming to Hardy-Weinberg Equilibrium has genotype frequencies 65 expected in an ideal population that is subject to random mating, no mutations, no drift and no migrations resulting in a random union of gametes. Deviations from HWE are detected by looking for both heterozygote excess and heterozygote deficiency. Loci should also be independent from one another and genotypic linkage disequilibrium (LD) is a violation of the hypothesis that genotypes at one locus are independent from genotypes at another locus. Tests for linkages between each pair of loci in a multiplex design is important. The workflow also addresses biases associated with monomorphic loci, a potential problem that has garnered less attention than the above issues when evaluating microsatellite loci. A monomorphic locus is one with low heterogeneity such that most individuals in a given population have identical genotypes for that locus. The general consensus appears to be that monomorphic loci are simply uninformative and do not add value or cause problems for most analyses, other than wasting computing resources. However, Roesti et al, 2012 found that including monomorphic loci artificially decreasing values of genomic differentiation because all individuals have one locus in common. I will test for and evaluate the effect of monomorphic loci on several other common analyses employed in conservation genetics studies including genetic diversity, population HWE and levels of inbreeding. Monomorphic loci may also impact identification of genetically-related individuals. This is a crucial step when studying eusocial haplo-diploid organisms, including bees. Eusocial haplo- diploid offspring produced by the same mother and father (i.e. members of the same full-sibship family) are highly related with very similar genotypes. Many studies performing genetic tests include only one individual from each full-sibship family to avoid biasing results with multiple copies of similar genotypes. Furthermore, identifying related individuals is the foundation of most colony abundance studies, which are used to understand population dynamics of pollinators crucial to the agricultural industry and success of native ecosystems (Carvell et al, 2017; Geib et al, 2015; Rao & Strange, 2012; Goulson et al, 2010; Darvill et al, 2004). Because individuals automatically match for monomorphic loci, unrelated individuals could be incorrectly assigned to the same full-sibship family resulting in a decrease of full-sibship families and the exclusion of novel genotypes in subsequent genetic analyses, but this has never been empirically reviewed. The aim of this study is to provide a conceptual workflow emphasizing a systematic and repeatable process for objectively evaluating microsatellite loci with clearly defined thresholds for eliminating problematic loci. I also present a new method to address allelic drift, as well as systematically evaluate the effect of monomorphic loci on several key genetic analyses. For a

66 narrower audience, I provide an optimized multi-plex of non-species-specific microsatellite loci that can be confidently used in Bombus impatiens.

METHODS

DNA EXTRACTION (Figure 23, Step 1 and 2)

We collected 6,323 Bombus impatiens foragers from 30 sites (210.8 ± 3.3SE) in Pennsylvania. Foragers were pinned and are currently stored in the Fleischer lab in the Agriculture Sciences and Industries building at Penn State. The right mesathoracic leg was removed from each forager and placed into 150 µl Chelex® 100 (5%, in milli-q H20) with 5µl of 10 mg/mL Proteinase K. Each sample was heated in a Mastercycler® pro thermocycler for 60 minutes at 55°C, 15 minutes at 99°C, 1 minute at 37°C, and 15 minutes at 99°C. Extracted DNA samples were stored short term (up to 3 weeks) in a 4oC refrigerator, long term (2 years) in a -20°C freezer and indefinitely in -80°C freezer (details in Appendix G)

PCR AMPLIFICATION AND SEQUENCING (Figure 23, Step 3) Genotypes were obtained for each worker using a multiplex PCR reaction with approximately 1.2µl bee DNA, 2µl PCR buffer (5X Colorless GoTaq® Flexi Buffer), 0.2µl BSA (0.1mg/ml), 0.08µl of 5u/µl Taq polymerase (GoTaq® Flexi DNA Polymerase), 0.6µl of 10mM dNTP’s,

0.56µl of 25mM MgCl2, 2.8µl of molecular grade H2O and 11 primers. Unlabeled reverse primers and forward primers labeled with FAM (blue), VIC (green), PET (red) or NED (yellow) were first rehydrated and then diluted to a working solution of 10µM. After working to optimize our primer sets, the final master mix had a total volume of 10.7µl and included the following primers: BTMS0066, B124, Btern01, BT28, BTMS0062, BTMS0073, BT10, BL11, BT30, B96, and BTMS0081 (Table 13). Three additional primers were trialed and discarded for various issues throughout the evaluation process described in this paper: BTMS0074, BL15, and BTMS0083. Primers labeled with the same florescent dye color do not have overlapping size ranges. Size ranges were initially taken from previous studies but have been updated to reflect the distribution of alleles found in this study. The number of nucleotides in the repeat motifs are reported for this study, while the motif repeat sequence is reported from the references. Polymerase chain reactions were performed using Mastercycler® pro thermocyclers with a 3 minute 30 second initial denaturation at 95°C, followed by 30 cycles of 30 seconds at 95 ̊C, 1 minute 15 seconds at an annealing temperature of 55 ̊C and 45 seconds at 72 ̊C, then a final 15 67 minute extension at 72 ̊C with a hold temperature at 15 ̊C. Amplified PCR products were sized on an ABI3730XL® DNA Analyzer (Applied Biosystems) with GeneScan LIZ 500 internal size standard (Applied Biosystems) at the Penn State Genomics Core Facility – University Park, PA (details in Appendix H)

VISUALIZING ALLELIC DRIFT AND SETTING BINS USING GENEIOUS (Figure 23, Step 4) We scored alleles using Geneious V10.0.7 (Biomatters Ltd) and initially prompted Geneious to “Predict Bins” and copied a subset of ~1000 samples from across sites into one combined folder (Geneious begins to lag perceptibly when viewing more than 1000 samples simultaneously). We reduced the spacing to 0 and selected “Allow Vertical Overlap” in order to visualize the pattern of peak calls (specifically looking for allelic drift) and adjust bins to appropriate settings (details regarding appropriate menu options can be found in Appendix I)

After scoring alleles and setting appropriate bins, genotypes were exported from Geneious. I used Convert V1.31 (Glaubitz, 2004) and BBEdit V11.6.7 (Bare Bones Software, Inc.) to create input files for multiple software programs including Micro-checker V2.2.3 (Van Oosterhout et al, 2004), Colony V.2.0.6.4 (Jones & Wang, 2004), Genepop on the web V4.2 (Raymond & Rousset, 1995) and R V3.0.0 (R Development Core Team, 2012). Scripts for R functions described in the following analyses can be found in Appendix J. I used JMP® Pro, V13.0.0 (SAS Institute 2007) to complete all analyses of variance (ANOVA). Genotypes are stored in the PSU Box system.

AMPLIFICATION SUCCESS (Figure 23, Step 5) Individuals which did not successfully amplify at ≥ 7 loci were reprocessed (exceeds precedence set in Lepais et al, 2010). If amplification success did not ultimately improve, individuals were excluded from the 6,323 collected B. impatiens foragers using the function ‘missingno(x, “geno”)’ from the R package ‘poppr’. These individuals were likely suffering from poor quality DNA and could bias results when evaluating non-species-specific loci for use in a new species. (Figure 23, Step 5a) After removing individuals with poor quality DNA, I examined amplification success per locus across all individuals using the function ‘missingno(x, “loci”)’ from the R package ‘poppr’. When loci failed to amplified across 95% of samples, the master mix recipe was adjusted and the amount of primer was increased. To minimize re-processing, preliminary amplification success 68 was conducted early on a subset of 800 individuals (~13%) such that necessary adjustments could be made before all individuals are processed (Figure 23, Step 5b). Loci that ultimately fail to amplify across ≥ 95% of individuals are discarded.

IDENTIFYING RELATED INDIVIDUALS (Figure 23, Step 6) To identify foragers that belong to the same full-sibship family, we used the maximum likelihood sibship reconstruction method implemented in Colony V.2.0.6.4 (Jones & Wang, 2004). We ran Colony with the following parameters: monogamy for males and females, without inbreeding, without clones, dioecious, haplodiploid, medium-length run, no sibship prior and unknown allele frequencies. We used genotyping error rates of 0.5-5% per loci based on results of rescoring 96 individuals in Geneious and estimates of null allele frequencies of 1-4% per loci based on preliminary results from Microchecker using a subset of sites (Figure 23, Step 6a ‘Error Information;’ additional details in Appendix K) Only sisterhoods with an inclusion probability of 0.8 or higher were accepted (precedent set in Carvell et al, 2017). Duplicate full-sibship members were removed from each site to create a reduced dataset which included only 1 randomly selected forager per full-sibship per site.

TESTING NON-SPECIES-SPECIFIC LOCI FOR SIGNIFICANT ISSUES (Figure 23, Step 7) The dataset of 4,902 individuals containing only 1 forager per full-sibship per site was used to test each locus for significant issues related to null alleles, monomorphic loci, deviations from Hardy-Weinberg equilibrium and linkage disequilibrium. If significant issues are detected in the pooled dataset and/or in ≥ 10% of sites (3 of 30 sites for this study), the effect of these potentially problematic loci are evaluated by performing several common genetic tests for each site with and without questionable loci and comparing the results. The common genetic tests performed per site included measures of genetic diversity, overall population deviations from Hardy-Weinberg equilibrium, and inbreeding.

Genetic diversity was measured with as expected heterozygosity (HE) and allelic richness (AR).

HE is based on Nei’s unbiased estimated of gene diversity and was calculated using R package and function “poppr(x)” (Kamvar et al, 2014) with sample sizes standardized to the smallest sample size of 124 genotypes per site. AR was calculated per loci using 112 alleles for rarefaction to correct for varying sample sizes between sites with the function “allele.richness” in the R package “hierfstat” (Goudet, 2005). AR per loci was averaged across all loci per site to provide a

69 single value of AR per site. The effect of questionable loci on values of genetic diversity is tested with ANOVA. Changes in patterns of genetic diversity between sites is also noted.

I tested for significant deviation from Hardy-Weinberg equilibrium per site using Fisher’s method to complete a global test across all loci implemented in Genepop on the Web by selection option 1.3 (probability test), without enumeration of alleles and the default Markov chain parameters (dememorization number: 1000, number of batches: 100, number of iterations per batch: 1000). Significance was found at less than alpha = 0.002 after Bonferroni corrections based on testing 30 sites. Changes in significance when excluding questionable loci is noted.

Inbreeding coefficients (FIS), described by 95% confidence intervals for each site was calculated using the function “boot.ppfis(x)” in the R package “heirfstat” (Goudet, 2005). FIS is not significantly different from 0 if the 95% confidence interval overlaps or is bounded by 0, which indicates no inbreeding, i.e. random mating for each site. Changes in significance when excluding questionable loci is noted.

Null Alleles (Figure 23, Step 7b & d): I examined 4,902 individuals by site to test each locus for significant instances of null alleles using Microchecker V2.2.3. Appropriate motif repeats were selected for each locus (Table 13, ‘Repeat Motif’), maximum expected allele size was set at 360 bp for all loci, confidence interval was set to Bonferroni, analysis was set to 1000x and any unusual observations were first verified as missing data and then omitted from analysis. When null alleles were detected in ³10% of sites for any locus, I considered it a universal issue and monitored the effect of questionable loci on genetic tests.

Monomorphic Loci (Figure 23, Step 7a-D): I pooled the 4,902 individuals to test for universally significant monomorphic loci using the function ‘informloci’ from the R package ‘poppr’, which is comprised of two tests. It first looks for complete fixation, meaning any loci that contains less than 2 examples of individuals with alleles diverging from the most common allele (0.03% for this dataset). The second test looks for any locus where the minor allele frequency (MAF) is less than 0.01 or 1%. The MAF refers to the frequency at which the second most common allele occurs. When a locus was universally monomorphic, individuals were examined by site to understand severity. If the locus is monomorphic in ≥ 10% of sites, I monitored its effect on genetic tests.

70 In addition to evaluating the effect of monomorphic loci on common genetic tests, I used ANOVA to monitor the effect of monomorphic loci on the number of full-sibship families identified by Colony by comparing mean full-sibship families identified with and without monomorphic loci for all 30 sites.

Hardy-Weinberg Equilibrium (Figure 23, Step 7a-d): I pooled the 4,902 individuals to test each locus for universal deviations from for Hardy-Weinberg Equilibrium (HWE) with the function ‘hw.test’ from the R package ‘pegas’ which performs the classical Chi-squared tests based on the expected genotype frequencies calculated from the allelic frequencies. Significant was found at alpha = 0.002 after Bonferroni corrections based on testing 30 sites. When a locus universally deviated from HWE, individuals were examined by site and the effect of questionable loci on genetic test was further evaluated.

Linkage Disequilibrium (Figure 23, Step 7a-d): I pooled the 4,902 individuals to test each pair of loci for universal significant linkage disequilibrium (LD) using the log likelihood ratio statistic (G-test) implemented in Genepop on the web V4.2 with the default settings for Markov chain algorithm of Raymond & Rousset, 1995a (dememorization number: 1000, number of batches: 100, number of iterations per batch: 1000). Given the large number of site-by-locus tests in this dataset (n = 330), I viewed the use of Bonferroni corrections to be overly conservative so I considered tests significant at alpha = 0.001 (precedent in Lozier et al, 2011). When loci pairs exhibited universal LD, individuals were examined by site and the effect of questionable loci on genetic test was further evaluated.

LOCI DIVERSITY (Figure 23, step 8a) After determining which non-species-specific loci are appropriate for use in B. impatiens, I pooled all 6,222 individuals (regardless of relatedness) to calculate diversity metrics for each locus with the R package ‘poppr’. The total number of alleles (AN), observed heterozygosity (HO), and expected heterozygosity (HE) based on Nei’s estimate of gene diversity (Nei, 1978) were obtained for each locus using the ‘locus_table’ function. Allelic frequency distributions were obtained using the ‘rraf’ function.

71 RESULTS

VISUALIZING ALLELIC DRIFT AND SETTING BINS USING GENEIOUS (Figure 23, Step 4) Using Geneious to visualize the pattern of allele peaks revealed several loci impacted by allelic drift, including BT10 (Figure 24A). Peaks appear early in bins at the beginning of the range, but clearly drift until appearing late in bins at the end of the range. To account for this drift, I hand- adjusted bins, widening them uniformly across the entire range. Other loci, including BL11, did not exhibit allelic drift, but peaks were diverse at each allele and predicted bins initially excluding many peaks. Visualizing revealed the clear clustering of peak and widening bins uniformly across the entire range successfully captured appropriate peak diversity for each allele (Figure 24B). I discarded one locus, BL15, which lacked clear peak clusters and could not be appropriately binned with any adjustments (Figure 24C) (Binning data can be found in Appendix L).

AMPLIFICATION SUCCESS AND LOCI DIVERSITY (Figure 23, Step 5 & 8a) The range for Locus BTMS0074 overlapped with other loci in each dye and couldn’t be included in this multiplex design. Locus BTMS0083 was discarded for failing to amplify entirely. Loci BTMS0062 and B96 amplified at < 60% initially, so primer volume was increased. After optimizing our master mix, 6,222 foragers successfully genotyped at ≥ 7 loci and were used to evaluate overall amplification success of the remaining eleven loci, all of which successfully amplified across > 97.4% of individuals (Table 14; more details in Appendix M). Based on 6,012 – 6,099 individuals per locus, the number of alleles ranged from 3 – 54 per locus and both expected and observed heterozygosity ranged from 0.02 – 0.96 per locus. Allelic frequency tables are available for each locus in Appendix N)

TESTING NON-SPECIES-SPECIFIC LOCI FOR SIGNIFICANT ISSUES (Figure 23, Step 7) After using Colony to remove related forages with similar genotypes (Figure 23, Step 6), I evaluated loci using 4,902 individuals. The results from all tests are summarized in Table 14. Ultimately 11 non-species-specific loci were verified for future use in Bombus impatiens.

Monomorphic Loci (Figure 23, Step 7a-d): No locus was universally fixed with fewer than 2 samples (0.04%) diverging from the most common allele. However, the universal minor allele frequency (MAF) for BT28 was only 0.0086, which is just shy of the minimum threshold of 0.01

72 to be considered polymorphic. When examined by site, BT28 was monomorphic (MAF < 0.01) in 18 of 30 sites, prompting further evaluation. Excluding BT28 resulted in a slight, yet significant increase for both measures of genetic

2 diversity (Figure 25A & B HE: F1, 58 = 1364.27, P < 0.001, R = 0.96; AR: F1, 58 = 306.61, P <

2 0.001, R = 0.84). Because this increase was uniform across sites (HE: variance < 0.001; AR: variance = 0.102), there was no effect on the pattern of genetic diversity among sites. Contrary to my original hypothesis, excluding BT28 resulted in slightly fewer full-sibship families per site,

2 however the difference was not significant (Figure 25.C, F1, 58 = 0.349, P = 0.557, R = 0.01). No sites deviated from HWE when using all 11 loci, but excluding BT28 revealed significant deviations for 2 sites. Low levels of significant inbreeding were detected in 4 sites when using all 11 loci, and excluding BT28 revealed significant inbreeding in a 5th site (more details can be found in Appendix O).

Null Alleles (Figure 23, Step 7b, D): I detected isolated incidents of significant null alleles for 5 loci, but in only 1 or 2 sites for each locus: one site each for BTMS0066, BTMS0073, and BL11, BT30 and 2 sites for BTMS0081 (details in Appendix P). Because significant null alleles were only present in 1 or 2 sites (<10%), these loci were not considered universally problematic and were retained without need for further evaluation.

Hardy-Weinburg Equilibrium (HWE) (Figure 23, Step 7a-d): I detected universal deviations

2 from HWE for one locus, BTMS0081 (Chi =646.7, Df=36, P <0.001). When examining by site,

2 significant deviations were only found in Site 6 (Chi = 224.9, Df = 15, P < 0.001) and Site 25 (Chi2=178.9, Df=10, P<0.001). Removing those 2 sites eliminated the universal deviation from HWE, and therefore BTMS0081 was retained without further evaluation (more details in Appendix Q).

Linkage Disequilibrium (LD) (Figure 23, Step 7a-D): I detected 9 universally linked loci pairs. When examining by site, each pair was significantly linked in only 1 or 2 sites (more details in Appendix R). Removing Site 4 eliminated 7 of the 9 universal significant linkages, while removing site 18 and site 6 one at a time eliminated the remaining 2 universal significant linkages. Because all universal linkages could be eliminated by removing 1 sight, all loci were retained without further evaluation.

73 DISCUSSION

Previous studies have published step-by-step processes and checklists for designing multiplex primer sets and evaluating issues inherent to microsatellite analysis (Guichoux et al, 2011; Selkoe and Toonen, 2006). The workflow suggested in this study (Figure 23) provides a model for objectively eliminating or including potentially problematic loci based on clear thresholds that assess the severity of each potential issue, as well as a guide for evaluating the impact of potentially problematic loci on subsequent analyses. Future researchers may elect to test for additional issues or alter their tolerance for including potentially problematic loci, in which case this workflow can serve as a blueprint for designing objective evaluation protocols. The goal should be to report clear results garnered through systematic and objective processes with clearly defined protocols; all of which should be repeatable and user friendly. The new method for visualizing allelic drift in Geneious is a simple solution to a pervasive problem, particularly when working with large sample sizes. Although it does not have the statistical prowess available in previously reported methods for setting bins (Guichoux et al, 2010; Alberto, 2009; Amos et al, 2007; Idury & Cardon, 1997), this visualization technique is intended to provide a practical, time-saving method that can reduce potential errors associated with reformatting and transferring datasets across software platforms. Using this technique, I identified insurmountable binning problems for locus BL15 (Figure 24C) and, based on the frequency of peaks that appear at both odd and even alleles, it’s possible that this locus (originally isolated from B. lucorum) may contain a single-base pair mutation in Bombus impatiens. To verify possible mutations, future studies could examine the sequences of DNA in addition to the lengths of each allele. Including a monomorphic locus decreased overall genetic diversity per site but did not alter the pattern of genetic diversity between the 30 sites (Figure 25). This suggests that within a single study, including monomorphic loci will not affect the relative understanding of genetic diversity for a particular species. Furthermore, it’s possible that monomorphic loci would not impact comparisons made between different studies or between species, even when only one study or species includes monomorphic loci. For example, Goulson, 2010 summarizes HE and

AR reported for multiple bumble bee species and concludes that higher genetic diversity is a characteristic of stable species, with HE ranging from 0.52 – 0.85. Without monomorphic BT28, we found and average HE of 0.711, and with BT28, average HE dropped to 0.648. Both of those values fall within the range of “higher genetic diversity” and excluding BT28 did not change our 74 interpretation. However, it is possible that had we considered fewer loci or if our other loci had been less diverse, the effect of monomorphic loci could have been greater. Future studies should consider the role of monomorphic loci when comparing between studies and species. To enable such considerations, researchers should test for and report results of monomorphic loci. With BT28, all sites demonstrated HWE, but several were borderline with p-values just slightly greater than alpha = 0.002 (Appendix O). When excluding BT28, two borderline sites were pushed past the significance threshold and were considered deviating from HWE. A similar phenomenon was observed with FIS, wherein with BT28, the confidence interval for one site just barely overlapped 0, but removing BT28 caused a minor increase for the lower confidence limit to 0.001, resulting in significant inbreeding for that site. While the changes were subtle for both

HWE and FIS, removing BT28 revealed a few additional significant issues. It’s possible that monomorphic loci conceal population-level issues and future studies should consider the potential impact of monomorphic loci when evaluating boarder-line results. Interestingly, excluding the monomorphic locus had the opposite effect on the number of full-sibship families than was expected. I hypothesized that BT28 might inaccurately group unrelated foragers together, thus creating fewer families with more members each. Instead, the number of full-sibship families slightly increased with BT28, although it was not a significant increase (Figure 25.C). I believe we see this infinitesimal increase in full-sibship families because BT28 is not entirely fixed and at times, the rare minor alleles of BT28 enable Colony to discern when two bees are not related. It could be that these rare alleles are glitches and splitting these bees into different full-sibship families is inappropriate, but I think we can conclude with confidence that monomorphic loci are not inaccurately grouping unrelated foragers together – particularly when using 11 loci to determine relatedness. Aside from promoting new methods for visualizing allelic drift and evaluating the effect of monomorphic loci on several tests, this study has produced a well characterized multiplex of 11 non-species-specific microsatellite loci for use in B. impatiens. This multiplex has been used by collaborators at several institutions to genotype both B. impatiens (Nicholson et al, unpublished) and B. terrestris (Reynolds et al, unpublished) with great success. Additionally, the protocols provided in the appendices have been used in tandem with multiplexes appropriate for other species (Döke et al, unpublished) to produce clear results. The appended protocols and recipes are intended to streamline the wet-lab process for future researchers, particularly those new to microsatellite analysis. Recent calls for long term monitoring of native pollinators suggest that research on the common eastern bumble bee should continue, as this species supports both the

75 agricultural industry and native ecosystems throughout the eastern United States. Our well- characterized multiplex can be used to standardize future studies for B. impatiens.

REFERENCES

Alberto, F. (2009) MsatAllele_1.0: an R package to visualize the binning of microsatellite alleles. Journal of Heredity, 100, 394–397. Amos, W., Hoffman, J. I., Frodsham, A., Zhang, L., Best, S., & Hill, A. V. S. (2007). Automated binning of microsatellite alleles: Problems and solutions. Molecular Ecology Notes, 7(1), 10- 14. doi:10.1111/j.1471-8286.2006.01560.x Brookfield, J. (1996). A simple new method for estimating null allele frequency from heterozygote deficiency. Molecular Ecology, 5(3), 453-455. http://dx.doi.org/10.1046/j.1365- 294x.1996.00098.x Carvell, C., Bourke, A., Dreier, S., Freeman, S., Hulmes, S., & Jordan, W. et al. (2017). Bumblebee family lineage survival is enhanced in high-quality landscapes. Nature, 543(7646), 547-549. http://dx.doi.org/10.1038/nature21709 Chakraborty, R., Andrade, M., Daiger, S., & Budowle, B. (1992). Apparent heterozygote deficiencies observed in DNA typing data and their implications in forensic applications. Annals Of Human Genetics, 56(1), 45-57. http://dx.doi.org/10.1111/j.1469- 1809.1992.tb01128.x Darvill, B., Knight, M., & Goulson, D. (2004). Use of genetic markers to quantify bumblebee foraging range and nest density. Oikos, 107(3), 471-478. http://dx.doi.org/10.1111/j.0030- 1299.2004.13510.x Döke, M., McGrady, C., Otieno, M., Grozinger, C., & Frazier, M. (2018). Evaluating the role of local stock adaptation in overwintering success in honey bees (Hymenoptera: Apidae) in the Northeastern United States (submitted, Journal of Economic Entomology) Dudgeon, C. L., Blower, D. C., Broderick, D., Giles, J. L., Holmes, B. J., Kashiwagi, T., . . . Ovenden, J. R. (2012). A review of the application of molecular genetics for fisheries management and conservation of sharks and rays. Journal of Fish Biology, 80(5), 1789- 1843. doi:10.1111/j.1095-8649.2012.03265.x Estoup, A., Solignac, M., Cornuet, J., Goudet, J., & Scholl, A. (1996). Genetic differentiation of continental and island populations of Bombus terrestris (Hymenoptera: Apidae) in Europe. Molecular Ecology, 5(1), 19-31. http://dx.doi.org/10.1111/j.1365- 294x.1996.tb00288.x Funk, C.R., Schmid-Hempel, R., & Schmid-Hempel, P. (2006). Microsatellite loci for Bombus spp. Molecular Ecology Notes, 6(1), 83-86. http://dx.doi.org/10.1111/j.1471- 8286.2005.01147.x Geib, J., Strange, J., & Galen, C. (2015). Bumble bee nest abundance, foraging distance, and host-plant reproduction: implications for management and conservation. Ecological Applications, 25(3), 768-778. http://dx.doi.org/10.1890/14-0151.1

76 Glaubitz J.C. (2004) CONVERT: A user-friendly program to reformat diploid genotypic data for commonly used population genetic software packages. Molecular Ecology Notes 4: 309- 310. Goudet, J. (2005). hierfstat, a package for r to compute and test hierarchical F-statistics. Molecular Ecology Notes, 5(1), 184-186. http://dx.doi.org/10.1111/j.1471-8286.2004.00828.x Goulson, D., Lepais, O., O’Connor, S., Osborne, J., Sanderson, R., & Cussans, J. et al. (2010). Effects of land use at a landscape scale on bumblebee nest density and survival. Journal Of Applied Ecology, 47(6), 1207-1215. http://dx.doi.org/10.1111/j.1365-2664.2010.01872.x Goulson, D. (2010). Bumblebees behaviour, ecology, and conservation (2nd ed.). Oxford: Oxford Univ. Press. Guichoux, E., Lagache, L., Wagner, S., Chaumeil, P., Léger, P., Lepais, O., . . . Petit, R. J. (2011). Current trends in microsatellite genotyping. Molecular Ecology Resources, 11(4), 591-611. doi:10.1111/j.1755-0998.2011.03014.x Idury, R., & Cardon, L. (1997). A simple method for automated allele binning in microsatellite markers. Genome Research, 7(11), 1104-1109. Jayashree B, Reddy PT, Leeladevi Y et al. (2006) Laboratory Information Management Software for genotyping workflows: applications in high throughput crop genotyping. BMC Bioinformatics, 7, 383. Jombart T. (2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24: 1403-1405. doi: 10.1093/bioinformatics/btn129 Jones, O., & Wang, J. (2010). COLONY: a program for parentage and sibship inference from multilocus genotype data. Molecular Ecology Resources, 10(3), 551-555. http://dx.doi.org/10.1111/j.1755-0998.2009.02787.x Kalinowski, S. (2004). Counting Alleles with Rarefaction: Private Alleles and Hierarchical Sampling Designs. Conservation Genetics, 5(4), 539-543. http://dx.doi.org/10.1023/b:coge.0000041021.91777.1a Kamvar, Z., Tabima, J., & Grünwald, N. (2014). Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peerj, 2, e281. http://dx.doi.org/10.7717/peerj.281 Lepais, O., Darvill, B., O’connor, S., Osborne, J., Sanderson, R., & Cussans, J. et al. (2010). Estimation of bumblebee queen dispersal distances using sibship reconstruction method. Molecular Ecology, 19(4), 819-831. http://dx.doi.org/10.1111/j.1365- 294x.2009.04500.x Meirmans, P., & Hedrick, P. (2010). Assessing population structure: FST and related measures. Molecular Ecology Resources, 11(1), 5-18. http://dx.doi.org/10.1111/j.1755- 0998.2010.02927.x Morin PA, Martien KK, Archer FI et al. (2010) Applied conservation genetics and the need for quality control and reporting of genetic data used in fisheries and wildlife management. Journal of Heredity, 101, 1– 10. Nei, M. (1978) Estimation of average heterozygosity and genetic distance from a small number of individuals. Genetics 89(3), 583–590 77 Paradis, E. (2010). pegas: an R package for population genetics with an integrated-modular approach. Bioinformatics, 26(3), 419-420. http://dx.doi.org/10.1093/bioinformatics/btp696 Rao, S., & Strange, J. (2012). Bumble Bee (Hymenoptera: Apidae) Foraging Distance and Colony Density Associated With a Late-Season Mass Flowering Crop. Environmental Entomology, 41(4), 905-915. http://dx.doi.org/10.1603/en11316 Raymond M. & Rousset F. (1995). GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. J. Heredity, 86:248-249 Roesti, M., Salzburger, W., & Berner, D. (2012). Uninformative polymorphisms bias genome scans for signatures of selection. BioMed Central Evolutionary Biology 12: 94. http://www.biomedcentral.com/1471-2148/12/94 Rousset, F. (2008). Genepop'007: a complete reimplementation of the Genepop software for Windows and Linux. Mol. Ecol. Resources 8: 103-106. Selkoe, K., & Toonen, R. (2006). Microsatellites for ecologists: a practical guide to using and evaluating microsatellite markers. Ecology Letters, 9(5), 615-629. http://dx.doi.org/10.1111/j.1461-0248.2006.00889.x Stolle, E., Rohde, M., Vautrin, D., Solignac, M., Schmid-Hempel, P., Schmid-Hempel, R., & Moritz, R. (2009). Novel microsatellite DNA loci for Bombus terrestris(Linnaeus, 1758). Molecular Ecology Resources, 9(5), 1345-1352. http://dx.doi.org/10.1111/j.1755- 0998.2009.02610.x Van Oosterhout, C., Hutchinson, W., Wills, D., & Shipley, P. (2004). micro-checker: software for identifying and correcting genotyping errors in microsatellite data. Molecular Ecology Notes, 4(3), 535-538. http://dx.doi.org/10.1111/j.1471-8286.2004.00684.x Vieira, M., Santini, L., Diniz, A., & Munhoz, C. (2016). Microsatellite markers: what they mean and why they are so useful. Genetics And Molecular Biology, 39(3), 312-328. http://dx.doi.org/10.1590/1678-4685-gmb-2016-0027 Woodard, S., Lozier, J., Goulson, D., Williams, P., Strange, J., & Jha, S. (2015). Molecular tools and bumble bees: revealing hidden details of ecology and evolution in a model system. Molecular Ecology, 24(12), 2916-2936. http://dx.doi.org/10.1111/mec.13198 Winter, D. (2012). mmod: an R library for the calculation of population differentiation statistics. Molecular Ecology Resources, 12(6), 1158-1160. http://dx.doi.org/10.1111/j.1755- 0998.2012.03174.x

78 Systematic Workflow for Evaluating Microsatellite Loci

1) Obtain tissue sample per specimen

Multi-plex Master Mix with 2) DNA Extraction DNA Optimize master mix Microsatellite Primers

3) PCR Amplification & Sequencing

4) Score alleles and assess allelic drift

Retain problems consistent Preliminary Genotypes Locus

5a) Amplification success per Discard individual across all loci Locus

reprocess < 6 loci > 7 loci individual Preliminary frequency of null alleles 5b) Amplification success per locus across all individuals Scoring error rate

> 95% of < 95% of Error Information Working Genotypes individuals individuals

6a) Identify Related individuals

6b) Remove duplicates Reduced Genotypes

7) Test each Locus for fundamental issues

Deviation from Hardy- Null Alleles Monomorphic Linkage Disequilibrium Weinberg Equilibrium

7a) Test pooled samples for universal issues

7b) Test each site 7b) Test each site universal no signif. independently for signif. independently for signif. signif. detected

<10% of >10% of 7d) Run preliminary genetic >10% of <10% of sites sites analyses with/without locus sites sites

similar meaningful 7c) Remove affected sites results differences & re-test universal signif.

signif. still eliminate Discard detected signif. Locus

Retain Locus

Allelic Frequency per loci Genetic Diversty Final Genotypes

Allelic diversity per loci Genetic Differentiaion

8a) Final Analyses 8b) Final Analyses with Colony Analysis with all individuals unrelated individuals and More!

Figure 23. A standardized workflow for evaluating microsatellite loci, particularly when using non-species-specific loci for the first time. Actions or processes (blue squares) provide results (circles) which are used to make decisions (diamonds) about discarding or retaining loci for subsequent analyses.

79 Table 13. 14 microsatellite loci from Bombus spp. and their characteristics in Bombus impatiens. All loci were isolated from B. terrestris unless otherwise specified in parentheses under the locus name. Primers labeled with the same florescent dye color (Dye) do not have overlapping size ranges. The number of nucleotides in the repeat motifs (tri-, di-) are reported for this study, while the motif repeat sequence is reported from the references. Size ranges, measured in number of base pairs (bp) reflect the distribution of alleles found in this study based on sample sizes in parenthesis. Loci that were trialed and ultimately discarded are included in gray. During PCR, all primers had an annealing temperature of 55 C.̊ µl per Size Range Locus Primer Sequence Dye Repeat Motif Ref sample (bp) BTMS0066 F: CATGATGACACCACCCAACG Trinucleotide 118-193^ FAM 0.135 Stolle et al, 2009 R: TTAACGCCCAATGCCTTTCC ACG (6199) B124 F: GCAACAGGTCGGGTTAGAG Dinucleotide 231-305^ FAM 0.2 Estoup et al, 1995 R: CAGGATAGGGTAGGTAAGCAG CT, GC, GGCT (6185) Btern01 F: CGTGTTTAGGGTACTGGTGGTC Dinucleotide 122-168^ VIC 0.07 Funk et al, 2006 (B. ternarius) R: GGAGCAAGAGGGCTAGACAAAAG AG (6199) BT28 F: TTGCTGACGTTGCTGTGACTGAGG Trinucleotide 178-199^ VIC 0.1 Funk et al, 2006 R: TCCTCTGTGTGTTCTCTTACTTGGC GTT, GTTGCT (6187) BTMS0062 F: CTGTCGCATTATTCGCGGTT Dinucleotide 233-343^ VIC 0.272 Stolle et al, 2009 R: CTGGGCGTGATTCGATGAAC CT (6172) BTMS0073 F: CGATATCGCGATCTTCGTACAC Trinucleotide 111-135 NED 0.135 Stolle et al, 2009 R: GTAGCATGCTCTCCGTGTTG AAG (6012) BT10 F: TCTTGCTATCCACCACCCGC Dinucleotide 139-193 NED 0.05 Funk et al, 2006 R: GGACAGAAGCATAGACGCACCG CT (6126) BL11 F: AAGGGTACGAAATGCGCGAG Dinucleotide 122-160^ PET 0.109 Funk et al, 2006 (B. lucorum) R: TGACGAGTGCGGCCTTTTTC TG (6059) BT30 F: ATCGTATTATTGCCACCAACCG Trinucleotide 178-208^ PET 0.091 Funk et al, 2006 R: CAGCAACAGTCACAACAAACGC GCT, GTT (6173) B96 F: GGGAGAGAAAGACCAAG Dinucleotide 227-277^ PET 0.29 Estoup et al, 1996 R: GATCGTAATGACTCGATATG CG, CT (6176) BTMS0081 F: ACGCGCGCCTTCTACTATC Trinucleotide 286-331 PET 0.14 Stolle et al, 2009 R: AGGGACACGCGAACAGAC * (6198) BTMS0074‡ F: TTACGCGGAGATTGGACG Trinucleotide 137-156 FAM 0.15 Stolle et al, 2009 R: CGAACCGGCTATAGCGAA TGC (48) BL15 Not recommended for use in Bombus impatiens - difficult to score genotypes due to inconsistent Funk et al, 2006 (B. lucorum) peak morphology, possible allelic drift and probable single base pair mutations BTMS0083 Not recommended for use in Bombus impatiens - complete failure to amplify indicating flanking Stolle et al, 2009 regions not conserved between Bombus species ^ Alleles appeared outside these ranges, but only in < 0.001 of all individuals *BTMS0081 was reported for B. terrestris with a tetranucleotide repeat motif of CCTT, but this study has revealed it to be a trinucleotide repeat motif of unknown nucleotides for B. impatiens. ‡BTMS0074 range overlapped with other loci for every dye and could not be used in this multiplex.

80 A Locus BT10

B Locus BLll

C BL15

Figure 24. Visualizing cluster patterns of allele peaks and setting bins for 3 loci using Geneious. Colors represent the dye used to label primers during PCR. The thin, dark lines represent the midpoint of a peak for a single individual. The broad shaded regions represent the bin for a single allele. Peaks for (A) locus BT10 appear early in bins early in the range and exhibit drift, appearing late in bins late in range. Peaks for (B) locus BL11 are diverse for each allele, but still exhibit clear clustering and can be binned. Peaks for locus (C) BL15 vary considerably without clear patterns, resulting in many peaks falling outside of bins.

81 Table 14. Summary of universal issues and diversity metrics for eleven non-species-specific microsatellite loci when used in Bombus impatiens. In row 1, thresholds for universal significance is indicated below the name of each test. The % amplification success (AMP) is based on 6,222 successfully genotyped foragers followed by (sample size). The number of sites with significant null alleles (Null) is based on 4,902 unrelated individuals evaluated per site. Universal significance of monomorphic loci (Mono), deviations from Hardy-Weinberg Equilibrium (HWE), and linkage disequilibrium (LD) are based on 4,902 unrelated, pooled individuals. Monomorphic loci are indicated by reporting the minor allele frequency (maf). Values reported for HWE are p-values based on Chi-squared tests with the number of sites driving universal deviations in parentheses. Values for LD are the linked loci with the number of sites driving universal significant linkages in parenthesis. Gray LD values indicate a repeat of linked loci repeated earlier in the chart. Total number of alleles (AN), observed heterozygosity (HO) and expected heterozygosity based on Nei’s 1978 estimate of gene diversity (He) are based on 6,222 pooled individuals.

Locus Amp Null Mono HWE LD AN HO HE <95% of ≥ 10% of sites maf < 0.01 p < 0.002 p < 0.001 individuals BTMS0066 99.8% (6199) 3.3% (1) 0.811 B124 (1) 25 0.69 0.7 BT10 (2) BT30 (1) B124 99.6% (6185) 0.979 BTMS0066 (1) 33 0.91 0.91 Btern01 (2) BT30 (1) B96 (1) Btern01 99.7% (6199) 0.995 B124 (2) 24 0.81 0.8 BT30 (1) BT28 99.5% (6187) 0.0086 (18) 0.968 3 0.02 0.02 BTMS0062 99.2% (6172) 0.040 55 0.96 0.96 BTMS0073 96.6% (6012) 3.3% (1) 0.540 5 0.21 0.22 BT10 98.5% (6126) 0.884 BTMS0066 (1) 28 0.92 0.93 B96 (2) BL11 97.4% (6059) 3.3% (1) 0.087 B96 (2) 20 0.9 0.91 BT30 99.4% (6173) 3.3% (1) 0.999 BTMS0066 (1) 11 0.38 0.37 B124 (1) Btern01 (1) B96 99.4% (6176) 0.288 B124 (1) 26 0.79 0.79 BT10 (2) BL11 (2) BTMS0081 99.6% (6198) 6.6% (2) 0.000 (2) 8 0.51 0.51

82 0.8 A P < 0.001 15.5 B P < 0.001 205 C P = 0.557 R2 = 0.96 R2 = 0.84 R2 = 0.01 190 0.75 14.5 175 0.7 13.5 160 0.65 12.5 145 sibship Families

- 130 0.6 11.5

Full 115 Rarified Allelic Richness Expected Heterozygosity 0.55 10.5 100 All 11 Excluding All 11 Excluding All 11 Excluding Loci BT28 Loci BT28 Loci BT28

Figure 25. Comparing results for several analyses when including (green) or excluding (gray) monomorphic locus BT28. Excluding BT28 resulted in a small, but significant increase for genetic diversity when measured with both (A) expected heterozygosity (including 0.648 ± 0.001SE; excluding 0.711 ± 0.001SE) and (B) rarified allelic richness (including 12.52 ± 0.04SE; excluding 13.57 ± 0.04SE). Excluding BT28 did not significantly affect (C) average full- sibship families detected by Colony (including: 163.4 ± 2.9SE; excluding: 160.9 ± 2.8SE). Excluding BT28 revealed 2 sites that significantly deviated from Hardy-Weinberg equilibrium and 1 additional site with significant inbreeding. Whiskers depict range with mid-line representing the mean. Significance represented by bolded P-value. Sample size is 60 per test, 30 sites in each group.

83 APPENDICES

APPENDIX G Two wet lab recipes (5% Chelex Solution, Proteinase K Solution) and one protocol (DNA Chelex Extraction) optimized for bumble bee DNA extraction used in microsatellite analysis

FLEISCHER)LAB)RECIPES:)Chelex)Solution) V2:)16)May)2016) ! 5%$Chelex$Solution$Recipe$ ) Time$to$Complete:)10#minutes) ) Reagents$ 5)grams)Chelex)100) 95)mL)MilliFq)H20)) ) Materials$ Gloves) 100mL)graduate)cylinder) Label)Tape) Electronic)Balance) FineFtipped)Sharpie) Scientific)Spatula) 200mL)Erlenmeyer)flask) Plastic)Weighing)Tray) Magnetic)Stir)Bar) Parafilm) ) Procedure$ 1.! Wash)hands,)don)gloves) 2.! Label)a)200mL)Erlenmeyer)flask)with)label)tape:)5%)Chelex)Solution)–)Creation)Date:))#)Month,)####) 3.! Place)magnetic)stir)bar)in)flask)and)set)aside.)) 4.! Place)plastic)weighing)tray)on)electronic)balance)and)tare.)Make)sure)units)are)grams)) 5.! Use)the)scientific)spatula)to)measure)out)5g)of)Chelex)100)into)plastic)weighing)tray.)Close)balance) doors)for)most)accurate)results.) 6.! Carefully)scrap)the)Chelex)from)the)plastic)weighing)tray)into)the)flask)using)the)scientific)spatula) 7.! Measure)95)mL)of)milliFq)H20)with)a)graduate)cylinder)and)pour)into)flask) 8.! Cover)flask)with)parafilm)and)store)in)a)4°C)refrigerator) *can#be#stored#indefinitely#–#as#long#as#it#stays#clean#/#uncontaminated# ) WARNING:)Be)careful)when)transporting)the)Chelex)solution)from)4°C)to)work)space)and)back.)If)the) solution)sloshes)up)the)sides)of)the)Erlenmeyer)flask,)bits)of)solution)are)left)behind)and)cannot)be) reclaimed.) ) This)recipe)yields)approximately)100mL)of)solution)=)100,000µL)of)solution) 100,000µL)of)solution)can)service)approximately)seven)96)well)plates) ) Clean$up:$ F! Store)the)5%)Chelex)100)Solution)in)a)4°C)refrigerator) F! Store)the)powdered)Chelex)at)room)temperature) F! Discard)the)Plastic)weighing)tray)&)gloves) F! Clean)the)following)with)MilliFq)H2O:) o! Scientific)spatula)–)rinse)with)milliFq)H2O)and)allow)to)air)dry.)Place)in)labeled)ziplock)bag) o! Graduate)Cylinder)–)rinse)with)milliFq)H2O)and)allow)to)air)dry) #

Page)1)of)1) ) 84 FLEISCHER)LAB)RECIPES:)Proteinase)K) V2:)16)May)2016) " Proteinase"K"(10mg/mL)"Solution"Recipe" ) Estimated"Time"to"Completion:)10!mins) ) Reagents:" 10.1)mg)Proteinase)K) 1000)µL)Molecular)H2O) ) Materials:" Gloves) Label)Pen) Scientific)Spatula) 1000µL)Pipette)/)tips) 10µL)Pipette)/)tips) 1.5)mL)microcentrifuge)tube)and)cap) Electronic)Balance) Vortex) Microcentrifuge) ) Procedure:" ) 1.! Wash)hands,)don)gloves) 2.! Remove)10mg)Proteinase)K)from)Q20°C)freezer)and)allow)to)warm)to)room)temperature.)) 3.! Label)microcentrifuge)tube:)“Proteinase)K)(10mg/ml)”)and)the)creation)date) 4.! Place)microcentrifuge)tube)with)cap)on)Electronic)balance)and)tare.)Make)sure)units)are)in)mg.) 5.! Use)a)scientific)spatula)to)measure)10.1)mg)Proteinase)K)directly)into)microcentrifuge)tube.)For) most)accurate)results,)close)the)doors)of)the)electronic)balance.) 6.! Using)pipettes,)add)1010µL)molecular)H2O)to)microcentrifuge)tube) 7.! Vortex)solution)for)5)seconds)and)spin)down.)If)there)are)flecks)of)Proteinase)K)in)the)lid,)turn)the) tube)upside)down)and)vortex)for)5)more)seconds)) 8.! Store)at)Q20°C)until)needed.)) ) This)recipe)yields)approximately)1010µL)of)solution) 1010µL)of)solution)can)service)approximately)two)96)well)plates) ) Clean"Up:" Q! Store)the)Following)in)a)Q20°C)Freezer:) o! 10mg)lyophilized)Proteinase)K) o! Proteinase)K)(10mg/ml))Solution)) Q! Rinse)scientific)spatula)with)MilliQq)H2O)and)allow)to)air)dry.)Store)in)labeled)ziplock)bag) Q! Discard)gloves) )

! Page)1)of)1) )

85 FLEISCHER)LAB)PROTOCOLS:)DNA)Chelex)Extraction)Protocol) V2:)16)May)2016) DNA&Chelex&Extraction&Protocol& ) Estimated&Time&to&Completion:)~1!hr!per!96!well!plate!to!prepare!+!1!hr!31!min!for!thermocycling)) ) Materials& Previously&Prepared&Solutions& Gloves) Proteinase)K)(10)mg/ml))solution) 96)well)plate)tray) 5%)Chelex)100)solution) Multichannel)Pipette)+)tips)(1!box!+!extras)) 96)well)plate)of)clipped)legs) 200µL)&)1000µL)Pipette)+)tips) ) Kimwipe) Plastic)Trough)–)Pro)K) Label)Pen)/)Tape Magnetic)Stir)Plate) ) Thermocycler) ) Procedure:& 1.! Wash)hands,)don)gloves) ) 2.! Follow)Proteinase)K)Recipe)(if!solution!was!pre?prepared,!remove!from!freezer!and!allow!to!thaw)! ! 3.! Follow)5%)Chelex)Solution!Recipe)(if!solution!was!pre?prepared,!remove!from!fridge!and!begin!stirring! on!magnetic!stir!plate!at!400!–!600!RPM)) ) 4.! Retrieve)the)96)well)plate)containing)the)Bombus!impatiens)leg)samples)from)the)]20)freezer)and) place)in)96)well)plate)tray) ) 5.! Ensure)the)well)plate)is)labeled)correctly)with)field!name,!year,!and!well!plate!number.)Add)the) extraction!date)to)plate)face)with)label)pen.)Create)a)label)on)label)tape)that)says:) Extraction)Plate)mm.dd.yy! Field!name!Year! Box)#!Well)plate)3)) initials! ) 6.! Number)the)cap)strips)1]12)at)the)top)of)each)strip)and)remove)caps)one)column)at)a)time,)placing) them)on)a)clean)kimwipe.) ) 7.! Use)a)200μL)pipette)to)transfer)150μL)of)5%)Chelex®)100)solution)from)the)Erlenmeyer)flask)into) each)well.)Be)sure)to)pipette)from)Chelex)solution)that)is)actively)mixing)on)the)Magnetic)Stir)Plate.) Change)tips)every)three)rows)AND)if)tip)comes)into)contact)with)the)inside)of)a)well)or)leg)sample.) ) 8.! Use)a)1000μL)pipette)to)slowly)dispense)505μL)of)Proteinase)K)evenly)into)plastic)trough.*)) ) 9.! Use)a)multichannel)pipette)to)pipette)5μL)of)Proteinase)K)solution)into)each)well.)Mix)proteinase)K) and)Chelex)by)gently)pumping)pipette.)Ensure)leg)samples)are)fully)submerged)in)solution.)Change) tips)with)each)column.*) *&Proteinase)K)is)a)viscous)and)“sticky”)solution)that)clings)to)the)side)of)pipette)tips.)Always)pipette)slowly)when)working) with)the)proteinase)K)to)get)as)much)as)possible)into)the)solution& ) )

! Page)1)of)2) )

86 FLEISCHER)LAB)PROTOCOLS:)DNA)Chelex)Extraction)Protocol) V2:)16)May)2016) ) 10.!Recap)wells)firmly)after)each)column)with)the)correct)strip)of)caps)in)the)correct)orientation.) ) 11.!)Do!not!spin!down!plate)before)incubating)the)samples)in)a)thermocycler)with)the)following)protocol:) 55°C)for)1)Hour) 99°C)for)15)minutes) 37°C)for)1)minute) 99°C)for)15)minutes) Hold)at)15°C) ! This!reaction!takes!just!over!1hr!30mins.!You!now!have!an!Extraction!Well!Plate.!Extraction!Well!Plates!must!be!stored!at!4°C! until!needed,!for!short!term!storage!(about!a!week!or!so).!If!you!are!not!running!your!PCR!within!a!week,!then!keep!the!plate! in!the!?20°C!freezer!for!use!within!a!few!months.! & Clean&Up:& ]! Store)Proteinase)K)solution)in)a)]20°C)freezer)–)pipette)any)solution)remaining)in)trough)back)into) microcentrifuge)tube)for)future)use) ]! Store)the)following)in)a)4°C)refrigerator:) o! Extraction)Well)Plate) o! 5%)Chelex)Solution)100) ]! Rinse)Pro)K)plastic)trough)with)Milli]q)H2O)and)allow)to)air)dry.)Store)in)labeled)ziplock)bag) ]! Autoclave)the)following:) o! Erlenmeyer)Flask)–)if)5%)Chelex)solution)is)depleted)during)the)protocol,)rinse)the)flask)and) autoclave.)Allow)flask)to)air)dry,)then)store)flask)in)a)labeled)ziplock)bag)with)stir)bar) o! Magnetic)Stir)bar)–)if)5%)Chelex)solution)is)depleted)during)the)protocol,)rinse)the)magnetic) stir)bar)and)autoclave.)Allow)bar)to)air)dry,)then)store)bar)in)a)labeled)ziplock)bag)with)flask) ]! Discard)the)following:) o! Microcentrifuge)tube)for)Proteinase)K)Solution,)if)Proteinase)K)is)depleted)during)this)protocol) )

! Page)2)of)2) )

87 APPENDIX H Two protocols for amplifying and sequencing DNA for microsatellite analysis

FLEISCHER)LAB)PROTOCOLS:)Amplification)Protocol) V2:)16)May)2016) ) ! Amplification!Protocol! ) Estimated!time!to!complete:!Depends'on'pipetting'speed'of'individual'researcher! ) Materials! Previously!Prepared!Materials! Gloves) ) Master)Mix) Ice/)ice)bucket) ) Extraction)Well)Plate) Label)pen)/)Sharpie)/)Label)Tape) % Two)96)well)plate)trays) 96)well)Plates) 96)well)plate)caps) Multichannel)Pipette)/)10µL)tips)(2)boxes)+)extra)) 200µL)Pipette)/)tips) 1000µL)Pipette)/)tips) Plastic)Trough) Thermocycler) ) Procedure! 1.! Wash)hands,)don)gloves.)) ) 2.! Prepare)work)station)for)quickest)work)flow)to)reduce)primer)exposure)to)light:) •! Get)ice)and)cover)ready;)place)bag)of)cap)strips)near)by) •! Lay)out)both)well)plate)trays,)a)clean)kimwipe)and)the)open)plastic)trough) •! Remove)plastic)film)from)2)boxes)of)10µL)pipette)boxes,)set)multichannel)to)1.2µL) ) 3.! Remove)extraction)plate)from)Y20)/)4)/)thermocycler)and)allow)to)thaw)as)necessary.)Spin)plate)down.) ) 4.! Use)a)label)pen)to)label)the)Amplification)Well)Plate)directly)on)the)plate)face:) Field'Name'Year)WP)#)Amp:)mm.dd.yy' ' 5.! Use)a)sharpie)to)create)an)Amplification)Well)Plate)Label)with)Label)tape) Amplification)Plate)mm.dd.yy) Field'name'year) Run)#) Box)#)Well)Plate)#') initials' ' 6.! Place)extraction)plate)in)well)plate)tray)and)remove)the)cap)strips,)placing)them)on)the)kimwipe.) ) 7.! Use)the)multichannel)pipette)to)transfer)1.2µL)of)DNA)extract)from)each)well)to)the)corresponding)wells)of)the) Amplification)Well)Plate.)Be)careful)to)transfer)only)clear)DNA)solution)and)not)the)white,)sandy)chelex)at)the) bottom)of)the)DNA)extraction)wells.)Check)each)pipette)tip)to)ensure)a)complete)sample)from)each)well.) ) 8.! After)each)column,)change)pipette)tips.)After)every)three)columns,)replace)caps)on)the)extraction)well)plate.) ) 9.! When)complete,)place)extraction)plate)in)the)4))̊C)to)await)results.)Cover)amplification)plate)in)its)tray)and)set) to)the)side)where)it)won’t)be)bumped)or)contaminated.) ) 10.!Prepare)PCR)Master)Mix)recipe)using)the)recipe)page.)Once)prepared,)use)1000µL)to)transfer)the)master)mix) into)a)plastic)trough.) ) Page%1%of%2% %

88 FLEISCHER)LAB)PROTOCOLS:)Amplification)Protocol) V2:)16)May)2016) ) ) 11.!Use)a)multichannel)pipette)to)transfer)9)µL)of)master)mix)from)the)plastic)trough)to)each)well)of)the) Amplification)Well)Plate.)) ) 12.!After)each)column,)change)pipette)tips.)After)every)three)columns,)place)cap)strips)on)the)amplification)well) plate)tightly)to)avoid)dehydration)during)thermocycling.)Number)each)cap)strip)at)the)top)to)ensure)correct) replacement)in)future)steps.)Discard)any)remaining)master)mix.) ) 13.!Spin)down)amp)plate)@)1000rpm)for)10)seconds)to)ensure)all)droplets)of)DNA)and)master)mix)are)together)in) the)bottom)of)each)well) ) 14.!Incubate)the)Amplification)well)plate)in)a)thermocycler)set)to)this)protocol:) Initial)Denaturation)Step:) 3)minutes)30)seconds)@)95°C) Then,)repeat)these)three)Amplification)steps)for)30)cycles:) 30)seconds)@))95))̊C) 1)minute)15)seconds)@)55))̊C) 45)seconds)@)72))̊C) )) Followed)by)the)final)Extension)step:) 15)minutes)@)72))̊C) Hold)at)15))̊C) ) ) This'reaction'takes'just'over'2'hours.'After'thermocylcing,'you'will'now'have'PCR'Product!'DNA'Extract'+'Master'Mix'+' Thermocycling'is'called'PCR'Product.'It'is'mostly'pure'DNA'with'leftover'bits'of'master'mix' ' Cleaning!Up) (! Update)the)active)reagent)box)with)new)reagents)for)a)complete)master)mix)and)return)all)reagents)to)the)Y 20))̊C)freezer) (! Place)the)Following)in)a)4))̊C)refrigerator:) o! DNA)Extraction)Plate)can)be)stored)at)4))̊C)until)you)receive)accurate)results)from)the)sequencing) facility.)At)that)point,)move)the)DNA)Extraction)Plate)to)Y20))̊C) o! Amplification)Plate)can)be)stored)at)4))̊C)until)you)receive)accurate)results)from)the)sequencing)facility.) At)that)point,)move)the)Amplification)Plate)to)Y20))̊C.)Make)sure)amplification)plate)is)stored)in)a)light) proof)box)to)protect)photosensitive)primers.) (! Dispose)of)the)following:) o! Master)mix)microcentrifuge)tube) o! Gloves) o! Any)remaining)master)mix)in)plastic)trough) (! Rinse)master)mix)trough)with)MilliYQ)H2O.)Allow)trough)to)air)dry)and)then)store)in)a)labeled)ziplock)bag.) ) ' ' ' ' ' ' ' '

Page%2%of%2% %

89 FLEISCHER)LAB)PROTOCOLS:)Pre2Sequencing)Protocol) V2)16)May)2016) ) Pre$Sequencing,Protocol, ) Estimated,time,to,complete:,~15#mins! ) Materials, Previously,Prepared,Materials, Gloves) % Amplification%Well%Plate% Label)pen) % Two)96)well)plate)tray) Sequencing)96)well)plate*) Well)Plate)Sealing)film*) Multichannel)Pipette)/)tips:)1)box)of)10µL)pipette)tips)per)plate) Kimwipes) Tin)foil) ) *At#Penn#State,#the#Sequencing#facility#provides#the#96#well#plate#and#the#well#plate#sealing#film#that#is#compatible#with# their#machines.#Plates#can#be#obtained#in#advance#from#the#Chandlee#Laboratory,#4th#floor.#) ) Procedure, 1.! Submit)an)order)to)the)sequencing)facility)at)dnalims.huck.psu.edu)and)print)the)3)page)order)form.) 2.! Wash)hands,)don)gloves.)) 3.! Using)a)label)pen,)label)the)Sequencing)plate)directly)on)the)plate)face)with)“Fleischer)Lab”)and)the)order) number)(obtained)from)the)order)form)) 4.! Prepare)work)station)for)quickest)work)flow)to)reduce)primer)exposure)to)light:) •! Tear)off))two)sheets)of)tin)foil,)set)out)sharpie)marker) •! remove)plastic)film)from)tip)box,)set)multichannel)pipette)to)1µL) •! lay)out)both)well)plate)trays)and)place)empty)sequencing)plate)in)one)tray) •! lay)out)kimwipe)and)sealing)film)) 5.! Retrieve)Amplification)plate)from)Thermocycler)and)place)it)in)the)other)well)plate)trays.)Carefully)remove)all) cap)strips)and)place)on)a)clean)kimwipe) 6.! Use)the)multichannel)pipette)to)transfer)1µL)of)PCR)Product)from)the)Amplification)Well)Plate)to)the) corresponding)wells)in)the)Sequencing)Well)Plate.) 7.! After)each)column,)change)pipette)tips.)After)every)three)columns,)replace)caps)on)the)amplification)well)plate.) 8.! Carefully)cover)the)Sequencing)Well)Plate)with)the)well)plate)sealing)film.))This)film)is)hard)to)readjust,)so)be) careful)to)get)a)perfect)seal)the)first)time.))Wrap)the)plate)in)one)of)the)tin)foil)sheets)to)protect)the)primers) from)light)exposure.)Write)“Order:)#####”)on)the)tin)foil)with)sharpie) 9.! Wrap)Amplification)Well)Plate)in)the)2nd)sheet)of)foil)and)write) Field)Name)Year) Well)Plate)#) Run)#) Amp)##.##.##) 10.!Put)amplification)plate)in)4))̊C)refrigerator)to)await)results)from)the)sequencing)facility.)) 11.!Store)the)sequencing)plate)in)4))̊C)for)one)night)to)get)the)best)results.)Transport)the)sequencing)plate)and) order)form)to)the)Chandlee)Laboratory,)4th)floor)for)sequencing)by)8:30am)the)following)morning.)Keep)the) plate)as)flat)as)possible)during)transport.) ) Cleaning,Up, (! Amplification)Plate)can)be)stored)at)4))̊C)until)you)receive)accurate)results)from)the)sequencing)facility.)Once) you)receive)results,)move)both)the)Amplification)plate)and)the)Extraction)plate)to)the)220))̊C)into)the) appropriate)freezer)box)

Page%1%of%1% %

90 APPENDIX I

Reduce X Scale such that the entire range of one locus is visible

Reduce Spacing to 0

Select “Allow Vertical Overlap”

Select “Scale X Axes” Ensure appropriate loci document is selected

Only select one dye at a time

De-select “Show Traces” Initially de-select “Show Bins” to provide a clear Select “Show Peak Calls” visual of peak clustering

Then, select “Show De-select “Show Peak Labels” Bins” and evaluate the

bins predicted by Geneious and make These settings do not necessary adjustments impact this process for allelic drift or peak diversity

Figure I26. Necessary settings for visualizing allelic drift using Geneious.

91 APPENDIX J R script for testing microsatellite loci for significant issues (Figure 23, Step 7a) setwd("") library("adegenet") #http://adegenet.r-forge.r-project.org/files/tutorial-basics.pdf library("poppr") #https://cran.r-project.org/web/packages/poppr/vignettes/mlg.html library("magrittr") #https://cran.r- project.org/web/packages/magrittr/vignettes/magrittr.html library("pegas") #https://cran.r-project.org/web/packages/pegas/pegas.pdf library("lattice") #https://cran.r-project.org/web/packages/lattice/lattice.pdf

#######Read in data file (I used BBEdit to convert excel files to .gen files) #all individuals included regardless of relatedness, sorted into sites All<-read.genepop("file name",ncode=3)

#just checking some basic info to make sure it’s the correct file #make note of the number of individuals summary(All) #adegenet popNames(All) #adegenet

########Create file which excludes individuals with poor quality DNA that amplified at < 7 loci #cutoff value indicates that any bee missing 37% or more data (4 or more loci) should be removed AllClean <- missingno(All, "geno", cutoff =.37) #poppr summary (AllClean) #adegenet

#make sure the number of individuals has dropped from the original 'All' file nInd(All) #adegenet #number of individuals before excluding those missing 5+ loci nInd(AllClean) #adegenet #number of individuals AFTER excluding those missing 5+ loci

#######TESTING LOCI #1. Check Locus for amplification success (missing data less than 5%) missingno(AllClean, "loci") #poppr

#2. Check for any uninformative loci informloci(AllClean) #poppr

#3. Check each locus for Null Alleles --> LEAVE R, use Microchecker #use Microchecker results to create error file for COLONY #use COLONY to create data files that only have 1 individual per full-sibship family

####Read in new data files that only have 1 bee per colony #single individual per fullsibship family, sorted into sites mydataf<-read.genepop("file name”, ncode=3) nInd(mydataf) #adegenet #double check that these new files include only samples with 7+ loci missingno(mydataf, "geno", cutoff =.37) #poppr summary(mydataf) #adegenet

#4. Check for Linkage Disequilibrium between loci pairs --> LEAVE R, use Genepop on the Web

#5. Check that each locus is in Hardy-Weinberg equilibrium, thus proving they are neutral markers #pooled dataset (overallhwe.full <- hw.test(mydataf, B = 1000)) #pegas #per locus

92 #if you find a locus that is not in HWE (P<0.002), examine each site separately to determine if significance is driven by just a few sites

#By Field (fieldhwe <- seppop(mydataf) %>% lapply(hw.test, B = 1000)) #adegenet, magrittr, pegas #per field, can be difficult to visualize depending on how large your dataset is

#creating heatmap of HWE per field, so you can see what might be driving issues in HWE (fieldhwe.mat <- sapply(fieldhwe, "[", i = TRUE, j = 3)) alpha <- 0.001 fieldmat <- fieldhwe.mat fieldmat[fieldmat > alpha] <- 1 levelplot(t(fieldmat)) #lattice #I found just two fields that were out of HWE for BTMS0081, so i removed them and re- analyzed

#to remove Hootsy and Muncy because of BTMS0081 not in HWE popNames(mydataf) #adegenet toRemove = c(6, 14) mydatafnohoomun=mydataf[pop=-toRemove] popNames(mydatafnohoomun) #adegenet #check it’s been removed mydatafnohoomunclean <- missingno(mydatafnohoomun, "geno", cutoff =.37) (overallhwe.full <- hw.test(mydatafnohoomunclean, B = 1000)) #pegas #per locus (simple, doesn't need heatmap)

######After testing our LOCI for missing info, uninformative, null, linkage, hwe, we can obtain basic information about the loci... #1. LOCI DIVERSITY locus_table(AllClean) #poppr

#2. Allelic Frequency Distribution per Loci rraf(AllClean) #poppr #overall frequencies, every bee sampled to capture rare alleles

R script for monitoring the effect of questionable loci on common genetic tests per each site Figure 23, Step 7d) setwd("") library(hierfstat) #https://cran.r-project.org/web/packages/hierfstat/hierfstat.pdf library("poppr") #https://cran.r-project.org/web/packages/poppr/vignettes/mlg.html

#######Read in data file (I used BBEdit to convert excel files to .gen files) #all individuals included regardless of relatedness, sorted into sites mydataf<-read.genepop("file name”, ncode=3)

########1. Genetic Diversity #Expected Heterozygosity poppr(mydataf) #poppr #http://rpackages.ianhowson.com/cran/poppr/man/poppr.html for details on the output

#Allelic richness allelic.richness(mydataf, min.n=112, diploid=TRUE) #hierfstat #min.n is the rarified sample size

93 #########2. Check that each population is in Hardy-Weinberg equilibrium --> leave R, use Genepop on the Web, option 1.3 and look at results for Fisher’s test across all loci for each population

######3. Inbreeding boot.ppfis(mydataf) #hierfstat #provides 95% confidence intervals for each site basic.stats(mydataf) #hierfstat #provides an overall measure of inbreeding

#to remove bt28 because monomorphic locNames(mydataf) #what are the column headers for each locus toRemove= c(4) #BT28 is the 4th loci listed mydataf28=mydataf[loc=-toRemove] locNames(mydataf28) #check that BT28 has been removed mydataf28Clean <- missingno(mydataf28, "geno", cutoff =.31) #create doc without BT28 where all individuals have 7+ loci

APPENDIX K When identifying related foragers, Colony incorporates information regarding 2 types of potential errors in the genotypes: genotyping error rate and the frequency of null alleles. Genotyping error rate is based on the individual researcher responsible for scoring alleles. After scoring genotypes initially, the researcher rescores a subset and determines the rate at which the human made errors or changed subjective calls. This type of error will decrease as skill and experience of the researcher increases (Table K15). The frequency of null alleles is estimated by running a preliminary analysis of null alleles in Microchecker following the same process as the true analysis for null alleles, only the preliminary analysis includes all individuals, because obviously we have not yet determined which bees are related and it is only conducted on a subset of fields because we are trying to generate a representative estimate. Furthermore, we are interested in the frequency of null alleles, regardless of if that frequency is considered significant. We ran a preliminary null allele analysis on 23 of our 30 fields (which was probably overkill, honestly) using the Oosterhout, Chakraborty, Brookfield 1, and Brookfield 2 equations (Chakraborty, 1992; Brookfield. 1996) implemented through Microchecker V2.2.3. Each equation varies slightly in the method used to estimate null allele frequencies and thus provides slightly different answers. For each locus, we averaged the frequencies generated by all equations across the 23 fields to determine a preliminary estimate of null alleles (Table K15).

Table K15. Error rates used by Colony when identifying related foragers Genotyping Preliminary Frequency Locus Error Rate of Null Alleles BTMS0066 0.005 0.02 B124 0.02 0.01 Btern01 0.01 0.01 BT28 0.005 0.03 BTMS0062 0.02 0.01 BTMS0073 0.005 0.04 BT10 0.04 0.01 BL11 0.05 0.01 BT30 0.005 0.03 B96 0.01 0.01 BTMS0081 0.02 0.03 94 APPENDIX L Table L16. Binning information for each locus including locus name, dye color (6-FAM, VIC, NED, PET), allele length (allele) and the minimum (min) and maximum (max) for each allele’s bin. BTMS0066 (6-FAM) Btern01 (VIC) BTMS0073 (NED) BL11 (PET) Allele Min Max Allele Min Max Allele Min Max Allele Min Max 118 116.7 118.7 122 120.5 122.2 111 110.6 113 122 121.7 123.3 121 119.7 121.7 124 122.5 124.2 114 113.6 116 124 123.7 125.3 124 122.7 124.7 126 124.5 126.2 117 116.6 119 126 125.7 127.3 127 125.7 127.7 128 126.5 128.2 120 119.6 122 128 127.7 129.3 130 128.7 130.7 130 128.5 130.2 123 122.6 125 130 129.7 131.3 133 131.7 133.7 132 130.5 132.2 126 125.6 126.9 132 131.7 133.3 136 134.7 136.7 134 132.5 134.2 129 128.6 131 134 133.7 135.3 139 137.7 139.7 136 134.5 136.2 132 131.6 134 136 135.7 137.3 142 140.7 142.7 138 136.5 138.2 135 134.6 136.8 138 137.7 139.3 145 143.7 145.7 140 138.5 140.2 140 139.7 141.3 148 146.7 148.7 142 140.5 142.2 142 141.7 143.3 151 149.7 151.7 144 142.5 144.2 144 143.7 145.3 154 152.7 154.7 146 144.5 146.2 146 145.7 147.3 157 155.7 157.7 148 146.5 148.2 148 147.7 149.3 160 158.7 160.7 150 148.5 150.2 150 149.7 151.3 163 161.7 163.7 152 150.5 152.2 152 151.7 153.3 166 164.7 166.7 154 152.5 154.2 154 153.7 155.3 169 167.7 169.7 156 154.5 156.2 156 155.7 157.3 172 170.7 172.7 158 156.5 158.2 158 157.7 159.3 175 173.7 175.7 160 158.5 160.2 160 159.7 161.3 178 176.7 178.7 162 160.5 162.2 181 179.7 181.7 164 162.5 164.2 184 182.7 184.7 166 164.5 166.2 187 185.7 187.7 168 166.5 168.2 190 188.7 190.7 193 191.7 193.7 BT28 (VIC) BT10 (NED) BT30 (PET) Allele Min Max Allele Min Max Allele Min Max 178 177.5 179.4 139 137.9 139.5 178 177.3 179.5 181 180.5 182.4 141 139.9 141.5 181 180.3 182.5 184 183.5 185.4 143 141.9 143.5 184 183.3 185.5 187 186.5 188.4 145 143.9 145.5 187 186.3 188.5 190 189.5 191.4 147 145.9 147.5 190 189.3 191.5 193 192.5 194.4 149 147.9 149.5 193 192.3 194.5 196 195.5 197.4 151 149.9 151.5 196 195.3 197.5 199 198.5 200.4 153 151.9 153.5 199 198.3 200.5 155 153.9 155.5 202 201.3 203.5 157 155.9 157.5 205 204.3 206.5 159 157.9 159.5 208 207.3 209.5 161 159.9 161.5 163 161.9 163.5 165 163.9 165.5 167 165.9 167.5 169 167.9 169.5 171 169.9 171.5 173 171.9 173.5 175 173.9 175.5 177 175.9 177.5 179 177.9 179.5 181 179.9 181.5 183 181.9 183.5 185 183.9 185.5 187 185.9 187.5 189 187.9 189.5 191 189.9 191.5 193 191.9 193.5

95

B124 (6-FAM) BTMS0062 (VIC) B96 (PET) Allele Min Max Allele Min Max Allele Min Max 231 230.1 231.6 233 232 233.6 227 227.2 228.9 233 232.1 233.6 235 234 235.6 229 229.1 230.8 235 234.1 235.6 237 236 237.6 231 231.1 232.8 237 236.1 237.6 239 238 239.6 233 233 234.7 239 238.1 239.6 241 240 241.6 235 235 236.7 241 240.1 241.6 243 242 243.6 237 236.9 238.6 243 242.1 243.6 245 244 245.6 239 238.9 240.6 245 244.1 245.6 247 246 247.6 241 240.8 242.5 247 246.1 247.6 249 248 249.6 243 242.8 244.5 249 248.1 249.6 251 250 251.6 245 244.7 246.4 251 250.1 251.6 253 252 253.6 247 246.7 248.4 253 252.1 253.6 255 254 255.6 249 248.6 250.3 255 254.1 255.6 257 256 257.6 251 250.6 252.3 257 256.1 257.6 259 258 259.6 253 252.5 254.2 259 258.1 259.6 261 260 261.6 255 254.5 256.2 261 260.1 261.6 263 262 263.6 257 256.4 258.1 263 262.1 263.6 265 264 265.6 259 258.4 260.1 265 264.1 265.6 267 266 267.6 261 260.3 262 267 266.1 267.6 269 268 269.6 263 262.3 264 269 268.1 269.6 271 270 271.6 265 264.2 265.9 271 270.1 271.6 273 272 273.6 267 266.2 267.9 273 272.1 273.6 275 274 275.6 269 268.1 269.8 275 274.1 275.6 277 276 277.6 271 270.1 271.8 277 276.1 277.6 279 278 279.6 273 272 273.7 279 278.1 279.6 281 280 281.6 275 274 275.7 281 280.1 281.6 283 282 283.6 277 276 277.6 283 282.1 283.6 285 284 285.6 285 284.1 285.6 287 286 287.6 287 286.1 287.6 289 288 289.6 289 288.1 289.6 291 290 291.6 291 290.1 291.6 293 292 293.6 293 292.1 293.6 295 294 295.6 295 294.1 295.6 297 296 297.6 297 296.1 297.6 299 298 299.6 299 298.1 299.6 301 300 301.6 301 300.1 301.6 303 302 303.6 303 302.1 303.6 305 304 305.6 305 304.1 305.6 307 306 307.6 309 308 309.6 BTMS0081 (PET) 311 310 311.6 Allele Min Max 313 312 313.6 286 284.9 287.4 315 314 315.6 289 287.9 290.4 317 316 317.6 292 290.9 293.4 319 318 319.6 295 293.9 296.4 321 320 321.6 298 296.9 299.4 323 322 323.6 301 299.9 302.4 325 324 325.6 304 302.9 305.4 327 326 327.6 307 305.9 308.4 329 328 329.6 310 308.9 311.4 331 330 331.6 313 311.9 314.4 333 332 333.6 316 314.9 317.4 335 334 335.6 319 317.9 320.4 337 336 337.6 322 320.9 323.4 339 338 339.6 325 323.9 326.4 341 340 341.6 328 326.9 329.4 343 342 343.6 331 329.9 332.4

96 APPENDIX M After removing 98 individuals with poor quality DNA, 6,222 foragers successfully genotyped at ≥ 7 loci and were used to evaluate overall amplification success of the remaining eleven loci. All loci amplified successfully in more than 95% of foragers (Table M17, row ‘Overall’). Several isolated incidents of less than 95% amplification success do occur at some sites, largely due to wet lab processing errors. For example, the BTMS0073 primer was unintentionally excluded from the master mix used to amplify 96 samples from field 11, leading to an isolated incident of only 48.5% amplification success. In another isolated instance of only 78.3% amplification success, supplies of the BL11 primer were depleted near the end of the experiment and stock solution was procured from another lab which turned out to be substandard, resulting in weaker amplification and missing data in some of the final collections processed, including field 11. In both examples, lack of amplification success was not due to inherent issues with the loci.

Table M17. Percent of individuals that successfully amplified at each locus, displayed by site (n = 30) and summarized across all samples (overall). 100% amplification success is indicated by . Shading depicts a gradient of amplification success from 0 – 100% with the darkest colors indicating the least success. Locus

1

Mean B96 B124 BL11 BT10 BT30 BT28 Btern0 BTMS0081 BTMS0066 BTMS0062 BTMS0073 Overall 99.8% 99.6% 99.7% 99.5% 99.2% 96.6% 98.5% 97.4% 99.4% 99.4% 99.6% 99.0% Field 1 . 99.1% . . 99.1% . 98.2% 99.5% . 98.2% . 99.5% Field 2 99.4% 99.4% 98.7% . . . 96.8% 95.5% . . . 99.1% Field 3 . 99.5% 99.5% . 99.5% . 99.5% . 99.5% 99.5% . 99.7% Field 4 99.5% 99.0% . 99.5% 99.0% 99.5% 99.5% 98.5% 99.5% 99.5% 99.0% 99.3% Field 5 . 99.0% . . 98.5% 99.5% . 95.1% 99.5% 99.5% 98.5% 99.1% Field 6 . . 99.0% . . . . 98.5% 97.5% . . 99.5% Field 7 98.7% 99.2% 98.7% . 96.6% 84.3% . 98.3% 99.2% 97.5% 97.9% 97.3% Field 8 99.6% . 99.6% 99.6% . 83.4% . 99.1% . . . 98.3% Field 9 . . . . 99.6% 99.6% 99.2% 98.8% . 99.6% . 99.7% Field 10 . 99.1% 99.5% 99.5% 99.1% . 99.5% 97.2% . . 99.5% 99.4% Field 11 99.6% . 99.6% . . . 99.2% 78.2% . . . 97.9% Field 12 . 99.0% 99.5% 99.5% 99.5% . 96.0% 98.0% 98.5% 99.5% 98.5% 98.9% Field 13 . . 99.5% . 99.5% . . . 99.0% 99.5% 99.5% 99.7% Field 14 . 99.5% . . . . 99.5% 99.1% . . 99.5% 99.8% Field 15 . 99.5% . 99.5% 95.0% . 97.0% 98.5% 97.5% 96.5% 99.5% 98.5% Field 16 . . . 93.8% 97.9% 48.5% 95.9% 96.9% 99.0% 99.0% 99.0% 93.6% Field 17 99.0% . 99.0% 98.0% 99.0% 92.1% 99.5% 99.0% 98.5% 97.5% 99.5% 98.3% Field 18 . 99.0% 99.5% . 98.6% . 99.0% 92.3% 99.5% 99.5% . 98.9% Field 19 . 99.5% . 99.5% 99.5% . 99.1% 97.7% . . 99.5% 99.5% Field 20 99.4% 99.4% . 98.9% . . 98.9% 99.4% 99.4% 99.4% . 99.5% Field 21 99.1% 99.5% . 99.5% 99.5% 95.2% 97.1% 99.5% 98.1% 99.1% . 98.8% Field 22 . . . . . 99.5% 97.2% . . . . 99.7% Field 23 . . . 98.1% . . 99.5% 99.5% 99.5% 99.5% . 99.7% Field 24 . . . 99.6% . . 97.3% . . 99.6% . 99.7% Field 25 . . . 99.5% . . 97.6% . . . . 99.7% Field 26 99.5% . . . 99.0% 98.4% 93.7% 98.4% . . 99.0% 98.9% Field 27 99.5% 98.5% 99.5% 99.0% 99.5% . 96.5% 99.0% 99.0% . . 99.1% Field 28 . 99.5% 99.5% . 99.5% 99.5% 98.1% 99.0% . 99.0% . 99.5% Field 29 . 99.1% 99.1% . . . . 93.1% . 99.1% . 99.1% Field 30 . 99.4% . . 99.4% . 98.9% 97.8% 98.9% . . 99.5%

97 APPENDIX N Allelic frequency tables are available for each locus in Figure N27, along with sample size (N), number of alleles (AN) and expected heterozygosity based on Nei’s gene diversity (HE).

A N: 6199 Locus BTMS0066 AN: 25 0.50 HE: 0.7

0.40

0.30

0.20

Alleleic Alleleic Frequency 0.10

0.00 118 121 151 181 124 127 130 133 136 139 142 145 148 154 157 160 166 169 172 175 178 184 187 190 193 Alleles

B

N: 6185 Locus B124 AN: 33 0.15 HE: 0.91

0.12

0.09

0.06

Allelic Frequency 0.03

0.00 231 251 261 271 281 291 301 233 245 247 249 253 255 257 259 263 265 267 269 273 275 277 279 283 285 287 289 293 295 297 299 303 305 Alleles

98 C

Locus Btern01 N: 6199 0.30 AN: 24 HE: 0.8 0.25

0.20

0.15

0.10

Alleleic Alleleic Frequency 0.05

0.00 122 124 126 128 130 132 134 136 138 140 142 144 146 148 150 152 154 156 158 160 162 164 166 168 Alleles

D N: 6172 Locus BTMS0062 AN: 54 0.05 HE: 0.96

0.04

0.03

0.02

Alleleic Alleleic Frequency 0.01

0.00 311 241 251 261 271 281 291 301 313 315 317 319 321 331 341 233 239 243 245 247 249 253 255 257 259 263 265 267 269 273 275 277 279 283 285 287 289 293 295 297 299 303 305 307 309 323 325 327 329 333 335 337 339 343 Alleles

E F

Locus BT28 N: 6187 Locus BTMS0073 N: 6012 1.00 0.90 AN: 3 AN: 5 HE: 0.02 0.75 HE: 0.22 0.75 0.60

0.50 0.45

0.30 0.25 Alleleic Alleleic Frequency Allelic Allelic Frequency 0.15

0.00 0.00 111 120 123 126 135 178 184 199 Alleles Alleles

99 G N: 6126 Locus BT10 AN: 28 0.15 HE: 0.93

0.12

0.09

0.06

Alleleic Alleleic Frequency 0.03

0.00 141 151 161 171 181 191 139 143 145 147 149 153 155 157 159 163 165 167 169 173 175 177 179 183 185 187 189 193 Alleles

H

Locus BL11 N: 6059 AN: 20 0.15 HE: 0.91 0.12

0.09

0.06

Alleleic Alleleic Frequency 0.03

0.00 122 124 126 128 130 132 134 136 138 140 142 144 146 148 150 152 154 156 158 160 Alleles

I

N: 6176 Locus B96 AN: 26 0.35 HE: 0.79 0.30 0.25 0.20 0.15 0.10 Allelic Allelic Frequency 0.05 0.00 231 241 251 261 271 227 229 233 235 237 239 243 245 247 249 253 255 257 259 263 265 267 269 273 275 277 Alleles

100 J K

Locus BT30 N: 6173 Locus BTMS0081 N: 6198 0.80 AN: 11 0.55 AN: 8 0.70 HE: 0.37 0.50 HE: 0.51 0.45 0.60 0.40 0.50 0.35 0.40 0.30 0.25 0.30 0.20 0.20 0.15 Allelic Allelic Frequency Allelic Allelic Frequency 0.10 0.10 0.05 0.00 0.00 181 178 184 187 190 193 196 199 202 205 208 310 313 316 319 331 286 304 307 Alleles Alleles

Figure N27. Allele frequency distributions for 11 loci (A-K) based on the 6,222 B. impatiens genotypes examined in this study. For each locus, N denotes the number of foragers successfully genotyped, AN denotes total number of alleles, and HE denotes the expected heterozygosity based on Nei’s 1978 estimate of gene diversity. Alleles missing from the x-axis indicate no individuals were found to have that allele. Colors indicate the florescent dye used during PCR

101 APPENDIX O For each site (1-30), the number of individuals genotyped, the number of full-sibship families identified, measures of genetic diversity (HE and AR), population HWE and inbreeding are reported when including and excluding BT28, a monomorphic locus (Table O18). The number of full-sibship families decreased, but not significantly, when excluding BT28. Both HE and AR increased significantly when excluding monomorphic loci. Two of 30 sites were found to significantly deviate from HWE when excluding BT28. One additional site (5 total) had significant yet low levels of inbreeding when excluding BT28, but the overall inbreeding coefficient (FIS = 0.0058) did not change.

Table O18. Results of common genetic tests performed with (11 loci) and without monomorphic locus BT28 (Ex BT28). Number of individuals successfully genotyped (NG) were first sorted into full-sibship families (NF) using Colony. Genetic diversity is measured with expected heterozygosity (HE) and allelic richness (AR). Values of NF, HE, and AR when excluding BT28 are colored when values decrease (red) or increase (green). Deviations from Hardy-Weinberg Equilibrium (HWE) are significant at p < 0.002 after Bonferroni corrections. Inbreeding (FIS) is significant when positive 95% confidence intervals do not overlap 0. Significance is bolded and changes in significance when excluding BT28 have *. Tests for HE, AR, HWE and FIS were run with sample sizes listed under NF, 11 loci. N H A HWE F Site N F E R IS G 11 loci Ex Bt28 11 loci Ex BT28 11 loci Ex BT28 11 loci Ex BT28 11 loci BT28 1 219 161 160 0.645 0.704 12.2 13.2 0.227 0.394 (-0.024, 0.007) (-0.028, 0.011) 2 156 132 133 0.653 0.716 12.6 13.7 0.559 0.328 (-0.016, 0.031) (-0.009, 0.041) 3 193 124 120 0.638 0.7 12.2 13.2 0.691 0.613 (-0.035, 0.017) (-0.034, 0.015) 4 205 155 151 0.641 0.704 12.5 14 0.088 0.019 (-0.034, 0.039) (-0.04, 0.034) 5 203 168 166 0.655 0.716 13 13.6 0.977 0.968 (-0.028, 0.024) (-0.024, 0.023) 6 196 166 161 0.641 0.702 12.4 13.5 0.003 0.0005* (-0.01, 0.03) (-0.006, 0.03) 7 236 178 176 0.651 0.714 12.6 13.3 0.504 0.224 (-0.017, 0.011) (-0.018, 0.012) 8 229 185 185 0.652 0.716 13 13.7 0.199 0.103 (-0.033, 0.012) (-0.031, 0.016) 9 242 191 186 0.653 0.718 12.5 13.4 0.155 0.086 (-0.011, 0.038) (-0.007, 0.034) 10 211 146 147 0.648 0.71 12.3 13.8 0.129 0.271 (-0.023, 0.038) (-0.021, 0.042) 11 239 183 181 0.644 0.708 12.2 13.8 0.476 0.573 (-0.01, 0.032) (-0.017, 0.029) 12 199 168 168 0.652 0.715 12.7 13.8 0.121 0.032 (-0.004, 0.024) (-0.001, 0.021) 13 205 170 168 0.647 0.71 12.6 13.2 0.176 0.175 (-0.001, 0.058) (0.001, 0.055)* 14 219 148 148 0.638 0.7 12.4 13.6 0.750 0.543 (-0.002, 0.035) (-0.004, 0.028) 15 201 159 156 0.638 0.701 12.2 13.8 0.326 0.347 (-0.015, 0.021) (-0.014, 0.021) 16 194 146 143 0.645 0.708 12.4 13.7 0.653 0.672 (-0.003, 0.03) (-0.009, 0.024) 17 202 168 167 0.651 0.714 12.4 14 0.421 0.319 (-0.013, 0.028) (-0.01, 0.029) 18 207 148 148 0.647 0.709 12.7 13.7 0.028 high. sig.* (0.012, 0.055) (0.015, 0.055) 19 218 177 172 0.654 0.714 12.6 13.4 0.895 0.689 (-0.021, 0.009) (-0.021, 0.007) 20 176 155 152 0.651 0.712 12.7 13.5 0.955 0.918 (-0.001, 0.035) (-0.005, 0.031) 21 210 172 169 0.649 0.712 12.2 13.3 0.372 0.181 (0.004, 0.033) (0.007, 0.038) 22 211 175 171 0.646 0.708 12.7 13.4 0.560 0.445 (-0.006, 0.031) (-0.003, 0.032) 23 210 171 167 0.641 0.705 12.6 13.6 0.194 0.213 (-0.028, 0.011) (-0.026, 0.013) 24 221 179 179 0.642 0.705 12.7 13.2 0.787 0.788 (-0.007, 0.034) (-0.007, 0.035) 25 212 177 171 0.648 0.711 12.6 13.6 0.421 0.392 (0.008, 0.035) (0.006, 0.036) 26 191 153 151 0.659 0.722 12.6 13.3 0.454 0.234 (-0.021, 0.025) (-0.02, 0.028) 27 202 159 159 0.659 0.723 12.3 13.7 0.784 0.601 (0.002, 0.05) (0.005, 0.053) 28 205 167 161 0.663 0.726 12.4 13.7 0.642 0.372 (-0.042, -0.001) (-0.038, -0.004) 29 231 181 176 0.648 0.711 12.4 13.4 0.352 0.411 (-0.026, 0.009) (-0.027, 0.011) 30 179 140 137 0.644 0.707 12.8 13.9 0.345 0.3432 (-0.037, 0.029) (-0.028, 0.011) 163.4 160.9 ± ± ± ± Total: 6,222+ 0.648 0.711 12.52 13.57 0* 2* 0.0058* (4) 0.0058* (5) ± 2.9SE^ ± 2.8SE^ 0.001SE^ 0.001SE^ 0.04SE ^ 0.04SE ^ +Summed across sites, ^Averaged across sites, *Calculated overall

102 APPENDIX P For all loci, null alleles were the result of stuttering, indicated by a deficit of heterozygotes separated by a single repeat motif (Table P19) No locus was affected by large allelic dropout, as is suggested by a general excess of homozygotes for most allele size classes.

Table P19. For each locus, the total observed homozygotes (TOBS) significantly exceeds the total expected homozygotes (TEXP). In all instances, the source of error (Error) is identified as stuttering. Equations from Chakraborty 1992 and Brookfield 1996 are used to provide estimates of null allele frequencies (Oosterhout, Chakraborty, Brookfield 1, Brookfield 2)

Locus (Site) TEXP TOBS Error Oosterhout Chakraborty Brookfield 1 Brookfield 2 BTMS0066 (site 13) 51.6 64 Stuttering 0.0548 0.0551 0.0429 0.0429 BTMS0073 (site 17) 115.9 124 Stuttering 0.0838 0.1038 0.0401 0.217 BL11 (site 6) 16.5 26 Stuttering 0.0332 0.033 0.0303 0.0473 BT30 (site 23) 81.1 87 Stuttering 0.0809 0.0893 0.0385 0.186 BTMS0081 (site 4) 76.3 92 Stuttering 0.0982 0.1143 0.0685 0.1191 BTMS0081 (site 27) 78 91 Stuttering 0.0779 0.0871 0.0541 0.0541

APPENDIX Q Table Q20. Testing each locus for deviations from Hardy-Weinberg Equilibrium across 4,902 foragers. All Sites Excluding Site 6 and Site 25 Locus Chi2 Df P Chi2 Df P BTMS0066 437.964 465 0.811 430.140 465 0.875 B124 559.624 630 0.979 553.037 630 0.988 Btern01 263.221 325 0.995 286.076 325 0.941 BT28 1.366 6 0.968 1.613 6 0.952 BTMS0062 1582.048 1485 0.040 1552.687 1485 0.108 BTMS0073 13.812 15 0.540 13.121 15 0.593 BT10 372.243 406 0.884 362.941 378 0.702 BL11 260.770 231 0.087 264.382 231 0.065 BT30 48.960 91 0.999 51.410 91 0.999 B96 392.903 378 0.288 392.171 378 0.297 BTMS0081 646.724 36 0.000 14.140 36 0.999

Just out of curiosity, I evaluated the effect of BTMS0081 on a few of the common genetic test that did not require much additional work. Excluding BTMS0081 had a significant, positive effect on HE (F1, 58 = 64.38, P < 0.001). However, the average increase was relatively small (0.013) and uniform across sites (variance < 0.001) indicating no effect on the pattern of genetic diversity among sites. This change in genetic diversity was unlikely caused by the universal deviating from HWE and rather is the result of BTMS0081 having a low number of alleles. With all 11 loci, low levels of inbreeding were detected in 4 sites. When excluding BTMS0081, inbreeding was detected in only 3 sites, 2 of which were previously known but 1 of which was new.

Table Q21. Summary comparison of limited test results when including and excluding BTMS0081. 11 loci – including BTMS0081 10 loci – excluding BTMS0081

Average Expected Heterozygosity (HE) 0.648 ± 0.001SE 0.661 ± 0.001SE

Sites with significant inbreeding (FIS) 4 3

103 APPENDIX R Table R22. Each locus pair included below exhibited highly significant linkage disequilibrium for the universal dataset. The site column indicates the unique sites driving the universal linkage disequilibrium. P-value indicates the probability per site that the two loci were not linked given the observed results. Universal P-Value indicates the probability that two loci were not linked for the universal dataset once removing the Site in ( ). Universal P-value Universal P-value Universal P-value Locus#1 Locus#2 Site P-Value (Exclude Site 4) (Exclude Site 18) (Exclude Site 6) BTMS0066 B124 Site 4 0.00000 0.81 ü ü Site 4 0.00000 B124 Btern01 0.38 ü ü Site 10 0.00000 Site 13 0.00000 BTMS0066 BT10 0.06 ü ü Site 30 0.00028 BTMS0066 BT30 Site 18 0.00000 0.27 ü B124 BT30 Site 12 0.00000 0.77 ü ü Btern01 BT30 Site 4 0.00000 0.68 ü ü B124 B96 Site 23 0.00000 0.30 ü ü Site 4 0.00000 BT10 B96 0.92 ü ü Site 20 0.00000 Site 6 0.00000 BL11 B96 0.25 Site 29 0.00000

104 APPENDIX S Instructions for handling reagents used in our optimized master mix of 11 primers for microsatellite analysis as well as our complete master mix recipe.

FLEISCHER)LAB)RECIPES:)Primers)for)Microsatellite)Analysis)–)order,)store,)rehydrate,)dilute,)aliquot) V2:)05)August)2016)) Primers(for(Bombus'impatiens(Microsatellite(Analysis( ( Ordering( All)primers)(oligios))are)order)from)Life)Technologies)Corporation.))For)microsatellite)analysis,)a)pair)of) primers)is)needed)per)loci.)In)each)pair)of)primers,)one)will)be)a)forward)primer)and)one)will)be)a)reverse) primer.)In)each)pair)of)primers,)one)will)be)labeled)with)a)fluorescent)dye,)which)is)necessary)for) sequencing.)This)fluorescent)dye)is)photosensitive)and)should)be)keep)covered)whenever)possible.)We) always)choose)to)label)the)forward)primer)with)the)fluorescent)dye.)Create)a)personal)account)and)make) sure)your)account)is)properly)affiliated)with)your)institution.))Then)visit) https://www.lifetechnologies.com/us/en/home/productsTandTservices/productTtypes/primersToligosT nucleotides/appliedTbiosystemsTcustomTprimersTprobes.html)and)order)5’)Fluorescent) Labled/Unlabeled)Pairs.))Primers)(oligios))arrive)within)1)–)2)weeks.) ) Information)required)for)ordering:) Oligio)Type:)) ) ) Labeled/Unlabeled) Delivered)Scale:))) ) 10,000pmol)OR)80,000pmol)[depends)on)#)of)samples,)see)reference)sheet]) Purification:)) ) ) Desalted) Format:)) ) ) Dry)(lyophilized)) 5’)Dye:) ) ) ) [see)reference)sheet)below]) 5’)DyeTLabled)Oligo)Name:)) [see)reference)sheet)below]TF) 5’)Dye)Labeled)Sequence:)) [see)reference)sheet)below]) Unlabeled)Oligo)Name:)) ) [see)reference)sheet)below]TR) Unlabeled)Sequence:)) ) [see)reference)sheet)below]) ) Primer)Reference)Sheet) Oligo&Name& 5’&Dye& Sequence& 10,000pmol& 80,000pmol& BTMS0066=F& FAM=blue& CATGATGACACCACCCAACG& 66&Well&Plates&of&96& 533&Well&Plates&of&96& BTMS0066=R& & TTAACGCCCAATGCCTTTCC& samples&=&6336& samples&=&51,168& B124=F& FAM=blue& GCAACAGGCGGGTTAGAG& 45&Well&Plates&of&96& 363&Well&Plates&of&96& B124=R& CAGGATAGGGTAGGTAAGCAG& samples&=&4,320& samples&=&34,848& & Btern01=F& VIC=green& CGTGTTTAGGGTACTGGTGGTC& 130&Well&Plates&of&96& 1,040&Well&Plates&of& Btern01=R& GGAGCAAGAGGGCTAGACAAAAG& samples&=&12,480& 96&samples&=&99,840& & BT28=F& VIC=green& TTGCTGACGTTGCTGTGACTGAGG& 181&Well&Plates&of&96& 1,448&Well&Plates&of& BT28=R& TCCTCTGTGTGTTCTCTTACTTGGC& samples&=&17,376& 96&samples&=&139,008& & BTMS0062=F& VIC=green& CTGTCGCATTATTCGCGGTT& 33&Well&Plates&of&96& 264&Well&Plates&of&96& BTMS0062=R& CTGGGCGTGATTCGATGAAC& samples&=&3,168& samples&=&25,344& & BTMS0073=F& NED=black& CGATATCGCGATCTTCGTACAC& 66&Well&Plates&of&96& 533&Well&Plates&of&96& BTMS0073=R& (originally*FAM.blue)* GTAGCATGCTCTCCGTGTTG& samples&=&6336& samples&=&51,168& BT10=F& NED=yellow& TCTTGCTATCCACCACCCGC& 91&Well&Plates&of&96& 728&Well&Plates&of&96& BT10=R& GGACAGAAGCATAGACGCACCG& samples&=&8,736& samples&=&69,888& & BL11=F& PET=red& AAGGGTACGAAATGCGCGAG& 83&Well&Plates&of&96& 664&Well&Plates&of&96& BL11=R& TGACGAGTGCGGCCTTTTTC& samples&=&7,968& samples&=&63,744& & BT30=F& PET=red& ATCGTATTATTGCCACCAACCG& 100&Well&Plates&of&96& 800&Well&Plates&of&96& BT30=R& CAGCAACAGTCACAACAAACGC& samples&=&9,600& samples&=&76,800& & B96=F& PET=red& GGGAGAGAAAGACCAAG& 31&Well&Plates&of&96& 248&Well&Plates&of&96& B96=R& GATCGTAATGACTCGATATG& samples&=&2976& samples&=&23,808& & BTMS0081=F& PET=red& ACGCGCGCCTTCTACTATC& 65&Well&Plates&of&96& 520&Well&Plates&of&96& BTMS0081=R& AGGGACACGCGAACAGAC& samples&=&6,240& samples&=&49,920& & ) 1)of)6)

105 FLEISCHER)LAB)RECIPES:)Primers)for)Microsatellite)Analysis)–)order,)store,)rehydrate,)dilute,)aliquot) V2:)05)August)2016)) ( Storage( Always)keep)primers)(oligos))covered)as)much)as)possible)to)protect)photosensitive)fluorescent)dye.) ) Lyophilized)(Dry):)Primers)arrive)as)a)dry)film)stuck)to)the)sides)of)a)clear)microcentrifuge)tube)that)may) or)may)not)be)visible.)They)can)be)stored)this)way)for)several)weeks)at)room)temp)or)longer)at)T20°C.))) ) 200μM)Stock)Solution:)T80°C)for)at)least)12)months,)no)longer)than)2)years.)Avoid)freeze/thaw.) ) 10μM)Working)Aliquots:)T20°C)up)to)3)years.)Avoid)freeze)/)thaw) ) Stock(Solutions( Stock)solutions)are)always)made)and)kept)in)the)original)Life)Technologies)tube)and)reTsuspended)in) molecular)grade)H2O.)Note)the)following)formulas:) ) 10,000pmol)primer)+)50μL)water)=)50μL)of)200μM)Stock)Solution) 80,000pmol)primer)+)400μL)water)=)400μL)of)200μM)Stock)Solution) ) Reagents) ) Lyophilized)forward)primer) ) Lyophilized)reverse)primer)) ) 50μL)/)400μL)Molecular)grade)H2O) ) Materials) ) Gloves) Label)Marker) ) 200μL)Pipette)&)Tips)/)1000μL)Pipette)&)Tips) ) Vortex) ) Microcentrifuge) ) Freezer)Box)&)lid) ) ) Procedure) ) 1.! Put)on)gloves.) 2.! Remove)both)the)forward)and)reverse)lyophilized)primers)from)the)T20°C)freezer.))Throughout) procedure,)keep)primers)covered)as)often)as)possible)to)protect)photosensitive)fluorescent)dye.) 3.! Spin)down)dry)primer)for)5)seconds)to)ensure)all)primer)is)within)tube)and)not)stuck)to)the)cap.) Then)carefμLly)remove)caps)from)primers.))Keeping)track)of)which)cap)matches)which)primer.) 4.! )Use)a)200μL)pipette)to)add)50μL)of)molecular)H2O)to)each)tube)of)primer.)Use)a)1000μL)pipette)to) add)400μL)of)molecular)H2O)to)each)tube)of)primer.)Change)tips)between)tubes.) 5.! Recap)tightly)and)allow)the)rehydrated)primers)sit,)covered,)for)a)few)minutes)to)allow)lyophilized) primer)to)be)completely)dissolved)by)the)molecular)H2O) 6.! Vortex)the)rehydrated)primers)for)3)seconds.)If)any)flecks)of)primer)adhered)to)the)inside)of)the) cap,)turn)the)tubes)upside)and)vortex.) 7.! Spin)down)the)tubes)after)vortexing)to)ensure)the)entire)solution)is)within)the)tube.) 8.! Label)the)tubes)with)“200μM)Stock)Solution”)and)the)date)of)creation.) 9.! Store)stock)solutions)in)T80°C)(for)up)to)3)years))or)continue)and)create)working)aliquots.) ( ) 2)of)6)

106 FLEISCHER)LAB)RECIPES:)Primers)for)Microsatellite)Analysis)–)order,)store,)rehydrate,)dilute,)aliquot) V2:)05)August)2016)) Working(Aliquots) Working)aliquots)of)primers)are)used)to)reduce)contamination)issues)and)to)avoid)mμLtiple)freeze/thaw) of)stock)solutions.)To)make,)simply)dilute)the)200μM)stock)solution)to)a)10μM)working)concentration,) and)aliquot)the)diluted)solution.)Use)the)following)ratio)for)proper)dilution:) ) 5μL)of)200μM)Stock)Solution) ) 95μL)of)molecular)grade)H2O) ) 100μL)of)10μM)Working)Solution)

Materials) Reagents) Gloves) ) 200μM)Primer)Stock)Solution) Label)Maker)/)Scissors) ) Molecular)grade)H2O) Label)Marker) ) ) 1.5mL)Microcentrifuge)tubes,)pigmented) ) Microcentrifuge)tube)caps,)colored) ) Pipette)&)Tips) ) Vortex) ) Microcentrifuge) ) Freezer)Box)&)lid) ) Procedure) 1.! Wash)hands,)wear)gloves.) 2.! Thaw)stock)solution.)Keep)covered.) 3.! Depending)on)the)number)of)aliquots)you)intend)to)create,)retrieve)the)appropriate)number)of) 1.5mL)microcentrifuge)tubes)and)caps) 4.! Using)a)label)maker,)create)labels)for)each)tube)with)the)following)information)for) Forward)Primers:) ) ) ) ) Reverse)Primers:) #μL)of)10μM)[primer)name])–)F) ) ) #μL)of)10μM)[primer)name])–)R) Date)of)Creation)) ) ) ) ) Date)of)Creation)) 5.! Vortex)stock)solution)and)spin)down) 6.! In)a)new,)unlabeled)1.5mL)microcentrifuge)tube,)add)appropriate)volume)of)molecular)H2O)for)the) dilution.)Then)add)the)appropriate)volume)of)200μM)stock)primer)to)create)final)working) concentration.)Mix)by)pipetting)up/down)and)vortexing.) Example:)) You)want)to)make)980μL)of)10μM)working)solution) Use)Ratio:) ) ) ) ) 5μL)of)200μM)Stock)Solution) 49μL)of)200μM)Stock)Solution) ) ) ) ) ) ) 95μL)of)molecular)grade)H) ) ) )2O) )))))))))))))=) 931μL)of)molecular)grade)H2O) ) ) ) ) 100μL)of)10μM)Working)Solution) 980μL)of)10μM)Working)Solution) ) Use)a)1000μL)pipette)to)add)931μL)of)molecular)grade)H2O)to)the)1.5mL)tube.)Then)use)a) 200μL)pipette)to)add)49μL)of)200μM)stock)solution)to)create)980μL)of)10μM)working) FLEISCHER)LAB)solutionRECIPES:)Primers)for)Microsatellite)Analysis)) –)order,)store,)rehydrate,)dilute,)aliquot) V2:)05)August)2016)) ) 7.! Aliquot)this)diluted)solution)into)the)labeled)1.5mL)tubes.)I)usually)aliquot)in)volumes)that)will)3)of)6) supply)enough)primer)for)4)well)plates.) 8.! Store)aliquots)in)labeled)freezer)box)in)T20°C)freezer) ) This)procedure)will)be)different)for)each)primer,)depending)on)how)much)primer)the)master)mix)recipe) calls)for.)) ) ) ) ) ) ) ) ) The)following)is)a)cheat)sheet)for)each)primer)as)of)August)8th,)2016) 107 ) ) 1& 96&well& 4&plates&=&Master&Mix& sample& plate& Batch& Raw&Form& BTMS0066& =F& 0.136& 15& 60&+&wiggle&=&65& 80,000&pmoles& BTMS0066=R& 0.136& 15& 60&+&wiggle&=&65& 80,000&pmoles& 80,000)pmole)Dry)Primer)+)400μl)of)molecular)H2O)=)400μl)of)200μM)Stock)Solution) ) 400μl)of)200μ)Stock)Solution)+)7,600μl)of)molecular)H2O)=)8,000μl)of)10μMStock)Solution) ) 8,000μl)working)solution)/)15μl)working)solution)needed)per)plate)=)533.3)plates)serviced) ) 1& 96&well& 4&plates&=&Master&Mix& sample& plate& Batch& Raw&Form& B124& =F& 0.200& 22.0& 88&+&wiggle&=&90& 80,000&pmoles& B124=R& 0.200& 22.0& 88&+&wiggle&=&90& 80,000&pmoles& 80,000)pmole)Dry)Primer)+)400μl)of)molecular)H2O)=)400μl)of)200μM)Stock)Solution) ) 400μl)of)200μM)Stock)Solution)+)7600μl)of)molecular)H2O)=)8000μl)of)10μMStock)Solution) ) 8,000μl)working)solution)/)22μl)working)solution)needed)per)plate)=)363.63)plates)serviced) ) 1& 96&well& 4&plates&=&Master&Mix& sample& plate& Batch& Raw&Form& Btern01& =F& 0.070& 7.7& 30.8&+&wiggle&=&33& 10,000&pmoles& Btern01=R& 0.070& 7.7& 30.8&+&wiggle&=&33& 10,000&pmoles& 10,000)pmole)Dry)Primer)+)50μl)of)molecular)H2O)=)50μl)of)200μM)Stock)Solution) ) 50μl)of)200μM)Stock)Solution)+)950μl)molecular)H2O)=)1,000μl)of)10μMworking)solution) ) 1,000μl)working)solution)/)7.7μl)working)solution)needed)per)plate)=)129.87)plates)serviced)

) 4)of)6) FLEISCHER)LAB)RECIPES:)Hydrate,)Dilute)and)Aliquot)Reagents)for)Master)Mix) V1:)24)March)2015) ) Dilute,(Combine(and(Aliquot(dNTPs( ! Materials( 400μl!of!40!umol!dATP! 400μl!of!40!umol!dGTP! 400μl!of!40!umol!dCTP! 400μl!of!40!umol!dTTP! Molecular!H2O! 1.5mL!Microcentrifuge!Tubes! 10µL!Pipette!/!tips! 200µL!Pipette!/!tips! 1000µL!Pipette!/!tips! Label!marker! Vortex! ! Procedure:( ! 1.! Aliquot!stock!solutions!of!dNTPs:! •! Divide!400ul!dATP!into!4!tubes!of!100ul!each!(3!new!tubes,!1!original!tube)! •! Divide!400ul!dGTP!into!4!tubes!of!100ul!each!(3!new!tubes,!1!original!tube)! •! Divide!400ul!dCTP!into!4!tubes!of!100ul!each!(3!new!tubes,!1!original!tube)! •! Divide!400ul!dTTP!into!4!tubes!of!100ul!each!(3!new!tubes,!1!original!tube)! ! 2.! Label!the!tubes!with!“40umol!dXTP”!and!the!date.!!Return!3!of!the!tubes!to!the!T20oC!freezer!and! work!with!only!1!tube!of!each!dNTP!!! ! 3.! Create!4000ul!of!a!10mM!working!solution!of!a!dNTP!mix:! •! Add!3,600ul!of!molecular!H2O!to!a!15mL!centrifuge!tube! •! Add!100ul!stock!solution!of!each!individual!dNTP! •! 100uL!dATP! •! 100uL!dCTP! •! 100uL!dGTP! •! 100uL!dTTP! ! 4.! Vortex!the!mix!for!2!seconds! ! 5.! Aliquot!the!4,000ul!of!working!solution!into!10!tubes!of!400ul!working!solution! ! 6.! Store!in!T20! ! Each!tube!of!400ul!working!solution!will!service!6!plates!with!4ul!solution!remaining.! Originally,*the*supplies*provided*enough*dNTPs*for*240*plates* * * * Page)1)of)3) )

108 FLEISCHER)LAB)RECIPES:)Hydrate,)Dilute)and)Aliquot)Reagents)for)Master)Mix) V1:)24)March)2015) ) Aliquot(Taq,(MgCl2,(and(5x(Buffer( ! 5!tubes!containing!100µL!GoTaq®!Flexi!DNA!Polymerase! T! Store!in!T20oC!freezer! T! When!creating!master!mix!for!an!Amplification!Well!Plate,!add!8.8ul!of!Taq!to!each!master!mix.! T! Keep!Taq!covered!and!on!ice!whenever!it!is!not!in!the!freezer! T! Once!Taq!is!depleted!to!less!than!8.8ul,!discard!microcentrifuge!tube!and!remaining!Taq!! ! 15!tubes!containing!1.2ml!of!MgCl2! T! Aliquot!1!tube!of!1200ul!into!4!microcentrifuge!tubes!of!250ul,!with!200ul!remaining!in!original! tube! T! Label!tubes!with!contents!and!date! T! Store!in!T20oC!freezer! T! When!creating!Master!Mix!batches!to!service!4!Amplification!Well!Plates,!add!246.4ul!of!MgCL2!to! each!batch! T! Discard!the!microcentrifuge!tube!and!remaining!MgCL2! ! !20!tubes!containing!1mL!of!5X!Colorless!GoTaq®!Flexi!Buffer! T! No!need!to!aliquot! T! Each!Master!Mix!Batch!for!4!Amplification!Well!Plates!will!use!880ul!of!buffer! T! Save!remaining!120ul!in!original!tubes! T! Store!in!T20oC!freezer! ! 20!tubes!containing!1mL!of!5X!Green!GoTaq®!Flexi!Buffer! T! Discard!Green!Buffer! ! ! ! ! ! ! ! ! ! ( ( ( ( ( ( ( ( ( ( ( ( Page)2)of)3) )

109 FLEISCHER)LAB)RECIPES:)Hydrate,)Dilute)and)Aliquot)Reagents)for)Master)Mix) V1:)24)March)2015) ) Dilute(and(Aliquot(BSA( ! Materials:! 1mL!of!10mg/ml!Stock!solution!BSA! Molecular!water! 20ul!pipette!/!tips! 200ul!pipette!/!tips! 1000ul!pipette!/!tips! Label!Pen! 1.5mL!microcentrifuge!tubes! ! Procedure:! 1.! Create!2!working!solutions!of!900ul!each! !! Label!2!microcentrifuge!tubes!with!“900ul!working!solution!BSA”!and!the!date! !! In!each!tube,!combine!9ul!of!stock!solution!BSA!and!891ul!of!molecular!water! !! Vortex!for!2!seconds! !! Refreeze!stock!solution!of!BSA!and!1!microcentrifuge!tube!of!working!solution! ! 2.! Aliquot!the!remaining!working!solution!into!batches!of!90ul!each!! !! Label!10!microcentrifuge!tubes!with!“90ul!working!solution!BSA”!and!the!original!date! !! Use!a!200ul!pipette!to!dispense!90ul!of!working!solution!into!each!of!the!10! microcentrifuge!tubes! !! Store!all!in!T20oC!freezer! !! When!making!master!mix!batch!for!4!amplification!well!plates,!use!1!tube!of!90ul!of! working!solution.!Discard!microcentrifuge!tube!and!remaining!2ul!of!working!solution!BSA! !

Page)3)of)3) )

110 FLEISCHER)LAB)RECIPES:)Master)Mix)Recipe) v2:)16)May)2016) Bombus'Impatiens'Master'Mix'Recipe' ) Estimated'time'to'complete:''Depends'on'your'pipetting'speed) ' Materials' Gloves) Vortex) 1.5)ml)Microcentrifuge)tube)(pigmented)) Microcentrifuge) 10µL,)20)µL,)200µL)&)1000µL)Pipettes)+)tips)) Reagents)–)in)some)sort)of)box)/)container) Ice)Bucket)/)Ice)(for'Taq)) ) Procedure' 1.! Wash)hands,)don)gloves.)) ) 2.! Spin)down)Taq)for)2)seconds)and)place)on)ice,)covered.)Taq)is)heat)sensitive)and)should)be)added)last.) ) 3.! Remove)all)other)reagents)from)freezer)and,)when)thawed,)vortex)each)component)for)2)seconds.))Keep) primers)covered)until)using)–)they)are)photosensitive) ) 4.! Label)a)1.5)ml)microcentrifuge)tube)and)use)pipettes)to)combine)the)following)reagents)in)the)following) ratios.)) ) Reagents) 1)Sample) 96)Well)Plate) Two)96)Well)Plates) Four)96)Well)Plates) ) Buffer)5x)Promega) 2.000) 220.0) 440) 880) ) MgCl2) 0.560) 61.6) 123.2) 246.4) ) BTMS0066]F) 0.135) 15) 30) 60) ) BTMS0066]R) 0.135) 15) 30) 60) ) B124]F) 0.200) 22.0) 44) 88) ) B124]R) 0.200) 22.0) 44) 88) ) Btern01]F) 0.070) 7.7) 15.4) 30.8) ) Btern01]R) 0.070) 7.7) 15.4) 30.8) ) BT28]F) 0.100) 11.0) 22) 44) ) BT28]R) ) 0.100) 11.0) 22) 44) ) BTMS0062]F) 0.272) 30) 60) 120) ) BTMS0062]R) 0.272) 30) 60) 120) ) BTMS0073]F) 0.135) 15) 30) 60) ) BTMS0073]R) 0.135) 15) 30) 60) ) BT10]F) 0.050) 5.5) 11) 22) ) BT10]R) 0.050) 5.5) 11) 22) ) BL11]F) 0.109) 12.0) 24) 48) ) BL11]R) 0.109) 12.0) 24) 48) BT30]F) 0.091) 10.0) 20) 40) ) BT30]R) 0.091) 10.0) 20) 40) ) B96]F) 0.290) 32) 64) 128) ) B96]R) 0.290) 32) 64) 128) ) BTMS0081]F) 0.140) 15.4) 30.8) 61.6) ) BTMS0081]R) 0.140) 15.4) 30.8) 61.6) ) BSA) 0.200) 22.0) 44) 88) ) Molecular)H2O) 2.882) 317.0) 634) 1268) ) dNTP)(add)last)) 0.600) 66.0) 66.0)x)2)plates) 66.0)x)4)plates) ) Taq)(add)last)) 0.080) 8.8) 8.8)x)2)plates) 8.8)x)4)plates) ) TOTAL) 9) 1046.6) 1943.6)+)dNTP)+)Taq) 3887.2)+)dNTP)+)Taq)

*'The'Well'Plate'amounts'calculate'in'a'15%'error'rate' ' 5.! Once)all)ingredients)are)added,)vortex)the)final)solution)for)2)seconds,)then)spin)down) ! Page%1%of%2% %

111 FLEISCHER)LAB)RECIPES:)Master)Mix)Recipe) v2:)16)May)2016) ' ALTERNATIVE)–)Create)Batches)of)Master)Mix)that)will)service)Four)96)well)plates)at)one)time) ) Materials' 20)ml)Centrifuge)tube)(wrapped'in'tin'foil)) 1.5)Microcentrifuge)tubes)(pigmented)) 10µL)Pipette)/)tips)) 20µL)Pipette)/)tips)) 200µL)Pipette)/)tips) ) 1000µL)Pipette)/)tips)) Ice)Bucket)/)Ice)(for)Taq)&)dNTP)) Reagents)–)in)some)sort)of)box)/)container) Vortex) Microcentrifuge) ' Procedure' 1.! Using)pipettes,)combine)reagents)in)a)20)mL)centrifuge)tube)wrapped)in)foil,)following)the)amounts)listed) in)the)“Four)96)Well)Plates”)column.)DO'NOT'ADD'dNTP'OR'TAQ.)) ) 2.! When)all)reagents)except'dNTP'and'Taq)have)been)added,)you)should)have)a)stock)solution)of)3887.2µL.) Vortex)mixture)for)5)seconds)and)“sling”)down)with)arm)rotations.) ) 3.! Using)1000µL)Pipette,)dispense)971.8µL)of)solution)into)each)of)four)1.5)mL)microcentrifuge)tubes.)Label) each)tube)with)“Master)Mix)Recipe)–)1)plate,)No)Taq,)No)dNTPs”,)and)date)of)creation.)Store)tubes,)covered,) in)the)]20)freezer)for)future)use.)) *If'you'don’t'have'enough'for'971.8'for'each'tube,'it’s'probably'because'the'“sling”'down'method'doesn’t'bring'all' the'solution'to'the'bottom'of'the'tube.'Just'make'certain'to'divide'the'master'mix'evenly' ) THEN…' 4.! Before)beginning)the)Amplification)and)Pre]Sequencing)Protocol,)retrieve)one)tube)of)master)mix,)Taq)and) dNTP)from)the)freezer)and)allow)them)to)thaw.)Keep)the)Taq)covered)and)on)ice.)Keep)the)master)mix) covered)to)protect)photosensitive)primers)from)light.) ) 5.! At)step)4)of)the)Amplification)and)Pre]Sequencing)Protocol:) ]! use)a)200µL)pipette)to)add)66µL)of)dNTP)to)the)prepared)master)mix.)) ]! use)a)10µL)pipette)to)add)8.8)µL)of)Taq)to)the)prepared)master)mix.))) ) 6.! Vortex)for)2)seconds)and)Spin)down)the)solution.)Proceed)to)step)5)of)the)Amplification)and)) Pre]sequencing)protocol.) ) CleanBup:) ]! Store)the)following)in)]20oC)freezer:) o! All)individual)reagents)–)all)remaining)raw)reagents)should)be)stored)) o! All)pre]mixed)batches)of)master)mix)–)all)unused)master)mixes)should)be)stored) ]! Dispose)of)the)following:) o! 20)ml)Centrifuge)tube)used)to)create)Master)Mix)Batch) o! 1.5ml)microcentrifuge)tubes)used)for)reagents)–)if)raw)reagents)are)depleted)during)this)protocol) o! Ice)–)empty)ice)bucket)and)allow)to)air)dry) o! Gloves)

! Page%2%of%2% %

112

Chapter 4: Wild Bumble bee colony abundance, mediated by field size, predicts pollination services in agroecosystems

INTRODUCTION / BACKGROUND

Bumble bee foragers are key pollinators in managed and natural ecosystems worldwide. As a eusocial insect living in colonies, the availability of pollinating foragers depends on the abundance and size of colonies nesting in the surrounding landscape. Efforts have been made to measure wild bumble bee colony abundance and identify factors influencing colony success. Directly counting bumble bee colonies can be quite tricky, (see Osborne et al, 2008, Waters et al, 2011; O’Connor et al, 2012; Goulson et al, 2017 for a variety of methods), so genetic techniques are commonly used to estimate the number of colonies from a collection of foragers. By genotyping foragers with microsatellite loci, related foragers can be sorted into sister groups, each of which represents a unique colony for monogamous bumble bee species. Because representatives from some colonies will be missed during sampling, a non-lethal mark-recapture statistical model can analyze the distribution of sister groups containing 1, 2…, k foragers to provide estimates of total colony abundance, including unsampled colonies (Goulson et al, 2010). Previous studies throughout Europe and the western United States have estimated total colony abundance per sampling location ranging from an average of 20.4 colonies per site (B. terrestris – Darvill et al, 2004) to 630 colonies per site (B. terrestris - Wood et al, 2015). However, colony abundance studies for key pollinators in the eastern United States are lacking. Additionally, trends in colony abundances across successive years and distinct geographical regions have never been examined. Recent trends in colony abundance and stability will provide valuable insights regarding the status of bumble bee populations given the current environmental conditions. But as the global climate conditions change and landcover alters to suit anthropogenic needs, a species’ survival will depend, at least in part, on their genetic capacity to respond to selective pressures. Bumble bee species characterized by a single genetically-connected breeding population throughout their range appear stable (Shao et al, 2004; Pirounakis et al, 1998; Estoup et al, 1996; 1995), as opposed to declining species characterized by fragmented and genetically isolated populations (Darville, et al 2006; Ellis et al 2006, Bourke & Hammond, 2002). Furthermore, 113 high genetic diversity is a hallmark of stable bumble bee populations, compared with declining species characterized by lower levels of genetic diversity (Darvill et al, 2006, 2010; Ellis et al, 2006; Lozier & Cameron, 2009; Charman et al, 2010; Cameron et al, 2011; Lozier et al, 2011). Fine scale genetic analyses regarding patterns of genetic structuring and diversity for native bumble bees can provide a better understanding of the potential resilience of key pollinating species. Genetically-derived colony abundances is at least a conservative estimate of the true number of bumble bee colonies within foraging range and can be used to address population- level hypotheses. For example, agricultural intensification is often thought to have a negative effect on pollinators, but studies concerning the effects of mass-flowering crops provide conflicting results. Westphal et al, 2003 concluded that the percentage of oilseed rape in a given landscape positively influenced the density of bumble bee foragers per square meter of Phacelia flowers. However, Holzschuh et al, 2016 found that as the percent cover of oilseed rape, sunflower and orange crops increased, the density of bumble bee foragers decreased both within the mass-flowering crops and in the surrounding semi-natural habitats. In another study, the percent coverage of mass-flowering crops had conflicting effects on bumble bee forager density and species richness, depending on when the crop was in flower (Kallioniemi et al, 2017). The majority of these studies have been limited to mass-flowering crops that provide pollen resources and are focused only on individual foragers. For eusocial species, colony data would be helpful for determining population-level responses to mass-flowering crops. The effect of nectar-rich mass-flowering crop field size on wild bumble bee colony abundance is unknown. There are several studies that highlight landscape features influencing colony success: Jha & Kremen, 2013 found that oak woodland-chaparral was positively linked with B. vosnesenskii nest density, Goulson et al, 2010 found B. lapidarius bumble bee colony survival across a single season was positively affected by garden habitat and Carvell et al, 2017 found mixed semi-natural landscapes positively affected queen survival across successive seasons. These and other studies often encourage conserving or increasing landscape features positively associated with colony success in order to bolster bumble bee populations. This ubiquitous desire to promote colony success indicates an underlying assumption that increased colony abundance will result in additional beneficial ecosystem services in natural and managed landscapes. However, only a few studies have addressed the assumption that colony abundance positively effects pollination services. Hermann et al, 2007 concluded that in crop field margins, B. pascuorum forager abundance was the result of increased colony size, while Geib et al, 2015 found that colony 114 abundance positively influenced forager abundance and clover reproduction in alpine habitats in the western US. With just two published studies, the relationship between wild bumble bee colony abundance and pollination services remains understudied, particularly for crop pollination. In order to address the above knowledge gaps, we will examine the common eastern bumble bee, Bombus impatiens Cresson in Cucurbita agroecosystems in Pennsylvania. In the eastern United states, B. impatiens is a key pollinator of many native plants and crops including commercial produced pumpkin throughout the mid-Atlantic (see Chapter 2). While B. impatiens is likely a stable and possibly expanding bumble bee species (Colla et al, 2012; Cameron et al, 2011; Grixti et al, 2008; however, see Beckham et al, 2016), population dynamics studies are lacking for this economically and ecologically valuable pollinator. It is critical to capture baseline data of common species for long-term monitoring as global climate conditions and land use patterns effect native pollinators, particularly those with the potential to replace or supplement pollination services provided by managed pollinators. Using B. impatiens as a model system will allow us to draw inferences from large-scale sampling without the negative ecological ramifications associated with destructively sampling rare species. Furthermore, bumble bees are most likely only nectar-foraging in pumpkin flowers, making Cucurbita agroecosystems a unique system to examine the effect of a nectar-rich mass-flowering crops that reliably attracts a large abundance of B. impatiens foragers. The main purpose of this study was to estimate the abundance of wild bumble bee colonies providing foragers to commercial pumpkin fields and calculate visitation rates, and use these data to examine (1) the abundance, stability and potential genetic resilience of a key native pollinator, (2) the effect of mass-flowering crops on eusocial pollinator populations and pollination services and (3) the relationship between colony abundance and pollination services. Here we use microsatellite analysis and statistical inferences to estimate the abundance of B. impatiens colonies providing forages to 30 commercial pumpkin fields spread across 4 years and 3 regions. Because B. impatiens queens may exhibit some degree of polygamy (Cnaani et al, 2002, Payne et al, 2003), we evaluate the effect of mating assumptions on colony abundance analysis for the first time. We evaluate patterns of population structuring, genetic diversity and levels of inbreeding to better understand potential resilience of B. impatiens pollinators. We determine the relationship between pumpkin field size and colony numbers. From a subset of 21 fields, we record pollination services and evaluate the relationship between pollination services and colony abundances.

115 METHODS

DATA COLLECTION Forager collection: Over the course of 4 years and 3 geographical regions, we collected approximately 200 Bombus impatiens foragers per field from 30 commercial pumpkin fields for a total of ~6000 individuals. Each collection was completed on a single morning (0800 – 1200 EST) within peak pumpkin flower bloom (range: July 24th – Aug 29th). Along random walks throughout the entirety of each field, actively foraging bees were collected by placing a 20ml plastic or glass scintillation vial over a worker interacting with the reproductive portion of the pumpkin flower. Collected specimen were pinned and the right mesothoracic leg was removed for DNA extraction.

Pollination Activity: From a subset of 21 fields, pollination activity was measured within 2 weeks of forager collections between 0630 and 1200 hours EST when weather conditions were favorable for bee activity (>15.5oC with low wind speeds). Visitation rates were measured within a single hectare of each pumpkin field by walking along four 100m transects spaced 0, 25m, 50m and 100m from the field edge. Along each transect, the number of B. impatiens visits to all designated pumpkin flowers within ~ 1-2 m2 (regardless of flower gender) was recorded in 45 second intervals. Approximately 60 observations were completed for each transect and all ~240 observations were averaged to obtain a single mean visitation rate per hectare for each field (sampling support details in Appendix T)

MOLECULAR METHODS Forager DNA was extracted using the Chelex protocol detailed in Appendix G, in Chapter 3. B. impatiens individuals were genotyped at 11 microsatellites loci (BTMS0066, B124, Btern01, BT28, BTMS0062, BTMS0073, BT10, BL11, BT30, B96, BTMS0081, detailed in Appendix U (Estoup et al, 1995; Estoup et al, 1996; Funk et al, 2006; Stolle et al, 2009). DNA with microsatellite markers was amplified with a PCR multiplex protocol detailed in Appendix H in Chapter 3. Foragers successfully genotyped at >7 loci were included in subsequent analysis. The genetic data are archived on the PSU Box system.

116 COLONY ABUNDANCE Full-Sibship Families and Detected Colony Numbers: When calculating genetically-derived colony numbers, genotyped foragers are first assigned to full-sibship families (FS families) using the maximum likelihood method implemented in Colony (Jones & Wang, 2010). Then, the number of FS families is used to calculate the number of unique colonies, commonly referred to as detected colony numbers. Because most bumble bee species are monogamous, all previous studies have (a) selected monogamous mating strategies when running Colony and (b) interpreted each FS family as a unique colony. This is appropriate for monogamous species because when a queen mates with only one male, all offspring are full-siblings and thus each colony contains only 1 FS family. However, evidence suggest 20 - 30% of B. impatiens queens mate with multiple males (Cnaani et al, 2002; Payne et al, 2003). Assuming polygamous mating could impact both the analysis and interpretation of colony data resulting in different detected colony numbers per site. First, selecting polygamous mating strategy impacts the way Colony identifies and groups related foragers, potentially altering the number of FS families per site. Furthermore, under polygamous assumptions, each unique colony could include more than 1 FS family. Foragers produced by the same queen but fathered by a different male would only be considered half-sisters and assigned to different FS families within the same colony. Therefore, the number of FS families included in each colony depends on the average number of mates per queen , which has been found to range from 1.06 – 1.2 mates per queen for B. impatiens (Cnaani et al, 2002; Payne et al, 2003). To understand the effect of mating assumptions on colony measures, I first identified the number of FS families at each field when selecting (a) monogamous and (b) polygamous mating strategies for females in Colony. I then calculated the ‘detected colony number’ by dividing the number of FS families by a range of possible ‘average mates per queen,’ at increasing 0.02 increments from 1 to 1.24. For 1 mate per queen, I divided the number of monogamous FS families by 1. For 1.02 – 1.24 mates per queen, I divided the number of polygamous FS families by 1.02, 1.04…, 1.24.

ESTIMATED TOTAL COLONY ABUNDANCE Because detected colony numbers are likely an underestimate of total colonies visiting a field, I used Capwire (Miller et al, 2005) to provide estimates of total colony abundance by determining the number of unsampled colonies based on the distribution of detected colonies represented by 1 forager, 2 foragers… K foragers with per site. Capwire provides 2 estimates of total colony

117 abundance per site using 2 different models detailed in Goulson et al, 2010. In keeping with previous studies and biological assumptions of non-random within field distribution, we elected to use estimates based on the Two Innate Rate Model (TIRM) method in subsequent analyses. I examine both the abundance of colonies foraging in an entire field and calculate an abundance of colonies foraging per hectare of pumpkin by dividing the number of total colonies per field by the field area. All fields were mapped and ground-truthed, and field area was calculated using the polygon feature in google maps.

POPULATION GENETICS For analyses of population structuring and genetic diversity, I removed duplicate members of the same FS family so as not to bias results of genetic tests. Population genetics test were performed in R and a complete script can be found in Appendix V.

Population Structuring: To test previous findings that suggest B. impatiens individuals are members of a single, panmictic population, I looked for a lack of population structuring by field and by region using G-statistics and Analysis of Molecular Variance (AMOVA). In order to assess a single generation at a time, I analyzed foragers from each year separately. I tested for genetic differentiation among individuals within the same year using a comparison of three G-statistics (Nei’s Gst, Hedrick’s Gst, Jost’s Dest) with the function ‘diff_stats’ in the R package “mmod” (Winter, 2012). All three G-statistics determine the degree of genetic differentiation between populations by comparing genetic diversity within and between populations. Values close to 0 indicate low differentiation, suggest a panmictic population, whereas value close to 1 indicate populations that are completely differentiated

(Meirmans & Hedrick, 2011). For Jost’s DEST, I report values based on average heterozygosity as opposed to harmonic means. Within each year, I tested for genetic differentiation between fields and then regions, using global and pairwise comparisons. I looked for problematic increases in genetic differentiation across successive years. Using AMOVA, I looked for sources of genetic variation by partitioning variance between individuals, fields and regions for each year using the function ‘poppr.amova’ from the R package ‘poppr’ (Kamvar et al. 2014). Significant and relatively high variance between fields and regions could suggest population structuring. I looked for changes in sources of genetic variation across successive years.

118 GENETIC DIVERSITY AND INBREEDING I evaluate the levels of genetic diversity and inbreeding for the population(s) each year. To quantify genetic diversity, I calculated expected heterozygosity (HE) and allelic richness (AR) across the entire population. Expected heterozygosity (HE) is based on Nei’s unbiased estimated of gene diversity and was calculated using R package and function “poppr” (Kamvar et al, 2014) with sample sizes standardized to the smallest sample size of 293 genotypes per year. Values range from 0 – 1, with 1 indicated the highest level of diversity. Allelic richness (AR) was calculated per loci using 100 alleles for rarefaction to correct for varying sample sizes between years with the function “allele.richness” in the R package “hierfstat” (Goudet, 2005). AR per loci was averaged across all loci per year to provide a single value of AR per year. Values range could range from 0 – infinity, with higher values indicating higher genetic diversity. I looked for problematic decreases in genetic diversity across successive years.

I also calculated inbreeding coefficients (FIS) using the function “boot.ppfis(x)” in the R package “heirfstat” (Goudet, 2005). When the 95% confidence interval includes 0, the FIS is not significantly different from 0 - which indicates no inbreeding, i.e. random mating for the population. I looked for problematic increases in inbreeding across successive years.

STATISTICAL ANALYSIS JMP® Pro, Version 13.0.0 (SAS Institute 2007) was used to complete all Analysis of Variances (ANOVA), mean comparisons and regressions. For all analyses, significance is found at alpha equals 0.05. Simple linear regressions were completed using “Fit Model” with model personality “Standard Least Squares”, and emphases “Effect Leverage.” For curvilinear relationships, quadratic terms were tested. Visitation rates and colony abundances per field were normally distributed and did not require transformations. After removing a single outlier, colony abundances per hectare were also normally distributed.

Effect of Mating Assumptions on Colony Analysis: I used an ANOVA to determine the effect of mating assumptions on the number of FS families identified by Colony by comparing mean number of FS families identified under polygamous vs monogamous mating assumptions across all fields. I then used regression to determine if mates per queen had a significant effect on the mean number of colonies detected under polygamous assumptions. I then compared colony numbers detected under monogamous conditions with colony numbers detected under polygamous conditions to determine the effect of mating assumptions on detected colony numbers. 119

Recent Trends in Colony Abundance Per Field: To explore the stability of estimated total colony abundances per field across time and space, I used multiple one-way ANOVAs to individually test the effect of year (2012, 2013, 2014 and 2015) and region (Center, Columbia and Lancaster) on mean estimated total colony abundances per field using all 30 fields. I used a two-way ANOVA on a subset of 28 fields to evaluate year, region and the interaction term year*region. Fields from 2012 (n = 2) were excluded from the two-way ANOVA because only one region (Columbia) was sampled in 2012.

Mass-flowering crop effects on B. impatiens populations and pollination services: To explore the effect of mass-flowering crops on B. impatiens populations, I used simple linear regression to examine the relationships between pumpkin field area and both colony abundance per field and colony abundance per hectare. To explore the effect of mass-flowering crops on pollination services, I used simple linear regression to examine the relationship between commercial pumpkin field area and B. impatiens visitation rates to pumpkin flowers.

Colony Abundance and Pollination Services: To explore the relationship between wild bumble bee colony abundance and pollination services in agroecosystems, I used simple linear regression to examine the effect of B. impatiens colony abundance per field and colony abundance per hectare independently on B. impatiens visitation rates to pumpkin flowers.

RESULTS

Commercial pumpkin fields ranged from 0.61 – 11.86 ha with an average field area of 4.67 ±

0.56SE (Table 23). We collected 210.8 ± 3.3SE B. impatiens foragers per field and successfully genotyped 207.4 ± 3.4SE foragers per field (Table 23). On average, each pumpkin flower received

0.3 ± 0.05SE visits from B. impatiens foragers every 45 seconds (Table 23).

EFFECTS OF MATING ASSUMPTIONS ON COLONY ANALYSIS

Under monogamous assumptions, we identified 163.8 ± 2.9SE FS families per field (Figure 28; Table 23). For 15 fields (50%), repeated Colony runs yielded stable numbers of FS families. When FS families differed between runs, results never deviated more than 1.6% of the average

(Table 23). Under polygamous assumptions, we identified 183.8 ± 3.6SE FS families per field

120 (Figure 28; Table 1). Mating assumptions had a significant effect on the number of FS families, with average polygamous FS families approximately ~12.2% greater than average monogamous

2 FS families (Figure 28, F1, 58 = 18.06, P < 0.0001, R = 0.23). When interpreting polygamous full-sibship families as polygamous detected colony numbers based on the average number of mates per queen, detected colonies significantly

2 decreased as mates per queen increased (Figure 29, F1, 328 = 91.86, P < 0.0001, R = 0.21). The monogamous colony mean is equal to the polygamous colony mean at approximately 1.12 mates per queen (Figure 29, solid red line). The monogamous colony numbers are less than polygamous colony numbers when mates per queen is less than 1.12 and the upper 95% CI does not overlap at 1.02 mates per queen (Figure 29, dotted red line). Monogamous colony numbers are greater than polygamous colony numbers when mates per queen is more than 1.12 and confidence intervals do not overlap at 1.22 mates per queen. (Figure 29, dotted red line). Current studies suggest 1.06 – 1.2 mates per B. impatiens queen (Figure 29, green line) (Cnaani et al, 2002; Payne et al, 2003), which are within the range of polygamous 95% CI that overlap with monogamous 95%. Therefore, it is unlikely that mating assumptions are significantly impacting detected colony numbers for B. impatiens and so we used colony numbers detected under monogamous assumptions to estimate total colony abundances.

ESTIMATES OF TOTAL COLONY ABUNDANCE

Using the TIRM model, Capwire generated total colony abundances per field of 543.7 ± 21.7SE per field (Table 23) Dividing estimated total colony abundances per field by field area yielded colony abundances per hectare of 135.2 ± 15SE (Table 23). For the entire dataset, total colony

2 abundance per field was stable across all years (Figure 30A, F3, 26 = 1.1, P = 0.36, R = 0.11), and

2 regions (Figure 30B, F2, 27 = 0.61, P = 0.55, R = 0.04). Furthermore, there was no year by region

2 interaction for 2013, 2014 and 2015 (F8, 19 = 0.7, P = 0.6874, R = 0.22.

EFFECT OF NECTAR-RICH MASS FLOWERING CROP FIELD AREA

Field area did not affect total colony abundances per field (Figure 31A, F1, 28 = 1.73, P = 0.198, R2 = .058). However, field area did have a negative effect on colony abundance per hectare

2 (Figure 31B, F2, 23 = 61.16, P < 0.0001, R = 0.84). The quadratic term was significant, indicating a curvilinear relationship between field area and colonies per hectare. We excluded fields less than a 1 ha from analyses to avoid extrapolating beyond what was physically measured.

121 2 Visitation rates also decreased as field area increased (Figure 31C, F1, 19 = 7.5, P = 0.013, R = 0.28).

COLONY ABUNDANCE AND POLLINATION SERVICES When examining the 21 fields with both colony abundance measures and visitation rates, we found that colony abundance per field did not affect visitation rates (Figure 32A, F1, 19 = 0.38, P = 0.5457, R2 = 0.02). However, colony abundance per hectare had a positive effect on visitation

2 rates (F1, 19 = 4.5, P = 0.046, R = 0.19) when including all 21 fields, including a single outlier

(Field 7, CH = 249.4). Once removing the outlier, colony abundance per hectare had a slightly

2 stronger positive relationship with visitation rates (Figure 32B, F1, 18 = 5.3, P = 0.0322, R = 0.23).

POPULATION STRUCTURING For each year, overall estimates of genetic differentiation between sites or regions was never significantly different from zero, with confidence intervals always bounded by or overlapping zero (Table 24). Pairwise comparisons of genetic differentiation between fields was always less than 0.007 and always less than 0.003 between regions (details in Appendix W). For each year, nearly all genetic variation (>99.85%, P = 0.03) was found between individuals (Table 25). Differences between fields accounted for 0.14% or less of total genetic variance and was only significant in 2013 (P = 0.01, Table 25). Differences between regions never accounted for 0.13% or less of total genetic variance, although these tiny amounts were significant in 2014 (P = 0.02) and 2015 (P = 0.01, Table 25).

GENETIC VARIATION Because I found little evidence to suggest population structuring among fields or regions, I grouped all individuals together as a single panmictic population and examined genetic diversity for each year separately. HE ranged from 0.648 – 0.650 per year and AR ranged from 12.14 – 12.28 per year. Each year, the panmictic population appeared to have low levels of inbreeding

(FIS < 0.02), which were only significantly different from 0 in 2014 and 2015.

122 DISCUSSION

MASS-FLOWERING CROP EFFECTS ON WILD BUMBLE BEE POPULATIONS B. impatiens colony abundance was not affected by commercial pumpkin field area (Figure 31A). This suggests that increasing the size of nectar-rich mass-flowering crop fields is unlikely to negatively affect the abundance of wild bumble bee colonies nesting within foraging distance of these fields. Furthermore, in central Pennsylvania, it appears that pumpkin agroecosystems provide floral resources that can support many wild B. impatiens colonies. If floral resources were limited, we would expect to find larger fields accommodating foragers from a greater number of colonies; instead we find that a single pumpkin field can accommodate up to 829 B. impatiens colonies (Table 23). Given this independence between field area and colony abundance per field, it is reasonable to expect that colony abundance per hectare would decrease as fields get larger. This is, in fact, exactly what we see: colony abundance per hectare declines in an apparently exponential fashion across larger and larger fields (Figure 31B). Correspondingly, field size also had a negative effect on visitation rates (Figure 31C). These findings support the ‘landscape- moderated concentration and dilution hypothesis’ articulated in Tscherntke et al, 2012, that suggests pollinators may be diluted across increasingly larger agricultural floral resources, which would result in a decrease of pollination services without negatively impacting pollinator populations.

COLONY ABUNDANCE POSITIVELY INFLUENCES POLLINATION SERVICES Because increasing colony abundance per hectare results in higher visitation rates (Figure 32B), we can conclude that wild bumble bee colony abundance, mediated by field size, positively influences pollination services in commercial pumpkin agroecosystems in our study area. Colony abundance per hectare is responsible for approximately 23% of the variation in visitation rates, indicating other factors are important as well. Indeed, there are some fields characterized by relatively high visitation rates from relatively low colony abundances per hectare (Figure 32B; Table 23, field 13, 14 and 16). This may be attributed to variation in colony size (e.g., foragers per colony) and / or colony proximity. In order for colonies to grow large, abundant floral resources with quality protein should be available throughout the life cycle of the colony. Future studies could investigate the succession of floral blooms throughout the season to determine

123 which landscapes could potentially support larger colonies. Bumble bees do not recruit nest- mates to floral resources the same intensity and accuracy as honey bees, but bumble bee foragers will encourage nest-mates to seek certain resources by communicating the existence and quality of choice resources (Dornhaus & Chittka, 1999). If colonies are nesting in close proximity to pumpkin fields, more foragers may have a higher probability of encountering those resources once primed to do so by nest mates. More studies are needed to evaluate landscape characteristics that are conducive to bumble bee nesting and variation in colony size within these landscapes. Because colony abundance per hectare drops as fields get larger, conserving abundant populations of wild B. impatiens could help promote pollination services throughout large mass- flowering crops fields. The fact that pumpkin field sizes are large enough to dilute current B. impatiens populations might at first appear worrying for the pollination services they provide. However, let’s examine the ‘worst-case scenario’ from our study. Aside from one field anecdotally suffering from horticultural issues, the lowest visitation rate we observed was .07 visits per flower per 45s. That equates to roughly 1 visit every 11 minutes. Pumpkin flowers are open for at least 4 hours a day, which means a single flower will receive approximately 21 visits from a B. impatiens forager. The visitation rates reported in this study are for male and female flowers combined, and we previously demonstrated that B. impatiens preferentially visits female flowers (Figure 10 in Chapter 2). Therefore, under the same conditions, a female flower will be visited >21 times. Previous studies have shown that bumble bees can adequately fertilize cucurbits with just 2 – 8 visits (Artz & Nault, 2011; Pfister et al, 2017) and so, even in the worst- case scenario in our study, wild B. impatiens colonies are currently supplying enough foragers to provide >2.5x the amount of required pollination services in Cucurbita agroecosystems.

MATING ASSUMPTIONS Mating assumptions play a key role when using genetic techniques to estimate colony abundance. No previous studies using microsatellites to estimate colony abundances have taken into account the effect of multiple mating, in part because Bombus species in the UK (Shmidt- Hemple R. & Shmidt-Hemple, 2000) where the foundational work for this subject were conducted, have largely been shown to be singly mated. However, because there is evidence of multiply mated queens for bumble bee species in the subgenera , including B. impatiens (Owen & Whidden, 2013), we examined the effect of mating assumptions on colony analysis. We found, contrary to popular belief, that assuming monogamy results in an under-estimate of

124 ‘detected colony numbers’ for polygamous species exhibiting low levels of polygamy (i.e. average mates per queen < 1.12). While the number of FS families was always greater under polygamous assumptions (Figure 28), the number of colonies detected under polygamous assumptions may be more or less than monogamy, depending on the average number of mates per queen (Figure 29). When mates per queen is approximately 1.13, assuming polygamy yields similar detected colony numbers as monogamy. When mates per queen is <1.12 (i.e. low levels of polygamy), assuming monogamy yields an under-estimate of detected colony numbers. Only when mates per queen is ³ 1.14 (i.e. high levels of polygamy) does assuming monogamy result in an over-estimate of detected colony numbers. Published studies concerning mating frequencies of B. impatiens suggest that queens exhibit low levels of polygamy. A study based on wild caught queens suggested 1.06 mates per queen (Payne et al, 2003), while a study based on commercially reared queens (which are thought to be bred in conditions that artificially inflate queens’ exposure to mates), suggest 1.2 mates per queen. However, there are only 2 studies regarding mating frequency of B. impatiens, and these studies were conducted with small sample sizes and using only 1 or 3 microsatellites. Future studies with more robust sampling design (more colonies, more offspring per colony, more discerning loci) focused on wild-caught queens would provide a better understanding of mating frequency, which could be used for more reliable estimates of detected colony numbers. Given our current understanding, mating assumptions do not significantly affect detected colony numbers when average mates per queen are between 1.02 and 1.22, which includes the published range for B. impatiens (1.06 – 1.2).

B. IMPATIENS POPULATION ABUNDANCE, STABILITY AND POTENTIAL RESILIENCE We estimated that between 291 – 829 B. impatiens colonies provide foragers to commercial pumpkin fields in our study area (Table 23, Figure 31A). This suggests an abundant B. impatiens population nesting in the surrounding landscape within foraging distance of these fields. For the first time, the stability of wild bumble bee colonies was examined across short time spans and geographic ranges. Average colony abundance per field (~540) did not significantly vary between 3 distinct regions spaced out over 5000 km2 (Figure 30B), nor did colony abundance per field change over the course of 4 years (Figure 30A); if anything, a slight positive yearly trend may exist. No other study that we know of has sampled repeatedly in distinct regions to evaluate colony dynamics across time and space. Recent temporal and spatial trends in B. impatiens colony

125 abundance per field suggest wild populations are abundant and stable in central Pennsylvania and therefore, can be relied upon to provide pollination services in managed and native ecosystems alike, given the current environmental conditions. As global climate and land use patterns are altered, primarily by anthropogenic forces, the genetic capacity to respond to selective pressures may predict a species potential resilience. Bumble bee species characterized by fragmented populations are often found to be in decline, shrinking from localized losses and possibly facing extinction. Fortunately, we found little evidence of population structuring for B. impatiens: both genetic differentiation (Table 24) and genetic variance (Table 25) was essentially non-existent between foragers found in different fields or regions. We can conclude that B. impatiens is a single, panmictic population across the 13,000 km2 study region in our area. This finding of a single panmictic population confirms previous work on B. impatiens conducted nationwide by Lozier et al, 2011 and in Massachusetts by Suni et al, 2017. The genetic diversity of our panmictic B. impatiens population was relatively high. Total number of alleles (262, across all loci and all individuals) and average allelic richness (AR) (12.5 ± 0.22 per year, Table 26) was much higher than other reported studies on B. impatiens, including Lozier at el, 2011 which reported an average allelic richness of 4.737 ± 0.191. However, our high allelic richness could be a result of our much larger forager sample sizes (n = 100 per year in our study, and n = 10 per site in Lozier et al, 2011), as well as including different loci with potentially larger numbers of alleles. When compared with estimates of expected heterozygosity (HE), our overall value of 0.64 ± 0.006 more closely resembles previously reported values, like 0.687 ± 0.018 from Lozier et al, 2011. Values of expected heterozygosity are still impacted by the allelic diversity of loci, and because we used different loci than other studies, they are not a direct comparison – but because all measure of expected heterozygosity are on a scale of 0 – 1, values are potentially easier to compare across studies Patterns of genetic variance, genetic differentiation and genetic diversity did not change across successive years (Table 24, Table 25, Table 26). The fact that average genetic diversity per site did not decrease across 4 years indicates that populations are large enough to maintain current levels of genetic diversity with each successive generation. We did not expect to find changes in genetic patterns in such a short time period of just 4 years, but it is important to validate basic assumptions with empirical evidence when possible. These data on both colony abundance and genetic diversity can be used as an important baseline for future long-term

126 monitoring studies regarding the genetic status of economically valuable wild B. impatiens populations.

CONCLUSIONS Our study strongly suggests that a key pollinator in the eastern United States, B. impatiens, is a genetically diverse, panmictic population characterized by high, stable colony abundances. These findings bode well for the natural and agroecosystems that rely on B. impatiens for pollination services. The role of B. impatiens as critical pollinators in many agricultural crops has been well documented, specifically for pumpkins in the northeastern US (Artz & Nault, 2011, Petersen et al, 2013, Chapter 2). These studies based on individual foragers were unable to estimate the potential persistence of B. impatiens populations. It was previously unknown if the critically important forager force originated from many colonies or few colonies. Here, we present empirical evidence of high colony abundances exhibiting temporal and spatial stability, which suggest a resilient forager abundance that will not be diminished by random losses of a few colonies. Increasing sizes of nectar-rich mass-flowering crops do not appear to be affecting B. impatiens populations, neither acting as limiting factor nor promoting populations. However, the colony abundance per hectare decreases as mass-flowering crops field size increases. Because colony abundance per hectare is a positive predictor of visitation rates, increasing mass- flowering crops field size results in decreased pollination services without impacting the pollinator population. Even though B. impatiens pollination services are diluted across large commercial pumpkin fields, current populations can provide ~2.5x the required visits in even the worse-case scenarios. Furthermore, there is evidence to suggest that genetic structuring is not increasing and genetic diversity is not decreasing with successive years. Therefore, this wild pollinator and the economically valuable pollination services they provide, are likely a resilient force that will continue to persist in the foreseeable future. This study provides a robust base line on which to build for long term monitoring B. impatiens populations as environmental conditions change in the future.

REFERENCES

Beckham, J., Warriner, M., Atkinson, S., & Kennedy, J. (2016). The Persistence of Bumble Bees (Hymenoptera: Apidae) in Northeastern Texas. Proceedings Of The Entomological Society Of Washington, 118(4), 481-497. http://dx.doi.org/10.4289/0013-8797.118.4.481

127 Bourke, A.F.G. and Hammond, R.L. (2002). Genetics of the scarce bumble bee, Bombus distinguendus, and nonlethal sampling of DNA from bumble bees. A Report for the RSPB, January 2002. Cameron, S., Lozier, J., Strange, J., Koch, J., Cordes, N., Solter, L., & Griswold, T. (2011). Patterns of widespread decline in North American bumble bees. Proceedings Of The National Academy Of Sciences, 108(2), 662-667. http://dx.doi.org/10.1073/pnas.1014743108 Carvell, C., Bourke, A., Dreier, S., Freeman, S., Hulmes, S., & Jordan, W. et al. (2017). Bumblebee family lineage survival is enhanced in high-quality landscapes. Nature, 543(7646), 547-549. http://dx.doi.org/10.1038/nature21709 Carvell, C., Jordan, W., Bourke, A., Pickles, R., Redhead, J., & Heard, M. (2011). Molecular and spatial analyses reveal links between colony-specific foraging distance and landscape-level resource availability in two bumblebee species. Oikos, 121(5), 734-742. http://dx.doi.org/10.1111/j.1600-0706.2011.19832.x Chapman, R., Wang, J., & Bourke, A. (2003). Genetic analysis of spatial foraging patterns and resource sharing in bumble bee pollinators. Molecular Ecology, 12(10), 2801-2808. http://dx.doi.org/10.1046/j.1365-294x.2003.01957.x Charman, T., Sears, J., Green, R., & Bourke, A. (2010). Conservation genetics, foraging distance and nest density of the scarce Great Yellow Bumblebee (Bombus distinguendus). Molecular Ecology, 19(13), 2661-2674. http://dx.doi.org/10.1111/j.1365-294x.2010.04697.x Cnaani, J., Schmid-Hempel, R., & Schmidt, J. (2002). Colony development, larval development and worker reproduction in Bombus impatiens Cresson. Insectes Sociaux, 49(2), 164-170. http://dx.doi.org/10.1007/s00040-002-8297-8 Colla, S., Richardson, L., & Williams, P. (2011). Bumble bees of the eastern United States. Washington, D.C.: U.S. Dept. of Agriculture Pollinator Partnership. Darvill, B., Ellis, J., Lye, G., & Goulson, D. (2006). Population structure and inbreeding in a rare and declining bumblebee, Bombus muscorum (Hymenoptera: Apidae). Molecular Ecology, 15(3), 601-611. http://dx.doi.org/10.1111/j.1365-294x.2006.02797.x Darvill, B., Knight, M., & Goulson, D. (2004). Use of genetic markers to quantify bumblebee foraging range and nest density. Oikos, 107(3), 471-478. http://dx.doi.org/10.1111/j.0030- 1299.2004.13510.x Dornhaus, A., & Chittka, L. (1999). Evolutionary origins of bee dances. Nature, 401(6748), 38- 38. doi:10.1038/43372 Ellis, J., Knight, M., Darvill, B., & Goulson, D. (2006). Extremely low effective population sizes, genetic structuring and reduced genetic diversity in a threatened bumblebee species, Bombus sylvarum (Hymenoptera: Apidae). Molecular Ecology, 15(14), 4375-4386. http://dx.doi.org/10.1111/j.1365-294x.2006.03121.x Estoup A, Gamery L, Solignac M, Cornuet JM (1995b) Microsatellite variation in honey bee (Apis mellifera L.): hierarchical genetic structure and test of the infinite allele and step wise mutation models. Genetics, 140.679-695. Estoup, A., Solignac, M., Cornuet, J., Goudet, J., & Scholl, A. (1996). Genetic differentiation of continental and island populations of Bombus terrestris (Hymenoptera: Apidae) in

128 Europe. Molecular Ecology, 5(1), 19-31. http://dx.doi.org/10.1111/j.1365- 294x.1996.tb00288.x Funk, C.R., Schmid-Hempel, R., & Schmid-Hempel, P. (2006). Microsatellite loci for Bombus spp. Molecular Ecology Notes, 6(1), 83-86. http://dx.doi.org/10.1111/j.1471- 8286.2005.01147.x Geib, J., Strange, J., & Galen, C. (2015). Bumble bee nest abundance, foraging distance, and host-plant reproduction: implications for management and conservation. Ecological Applications, 25(3), 768-778. http://dx.doi.org/10.1890/14-0151.1 Goudet, J. (2005). hierfstat, a package for r to compute and test hierarchical F-statistics. Molecular Ecology Notes, 5(1), 184-186. http://dx.doi.org/10.1111/j.1471-8286.2004.00828.x Goulson, D. (2010). Bumblebees behaviour, ecology, and conservation (2nd ed.). Oxford: Oxford Univ. Press. Goulson, D., Lepais, O., O’Connor, S., Osborne, J., Sanderson, R., & Cussans, J. et al. (2010). Effects of land use at a landscape scale on bumblebee nest density and survival. Journal Of Applied Ecology, 47(6), 1207-1215. http://dx.doi.org/10.1111/j.1365-2664.2010.01872.x Goulson, D., O’Connor, S., and Park, K.J. (2017). The impacts of predators and parasites on wild bumblebee colonies. Ecological Entomology. DOI: 10.1111/een.12482 Grixti, J., Wong, L., Cameron, S., & Favret, C. (2009). Decline of bumble bees (Bombus) in the North American Midwest. Biological Conservation, 142(1), 75-84. http://dx.doi.org/10.1016/j.biocon.2008.09.027 Herrmann, F., Westphal, C., Moritz, R., & Steffan-Dewenter, I. (2007). Genetic diversity and mass resources promote colony size and forager densities of a social bee (Bombus pascuorum) in agricultural landscapes. Molecular Ecology, 16(6), 1167-1178. http://dx.doi.org/10.1111/j.1365-294x.2007.03226.x Holzschuh, A., Dainese, M., González-Varo, J., Mudri-Stojnić, S., Riedinger, V., & Rundlöf, M. et al. (2016). Mass-flowering crops dilute pollinator abundance in agricultural landscapes across Europe. Ecology Letters, 19(10), 1228-1236. http://dx.doi.org/10.1111/ele.12657 Holzschuh, A., Dormann, C., Tscharntke, T., & Steffan-Dewenter, I. (2012). Mass-flowering crops enhance wild bee abundance. Oecologia, 172(2), 477-484. http://dx.doi.org/10.1007/s00442-012-2515-5 Jha, S., & Kremen, C. (2012). Resource diversity and landscape-level homogeneity drive native bee foraging. Proceedings Of The National Academy Of Sciences, 110(2), 555-558. http://dx.doi.org/10.1073/pnas.1208682110 Jones, O., & Wang, J. (2010). COLONY: a program for parentage and sibship inference from multilocus genotype data. Molecular Ecology Resources, 10(3), 551-555. http://dx.doi.org/10.1111/j.1755-0998.2009.02787.x Kallioniemi, E., Astr̊ om,̈ J., Rusch, G. M., Dahle, S., Astr̊ om̈ S., Gjershaug, J. O. (2017). Local resources, linear elements and mass-flowering crops determine bumblebee occurrences in moderately intensified farmlands. Agriculture, Ecosystems and Environment 239. 90-100.

129 Kamvar, Z., Tabima, J., & Grünwald, N. (2014). Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. Peerj, 2, e281. http://dx.doi.org/10.7717/peerj.281 Kerr, J.T., Pindar, A., Galpern, P., Packer, L., Potts, S.G., Roberts, S.M., Rasmont, P., Schweiger, O., Colla, S.R., Richardson, L.L. Wagner, D.L., Gall, L.F., Sikes, D.S., Pantoja, A. (2015). Climate change impacts on bumblebees converge across continents. Science, 349(6244) 177-180. DOI: 10.1126/science.aaa7031 Klein AM, Vaissiere BE, Cane JH, Steffan-Dewenter I, Cunningham SA, Kremen C, Tscharntke T. 2007. Importance of pollinators in changing landscapes for world crops. Proceedings of the Royal Society B 274:303–313 DOI 10.1098/rspb.2006.3721. Knight, M., Martin, A., Bishop, S., Osborne, J., Hale, R., Sanderson, R., & Goulson, D. (2005). An interspecific comparison of foraging range and nest density of four bumblebee (Bombus) species. Molecular Ecology, 14(6), 1811-1820. http://dx.doi.org/10.1111/j.1365- 294x.2005.02540.x Lozier, J., & Cameron, S. (2009). Comparative genetic analyses of historical and contemporary collections highlight contrasting demographic histories for the bumble bees Bombus pensylvanicus and B. impatiens in Illinois. Molecular Ecology, 18(9), 1875-1886. http://dx.doi.org/10.1111/j.1365-294x.2009.04160.x Lozier, J., Strange, J., Stewart, I., & Cameron, S. (2011). Patterns of range-wide genetic variation in six North American bumble bee (Apidae: Bombus) species. Molecular Ecology, 20(23), 4870-4888. http://dx.doi.org/10.1111/j.1365-294x.2011.05314.x Meirmans, P., & Hedrick, P. (2010). Assessing population structure: FST and related measures. Molecular Ecology Resources, 11(1), 5-18. http://dx.doi.org/10.1111/j.1755- 0998.2010.02927.x Miller, C., Joyce, P., & Waits, L. (2005). A new method for estimating the size of small populations from genetic mark-recapture data. Molecular Ecology, 14(7), 1991-2005. http://dx.doi.org/10.1111/j.1365-294x.2005.02577.x O’Connor, S., Park, K.J. and Goulson, D. (2012) Humans versus dogs; a comparison of methods for the detection of bumblebee nests. Apidologie 51, 204-211. Osborne, J.L., Martin, A.P., Shortall, C.R., Todd, A.D., Goulson, D., Knight, M.E., Hale, R.J. and Sanderson, R.A. (2008) Quantifying and comparing bumblebee nest densities in gardens and countryside habitats. Journal of Applied Ecology 45, 784-792. Owen, R., & Whidden, T. (2013). Monandry and polyandry in three species of North American bumble bees (Bombus) determined using microsatellite DNA markers. Canadian Journal Of Zoology, 91(7), 523-528. http://dx.doi.org/10.1139/cjz-2012-0288 Payne, C., Laverty, T., & Lachance, M. (2003). The frequency of multiple paternity in bumble bee (Bombus) colonies based on microsatellite DNA at the B10 locus. Insectes Sociaux, 50(4), 375-378. http://dx.doi.org/10.1007/s00040-003-0692-2 Pirounakis K, Koulianos S, Schmid-Hempel P, 1998. Genetic variation among European populations of Bombus pascuorum (Hymenoptera: Apidae). Eur J Entomol . 95:27- 33.Plath, O. E. (1934). Bumblebees and their ways. New York: The Macmillan Company.

130 Rao, S., & Strange, J. (2012). Bumble Bee (Hymenoptera: Apidae) Foraging Distance and Colony Density Associated With a Late-Season Mass Flowering Crop. Environmental Entomology, 41(4), 905-915. http://dx.doi.org/10.1603/en11316 Schmid-Hempel, R., & Schmid-Hempel, P. (2000). Female mating frequencies in Bombus spp. from Central Europe. Insectes Sociaux, 47(1), 36-41. http://dx.doi.org/10.1007/s000400050006 Shao, Z.Y., Mao, H.X., Fu, W.J., Ono, M., Wang, D.-S., Bonizzoni, M., Zhang, Y.-P. (2004). Genetic Structure of Asian Populations of Bombus ignitus (Hymenoptera: Apidae), Journal of Heredity, 95(1)(1), 46–52. https://doi.org/10.1093/jhered/esh008 Sidhu, C. (2012). Farm-level and landscape-level effects on Cucurbit pollinators on small farms in diversified agroecosystems. Ph.D. dissertation. Pennsylvania State University Stolle, E., Rohde, M., Vautrin, D., Solignac, M., Schmid-Hempel, P., Schmid-Hempel, R., & Moritz, R. (2009). Novel microsatellite DNA loci for Bombus terrestris (Linnaeus, 1758). Molecular Ecology Resources, 9(5), 1345-1352. http://dx.doi.org/10.1111/j.1755- 0998.2009.02610.x Suni, S., Scott, Z., Averill, A., & Whiteley, A. (2017). Population genetics of wild and managed pollinators: implications for crop pollination and the genetic integrity of wild bees. Conservation Genetics, 18(3), 667-677. http://dx.doi.org/10.1007/s10592-017-0955-5 Tscharntke, T., Tylianakis, J. M., Rand, T. A., Didham, R. K., Fahrig, L., Batáry, P., . . . Westphal, C. (2012). Landscape moderation of biodiversity patterns and processes - eight hypotheses. Biological Reviews, 87(3), 661-685. doi:10.1111/j.1469-185x.2011.00216.x Waters, J., O’Connor, S., Park, K., & Goulson, D. (2011). Testing a detection dog to locate bumblebee colonies and estimate nest density. Apidologie, 42(2), 200-205. http://dx.doi.org/10.1051/apido/2010056 Westphal, C., Steffan-Dewenter, I., Tscharntke, T. (2003). Mass flowering crops enhance pollinator densities at a landscape scale. Ecology Letters, 6. 961-965 Winter, D. (2012). mmod: an R library for the calculation of population differentiation statistics. Molecular Ecology Resources, 12(6), 1158-1160. http://dx.doi.org/10.1111/j.1755- 0998.2012.03174.x Woodard, S., Lozier, J., Goulson, D., Williams, P., Strange, J., & Jha, S. (2015). Molecular tools and bumble bees: revealing hidden details of ecology and evolution in a model system. Molecular Ecology, 24(12), 2916-2936. http://dx.doi.org/10.1111/mec.13198 Wood, T., Holland, J., Hughes, W., & Goulson, D. (2015). Targeted agri-environment schemes significantly improve the population size of common farmland bumblebee species. Molecular Ecology, 24(8), 1668-1680. http://dx.doi.org/10.1111/mec.13144

131 Table 23. Bombus impatiens colony abundances and visitation rates were measured for a total of 30 fields, sampled across 4 years (Y) and 3 regions (R). Full-sibship families detected by Colony under monogamous assumptions (FS families Mono) is the average across three separate runs (numbers in parenthesis indicate when separate runs detected different numbers of full-sibship families) and is the same value as detected colony numbers used in subsequent analysis. FS families poly indicates Full-sibship families detected by Colony under polygamous assumptions. Colonies per field is estimated total colony abundance per field provided by Capwire; Colonies per hectare is the total colony abundance per field ÷ field area. Visitation rate is the mean bee visits per pumpkin flower per 45s for each field and is based on ~120 observations averaged per transect, and then 4 transects averaged per field.

Field Bees Bees FS families FS families Colonies Colonies Visitation Field Y R Date Area (ha) collected genotyped Mono Poly per field per ha Rate 1 2012 Co 24-Jul-12 2.09 222 219 161 187 408 195.2 2 2012 Co 29-Aug-12 4.45 163 156 132 145 509 114.4 3 2013 Co 20-Aug-13 6.47 195 193 124 138 291 45.0 0.09 ± 0.01 4 2013 Co 21-Aug-13 11.86 205 205 156 170 441 37.2 0.25 ± 0.07 168.7 5 2013 Co 21-Aug-13 5.06 204 203 189 603 119.2 ± (168, 169, 169) 0.46 0.03 6 2013 Co 22-Aug-13 7.28 196 196 166 179 719 98.8 0.34 ± 0.09 179 7 2013 L 5-Aug-13 1.62 239 236 209 482 297.5 ± (178, 179, 180) 0.45 0.07 8 2013 L 8-Aug-13 2.91 229 229 185 210 588 202.1 0.46 ± 0.07 192.3 9 2013 Ce 31-Jul-13 0.61 245 242 210 623 1021.3* (191, 192, 194) 148.3 10 2013 Ce 18-Aug-13 1.52 218 211 174 366 240.8 (146, 149, 150) 11 2013 Ce 22-Aug-13 0.61 242 239 183 214 520 852.5* 169 12 2014 Co 7-Aug-14 5.63 202 199 190 653 116.0 ± (168, 168, 171) 0.18 0.02 170.3 13 2014 Co 11-Aug-14 5.02 207 205 189 630 125.5 ± (170, 170, 171) 0.79 0.002 14 2014 Co 14-Aug-14 5.17 223 219 148 162 386 74.7 0.59 ± 0.03 15 2014 Co 14-Aug-14 4.61 207 201 159 186 487 105.6 0.19 ± 0.04 146.3 16 2014 L 8-Aug-14 2.43 202 194 163 421 173.3 ± (146, 146, 147) 0.75 0.20 169.3 17 2014 L 8-Aug-14 6.92 206 202 190 645 93.2 ± (168, 169, 171) 0.33 0.03 18 2014 Ce 24-Aug-14 1.70 214 207 148 163 428 251.8 177.3 19 2014 Ce 28-Aug-14 0.61 220 218 191 623 1021.3* (177, 177, 178) 157.3 20 2015 Co 31-Jul-15 6.03 181 176 173 829 137.5 ± (155, 158, 159) 0.34 0.03 172.6 21 2015 Co 2-Aug-15 4.78 217 210 200 601 125.7 ± (172, 173, 173) 0.10 0.09 22 2015 Co 12-Aug-15 4.78 212 211 175 188 693 145.0 0.08 ± 0.04 23 2015 Co 12-Aug-15 8.98 212 210 171 197 573 63.8 0.01 ± 0.12 24 2015 Co 13-Aug-15 8.82 224 221 179 210 568 64.4 0.07 ± 0.04 177.6 25 2015 Co 13-Aug-15 11.53 214 212 194 667 57.8 ± (177, 178, 178) 0.07 0.03 26 2015 L 28-Jul-15 3.85 196 191 153 167 476 123.6 0.25 ± 0.03 27 2015 L 29-Jul-15 4.90 206 202 159 183 479 97.8 0.42 ± 0.07 165.3 28 2015 L 5-Aug-15 7.57 210 205 187 577 76.2 ± (161, 167, 168) 0.11 0.06 182.3 29 2015 Ce 15-Aug-15 1.74 232 231 205 578 332.2 (181, 181, 185) 140.3 30 2015 Ce 26-Aug-15 0.61 180 179 151 448 734.4* (140, 140, 141) 4.7±0.57 210.8±3.3 207.4±3.4 163.8±2.9 183.8±3.6 543.7±21.7 135.2±15 0.3±0.05 *Values excluded from mean CH and subsequent analyses because fields were less than 1 ha

132 220 P < 0.0001 R2 = 0.23 200 n = 30 per group 180 160

sibship Families 140 - 120 Full 100 Monogamous Polygamous Mating Assumptions Figure 28. Mating assumptions have a significant effect on mean full-sibship families identified by Colony. Error bars depict standard error.

Monogamous

Polygamous

ueen) P < 0.0001 R2 = 0.21 n = 330 (30 per group) Mates per Q per Mates

÷ sibshipFamilies - Detected Colony Detected Numbers ull

(F published estimates for B. impatiens

Mates per Queen Figure 29. Average mates per queen has a significant effect on polygamous detected colony numbers. The x-axis depicts hypothetical mates per queen, ranging from 1 – 1.22 in increments of 0.02. Each point represents a mean of all 30 fields, surrounded by error bars depicting 95% confidence intervals. Values for 1.02 – 1.22 mates per queen are based on the number of full-sibship families identified by Colony under polygamous assumptions (black). The shaded region (gray) depicts a 95% CI surrounding the regression line of fit. Included for comparison, the values for 1 mate per queen is based on the number of full-sibship families identified by Colony under monogamous assumptions (red) and is not part of the regression analysis. The monogamous mean (red solid line) is approximately equal to the polygamous mean at 1.12 mates per queen. The 95% CI for monogamous mean (red dotted line) excludes 1.02 and 1.22. Estimated mates per queen for B. impatiens (green line) are taken from previous studies (Cnaani et al, 2002; Payne et al, 2003)

133 700 A P = 0.36 700 B P = 0.55 R2 = 0.11 R2 = 0.04 600 n = 30 600 n = 30 500 500 400 400 300 589.9300 514.7 534.1 512.3 566.1 524 200 458.5 200 100 100 0 0 Colony Abundance Field per Colony Colony Abundance Field per Colony 2012 2013 2014 2015 Centre Columbia Lancaster Figure 30. Mean Colony abundance per field was stable across (A) 4 successive years and (B) 3 geographic regions. Error bars depict standard error.

134 1050 A P = 0.2 Max: 829 R2 = 0.05 900 n = 30

750

600

450

300 Min: 291 Total Colony Abundance per Field per Colony Abundance Total

Total Colony Abundance Field per Colony Total 150 0 2 4 6 8 10 12 P < 0.0001

MFC Field Area (ha) B R2 = 0.84 n = 26 Colony Abundance per Hectare

C P = 0.01 R2 = 0.28 n = 21 (bee / flower / 45s) flower / / (bee

Mean Visitation Rate

Figure 31. Field area did not affect B. impatiens (A) total colony abundance per field, but did had a negative relationship with both (B) colony abundance per hectare and (C) mean visitation rate. The x-axis (field area) is uniform for all graphs and each point represents a single commercial pumpkin field. Error bars in (A) represent the 95% confidence intervals (CI) of estimated colony abundances supplied by Capwire. The shaded region (blue) depicts the 95% CI for the line of fit for significant relationships. For (B) fields less than 1 ha (n = 4) were excluded.

135 1.0 12 ha A B 0.8

2 ha 0.6

0.4

0.2 Visitation Rate Visitation (Bee / Flower / 45s) / Flower / (Bee P = 0.55 P = 0.032 0.0 R2 = 0.02 R2 = 0.23 n = 20 n = 21 300 400 500 600 700 800 0 50 100 150 200 250 300 Colony Abundance per Field Colony Abundance per Hectare Figure 32. Visitation rates have no relationship with (A) colony abundance per field, but they are positively affected by (B) colony abundance per hectare. Each point represents a single pumpkin field and is colored according to field size, depicted in the legend. The blue line of fit, indicating significance (F1, 19 = 5.3, P = 0.0322, R2 = 0.23) is surrounded by a shaded region depicting a 95% confidence interval. The dotted line indicates the line of fit when including an outlier for colony abundance per hectare, depicted as an unfilled 2 circle (F1, 19 = 4.5, P = 0.046, R = 0.19)

Table 24. Overall estimates of genetic differentiation between sites or regions using 3 G-statistics for each year separately. 95% confidence intervals are in ( ) and always include or overlap zero, indicating no significance.

Year Level Nei’s GST Hendricks GST Jost’s DEST 2013 Site 0.0006 (-0.000 – 0.001) 0.0018 (-0.011 – 0.007) 0.0012 (-0.000 – 0.003) Region 0.000 (-0.000 – 0.000) 0.0001 (-0.001 – 0.001) 0.0001 (-0.001 – 0.001) 2014 Site 0.0007 (-0.000 – 0.002) 0.0023 (-0.000 – 0.005) 0.0015 (-0.000 – 0.003) Region 0.0006 (0.000 – 0.001) 0.0024 (0.000 – 0.005) 0.0016 (0.000 – 0.003) 2015 Site 0.0004 (-0.000 – 0.001) 0.0013 (-0.001 – 0.003) 0.0008 (-0.000 – 0.002) Region 0.0003 (-0.000 – 0.001) 0.0012 (-0.000 – 0.003) 0.0008 (-0.000 – 0.002)

Table 25. Sources of genetic variation partitioned between individuals, sites and regions using AMOVA for each year independently. Year Source of genetic variation Df σ2 % variance ϕ-statistic P-value 2013 Between Regions 2 <0.001 <0.001 -0.00005 0.55 Between Fields Within Regions 6 0.009 0.14 0.001 0.01 Between individuals within Fields (error) 1487 6.32 99.86 0.001 0.01 2014* Between Regions 2 0.008 0.13 0.001 0.02 Between Fields Within Regions 5 <0.001 <0.001 -0.0002 0.74 Between individuals within Fields (error) 1276 6.13 99.88 0.001 0.03 2015 Between Regions 2 0.004 0.07 0.0007 0.01 Between Fields Within Regions 8 0.003 0.04 0.0004 0.11 Between individuals within Fields (error) 1818 6.37 99.89 0.001 0.01 *for 2014, loci BTMS0073 was not included because data was missing from more than 5% of samples due to human error.

136 Table 26. Genetic diversity metrics for the panmictic population each year. N1pc is the number of foragers reduced to a single representative per colony per site. AN is the total number of alleles from all loci. AP is the number of private alleles found only in that year. AR is the allelic richness calculated from a rarified sample size of 100 individuals per year. HE is the expected heterozygosity calculated from the smallest sample size of 293 samples per year. FIS is the inbreeding coefficient, expressed as a 95% confidence interval with significance, represented in bold, when intervals do not overlap 0.

Year N1pc AN AP AR HE FIS 2012 293 186 2 12.14 0.649 (-0.012, 0.0125) 2013 1496 235 8 12.28 0.648 (-0.0059, 0.0115) 2014 1284 230 7 12.22 0.648 (0.0054, 0.0196) 2015 1829 236 11 12.28 0.650 (0.0022, 0.012)

137 APPENDICES

APPENDIX T Table T27. Sampling support for colony abundance and pollination activity. Field: unique number and initials to identify each field, shortened to protect identity of grower collaborators. Region: the geographical location of each field which could be Columbia (Co), Lancaster (L) or Centre (Ce). DateC: date ~200 B. impatiens individuals were collected. NIC: number of individuals collected. DaysC-VR: days between forager collection date and visitation rate observation date. Transect (m): distance from field edge at which visitation rates were recorded; NVR: number of visitation rate samples taken per transect. – indicates visitation rate data was no collected for that field

Field Year Region DateC NIC DaysC-VR DateVR Transect (m) NVR 1 – CE 2012 Co 24-Jul-12 222 - - - - 2 – Wh 2012 Co 29-Aug-12 163 - - - - 3 – Th1 2013 Co 20-Aug-13 195 0 20-Aug-13 0 45 25 37 50 33 100 45 4 – LNK 2013 Co 21-Aug-13 205 -6 15-Aug-13 0 59 25 50 50 58 100 56 5 – St 2013 Co 21-Aug-13 204 0 21-Aug-13 0 52 25 59 50 58 100 58 6 – Ho 2013 Co 22-Aug-13 196 0 22-Aug-13 0 59 25 60 50 60 100 60 7 – GH 2013 L 5-Aug-13 239 0 5-Aug-13 0 60 25 60 50 60 100 60 8 – Ra 2013 L 8-Aug-13 229 -2 6-Aug-13 0 60 25 60 50 60 100 60 9 – Wa 2013 Ce 31-Jul-13 245 - - - - 10 – Rs3 2013 Ce 18-Aug-13 218 - - - - 11 – Ha3 2013 Ce 22-Aug-13 242 - - - - 12 - Mc 2014 Co 7-Aug-14 202 0 7-Aug-14 0 60 25 60 50 60 100 60 13 - Pi 2014 Co 11-Aug-14 207 0 11-Aug-14 0 60 25 60 50 60 100 60 14 - Pa 2014 Co 14-Aug-14 223 -13 1-Aug-14 0 60 25 60 50 60 100 60 138

15 - Yo 2014 Co 14-Aug-14 207 -14 31-Jul-14 0 60 25 60 50 60 100 60 16 - NG 2014 L 8-Aug-14 202 0 8-Aug-14 0 60 25 60 50 60 100 60 17 - RQ 2014 L 8-Aug-14 206 0 8-Aug-14 0 60 25 60 50 60 100 60 18 – Rs4 2014 Ce 24-Aug-14 214 - - - - 19 – Ha4 2014 Ce 28-Aug-14 220 - - - - 20 – Th5 2015 Co 31-Jul-15 181 0 31-Jul-15 0 60 25 60 50 60 100 60 21 – Mc2 2015 Co 2-Aug-15 217 -9 23-Jul-15 0 60 25 60 50 60 100 60 22 – Ko 2015 Co 12-Aug-15 212 -11 1-Aug-15 0 60 25 60 50 60 100 60 23 – Ne 2015 Co 12-Aug-15 212 -12 31-Jul-15 0 60 25 60 50 60 100 60 24 - KL 2015 Co 13-Aug-15 224 -11 2-Aug-15 0 60 25 60 50 60 100 60 25 - Mu 2015 Co 13-Aug-15 214 -13 31-Jul-15 0 60 25 60 50 60 100 60 26 - TE 2015 L 28-Jul-15 196 0 28-Jul-15 0 30 25 30 27 - Ni 2015 L 29-Jul-15 206 0 29-Jul-15 0 60 25 90 50 60 100 30 28 - Qu 2015 L 5-Aug-15 210 0 5-Aug-15 0 30 25 30 50 30 100 30 29 – Rs5 2015 Ce 15-Aug-15 232 - - - - 30 – Ha5 2015 Ce 26-Aug-15 180 - - - -

139

APPENDIX U

Table U28. 11 microsatellite loci multiplex used to genotype Bombus impatiens for from Bombus spp. All loci were isolated from B. terrestris unless otherwise specified in parentheses under the locus name. Primers labeled with the same florescent dye color (Dye) do not have overlapping size ranges. The number of nucleotides in the repeat motifs (tri-, di-) are reported for this study, while the motif repeat sequence is reported from the references. Size ranges, measured in number of base pairs (bp) reflect the distribution of alleles found in this study based on sample sizes in parenthesis. During PCR, all primers had an annealing temperature of 55 C.̊ µl per Size Range Locus Primer Sequence Dye Repeat Motif Ref sample (bp) BTMS0066 F: CATGATGACACCACCCAACG Trinucleotide 118-193^ FAM 0.135 Stolle et al, 2009 R: TTAACGCCCAATGCCTTTCC ACG (6199) B124 F: GCAACAGGTCGGGTTAGAG Dinucleotide 231-305^ FAM 0.2 Estoup et al, 1995 R: CAGGATAGGGTAGGTAAGCAG CT, GC, GGCT (6185) Btern01 F: CGTGTTTAGGGTACTGGTGGTC Dinucleotide 122-168^ VIC 0.07 Funk et al, 2006 (B. ternarius) R: GGAGCAAGAGGGCTAGACAAAAG AG (6199) BT28 F: TTGCTGACGTTGCTGTGACTGAGG Trinucleotide 178-199^ VIC 0.1 Funk et al, 2006 R: TCCTCTGTGTGTTCTCTTACTTGGC GTT, GTTGCT (6187) BTMS0062 F: CTGTCGCATTATTCGCGGTT Dinucleotide 233-343^ VIC 0.272 Stolle et al, 2009 R: CTGGGCGTGATTCGATGAAC CT (6172) BTMS0073 F: CGATATCGCGATCTTCGTACAC Trinucleotide 111-135 NED 0.135 Stolle et al, 2009 R: GTAGCATGCTCTCCGTGTTG AAG (6012) BT10 F: TCTTGCTATCCACCACCCGC Dinucleotide 139-193 NED 0.05 Funk et al, 2006 R: GGACAGAAGCATAGACGCACCG CT (6126) BL11 F: AAGGGTACGAAATGCGCGAG Dinucleotide 122-160^ PET 0.109 Funk et al, 2006 (B. lucorum) R: TGACGAGTGCGGCCTTTTTC TG (6059) BT30 F: ATCGTATTATTGCCACCAACCG Trinucleotide 178-208^ PET 0.091 Funk et al, 2006 R: CAGCAACAGTCACAACAAACGC GCT, GTT (6173) B96 F: GGGAGAGAAAGACCAAG Dinucleotide 227-277^ PET 0.29 Estoup et al, 1996 R: GATCGTAATGACTCGATATG CG, CT (6176) BTMS0081 F: ACGCGCGCCTTCTACTATC Trinucleotide 286-331 PET 0.14 Stolle et al, 2009 R: AGGGACACGCGAACAGAC * (6198) ^ Alleles appeared outside these ranges, but only in < 0.001 of all individuals *BTMS0081 was reported for B. terrestris with a tetranucleotide repeat motif of CCTT, but this study has revealed it to be a trinucleotide repeat motif of unknown nucleotides for B. impatiens.

140 APPENDIX V All loci were checked for issues including: monomorphic, significant null alleles, Hardy- Weinberg disequilibrium, and Linkage Disequilibrium as described in Chapter 3. Duplicate FS family members were removed based on results from Colony. setwd("/Users/...") library(hierfstat) #https://cran.r-project.org/web/packages/hierfstat/hierfstat.pdf library(adegenet) #http://adegenet.r-forge.r-project.org/files/tutorial-basics.pdf library("poppr") #https://cran.r-project.org/web/packages/poppr/vignettes/mlg.html

#Read in data file with duplicate family members removed (used BBEdit to convert excel files to .gen files) mydataf<-read.genepop("Bumble_Pop_Dyn.gen",ncode=3) #all years, individuals sorted into 30 fields mydatac<-read.genepop("Bumble_Pop_Dyn_County.gen",ncode=3) #all years, individuals sorted into 3 regions mydatay<-read.genepop("Bumble_Pop_Dyn_Year.gen",ncode=3) #seperated by years

#just basic info to ensure the correct files were read in summary(mydataf) summary(mydatac) summary(mydatay)

#double check that these new files include only samples with 7+ loci missingno(mydataf, "geno", cutoff =.37) missingno(mydatac, "geno", cutoff =.37)

#Create a file for each unique year, seperated by fields #2013 popNames(mydataf) toRemove = c(1, 2, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 19, 20, 21, 22, 23, 27, 28, 29, 30) mydataf13=mydataf[pop=-toRemove] popNames(mydataf13) #check its been removed

#2014 popNames(mydataf) toRemove = c(1, 2, 3, 4, 5, 6, 11, 12, 13, 14, 15, 16, 17, 18, 21, 22, 23, 24, 25, 26, 29, 30) mydataf14=mydataf[pop=-toRemove] popNames(mydataf14) #check its been removed

#2015 popNames(mydataf) toRemove = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 17, 18, 19, 20, 24, 25, 26, 27, 28) mydataf15=mydataf[pop=-toRemove] popNames(mydataf15) #check its been removed 141 #Read in files for each unique year, seperated by regions All13<-read.genepop("Bumble_Pop_Dyn_County2013.gen",ncode=3) summary(All13) #just checking some basic info to make sure we read in the correct file; mark the number of individuals All14<-read.genepop("Bumble_Pop_Dyn_County2014.gen",ncode=3) summary(All14) #just checking some basic info to make sure we read in the correct file; mark the number of individuals All15<-read.genepop("Bumble_Pop_Dyn_County2015.gen",ncode=3) summary(All15) #just checking some basic info to make sure we read in the correct file; mark the number of individuals

#######Genetic Differentiation Tests #######1. G-statistics ########## library("mmod") ###Testing - 2013 ###BY Field pop_stru<-diff_stats(mydataf13) pop_stru set.seed(41199) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(mydataf13, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(mydataf13) pairwise_Gst_Hedrick(mydataf13) pairwise_Gst_Nei(mydataf13)

###BY REGION pop_stru<-diff_stats(All13) pop_stru set.seed(41199) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(All13, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(All13) pairwise_Gst_Hedrick(All13) pairwise_Gst_Nei(All13)

142 ###Testing - 2014 ###BY Field pop_stru<-diff_stats(mydataf14) pop_stru set.seed(2012901) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(mydataf14, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(mydataf14) pairwise_Gst_Hedrick(mydataf14) pairwise_Gst_Nei(mydataf14)

###BY REGION pop_stru<-diff_stats(All14) pop_stru set.seed(41199) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(All14, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(All14) pairwise_Gst_Hedrick(All14) pairwise_Gst_Nei(All14)

###Testing - 2015 ###BY Field pop_stru<-diff_stats(mydataf15) pop_stru set.seed(20178832) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(mydataf15, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(mydataf15) pairwise_Gst_Hedrick(mydataf15) pairwise_Gst_Nei(mydataf15)

143 ###BY REGION pop_stru<-diff_stats(All15) pop_stru set.seed(45) # SET RANDOM SEED!! Be sure to set a seed for any random analysis! bs_repsf <- chao_bootstrap(All15, nreps = 1000) #generate bootstrap replicates summarise_bootstrap(bs_repsf, Gst_Nei) summarise_bootstrap(bs_repsf, Gst_Hedrick) summarise_bootstrap(bs_repsf, D_Jost) # Using the D_Jost function to summarize pairwise_D(All15) pairwise_Gst_Hedrick(All15) pairwise_Gst_Nei(All15)

#######2. AMOVA ########## library("poppr") ###2013 bombus_strata<-read.table("bombus_strata_2013.txt", header=TRUE, sep="\t") head(bombus_strata) strata(mydataf13)<-bombus_strata #stratify data based on the strata file i just created and read in #AMOVA in poppr: 3 levels of organization: County, site, individual res1 <- poppr.amova(mydataf13, ~County/Field, within = FALSE) #this takes forever to run res1 set.seed(244514) randtest(res1, nrepet = 99)

###2014 bombus_strata<-read.table("bombus_strata_2014.txt", header=TRUE, sep="\t") head(bombus_strata) strata(mydataf14)<-bombus_strata #stratify data based on the strata file i just created and read in #AMOVA in poppr: 3 levels of organization: County, site, individual res1 <- poppr.amova(mydataf14, ~County/Field, within = FALSE) #this takes forever to run res1 set.seed(244514) randtest(res1, nrepet = 99)

144 ###2015 bombus_strata<-read.table("bombus_strata_2015.txt", header=TRUE, sep="\t") head(bombus_strata) strata(mydataf15)<-bombus_strata #stratify data based on the strata file i just created and read in #AMOVA in poppr: 3 levels of organization: County, site, individual res1 <- poppr.amova(mydataf15, ~County/Field, within = FALSE) #this takes forever to run res1 set.seed(244514) randtest(res1, nrepet = 99)

############Genetic Diversity############# ######1. Expected Heterozygosity per year library("poppr") poppr(mydatay)

######2. Allelic Richness per year library(hierfstat) allelic.richness(mydatay, min.n = 100) #allelic richness per year rarified to a sameple size of 100 per year library("poppr") #number of private alleles per year private_alleles(mydatay) library(adegenet) #number of individuals per year and number of alleles per year summary(mydatay)

###########Inbreeding############## library(hierfstat) boot.ppfis(mydatay)

145 APPENDIX W Table W29. Pairwise genetic differentiation for site pairs within each year (2013, 2014, 2015). Maximum value across all years (bold) was less than 0.0069. Cells are colored according to value with the higher values shaded darker. 2013 Site 3 Site 5 Site 4 Site 6 Site 8 Site 7 Site 10 Site 9 Site 5 0.0021 Site 4 0.0050 0.0012 Site 6 0.0001 -0.0002 0.0040 Site 8 0.0018 -0.0010 0.0000 0.0010 Site 7 0.0026 0.0003 0.0044 0.0010 0.0013 Site 10 0.0015 -0.0008 0.0017 0.0002 0.0002 0.0009 Site 9 0.0028 0.0001 0.0025 0.0016 0.0001 0.0011 0.0011 Site 11 0.0002 0.0005 0.0026 -0.0003 -0.0002 0.0013 0.0007 0.0001

2014 Site 15 Site 13 Site 14 Site 12 Site 17 Site 16 Site 18 Site 13 0.00121 Site 14 -0.00004 0.00103 Site 12 0.00079 -0.00151 0.00074 Site 17 0.00294 0.00143 0.00557 0.00055 Site 16 0.00338 0.00292 0.00681 0.00379 0.00164 Site 18 0.00068 0.00078 0.00147 -0.00018 0.00152 0.00316 Site 19 -0.00014 0.00065 0.00284 -0.00064 0.00005 0.00145 -0.00048

2015 Site 24 Site 22 Site 21 Site 25 Site 23 Site 20 Site 27 Site 28 Site 26 Site 29 Site 22 0.0011 Site 21 0.0024 0.0001 Site 25 0.0018 0.0004 0.0017 Site 23 0.0003 -0.0006 0.0030 0.0000 Site 20 -0.0005 -0.0003 -0.0006 0.0015 0.0006 Site 27 0.0019 0.0005 0.0009 0.0007 0.0023 0.0010 Site 28 0.0021 0.0028 0.0021 0.0016 0.0023 0.0008 0.0000 Site 26 0.0009 -0.0005 -0.0011 -0.0012 0.0003 -0.0011 -0.0007 -0.0017 Site 29 0.0018 0.0014 0.0007 -0.0003 0.0017 0.0023 0.0015 0.0009 -0.0007 Site 30 0.0026 0.0010 0.0016 -0.0004 0.0017 0.0008 0.0002 0.0024 0.0011 0.0013

Table W30. Pairwise genetic differentiation for regional pairs within each year (2013, 2014, 2015). Maximum value across all years (bold) was less than 0.003. Cells are colored according to value with the higher values shaded darker.

2013 2014 2015 Region 1 Region 2 Region 1 Region 2 Region 1 Region 2 Region 2 0.00013 0.00291 0.00090 Region 3 0.00013 0.00002 0.00064 0.00118 0.00059 0.00081

146 APPENDIX X Protocol for collecting bumble bee foragers for colony abundance analyses.

FLEISCHER)LAB)PROTOCOLS:)Bumble)bee)Forager)collection,)pinning)and)storage)) V1:)Aug)2014) ! Bumble)Bee)Forager)Collection) ) Time)to)Complete:)~1#hr,#depending#on#forager#abundance#and###of#collectors) ) Materials) 20)ml)scintillation)vials)–)glass)or)plastic) Gallon)Ziploc)bags) Sharpie) Cooler)with)ice) 2)Tote)Bags)per)person) M20°C)freezer) ) Procedure) 1.! Collect)approximately)200)actively)foraging)Bombus#impatiens)workers)per)field) a.! carry)two)tote)bags)–)one)for)empty)vials)and)one)for)vials)with)collected)bees.)This)saves) time)when)searching)for)an)empty)vial)for)collection)and)reduces)accidently)trying)to)use)a) full)vial)and)releasing)already)collected)bees) 2.! Only)collect)individuals)that)are)actively)collecting)nectar)or)pollen)from)Pumpkin)flowers)) 3.! Place)a)scintillation)vial)over)the)actively)feeding)bee)and) shake)gently)so)the)bee)will)move)up)into)the)vial) 4.! Bring)the)cap)to)the)vial)in)the)flower)and)cap) immediately;)this)decreases)the)possibility)of)bees) escaping)from)vials) 5.! Do)not)collect)bees)in)the)same)area;)move)along)the)rows) all)throughout)the)pumpkin)field)to)collect)bees)across)the) entire)field;)walk)at)least)5)paces)between)bee)collections) 6.! Immediately)after)collection)foragers)from)field,)place)all) vials)from)a)single)field)in)a)Ziploc)bag) 7.! Using)a)permanent)marker,)label)the)Ziploc)bag)with)the) following)information:) •! field)name) •! Collection)Date) •! collector’s)names) •! length)of)collection)time) •! If)more)than)one)Ziploc)bag)is)necessary,)number)the)bags)1)of)2,)2)of)2) 8.! Place)bags)of)vials)containing)foragers)in)cooler)with)ice)for)transportation) 9.! Upon)returning)to)the)lab,)transfer)Ziplock)bags)to)M20°C)freezer)to)kill)specimen)and)preserve) DNA)until)pinning)and)snipping)protocol)(Note:)we)stored)bees)in)M20°C)freezer)2)years)before) further)processing)and)DNA)was)sufficiently)preserved)for)subsequent)molecular)work)) ) Clean)up:) M! Discard)ice)and)leave)coolers)open)to)airMdry) ) ) )

Page)1)of)3) )

147 APPENDIX Y Protocol for processing collected specimen for use in microsatellite analysis.

FLEISCHER)LAB)PROTOCOLS:)Bumble)bee)Forager)collection,)pinning)and)storage)) V1:)Aug)2014) ! Pinning)and)Snipping)Specimen) ) Time)to)Complete:)~20#mins#for#a#batch#of#20#bees) ) Materials) Cornell)pinning)box)) Plastic)wash)bottle)with)integrated)spout) 9”)Paper)strips)labeled)1)–)12)horizontally) 70%)Ethanol) 3.5”)Paper)strips)labeled)A)–)H)vertically) 96)Well)plate) Unit)Tray)with)foam)pinning)bottom)8.5”x7.5”) 96)Well)plate)caps) #2)Insect)Pins) 96)well)plate)freezer)block) Pinning)Block) Label)marker) Insect)Identification)Labels)–)locality,)species) Label)tape) Insect)pinning)forceps) M20°C)freezer) Curved)insect)dissecting)scissors) Insect)proof)storage)cabinets) Kim)wipes) ) Procedure) 1.! Number)each)Cornell)box)and)pin)6)sets)of)paper)strips) into)boxes)to)match)96)wellMplate)design) 2.! Place)96)well)plate)in)freezer)block)and)label)with)the) following)information:) Field#Name#Year# Box###Well#plate### initials# 3.! Print)&)cut)ample)quantities)of)locality)&)species)labels) 4.! Remove)only)20)vials)of)bees)from)the)freezer)at)a)time) (Processing)bees)in)batches)of)20)is)efficient)and)reduces)tissue’s) exposure)to)ambient)temperatures)to)preserve)DNA)for)future) molecular)work))) 5.! Uncap)vials)and)shake)bees)into)unit)tray) 6.! Use)pinning)block)and)insect)pins)to)mount)and)label) each)bee,)leaving)them)pinned)in)unit)tray) 7.! Once)all)20)bees)are)pinned,)begin)leg)clipping.)Use)forceps)to)hold)mid)rightMleg)away)from)the) body)and)clip)with)dissecting)scissors)(do)not)simply)tear)leg)off)because)you)might)end)up)with)a)big)chunk)of) internal)muscle)–)too)much)tissue)can)overwhelm)molecular)analysis)) 8.! Place)leg)in)a)96)well)plate)and)pin)the)bee)in)the)Cornell)box)in)corresponding)position)that) matches)well)number)(1M12))and)letter)(AMH)) 9.! Sanitize)dissecting)scissors)between)each)bee)with)a)squirt)of)70%)Ethanol)from)wash)bottle;)wipe) scissors)dry)with)Kim)wipe,)removing)any)leftover)tissue)or)hairs) 10.!Repeat)steps)7M9)for)each)bee) 11.!If)time)allows,)continue)with)the)next)batch)of)20)bees)and)repeat)steps)4)–)10.)) 12.!To)best)preserve)tissue)for)molecular)work,)place)96)well)plates)containing)legs)in)M20°C)freezer) during)steps)4)–)7)with)each)subsequent)batch.)Be)careful)not)to)jostle)or)dislodge)bee)legs)when) moving)well)plate) 13.!Sanitize)forceps)after)each)batch)to)reduce)buildup)of)pollen)from)bee)bodies) 14.!When)all)wells)contain)a)bee)leg,)cap)96)wellMplate)and)place)in)M20°C)freezer) )

Page)2)of)3) )

148 APPENDIX Z 6,222 workers (98.4% of 6,320 workers collected) were successfully genotyped at 7 or more loci; only 560 (9% of 6,222 successfully genotyped workers) were missing any data and only 102 (1.6% of 6,222) samples were missing data from more than 1 locus (Figure Z33).

Figure Z33. The smaller circle represents the total number of collected B. impatiens foragers, distinguishing between individuals removed from all analyses for failing to amplify in 5 or more loci (gray) and individuals included in subsequent analysis (dark green). The larger circle represents those 6,222 individuals included in subsequent analysis, categorizing individuals based on the number of loci that failed to amplify.

APPENDIX AA We ran Colony with the following parameters: Monogamous or polygamous mating for females, monogamous mating for males, without inbreeding, without clones, dioecious, haplodiploid, medium-length run, no sibship prior and unknown allele frequencies. We used genotyping error rates of 0.5-5% per loci based on results of rescoring 96 individuals and null allele estimates of 1- 4% per loci based on results from Microchecker reported in Chapter 3. Only sisterhoods with an inclusion probability of 0.8 or higher were accepted (precedent set in Carvell et al, 2017). To insure stable detected colony numbers, we ran Colony 3 times per site (each run with a different random number seed but with all other parameters kept equal) and used the mean of the three runs as the final detected colony number.

Reference Carvell, C., Bourke, A., Dreier, S., Freeman, S., Hulmes, S., & Jordan, W. et al. (2017). Bumblebee family lineage survival is enhanced in high-quality landscapes. Nature, 543(7646), 547-549. http://dx.doi.org/10.1038/nature21709

149 APPENDIX AB The TIRM method in Capwire estimated an Mean comparison of ECM & average of 543.7 ± 118.7 colonies per site, while TIRM colony estimates the ECM method estimated an average of 450.7 ± 118.4 colonies per site (Figure AB34). The 600 P = 0.004 543.7 ECM estimates were significantly more conservative (Figure AB34; ANOVA; F = 9.24, 500 450.7 Df =1, P =0.004), as the TIRM method provided estimates that were approximately 1.2 times 400 higher than ECM estimates (Figure AB35). We found that the ECM method was the more likely 300 model in 23 out of our 30 sites (77%) when using the likelihood ratio test. However, we elected to 200 use the TIRM model because (1) it may align more closely with biological reality, (2) all 100 previous studies have used the TIRM model (Goulson et al, 2010; Jha & Kremen, 2013;

Mean Estimates of Colony Abundance ofColony MeanEstimates 0 Wood et al, 2015), and (3) using the TIRM model ECM TIRM suggests a higher number of colonies to achieve Capwire Model the same visitation rate and we want safely estimate for the minimum number of colonies Figure AB34. Comparison between model mean estimates of colony abundance using ANOVA. required.

Comparison of TIRM and ECM Colony Estimates 1000 TIRM Max: 829 900 ECM Max: 760

Capwire) 800 700 600 500 400 300 200 TIRM TIRM Min: 291 100 ECM Min: 200 ECM Estimates of AbundanceEstimates ( Colony 0 Collection Sites

Figure AB35. Colony estimates provided by Capwire, ranked low to high for all 30 sites. Each site has two estimates, provided by the two different models, TIRM (light blue) and ECM (dark blue). The error bars represent the 95% confidence interval.

150 Chapter 5: Thesis Discussion

As previously discussed in Chapter 2, B. impatiens was the most frequently collected and most active pollinator in commercial pumpkin fields (Table 4; Figure 9). Bombus spp. accounted for one third of the total female flower visitation rates, with an estimated visit every 2m 21s (Table 7). Similar to previous studies, we found that the proportion of Bombus spp. visits to female flowers was up to twice as high as the proportion of female flowers observed, suggesting a preference for female flowers, which decreased as female flowers occurred further from the field edge. Nectar and pollen collecting behaviors were not specifically documented in our study; nevertheless most observers agree that Bombus spp. foragers appeared to be primarily nectar collecting when visiting pumpkin flowers. It is possible that Bombus spp. were collecting pumpkin pollen as well, although pumpkin pollen may lack appropriate macronutrient ratios. Vaudo et al, 2016 reported B. impatiens foraging for pollen with an optimal protein to lipid ratio of 5:1 and Treanor, 2017 reported pumpkin pollen contains only a 1.45:1 protein to lipid ratio. In microcolony experiments, B. impatiens workers fed only pumpkin pollen lost a significantly greater amount of weight compared with workers fed other diets (Treanor, 2017). Certain species of bees might also avoid pumpkin pollen due to secondary plant compounds. Plants in the Cucurbitaceae family, including C. p. pepo, contain cucurbitacin: defensive plant toxins found in the vegetative material to protect the plant from herbivory. Preliminary work from the Irwin lab at NCSU is finding that secondary plant compounds are found in higher concentration in pollen than nectar and at times, can equal levels found in vegetative material (Heiling, unpublished). Based on previous studies, we determined that a female Cucurbita flower needs 8 Bombus spp. visits to achieve adequate pollination, given the pollination efficiency of our most active bee taxa (Table 6, Table 7). Give the sheer abundance of Bombus spp. visitation rates and their preference for female flowers, we calculated that a single female flower was visited ~102 time by Bombus spp. (Table 7), suggesting that wild Bombus spp populations are providing almost 13x necessary pollination services for commercial pumpkins in Pennsylvania. Excessive Bombus spp. pollination services in Cucurbita agroecosystems have been reported previously. Pfister et al, 2017 used modeling to determine that only 11% of Bombus spp pollination activity was necessary to adequately pollinate Cucurbita maxima cv “Hokkaido” pumpkins in Germany. Julier & Roulston, 2009 reported 3.1 B. impatiens foragers in Cucurbita flowers every minute. Additional previous studies have reported a single female flower receiving ~19 Bombus spp. visits in a lifetime (Artz & 151 Nault, 2011; Pfister et al, 2017). Even so, pollination services supplied by Bombus spp. in our study appear to be greater than other studies in surrounding areas. These differences could be artificial, I believe our results represent actual differences in visitation rates, with our system experiencing substantially more visits from Bombus spp. than other systems because of larger wild populations. Although many studies report intense native bee activity, the idea that native Bombus spp. populations are more abundant in Pennsylvania should encourage efforts to conduct context-specific research, even in closely related systems. Furthermore, Bombus spp visitation rates increased with increasing male flower floral density (Figure 12). This suggests C. p. pepo blooms act as a mass floral resource and attracts bee foragers. Increased visitation rates in response to increasing floral densities also suggests that pollinator populations in our current agroecosystems are large enough to keep up and even increase visitation rates in the face of additional flowers, as opposed to becoming diluted. Bombus spp visitation rates also had discernable relationship with certain pumpkin yield metrics in certain years. Seed set was positively influenced by Bombus spp. visits in 2013. Pumpkin weight was positively affected by Bombus spp. visits to male flowers in 2013 and female flowers in 2014 (Figure 19). A causal relationship between visitation rates and pumpkin weight would be mediated by increased seed set, as seen in 2013. In 2014 where there is a lack of relationship between Bombus spp visitation rates and seed set, it is possible that the relationship with pumpkin weight is a correlation, rather than a causal relationship. Both pumpkin weight and Bombus foragers may simultaneously be responding to horticultural conditions. While pumpkin weight is in part due to seed (Figure 16), the plant must also have adequate resources to grow large fruit. Therefore, when horticultural conditions are sufficient to provide plants with resources to grow larger fruit, it may also produce better floral resources – namely more nectar. Artz et al, 2011 demonstrated that in better horticultural conditions, pumpkin plants produced greater nectar volumes and more sugary nectar. If Bombus spp respond to this improved floral reward, then both Bombus visits and pumpkin weight would respond positively to horticultural conditions. It’s possible that increasing Bombus visitation rates independently results in heavier pumpkins, but more studies are needed to understand if relationships between native bee visitation rates and yield metrics are causal or correlated in response to horticulture conditions. When taking pollination efficiency into account, Bombus spp., (who deposit 3x-6x pollen grains per visit compared to A. mellifera or P. pruinosa), are likely the most valuable pollinator in Cucurbita agroecosystems in PA – a theme purported in other studies (Artz & Nault, 2011) The findings outlined above (excessive pollination services form Bombus spp., populations responding positively to increased floral resources and positive relationships with pumpkin yield) are 152 promising and may be cause to discontinue the use of managed pollinators. However, there are certain variables that impact native bee visitation rates, which should be considered when making pollination management decisions. Overall, we found that visitation rates decreased ever so slightly as distance from the field edge increased; however, it was not a strong relationship (Figure 15). But recall that we did find that Bombus spp. preference for female flowers were affected by spatial dynamics: as distance from field edge increased, preference for female flowers decreased (Figure 10). This could be due in large part to plant structure and flower placement. Female flowers, located close to the ground, can be obscured by leafy vegetation. Pumpkin plants tend to get more lush and vegetative as distance from field edge increases – which could conceal female flowers and make female flower foraging more energy intensive for the pollinator. If male flowers which occur in much higher numbers provided adequate nectar resources, it could have been too high a fitness cost for bees to exert additional effort seeking out female flowers. We noticed Bombus spp. foragers flying awkwardly through dense foliage, often bumping into spiky pumpkin stems when trying to reach flowers amongst thick vegetation. Even with these spatial dynamics at play, female flowers 100meters from the edge were still visited at a relatively high rate of 0.7 + 0.11 SE bee visits per flower per 45s. If this decreasing trend in visitation rates continues at distances even greater than 100m from the edge, eventually there could be a negative effect on production objectives in certain field layouts. Any square field larger than 4 ha (200m L x 200m W) or circle fields larger than 3.14 ha (100m radius) could begin to experience yield issues towards the center. It is interesting to note, however, that cultivation practices in Pennsylvania often tend to follow contours in hilly landscapes, resulting in a large edge-to-area ratio. This agricultural practice, typically implemented by farmers for soil conservation goals, may be helping ensure pollination services in our agroecosystems. Temporal dynamics across the season effected Bombus spp. visitation rates, which increased throughout the season (Figure 14B), exhibiting a pattern similar to previous studies (Julier & Roulston, 2009). Unlike managed pollinators used in this system, an increase in Bombus spp. visitation rates was not due to increasing colonies because our grower collaborators did not stock commercial bumble bees. Instead, all Bombus spp visitation rates were supplied by wild populations. The greatest number of Bombus colonies will be in early spring when over- wintering queens first emerge and found colonies. Over time, colonies will fail due to lack of resources, parasitism, predation or disease and thus throughout the season, colony numbers are inevitably decreasing (Goulson, 2010). However, the colonies that do persist are growing in size as the queen continuously lays eggs and additional workers emerge. B. impatiens, the most 153 common Bombus species encountered in our study, is estimated to contain between 25 – 450 workers, the largest of which were reported later in the season (Plath, 1934). Therefore, we believe that Bombus spp visitation rates increased throughout the season due to the increasing size, rather than number, of colonies. If growers hope to rely on native Bombus spp for pollination services, this could be worrying, depending on the number of colonies nesting within foraging distance. If Bombus foragers originate from a few large colonies, pollination services could be vulnerable to the loss of a few key colonies. Therefore, to better understand the reliability of Bombus spp. and their pollination services, we carried our additional research in Chapter 4 to estimate the abundance and stability of Bombus spp. colonies in this region. To estimated colony abundance, we used genetic techniques coupled with statistical models because locating actual colonies in the environment is tricky. To obtain genetically-derived colony estimates, we had to first collect and genotype foragers using microsatellites. Chapter 3 was dedicated to thoroughly evaluating non-species-specific loci for use in Bombus impatiens and optimizing an 11-primer multi-plex for reliable and accurate genetic data. As previously discussed in Chapter 4, We estimated that between 291 – 829 B. impatiens colonies provide foragers to commercial pumpkin fields in our study area (Table 23, Figure 31A). This suggests an abundant B. impatiens population nesting in the surrounding landscape within foraging distance of these fields. For the first time, the stability of wild bumble bee colonies was examined across short time spans and geographic ranges. Average colony abundance per field (~540) did not significantly vary between 3 distinct regions spaced out over 5,000 km2 (Figure 30B), nor did colony abundance per field change over the course of 4 years (Figure 30A); if anything, a slight positive yearly trend may exist. No other study that we know of has sampled repeatedly in distinct regions to evaluate colony dynamics across time and space. Recent temporal and spatial trends in B. impatiens colony abundance per field suggest wild populations are abundant and stable in central Pennsylvania and therefore, can be relied upon to provide pollination services in managed and native ecosystems alike, given the current environmental conditions. As global climate and land use patterns are altered, primarily by anthropogenic forces, the genetic capacity to respond to selective pressures may predict a species potential resilience. Our study strongly suggests that a key pollinator in the eastern United States, B. impatiens, is a genetically diverse, panmictic population characterized by high, stable colony abundances. These findings bode well for the natural and agroecosystems that rely on B. impatiens for pollination services, including the commercial pumpkin fields studied in chapter 2. It was previously unknown if the critically important forager force originated from many colonies or few colonies. 154 Here, we present empirical evidence of high colony abundances exhibiting temporal and spatial stability, which suggest a resilient forager abundance that will not be diminished by random losses of a few colonies. Furthermore, in central Pennsylvania, it appears that pumpkin agroecosystems provide floral resources that can support many wild B. impatiens colonies. Increasing sizes of nectar-rich mass-flowering crops appears unrelated to B. impatiens colony abundance (Figure 31A), neither acting as limiting factor nor promoting populations. However, the colony abundance per hectare decreases as mass-flowering crops field size increases. Given this independence between field area and colony abundance per field, it is reasonable to expect that colony abundance per hectare would decrease as fields get larger. This is, in fact, exactly what we found: colony abundance per hectare declines in an apparently exponential fashion across larger and larger fields (Figure 31B). Correspondingly, field size also had a negative effect on visitation rates (Figure 31C). These findings support the ‘landscape- moderated concentration and dilution hypothesis’ articulated in Tscherntke et al, 2012, that suggests pollinators may be diluted across increasingly larger agricultural floral resources, which would result in a decrease of pollination services without negatively impacting pollinator populations. Even though B. impatiens pollination services are diluted across large commercial pumpkin fields, current populations can provide ~2.5x the required visits in even the worse-case scenarios. Furthermore, there is evidence to suggest that genetic structuring is not increasing and genetic diversity is not decreasing with successive years. Therefore, this wild pollinator and the economically valuable pollination services they provide, are likely a resilient force that will continue to persist in the foreseeable future. This study provides a robust base line on which to build for long term monitoring B. impatiens populations as environmental conditions change in the future.

REFERENCES

Artz, D. R., Hsu, C. L., & Nault, B. A. (2011). Influence of Honey Bee, Apis mellifera, Hives and Field Size on Foraging Activity of Native Bee Species in Pumpkin Fields. Environmental Entomology,40(5), 1144-1158. doi:10.1603/en10218 Artz, D. R., & Nault, B. A. (2011). Performance of Apis mellifera, Bombus impatiens, and Peponapis pruinosa (Hymenoptera: Apidae) as Pollinators of Pumpkin. Journal of Economic Entomology,104(4), 1153-1161. doi:10.1603/ec10431

155 Goulson, D. (2010). Bumblebees behaviour, ecology, and conservation (2nd ed.). Oxford: Oxford Univ. Press. Julier, H. E., & Roulston, T. H. (2009). Wild Bee Abundance and Pollination Service in Cultivated Pumpkins: Farm Management, Nesting Behavior and Landscape Effects. Journal of Economic Entomology,102(2), 563-573. doi:10.1603/029.102.0214 Pfister, S. C., Eckerter, P. W., Schirmel, J., Cresswell, J. E., & Entling, M. H. (2017). Sensitivity of commercial pumpkin yield to potential decline among different groups of pollinating bees. Royal Society Open Science,4(5), 170102. doi:10.1098/rsos.170102 Plath, O. E. (1934). Bumblebees and their ways. New York: The Macmillan Company. Treanor, E. 2017. Supporting Bombus and other bees in Cucurbita agroecosystems. M.S. thesis. Pennsylvania State University. Tscharntke, T., Tylianakis, J. M., Rand, T. A., Didham, R. K., Fahrig, L., Batáry, P., . . . Westphal, C. (2012). Landscape moderation of biodiversity patterns and processes - eight hypotheses. Biological Reviews, 87(3), 661-685. doi:10.1111/j.1469-185x.2011.00216.x Vaudo, A. D., Patch, H. M., Mortensen, D. A., Tooker, J. F., & Grozinger, C. M. (2016). Macronutrient ratios in pollen shape bumble bee (Bombus impatiens) foraging strategies and floral preferences. Proceedings of the National Academy of Sciences,113(28). doi:10.1073/pnas.1606101113

156