!"#$%&'()*+$),$-.*/+$0##$1&2#*'&,&3.(&)4$0.'#5$)4$6&2#$7#4#'$8/9'$:)*;")/)<+ =9(")*>'?@$0*+.4$AB$1.4,)*("C$D#5)4&.$D&;#'C$E#44&,#*$6.4N3(B$JKC$OKKP?C$;;B$JQJJRSJQJOL 89T/&'"#5$T+@$National Academy of Sciences D(.T/#$HUV@$http://www.jstor.org/stable/30051512 =33#''#5@$JMWKXWOKJK$JM@QJ

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/action/showPublisher?publisherCode=nas.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

National Academy of Sciences is collaborating with JSTOR to digitize, preserve and extend access to Proceedings of the National Academy of Sciences of the United States of America.

http://www.jstor.org The history of early diversificationbased on five genes plus morphology Bryan N. Danfortht, Sedonia Sipes*, Jennifer FangS, and Sean G. Brady'

Department of Entomology, 3119 Comstock Hall, Cornell University, Ithaca, NY 14853

Communicated by Charles D. Michener, University of Kansas, Lawrence, KS, May 16, 2006 (received for review February 2, 2006) , the largest (>16,000 species) and most important radiationof tongued (LT) bee families Megachilidaeand Apidae and the pollinating , originated in early to mid-Cretaceous,roughly short-tongued(ST) bee familiesColletidae, Stenotritidae, Andreni- in synchrony with the angiosperms (flowering plants). Under- dae, Halictidae,and Melittidaesensu lato (s.l.)II(9). Colletidaeis standing the diversification of the bees and the coevolutionary widelyconsidered the mostbasal family of bees (i.e.,the sister group history of bees and angiosperms requires a well supported phy- to the restof the bees),because all femalesand most males possess logeny of bees (as well as angiosperms). We reconstructeda robust a glossa(tongue) with a bifid(forked) apex, much like the glossaof phylogeny of bees at the family and subfamily levels using a data an apoidwasp (18-22). set of five genes (4,299 nucleotide sites) plus morphology (109 However,several authors have questioned this interpretation (9, characters).The molecular data set included protein coding (elon- 23-27) and have hypothesizedthat the earliestbranches of bee gation factor-la, RNApolymerase II,and LWrhodopsin), as well as phylogenymay have been either Melittidaes.l., LT bees, or a ribosomal (285 and 185) nuclear gene data. Analyses of both the monophyleticgroup consisting of both. The most recentmorpho- DNA data set and the DNA+morphology data set by parsimony logicalanalysis of family-levelphylogeny in bees (17) obtainedtwo and Bayesian methods yielded a single well supported family-level differenttree topologiesbased on alternativecoding of relatively tree topology that places Melittidae as a paraphyleticgroup at the few mouthpartcharacters. One tree topologyplaces as base of the phylogeny of bees. This topology ("Melittidae-LT sisterto the restof the bees ("Colletidaebasal"), whereas the other basal") is significantly better than a previously proposed alterna- places Melittidaes.l.+LT bees as sister to the rest of the bees tive topology ("Colletidae basal") based both on likelihood and ("Melittidae-LTbasal"). The majordifference between the Col- Bayesian methods. Our results have important implications for letidaebasal and Melittidae-LT basal topologies involves the place- understanding the early diversification, historical biogeography, ment of the root node of bees (27). Placingthe root between host-plant evolution, and fossil record of bees. The earliest Colletidaeand the rest of the bees yields the Colletidaebasal branches of bee phylogeny include lineages that are predomi- topology,whereas placementof the root node near or within nantly host-plant specialists, suggesting that host-plant specificity Melittidaes.l. yieldsthe Melittidae-LTbasal topology. The biolog- is an ancestral trait in bees. Our results suggest an African origin icalimplications of these alternativetopologies are radicallydiffer- for bees, because the earliest branches of the tree include pre- ent. The Colletidaebasal topologyimplies an Australianand/or dominantly African lineages. These results also help explain the SouthAmerican origin for bees andsuggests the earliestbees were predominance of Melittidae, Apidae, and Megachilidae among the a mix of floral generalistsand specialists.Melittidae-LT basal earliest fossil bees. impliesan African origin for bees andindicates that the earliestbees were likelyto havebeen floralspecialists. These alternativetopol- bee phylogeny I bee evolutionI molecularevolution I ogiesalso have implications for understandingthe fossilrecord and molecularsystematics I coevolution antiquityof bees. To resolvethe root node of bees, we combined>4,000 bp of (floweringplants), with an estimated250,000- DNA sequencedata with the previousmorphological data set of Angiosperms260,000 species (1), representthe largestand most diverse Alexanderand Michener(17). We reporthere the resultsof an lineage of vascular plants on earth. To Darwin, the rapid analysisof the largestmolecular and morphological study to dateon emergence and early diversificationof the angiospermswas an "abominablemystery" (ref. 2 andrefs. therein).Among the most traits attributableto Author contributions: B.N.D. and 5.5. designed research; B.N.D., 5.5., J.F., and S..B. important the explosive radiationof the performed research; S.G.B. contributed new reagents/analytical tools; B.N.D. analyzed angiospermsis -mediatedpollination (3-7). Insectsare by data; and B.N.D. wrote the paper. far the most importantanimal pollinators ("70% of angiosperm The authors declare no conflict of interest. species are pollinated;ref. 8) and amonginsects, bees are Freely available online through the PNAS open access option. the most specializedand All of the importantpollinator group. Abbreviations: ST, short-tongued; LT, long-tongued; GTR, general time-reversible; I+G, >16,000 speciesof bees livingtoday (9) relyvirtually exclusively gamma distribution plus a proportion of invariant sites; s.l., sensu lato; s.s., sensu stricto. on angiospermproducts, including pollen and nectar for adult Data deposition: The sequences reported in this paper have been deposited in the GenBank and larvalnutrition (10), floral oils for larvalnutrition (11, 12), database (accession nos. listed in Table 4, which is published as supporting information on floral waxes and perfumesthat serve as sexual attractants(13), the PNAS web site). The data matrix has been deposited in the TreeBASE database, and resins for nest construction(14). Bees are morphologically www.treebase.org (accession nos. M2878 and 51599). adaptedto collecting,manipulating, carrying, and storingpollen tTo whom correspondence should be addressed. E-mail: [email protected]. and other plant products (15, 16), and many bee species are *Present address: Department of Plant Biology, Life Science 11420, Mailcode 6509, Southern specialistson one or a few closely related host plants (10). Illinois University, Carbondale, IL62901-6509. One toward Darwin's"abominable is SPresent address: Department of Physiology, P.O. Box 245051, University of Arizona, step resolving mystery" to AZ 85724-5051. developa betterunderstanding of the role thatbees playedin the Tucson, and diversificationof the A qPresent address: Department of Entomology and Laboratories of Analytical Biology, evolutionaryhistory angiosperms. Smithsonian Institution, Suitland, MD 20746. robustphylogeny of bees wouldallow us to infer attributesof the IlMelittidae in the sense of Michener (9) is a paraphyletic group based on our results. We earlybees and to reconstructthe typesof interactionsthat existed refer to the three melittid subfamilies as families, following an earlier suggestion by betweenthe earliestbees andtheir angiosperm hosts. Higher-level Alexander and Michener (17). The three families are Melittidae (s.s.), Dasypodaidae, and (family-and subfamily-level)bee phylogenyis poorlyunderstood. Meganomiidae. Currently,bees are dividedinto seven extantfamilies: the long- C 2006 by The National Academy of Sciences of the USA

15118-15123 I PNAS I October 10,2006 I vol.103 I no. 41 www.pnas.org/cgi/doi/10.1073/pnas.0604033103 Cerceris sp. 77 Stigmus sp. 69 95 Clypeadon sp. 86 I Philanthus gibbosus 80 Ochleroptera bipunctata 100 Bicyrtes ventralis 100 0 Bembix americana 979-83 100 Sceliphron caementarium eo mexicana 997 100 199Isodontia10Podalonia sp. 51 79 61 Anacrabro ocellatus 8 99 Plenoculus sp. 51 99 Tachysphex sp. 100 96 100 Hesperapis richtersveldensis 100 Hesperapislarreae 100 100 1 00 Hesperapis regularis Dasypodaidae 100 75 1 Haplomelitta griseonigra 100 Dasypoda argentata 68 100 Dasypoda hirtipes 74 Dasypoda visnaga Meganomia binghami Meganomiidae 99 100 Macropis europaea 98 77 100 Macropisnuda 100" Rediviva mcgregori 88 100 100 Redivivoides simulans 100 100 Melitta eickworti Melittidae s.s. 100 79 Melittaarrogans 7 100 Melittaleporina (A) 100 Melittaleporina (B) 100 Fideliopsis idium oblongatum Bees 100 98 66 Chelostoma fuliginosum Megachilidae 96 Lithurgusechinocacti 86 100 Megachile pugnata 90 84 100 Anthophora montana 100 Pachymelus peringueyi 94 98 Centris rhodopus 72 Ctenoplectra albolimbata 92 98 781 Apis mellifera z Leiopodus singularis 0 56 Ceratina calcarata Apidae 58 Holcopasites ruthae 0 LT bees59 Paranomada velutina L 73 Triepeolusrobustus ii 78100 Thyreusdelumbatus 100 Zacosmia maculata 86 Protoxaea gloriosa - 100 Megandrena enceliae 99 1000 56 AndrenaAlocandrena brooksi porteri 99 100 Melitturgaclavicomis 100 100 Panurgus calcaratus 100 100 -- Calliopsis pugionis 100 Calliopsisanthidia 100 Calliopsis fracta Dieunomia nevadensis 87 98 Pseudapis unidentata S98 facilis 9 100 8Nomioides91 Augochlorella pomoniella 72 97 Agapostemon tyleri 100 95 Halictus rubicundus Conanthalictus wilmattae j Halictidae 98 Penapis penai 88 65 Xeralictus bicuspidariae 91" 6 100 60 Rophites algirus 90 100 Dufourea mulledn 90s 72 Systropha curvicomis Stenotritus sp. Diphaglossa Stenotritidae S OutgroupsOt s100 ou 100 Caupolicanagayi vestita 98 o100 100 Caupolicana yarrowi S "Melittidae" 99 90 Callomelittaantipodes Colletes inaequalis 100 100 Colletes skinned Trichocolletes sp. S LT bees 100 100 Leiproctus bathycyaneus 851 57 100 100 Leioproctus bathycyaneus 852 100 Leioproctus fimbriatinus S Andrenidae oo100100oo Leioproctusperfasciatus s 100 100 Leioproctus irroratus Colletidae 87 100 Leioproctus plumosus Haicide78 78 Scraptererubescens Halictidae 100 Scrapteraniger 100 100 Scrapter heterodoxus 100 IZ Colletidae 57 93 ScrapterrChilimelissa ficomis rozeni Chilicola styliventris Xanthesma furcifera Stenotritidae Euryglossacalliopsella Euryglossina globuliceps 100 Hylaeus proximus - 100 67 amiculus 50 changes 56 HylaeusHylaeuselegans

Fig. 1. Equal weights of parsimony analysis of five genes combined (with gaps coded as a fifth state) plus morphology. Bootstrap values above the nodes are for the DNA+morphology analysis; bootstrap values below the nodes are for the analysis of the DNA data alone. Families are color-coded. The clade united by the presence of a hemicryptic midcoxa is indicated by the black dot. bee family-level phylogeny. Our analyses provide insights into the total aligned nucleotide sites, 438 parsimony informative sites); phylogeny, historical biogeography, host-plant associations, and RNA polymerase II (889 total aligned nucleotide sites, 300 parsi- fossil record of bees. mony informative sites); LW rhodopsin (716 total aligned nucleo- tide sites, 362 parsimony informative sites); 28S rDNA (772 total Results aligned nucleotide sites, 448 parsimony informative sites); and 18S Our data set consisted of a total of 4,299 total aligned nucleotide rDNA (781 total aligned nucleotide sites, 100 parsimony informa- sites {1,648 parsimonyinformative sites [elongation factor-la; 1,141 tive sites)] plus 109 morphological characters (104 parsimony

J Danforth et al. PNAS I October10,2006 vo1.103 I no. 41 I 15119 Table 1. Summaryof support measures (parsimony bootstrap values and Bayesian posterior probabilities) for different bee lineages Parsimony Parsimony K2P+ i + G HKY+ I+ G GTR+ G GTR+ I + G GTR+SYM Clade DNA data + morphology post. prob. post. prob. post. prob. post. prob. post. prob. Apiformes (bees) 96 100 100 100 100 100 100 Dasypodaidae 100 100 100 100 100 100 100 Melittidae (s.s.) 88 77 100 100 100 100 100 Melittidae + Meganomiidae 98 99 100 100 100 100 100 Melittidae/Meganomiidae + all others 90 86 52 61 83 81 78 LTbees 84 100 100 100 100 100 100 Megachilidae 100 100 100 100 100 100 100 Apidae 92 94 100 100 100 100 100 LTbees + Andrenidae + Halictidae + 86 78 98 100 100 100 98 Stenotritidae + Colletidae Andrenidae 99 99 100 100 100 100 100 100 100 100 100 100 100 100 Panurginae 100 100 100 100 100 100 100 Panurginae + Oxaeinae NA NA 100 100 100 98 100 Andrenidae + Halictidae + 64 87 100 100 100 100 100 Stenotritidae + Colletidae Halictidae 100 100 100 100 100 100 100 Halictinae 72 79 100 100 100 100 100 Rophitinae 100 100 100 100 100 100 100 Halictidae + Stenotritidae + Colletidae 90 91 100 100 100 100 100 Stenotritidae + Colletidae 99 98 100 100 100 100 100 Colletidae 100 100 100 100 100 100 100 Arithmeticmean - In(burnin = 2000) -74,130.22 -74,011.09 -74,569.35 -73,940.96 -73,936.85 Harmonicmean - In(burnin = 2000) -74,192.73 -74,076.42 -74,637.03 -74,006.94 -74,005.60

Posteriorprobabilities (post. prob.) calculated based on the last 8,000 trees from each analysis.K2P, Kimura two-parameter; HKY, Hasegawa-Kishino-Yano; NA, not applicable. informativecharacters; see Table2, whichis publishedas support- set with the model preferredby MrModelTestVer. 2.2 [general ing informationon the PNAS web site)}. Intronsfor both EF-la time reversible(GTR)+SYM[18s] + gammadistribution plus a (two introns)and opsin (three introns)were excludedfrom the proportionof invariantsites (I+G)] as well as the most complex analysis,because alignments were ambiguous. model (GTR+I+G with separate gamma distributionsand a separateproportion of invariantsites for each gene) yieldedwell ParsimonyAnalyses. DNAdata. When the combined five-gene data supportedtrees (Fig. 2). All families(excluding Melittidae s.l.) are set was analyzedby equal-weightparsimony, we obtained one supportedby posteriorprobabilities of 100%.Relationships among tree of 17,284steps. This tree recoveredmonophyly of the bees the familiesare identicalto the parsimonyresults, with Melittidae and all bee families excludingMelittidae s.l. (Fig. 1, Table 1). s.l. forminga paraphyleticassemblage from whichthe other bees Melittidaes.l. appearsas a basal paraphyleticgroup relativeto arose.Most basalnodes in the tree are well supported,although the other bees (Fig. 1). The basal branchof the bees appearsto monophylyof the bees, excludingDasypodaidae, is not strongly be the "melittid" subfamily Dasypodainae (Dasypodaidae). supportedin the Bayesiananalyses (Fig. 2, Table1). Analyseswith Bootstrapanalysis indicates that monophylyof severalfamilies is alternativemodels (Table 1) yieldedlargely congruent results. Our well our supported by data, including Megachilidae (100% resultsstrongly support the Melittidae-LTbasal hypothesis. None Andreni- bootstrapsupport), Apidae (92% bootstrapsupport), of the multipleanalyses of the combineddata sets or any analyses dae (99% bootstrapsupport), Colletidae (100% bootstrap sup- of the individual data sets the Colletidaebasal and Halictidae gene supported port), (100% bootstrapsupport). Relationships hypothesis. among the ST bee families (excludingMelittidae s.l.) were also well Stenotritidaeis sisterto Colleti- supported. unambiguously HypothesisTesting Using MaximumLikelihood and the Bayes Factor. dae (99%bootstrap support), Halictidae forms the sister group Using the Kishino-Hasegawa(28, 29) and Shimodaira-Hasegawa to Stenotritidae+Colletidae(90% bootstrapsupport), and An- drenidae forms the sister to these three families (30) tests, as implementedin PAUP*, we detected significant group (64% for the Melittidae-LTbasal overthe Colletidae bootstrap support). Overall, the parsimonyresults support a support hypothesis basal whichis as infor- highly derivedposition for Colletidae and a basal position for hypothesis(Table 3, published supporting on web in Melittidae s.l. Manipulationsof the outgroup topology (e.g., mation the PNAS site).The difference -In likelihoodfor constrainingbees to be the sister group to Crabronidae)and the two testswas 44.04,and the Colletidae-basalhypothesis could exclusionof outgroupsdid not alterthe topologywithin the bees. be rejectedwith P < 0.05. DNAdata plus morphology. Addition of morphologyto the molecular Using the constraint option in MrBayes Ver. 3.1.2, we data set did not alterthe relationshipsamong families obtained in calculatedthe harmonicmean of the -lIn likelihood values for the combinedmolecular data set (Fig. 1). However,inclusion of an unconstrained tree (Melittidae-LT basal) and the tree morphologicaldata did alter relationships among the subfamiliesof constrained to Colletidae-basal.The harmonic mean of the Colletidae.Examination of bootstrapvalues (Fig. 1, Table 1) -In likelihood of the last 8,000 trees from the constrained indicatesthat morphologyincreases overall levels of bootstrap analysis(Colletidae-basal) was -74,052.54. The corresponding supportin the tree. In particular,morphology adds support to LT harmonicmean for the unconstrainedanalysis (Melittidae-LT bee monophyly. basal) was -74,005.60; twice the difference is 85.64. A value of 6-10 is strong support, and a value of >10 is very strong BayesianAnalyses. Results of the Bayesiananalyses were largely support for the alternative model (31). Our results provide congruentwith the parsimonyresults (Fig. 2). Analysisof the data very strong support for the Melittidae-LT basal hypothesis.

15120 I www.pnas.org/cgi/doi/10.1073/pnas.0604033103 Danforthet al. 99 I Cerceris sp. 100 71 98 Stigmus sp. 70 Clypeadon sp. 100 Philanthus gibbosus 1 Ochleroptera bipunctata 10 100 100 ! 100 Bicyrtes ventralis 00 100 Bembix americana 1 0 100 Xerostictia sp. 100 100 100 Sceliphron caementarium 100110 0 Isodontia mexicana 100100 Podalonia sp. 100 1 Anacrabro ocellatus 100 10 Plenoculus sp. 100 Tachysphex sp. 100 0 100 Hesperapis richtersveldensis 100 100 1 Hesperapis larreae 100 Hesperapis regularis 100 78 Haplomelittagriseonigra 79 Dasypoda argentata Dasypodaidae 100 0 Dasypoda hirtipes 100 Dasypoda visnaga Meganomia binghami 100 19 Macropis europaea Meganomiidae 100 1 100 Macropis nuda Rediviva 10 mcgregorl 100 1 19 1 Redivivoidessimulans Melittaeickworti ( s.s. 189 Melittaarrogans Melittidae 100 9 Melittaleporina (A) 100 Melittaleporina (B) 100 Fideliopsis major 100 100 Lithurgusechinocacti 100 100 Megachile pugnata Bees 57 78 100 Anthidiumoblongatum Megachilidae 100 55 Chelostoma fuliginosum 81 72 Ceratina calcarata 100 76 100 Anthophora montana 100 Pachymelus peringueyi 100 100 1 Centris rhodopus albolimbata 100 100 Ctenoplectra z 100 Apis mellifera 0 100 Leiopodussingularis 100 100 Thyreus delumbatus Apidae 100 LTbees 100 59 Zacosmia maculata L0uJ 59 100 Holcopasites ruthae 100 100 88 Paranomada velutina 85 Triepeolus robustus 100 100 Megandrena enceliae 100 100 Alocandrena porteri 100 85 brooksi 100 Protoxaea gloriosa 98 100 Melitturgaclavicomis 98 100 100 Panurgus Andrenidae 100 100 Calliopsis pugioniscalcarata 100 100 Calliopsis anthidia 100 Calliopsis fracta 100 Dieunomia nevadensis 100 Pseudapis unidentata 100o 100 100 90 Nomioidesfacilis 9 S 100 pomoniella 100 100 100 Agapostemon tyleri 100 rAugochlorella100 Halictus rubicundus 100 r 80 [ PenapisConanthalictus penal wilmattae Halictidae 81 Outgroups Xeralictus biscuspidariae 100 100 77 100 71 Rophites algirus Mt100 100 Dufourea mulled S "Melittidae" 7 99 Systropha curvicomis 199 Stenotritus sp. Stenotritidae 100 Diphaglossa gayi LTbees 1 100 Caupolicana vestita S 100 Caupolicana yarrowi 100 Trichocolletes sp. 100 100 Leioproctus bathycyaneus S Andrenidae 100 100 100 100 100 Leioproctus bathycyaneus 100 Leioproctus fimbriatinus 100 100 100 Leioproctus perfasciatus 100 100 Leioproctusirroratus SHalictidae O 100 Leioproctus plumosus 100 100 Callomelittaantipodes Colletidae 100 100 Colletes inaequalis Colletidae 100 Colletes skinned 98 100 Chilimelissa rozeni Stenotritidae 97 100 100 Chilicola styliventris Se100 100 88 Hylaeus proximus 100 100 Hylaeus amiculus 100 1 70 Hylaeus elegans 100 100 Euryglossa calliopsella 0 Euryglossina globuliceps 100 100 5554 Xanthesma furcifera 100 100 Scrapter erubescens 100 100 Scrapter niger 100 100 Scrapter heterodoxus 100 Scrapter ruficomis

Fig. 2. Bayesiananalysis of five genes combined. Posterior probabilitiesfor the GTR+SYM[18s]+I+Gmodel (the preferred model) are shown above the branches,and posteriorprobabilities for the GTR+I+Gmodel are shown below the branches.Bayesian posterior probabilitiesare based on the last 8,000 trees of each analysis.Families are color-coded.The clade united by the presence of a hemicrypticmidcoxa is indicated by the black dot.

This conclusion is consistent with the maximum likelihood see Fig. 3, which is publishedas supportinginformation on the results (above). PNAS web site). The Colletidaebasal topology is more widely acceptedbecause of the perceptionthat the bifidglossa of Colleti- Discussion dae is a plesiomorphictrait shared with apoidwasps. This hypoth- This studyestablishes phylogenetic relationships among bee fami- esis appearsin numerouspublications but is rarelysupported by lies and subfamilieswith highlevels of support.Our results unam- charactersother than the overallappearance of the glossa(18-22). biguouslysupport the Melittidae-LTbasal topology (Figs. 1 and 2; Althougha numberof morphologicalstudies have questionedthe

Danforthetal. PNAS I October10,2006 I vo1.103 I no.41 I 15121 Colletidaebasal hypothesis (23-26), the morphologicalsupport for describedgenera are LT bees (Apidaeand Megachilidae;ref. 22). the Melittidae-LTbasal hypothesis has been largelyoverlooked in Melittidbees are also well representedin the Eocene both from the bee phylogeneticliterature. Among the most convincingmor- Baltic amber (Eomacropis;ref. 22) and French Eocene amber phologicalcharacters that supportthe tree presentedherein is the (Paleomacropis;ref. 38). The oldest fossil bee, Cretotrigonaprisca, morphologyof the midcoxa.Michener (32) discoveredthat in apoid is an apidbee closelyrelated to extantstingless bees (Meliponini; wasps,Melittidae s.l., andLT bees, the midcoxais exposed,whereas refs. 39-41). In contrast,ST bee families,such as Halictidae,are in the remainingST bee families,the upperportion of the midcoxa muchless well representedin the Eocene, and representativesof is internal and hidden beneath the mesopleuron(a condition Andrenidaeand Colletidaeare completelyabsent in the fossil describedas "hemicryptic").The hemicrypticcondition is a unique record up until the Miocene (42, 43). The high proportionof and unreversedcharacter congruent with monophylyof Andreni- Melittidaes.l. and LT bees in the Eocene fossil deposits has dae, Halictidae,Colletidae, and Stenotritidae(Figs. 1 and 2), thus generallybeen interpretedas an artifactbecause of the poorfossil stronglysupporting the Melittidae-LTbasal topology. recordof bees andpossibly a biastoward resin collecting bees, most of which are LT bees (44). However,if one acceptsthe Melitti- Implicationsfor Bee HistoricalBiogeography and Host-Plant Evolu- dae-LTbasal hypothesis, Melittidae s.l., Megachilidae,and Apidae tion. Melittidae-LTbasal topology substantiallyalters prevailing representearly branchesin the phylogenyof the bees and are hypothesesof bee phylogeny,biogeography, evolution, and early thereforerelatively old comparedwith some familiesof ST bees, diversification.The hypothesisthat Melittidaess.l. representsthe such as Halictidae,Colletidae, and Stenotritidae. earliestbranch(es) of bee phylogenysuggests an African rather than an Australianor SouthAmerican origin for the bees.Melittidae s.l. Materials and Methods is absentfrom Australia and South America, and Africa is the only Data Sets Analyzed.Molecular data. We generated a data set based on continentwhere all majorlineages (e.g., families)of Melittidaes.l. five nucleargenes that have previously shown promise for resolving occur (20). Meganomiidaeare restrictedto Africa,and for Dasy- deep divergencesin insects and other :elongation podaidaeand Melittidaesensu stricto (s.s.), Africa is the continent factor-lac(45), RNA polymeraseII (27), LW rhodopsin(46), 28S thathosts the greatestnumber of generaand species. Based on this rDNA (47), and 18S rDNA (48). PCR and sequencingprotocols distribution,Michener (20) hypothesizedan Africanorigin for all followedstandard methods detailed in Danforthet al. (27, 49, 50). familiesof Melittidaes.l. Giventheir placement in ourphylogenetic PCR productswere gel-purifiedovernight on low-melting-point trees, this would suggestan African origin for bees in general. agarose gels, and bands were extractedby using the Promega Disjunctbiogeographic distributions in some "melittid"genera, WizardPCR purification system (Promega, Madison, WI). All PCR such as Hesperapis(in the Dasypodaidae;ref. 9), providefurther productswere sequencedin both directions.Sequencing was per- supportfor the antiquityof this group. formedby usingan AppliedBiosystems (Surrey, U.K.) Automated The placementof a paraphyleticMelittidae s.l. at the baseof the 3730DNA Analyzer.We used Big Dye Terminatorchemistry and phylogenyalso supportsthe viewthat the earliestbees were narrow AmpliTaq-FSDNA polymerase. host-plant specialists.Host-plant specializationis widespread Morphologicaldata. We obtained109 morphological characters from amongthe melittidsas well as amongmany basal lineages in other a previousstudy of family-levelphylogenetic relationships in bees families,including Rophitinae (Halictidae), Andreninae and Pan- (17). Characterswere treatedas unorderedand of equalweight. urginae (Andrenidae),Colletinae (Colletidae), and Fideliinae Codingof charactersfollowed the SeriesI codingsof Alexanderand (Megachilidae).Most speciesin Melittidaes.s. and Dasypodaidae Michener (17). This is the coding method that supports the are well known to be host-plantspecialists. Female Hesperapis Colletidaebasal topology. oraria,for example, are monolectic and forage for pollen exclusively on Balduinaangustifolia (Asteraceae; ref. 33). FemaleH. trochan- PhylogeneticMethods and TaxonSampling. We included a total of 94 terataforage exclusivelyon plants in the genus Nama (Boragi- species (14 apoidwasp outgroupsand 80 bee ingroups;Table 4, naceae)and haveelongate slender heads with specializedhairs on whichis publishedas supporting information on the PNASweb site) the mouthpartsfor extractingpollen from the tubularflowers (34). representingall seven familiesof bees and all 21 of the currently Otherspecies of Hesperapisspecialize on a smallnumber of closely recognizedsubfamilies (9). Ourtaxon sampling was extensiveand relatedhost-plant species within diverseangiosperm families, in- includedrepresentatives of threepreviously recognized bee families cludingPolemoniaceae, Rosaceae, Zygophyllaceae,Onagraceae, (Oxaeidae,Ctenoplectridae, and Fideliidae). We focusedparticular Papaveraceae,Fabaceae, and Malvaceae. Host-plant specialization attentionon samplingwithin the five ST bee families and in is widespreadwithin Melittidaes.s., includingMelitta, Rediviva, particularin the twofamilies previously considered to be potentially Redivivoides,and Macropis.All species of Macropisare narrow the most basal lineages of bees: Colletidaeand Melittidaes.l. host-plantspecialists on oil-producingplants in the genusLysima- Samplingwithin Melittidae s.l. was consideredparticularly impor- chia(Primulaceae; ref. 35), andall speciespossess modified legs for tant,because this family is not clearlymonophyletic (17). The only collectingand manipulatingviscous floral oils. Speciesof Rediviva melittidtribe lacking from our data set is Promelittini. areinvolved in an intimatehost-plant association with plants in the Outgroupsincluded representatives of twoof the fourapoid wasp oil-producinggenus Diascia (Scrophulariaceae), in whichvariation families,Crabronidae and Sphecidae (51). Voucherspecimens are in floralspur length in the hostplants is paralleledby variationand depositedin the CornellUniversity Insect Collection.Complete extremeexaggeration in foreleglength in bees (36, 37). Giventhe localitydata, GenBank accession nos., and our combineddata set placementof Melittidaes.l. as a paraphyleticgroup at the base of are availableTable 4. Our data set is depositedin TreeBASE the tree, our resultsindicate that host-plantspecialization is the (www.treebase.org/treebase/index.html)as submission nos. M2878 primitivestate for bees. and S1599. Alignmentsfor all genes were generated in the Lasergene Implicationsfor Understandingthe Bee Fossil Record.The Melitti- DNAStar(Madison, WI) softwarepackage using Clustal W. LW dae-LTbasal hypothesismay help explainthe chronologicalap- rhodopsinpresented particular problems for sequencingas well as pearanceof bee familiesin the fossilrecord. If the Colletidaebasal alignmentbecause of pronouncedvariation in the lengthsof introns topologywere indeedcorrect, one of the most puzzlingaspects of I andIII within Colletidae. Long introns in some subfamilies(e.g., the bee fossil recordwould be the abundanceof Melittidaes.l., Euryglossinae,Hylaeinae, and Xeromelissinae)required manual Apidae,and Megachilidaein the oldest deposits,such as Eocene alignmentsor exclusionof introns.Alignments for the 28S D2-D4 (Baltic) amber(22, 38) and Cretaceousamber from New Jersey regionwere adjustedby eye, and some unalignableregions were (39-41). Among the bees in Baltic amber deposits, 15 of 18 excluded from the analysis.Reading frames and intron/exon

15122 I www.pnas.org/cgi/doi/10.1073/pnas.0604033103 Danforth etal. boundarieswere determinedby comparisonwith sequencesob- Bayesianmethods. Maximum likelihood methods involved both the tainedfor the honeybee,Apis mellifera. Kishino-Hasegawa(28, 29) and Shimodaira-Hasegawa(30) tests, Parsimonymethods. We performedmaximum parsimony analyses as implementedin PAUP*.These tests are appropriate, because the using PAUP* Ver. 4.0b10 (52). Initially,we performedequal- two alternativehypotheses (Colletidae basal vs. Melittidae-LT weightparsimony analyses on each of the six data sets separately basal)were proposed previously based on the morphologicalanal- and then combinedthe data sets into a single analysis.Branch ysis of Alexanderand Michener(17). We implementedthese tests supportfor the individualdata sets aswell as the combineddata set by firstconstraining the tree topologyto one or the othertopology was estimatedby using bootstrapanalysis (53). For parsimony andthen performing a one-tailedtest of significance.We used1,000 additions.For searches,we performed500 random sequence bootstrapreplicates and the RELL approximationand applieda we 500 calculatingbootstrap proportions, performed replicates singleGTR+I+G model acrossall five gene partitions.Goldman with 10 randomsequence additions per replicate. et al. (59) provideguidelines for when this test is appropriate. Bayesianmethods. Analysis of individualgene partitions by MrModel- For in the framework,we used the Test Ver. 2.2 (54) indicatedthat the GTR+I+G model was the hypothesistesting Bayesian most for and and that the Bayes factor (31, 60). We used MrBayesVer. 3.1.2 to calculate appropriate EF-la, 28S, opsin, pol II, the harmonicmean of the likelihoodvalues of the Markovchain SYM+I+G modelwas the most appropriatefor the 18Sdata set. The only differencebetween the two models is that GTR allows MonteCarlo samples from the combinedfive-gene data set for the empiricalbase frequencies,whereas SYM assumes equal base last8,000 trees. We thencalculated the harmonicmean of the like- frequencies(55). The 18S data showed little evidence of base- lihood values when the tree topology was constrainedto the compositionalbias (Table2). We thus applieda model in which Colletidaebasal topology for the same sample of trees. Twice GTR+I+G was appliedto EF-la, 28S,opsin, and pol II, whereas the differencein log likelihoodscan be usedto estimatethe extent a SYM+ I+G modelwas applied to the 18Sdata set (referredto as to whichthe observedresult (Melittidae-LT basal) differedfrom the GTR+SYM[18s]+I+Gmodel). In addition,we exploredal- the null hypothesis(Colletidae basal). Twice the differencein log ternativemodels to evaluatethe robustnessof the dataset to model likelihoodcan be interpretedby using tables in refs. 31 and 60. choice. Values >10 are consideredto be very strong supportfor the z 0 For the Bayesiananalyses, we used MrBayesVer. 3.1.2 (refs.56 alternativehypothesis. To imposethe Colletidaebasal topology, we LO and57). We analyzedthe combineddata set usinga rangeof models constrainedall familiesto be monophyleticand then constrained 0o includingthe Kimuratwo-parameter, Hasegawa-Kishino-Yano, the overalltree topology to matchfigure la in ref.27. No constraints SYM(55), and GTR models (58). Various among-site rate variation were placedon the topologywithin the families,and the affinities modelswere used to accountfor ratevariation among genes and of the Stenotritidaewere not constrained. codonpositions, including gamma distribution (G) as well as I+G. Forall analyses, we treatedthe separategenes as "unlinked,"so that LaurencePacker, Robert Minckley, Eduardo Almeida, and Sophie separateparameter estimates were obtainedfor each gene for all Cardinalprovided helpful comments on the manuscript. This project was runs.We rantwo simultaneousruns with fourchains each for 1 x supportedby collaborativeNational Science Foundation Research 106generations and sampledtrees every 100 generations.Plots of Grantsin SystematicBiology [DEB-0211701 (to B.N.D.)and DEB- the -In likelihoodscores over generation time showedthat stable 0211837(to S.S.)].S.G.B. is currentlysupported by NationalScience parameterestimates were obtainedafter trees. FoundationGrant EF-0431330. A National Grantin Aidof 1,000 Geographic Hypothesistesting. To compareour results with the previousfamily- Research(NGS Grant 6946-01) provided travel funds for the collection level phylogeniesof bees, we used both maximumlikelihood and of someimportant specimens for thisstudy.

1. Soltis PS, Soltis DE (2004) Am J Bot 91:1614-1626. 30. ShimodairaH, Hasegawa M (1999) Mol Biol Evol 16:1114-1116. 2. Davies TJ, BarracloughTG, Chase MW, Soltis PS, Soltis DE, SavolainenV (2004) 31. Nylander JAA, Ronquist F, Huelsenbeck JP, Nieves-Aldrey JL (2004) Syst Biol Proc NatlAcad Sci USA 101:1904-1909. 53:47-67. 3. Regal PJ (1977) Science 196:622-629. 32. Michener CD (1981) J Kansas Entomol Soc 54:319-326. 4. Burger WC (1981) BioScience31:572-581. 33. Cane JRR. Snelling RR, Kervin LJ (1996) J KansasEntomol Soc Suppl69:238-247. 5. Crepet WL, Friis EM (1987) in The Originsof Angiospermsand TheirBiological 34. Rozen JG, Jr (1987) Am Mus Novitates,no 2887. Consequences,eds Friis, EM, Chaloner, WG, Crane PR (Cambridge Univ Press, 35. Cane JH, EickwortGC, WesleyFR, SpielholtzJ (1983)Am MidlandNat110:257-264. Cambridge,UK), pp 181-201. 36. Steiner KE, Whitehead VB (1990) Evolution (Lawrence,Kans) 44:1701-1707. 6. Eriksson O, Bremer B (1992) Evolution(Lawrence, Kans) 46:258-266. 37. Steiner KE, Whitehead VB (1991) Evolution (Lawrence,Kans) 45:1493-1501. 7. Pellmyr O (1992) TrendsEcol Evol 7:46-49. 38. Michez D, Nel A, Menier J-J (2006) Zool J Linn Soc, in press. 8. Schoonhoven LM, Jermy T, van Loon JJA (1998) Insect-Plant Biology: From 39. Michener CD, GrimaldiDA (1988) Am Mus Novitates,no 2917. Physiologyto Evolution(Chapman & Hall, London). 40. Michener CD, GrimaldiDA (1988) Proc Natl Acad Sci USA 85:6424-6426. 9. Michener CD (2000) TheBees of the World(Johns Hopkins Univ Press, Baltimore). 41. Engel MS (2000) Am Mus Novitates,no 3296. 10. Wcislo WT, Cane JH (1996) Annu Rev Entomol 41:257-286. 42. Michener CD, Poinar PO, Jr (1996) 1 KansasEntomol Soc Suppl 69:353-361. 11. Simpson BB, J.L. Neff (1981) Ann Mo Bot Gard 68:301-322. 43. Engel MS (1999) Entomol Scand 30:453-458. 12. BuchmannSL (1987) Annu Rev Ecol Syst 18:343-369. 44. Roig-Alsina A, Michener CD (1993) Univ Kansas Sci Bull 55:123-173. 13. Dressier RL (1982) Annu Rev Ecol Syst 13:373-394. 45. Danforth BN, Ji S (1998) Mol Biol Evol 15:225-235. 14. ArmbrusterWS (1984) Am J Bot 71:1149-1160. 46. MardulynP, Cameron S (1999) Mol PhylogenetEvol 12:168-176. 15. Thorp RW (1979) Ann Mo Bot Gard 66:788-812. 47. Cameron SA, MardulynP (2001) SystBiol 50:192-214. 16. Thorp RW (2000) Plant SystEvol 222:211-223. 48. Caterino MS, Cho S, Sperling FAH (2000) Annu Rev Entomol 45:1-54. 17. Alexander BA, Michener CD (1995) UnivKansas Sci Bull 55:377-424. 49. Danforth BN, Sauquet H, Packer L (1999) Mol PhylogenetEvol 13:605-618. 18. Michener CD (1944) Bull Am Mus Nat Hist 82:151-326. 50. Danforth BN, Brady SG, Sipes SD, Pearson A (2004) SystBiol 53:309-326. 19. Michener CD (1974) The Social Behaviorof the Bees (Harvard Univ Press, Cam- 51. Melo GAR (1999) ScientificPapers, Univ Kansas Nat Hist Mus 14:1-55. bridge, MA). 52. Swofford DL (2002). PAUP*, PhylogeneticAnalysis Using Parsimony (and Other 20. Michener CD (1979) Ann Mo Bot Gard 66:277-347. Methods) (Sinauer, Sunderland,MA), Version 4.0b10. 21. Malyshev SI (1968) Genesis of the Hymenopteraand Phases of Their Evolution 53. Felsenstein J (1985) Evolution (Lawrence,Kans) 39:783-791. (Methuen, London). 54. NylanderJAA (2004) MrModelTest(Evolutionary Biology Centre, Uppsala Univer- 22. Engel MS (2001) Bull Am Mus Nat Hist 259:1-192. sity, Uppsala, Sweden), Version 2.2. 23. Perkins RCL (1912) Ann Mag Nat Hist (ser 8) 9:96-121. 55. ZharkikhA (1994) J Mol Evol 39:315-329. 24. McGinley RJ (1980) J KansasEntomol Soc 53:539-552. 56. Huelsenbeck JP, Ronquist F (2001) Bioinformatics17:754-755. 25. Radchenko VG, Pesenko YA (1994) Entomol Rev 75:140-162. 57. Ronquist F, Huelsenbeck JP (2003) Bioinformatics19:1572-1574. 26. Michener CD (2005) J HymenRes 14:78-83. 58. SwoffordDL, Olsen GJ, Waddell PJ, Hillis DM (1996) in MolecularSystematics, eds 27. Danforth BN, Fang J, Sipes S (2006) Mol PhylogenetEvol 39:358-372. Hillis DM, Moritz C, Mable BK (Sinauer, Sunderland,MA), 2nd Ed, pp 407-514. 28. Hasegawa M, Kishino M (1989) Evolution (Lawrence,Kans) 43:672-677. 59. Goldman N, Anderson JP, Rodrigo AG (2000) Syst Biol 49:652-670. 29. Kishino H, Hasegawa M (1989)J Mol Evol 29:170-179. 60. Kass RE, Raftery AE (1995) JAm Stat Assoc 90:773-795.

Danforth et al. PNAS I October 10, 2006 I vol. 103 I no. 41 I 15123