Available online at www.sciencedirect.com

Molecular Phylogenetics and Evolution 47 (2008) 783–798 www.elsevier.com/locate/ympev

Towards the phylogeny of chafers (Sericini): Analysis of alignment-variable sequences and the evolution of segment numbers in the antennal club

Dirk Ahrens a,b,*, Alfried P. Vogler b,c

a Zoologische Staatssammlung Mu¨nchen, Mu¨nchhausenstr. 21, 81247 Munich, Germany b Department of Entomology, Natural History Museum, London SW7 5BD, United Kingdom c Division of Biology, Faculty of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, United Kingdom

Received 17 August 2007; revised 7 February 2008; accepted 8 February 2008 Available online 23 February 2008

Abstract

Scarabaeoid display a distinctive lamellate antenna carrying olfactory sensillae which show various trends of surface enlarge- ment, including the increased number of the terminal lamellate antennomeres. The presence of >3 lamellae (‘plurilamellate’ antennae) in some groups has been used in the classification of chafers () and in particular in the tribe Sericini. However, this character may not be phylogenetically conservative. Here we present a phylogenetic analysis based on partial 28S rRNA, cytochrome oxidase I (cox1) and 16S rRNA (rrnL) for 183 species of , representing all traditionally recognized subfamilies, with particular focus on Sericini. Alignments of length-variable sequences were obtained applying various alignment algorithms and parameter settings. Tree topologies from the combined analysis were very similar when rrnL alignment was based on the progressive alignment algorithm MAFFT, MUSCLE, and less so Clustal, but differed greatly when using the probabilistic PRANK and the ‘local’ alignment procedure BlastAlign, while alignment conditions for the smaller 28S rRNA had little impact on the combined analysis. Preferred conditions were chosen based on an extensive analysis of character congruence between markers and recovery of well established taxonomic groups. Combined analyses on the best alignments using parsimony, maximum likelihood and Bayesian inference generally supported the tra- ditional classification, including the monophyly of Scarabaeidae, with Glaphyridae as its sister, the monophyly of Cetoniinae, and the monophyly of most tribes included. Various levels of support were also obtained in favor of a proposed sister relationship of Sericini with Ablaberini, their close relationships to a melolonthine clade consisting of several tribes with exclusively Southern Hemisphere dis- tribution, and the monophyly of Old World Sericini. In contrast, the generic level relationships were not consistent with the existing tax- onomy. The large genera Neoserica, Microserica, and each split in several distantly related branches. The segment number of the antennal club when optimized onto the preferred tree revealed that plurilamellate antennae originated repeatedly (9–10 times in Seri- cini, plus multiple origins in other Melolonthinae). This invalidates the use of this trait in the generic classification. The number of lamel- lae is likely to be relevant to mate recognition, as it affects the spatial organization and number of olfactory sensillae. The high level of homoplasy in antennal characters may indicate a causal link between the morphological diversity of the antennae and the great species richness of Sericini and related melolonthines. Ó 2008 Elsevier Inc. All rights reserved.

Keywords: Scarabaeidae; Melolonthinae; Classification; Progressive sequence alignment; Antennal lamellae; Chemical communication; Sexual selection; Species diversification

1. Introduction

* Corresponding author. Address: Zoologische Staatssammlung Mu¨n- The Scarabaeidae Pleurosticti (Erichson, 1847) are a chen, Entomology, Mu¨nchhausenstr. 21, 81247 Munich, Germany. Fax: +49 (0)89 8107300. well characterized group of some 25,000 described species E-mail address: [email protected] (D. Ahrens). of beetles (Scholtz and Grebennikov, 2005) which includes

1055-7903/$ - see front matter Ó 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2008.02.010 784 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 more than two thirds of all species in the superfamily Sca- they relate to the diversification of antennal club rabaeoidea (=Lamellicornia). They are defined by a dis- characters. tinct synapomorphy, the placement of the abdominal Here we address relationships in Sericini using partial spiracles on the upper portion of the sternites (e.g. Baltha- sequences from two mitochondrial genes, cytochrome oxi- sar, 1963; Ritcher, 1969). The Pleurosticti are usually sub- dase subunit I (cox1) and the large subunit rRNA (rrnL), divided into four major subfamilies, including Dynastinae, and one nuclear gene, the large subunit rRNA (28S). The Rutelinae, Melolonthinae, and Cetoniinae, plus several latter two markers were highly variable in length, due to small groups. All are phytophagous; the adults usually feed the broad taxon sampling that we used to explore relation- on leaves, flowers and pollen, while the larvae feed primar- ships of Sericini with other scarab lineages, introducing ily on living roots, soil humus or decaying wood. The tribe uncertainty for the phylogenetic analysis. The difference Sericini is a particularly diverse group with some 3500 spe- in evolutionary dynamics of mitochondrial and nuclear cies in about 200 genera. Like other members of the Melo- rRNA genes would suggest differences in sensitivity to lonthinae, they feed on leaves of angiosperm plants as the choice of alignment parameters (e.g. Vogler and Pear- adults and on roots in the larval stages. They occur pre- son, 1996; Terry and Whiting, 2005). Procedures based dominantly in the Old World, while only five genera are on the ‘progressive’ alignment algorithm of Gotoh known in the Neotropics (Evans and Smith, 2007) and they (1982), as implemented in ClustalX 1.83 (Thompson are absent entirely from the Australian region. Sericines are et al., 1997) remain very popular, despite shortcomings inconspicuous and morphologically uniform. Their homo- (Gardner et al., 2005). More recent programs are based geneity and the resulting taxonomic difficulties have limited on a similar principal strategy of progressive incorporation studies of their distribution, ecology, and morphology (e.g. of sequences into the alignment but optimizing these in var- Scholtz and Chown, 1995; Ahrens, 2006a). ious ways for the final alignment outputs. These procedures An important synapomorphy of all scarabaeoid beetles, are now available in programs such as MAFFT (Katoh including the Pleurosticti, is the distinctive lamellate et al., 2002, 2005), MUSCLE (Edgar, 2004a,b) and antenna consisting of several leaf-like terminal segments PRANK (Lo¨ytynoja and Goldman, 2005). Clustal settings (Browne and Scholtz, 1999). These lamellae are the loca- have also been widely tested and settings optimized for tion of olfactory sensillae (Meinecke, 1975; Leal et al., rRNA genes have been suggested (Wilm et al., 2006). This 1998) required for the detection of feeding sources as well leaves a bewildering range of algorithms and settings to as mate finding and recognition (Reinecke et al., 2006). choose from. While most lineages of possess a short Very few studies make use of this range of alignment antennal club composed of three antennomeres, here procedures to test the reliability of a tree. There is little referred to as ‘trilamellate’, a large group of pleurostict doubt that alignment parameters and choice of methods scarabs shows various modifications and enlargement of not only affect the alignment itself, but also the tree topol- this structure. The two major trends are modification of ogy (Morrison and Ellis, 1997), and regions suffering from the common funicular (spherical or cylindrical) antenno- the greatest variability in alignment also are those that pro- meres into lamellate (layered) antennomeres, in particular duce the greatest differences in topologies (Wong et al., in the males, and the increased number of lamellate anten- 2008). The selection of alignment parameters therefore is nomeres from the modal three, to between four and seven. probably the most critical step in phylogenetic reconstruc- This increased number of lamellae can be referred to as tions. While the advent of new or improved methods there- ‘plurilamellate’, following Burmeister’s (1855) German fore will greatly help in obtaining better tree estimates from term ‘mehrgliedrig’ [=with more (than 3) segments]. The alignment-variable sequences, we need to quantify the number of antennomeres composing the club in the male effect of their application on tree topologies and any bias sex is the most easily recognizable diagnostic feature, and resulting from the selection of particular methods. the current classification of Sericini at the genus-level is lar- To decide which alignment parameters and procedures gely based on these antennal characters. For example, sets are preferable, the criterion of congruence can be applied, of species have been separated into two different genera based on the notion that an alignment should be an opti- because the antennal club was composed of three vs. four mal description of homology (i.e. synapomorphy). Testing antennomeres (Reitter, 1896; Brenske, 1897). However, initial hypotheses of alignment for their congruence with variation in number as well as size and shape of clavicular other information, the preferred alignments are those that antennomeres are also seen in other pleurosticts, rendering optimize overall homology, including data from morpho- these traits homoplastic. This fact may raise doubts about logical studies or other gene regions (sensitivity analysis the validity of antennal characters as a major trait for the of Wheeler, 1995). This principle was implemented here generic classification of Sericini. Yet, the phylogenetic con- by testing the congruence between the two length-variable text of antennal traits can not be established conclusively, regions, rrnL and 28S. We also used a method for ‘local’ as relationships within Sericini remain poorly known alignments based on the BLAST algorithm which rely on (Ahrens, 2006a). This also hampers the efforts to examine the recovery of small segments of sequence recognized the evolutionary biology of chemical communication and among terminals, providing an alternative procedure for their effect on species diversification, in particular in how homology assignment against which to test the progressive D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 785 alignment procedures. These comparisons can be formal- FF and DD (Monaghan et al., 2007) (For primer ized using measures of similarity of trees, to test the relative sequences see Supplementary Table 5). Sequencing was effect of algorithm choice and alignment parameters on the performed on both strands using BigDye v. 2.1 and an tree topology. Based on trees obtained from the preferred ABI3730 automated sequencer. Sequences were assem- alignments, we finally focus on the origins of various mod- bled and edited using the STARS program (M.-S. Chan ification in shape and number of the clubbed antenno- and N. Ventress, University of Oxford, 2001). For many meres, as an independent assessment of the generic sequences, ca. 100 bp on each end of sequencing chro- classification of Sericini. matograms were edited manually using Sequencher v. 4.5 (Genecodes Corp., Ann Arbor, MI, USA) or BioEdit (Hall, 1999). All sequences used in the present study 2. Materials and methods have been submitted to GenBank (Supplementary Table 1). 2.1. Taxon sampling, DNA extraction, and DNA sequencing

A representative sample of Sericini and a less complete sample of other Melolonthinae were collected from Neo- 2.2. Phylogenetic analysis using multiple alignment tropical, Palearctic, Afrotropical, Oriental, and Austral- asian regions (Table 1) including all principal lineages Ten alignments were produced for each of the two (subfamilies) of Scarabaeidae. Additionally, two species length-variable markers, referred to as AL1–AL10. The of Hybosoridae as well as two species of the presumed resulting alignments were then tested for congruence in primitive (Browne and Scholtz, 1999) Glaphyridae were combined analyses in all possible combinations (Vogler included. All trees were rooted with Hybosorus illigeri. and Pearson, 1996). Progressive alignment procedures DNA was extracted using Qiagen DNeasy columns or Pro- were conducted in ClustalX 1.83 (Thompson et al., mega WizardSV extraction plates from thoracal leg muscle 1997) under default gap opening penalties (Go) of 15 tissue. Following DNA extraction, beetles were dry and extension penalties (Ge) of 6.66 (AL1) and under mounted. Vouchers will be deposited in the D. Ahrens an optimized setting suggested by Wilm et al. (2006) with collection. a gap opening penalty of 22.5 and extension penalty of Two mitochondrial and one nuclear gene regions were 0.83 (AL2). Further, we used related methods providing amplified and sequenced. These included cytochrome oxi- secondary refinement including MAFFT version 5.8 dase subunit 1 (cox1) and 16S ribosomal RNA (rrnL) (Katoh et al., 2002, 2005) (AL3) and MUSCLE version with the adjacent regions NAD dehydrogenase subunit 3.6 (Edgar, 2004a,b) (AL4), or probabilistic modeling 1(nad1) and tRNA leucine (trnLeu). Hereafter we use of indels, as implemented in PRANK (Lo¨ytynoja and rrnL to refer to this region. PCR and sequencing was Goldman, 2005) (AL5). Finally, we used a method for performed using primers Pat and Jerry for cox1 and ‘local’ alignments based on the BLAST algorithm which 16Sar and ND1A for rrnL (Simon et al., 1994). For a rely on the recovery of small segments of sequence recog- few specimens only a shorter fragment could be obtained nized among terminals (BlastAlign, Web Interface Ver- using the primer pairs 16SB2 or 16Sb2 (Monaghan et al., sion 0.1; Belshaw and Katzourakis, 2005) (AL6). 2007). A fragment of nuclear 28S rRNA containing the Additionally, we used a set of gap opening and extension variable domains D3–D6 was amplified using primers penalties for ClustalX (AL7–AL10) (see Table 1 for exact

Table 1 Alignments generated and tree statistics from parsimony analysis Algorithm Parameters rrnL 28S ALL GC PIC TL CI RI ALL GC PIC TL CI RI AL1 ClustalX 15/6.66 (default) 860 190 450 8546 0.14 0.56 828 162 99 893 0.29 0.76 AL2 ClustalX 22.5/0.83 861 194 459 8634 0.14 0.54 812 154 101 916 0.29 0.78 AL3 MAFFT Automatic; (default) 875 152 451 8496 0.14 0.49 818 149 99 891 0.29 0.75 AL4 MUSCLE Default 903 516 513 9882 0.14 0.44 821 152 99 899 0.28 0.74 AL5 PRANK Default 1232 589 687 9149 0.19 0.65 855 202 113 940 0.31 0.76 AL6 BlastAlign Default 986 549 544 8383 0.15 0.62 719 144 114 1389 0.34 0.73 AL7 ClustalX 25/6.66 852 170 444 8643 0.12 0.56 810 148 111 938 0.31 0.77 AL8 ClustalX 20/6.66 856 173 448 8596 0.13 0.56 810 152 118 954 0.32 0.77 AL9 ClustalX 10/6.66 865 198 456 8519 0.13 0.56 812 154 106 894 0.31 0.77 AL10 ClustalX 6.66/6.66 878 216 466 8506 0.13 0.57 813 156 102 885 0.31 0.77 Ten different alignments (AL1–10) were produced with the alignment algorithms and parameter settings (gap opening cost/gap extension cost) given separately for the length-variable markers. The table lists the total alignment length (ALL; in bp), the number of gapped characters (GC), parsimony- informative characters (PIC), tree length (TL), consistency index (CI) (excluding uninformative characters) and retention index (RI). All gaps were treated as 5th character state. 786 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 settings) to test the effect of varying the parameters for a generations, using the ML tree of the preferred alignment single algorithmic procedure. as a user starting tree and two runs of three heated and The resulting alignments were subjected to parsimony one cold Markov chains (heating of 0.05). Chains were tree searches using TNT (Goloboff et al., 2004) with 10 sampled every 100 generations and 1.5 Â 106 generations ratchet iterations, 10 cycles of tree drifting and three were discarded as burn-in based on the average standard rounds of tree fusing for each of 200 random addition deviation of split frequencies as well as by plotting ÀlnL sequences. Gaps were treated as a fifth character state, to against generation time. optimize length variation on the phylogenetic tree in the Comparisons of tree topologies obtained from different same manner as the variation in nucleotides (Ogden and alignments were made using the d statistic (Estabrook Rosenberg, 2007; Simmons et al., 2007). Bremer support et al., 1985; Goddard et al. 1994) as implemented in PAUP* was calculated for a subset of nodes on the resulting tree (Swofford, 2002). This procedure extracts all resolved quar- by constraining single nodes for non-monophyly and tets of terminals in each of two trees and establishes the repeating TNT searches. Congruence of trees was assessed number of quartets common to both trees. The similarity using the Incongruence Length Difference (ILD) (Micke- of tree topologies among multiple trees was summarized vich and Farris, 1981) and the Rescaled ILD (RILD; by a Neighbor Joining analysis on the resulting tree dis- Wheeler and Hayashi, 1998) applied here to the 3-parti- tances from all pairwise comparisons. To test the robust- tions (cox1, rrnL, 28S) which provides a measure of incon- ness of various alternative hypothesis of character gruence between partitions relative to the maximum total optimization, we used the Shimodaira–Hasegawa (SH) test homoplasy (incongruence) in the data set. Furthermore, (Shimodaira and Hasegawa, 1999; Goldman et al., 2000)as we applied the Meta-Retention Index (MRI; Wheeler implemented in PAUP*, comparing the parsimony trees et al., 2006) which measures the extra cost from combining obtained on the preferred alignment with trees constrained the data relative to the total level of homoplasy, but allow- for one to four independent origins of plurilamellate anten- ing data partitions to be split in various ways, indepen- nae among the Sericini. The parsimony optimization of the dently of a priori assumptions about functional portions. number of clavicular antennomeres was produced for the Finally, we explored the congruence among trees establish- phylogenetic hypotheses from the different tree building ing taxonomic consistency (TC) based on the recovery of methods with MacClade 3.04 (Maddison and Maddison 14 selected groups (tribes and subfamilies; see Supplemen- 1998) under accelerated optimization (ACCTRAN). tary Tables 2–4) that are well established in the traditional classification. This measure gives the percentage of these 3. Results groups that are recovered as monophyletic in a given analysis. 3.1. Choice of alignment parameters and tree building Model-based phylogenetic analysis was performed on the complete data set of the preferred alignment combina- The length of sequences for the part of the rrnL varied tion. Maximum Likelihood (ML; Felsenstein, 1973) from 794 to 837 bp and those of 28S from 682 to 791 bp searches were performed in PhyML (Guindon and Gasc- (795 and 686 bp in the outgroup, respectively). Using the uel, 2003) using a GTR model (as selected by Modeltest full taxon set (n = 183), alignments for the rrnL and 28S using standard AIC; Posada and Crandall, 1998; Akaike, genes were generated under various alignment procedures 1974) with all parameters estimated from the data and four as well as different parameter values for gap opening and substitution rate categories. Bayesian analysis was con- extension (Clustal only; Table 1). The resulting ten different ducted using parallel MrBayes 3.1.2 (Ronquist and Huel- alignments (AL1–AL10) obtained with PRANK, MUS- senbeck, 2003; Altekar et al., 2004) running on the CBSU CLE, and BLAST had significantly (p = 0.05) greater num- Web Computing Resources (http://cbsuapps.tc.cor- ber of parsimony-informative characters in rrnL than any nell.edu/mrbayes.aspx). Bayesian MCMC analyses (Yang of the CLUSTAL alignments, but such differences were and Rannala, 1997) were performed with unpartitioned not evident for the 28S marker (Table 1). Both length-var- and partitioned data (Nylander et al., 2004; Brandley iable markers were combined in all possible 100 permuta- et al., 2005). Data were partitioned by separating the com- tions (e.g. alignment AL1/3 combines the alignments for bined matrix into the three gene loci cox1, rrnL, and 28S 28S obtained under settings AL1 and for rrnL under set- (part3), and further separating the cox1 into 3-partitions tings AL3). Parsimony searches on these matrices, com- for each codon position (part5), while the rRNA markers bined with the length constant cox1 fragment (826 bp, were each separated into conservative length-invariable 408 parsimony-informative characters), returned three and highly variable length-variable partitions (part7). Tra- and four overall shortest trees of 22136 steps for alignment cer 1.3 (Rambaut and Drummond, 2003) was used to combinations AL3/3 and AL10/3, respectively (Supple- graphically determine stationarity and convergence of runs. mentary Table 3). The alignment combinations were then The Bayes factor (Nylander et al., 2004) was calculated in used to evaluate character congruence among the parti- order to determine the most appropriate partitioning strat- tions (Supplementary Table 3). The alignment combination egy. Once the preferred scheme for partitioning had been with highest character congruence differed greatly depend- established, in-depth analyses were conducted for 6 Â 106 ing on the measure used (ILD, RILD, MRI; Supplemen- D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 787 tary Table 3). The least amount of incongruence was including all characters = 0.10; RI = 0.43). This tree was obtained for AL4/4 when assessed using the ILD, AL4/7 among the shortest trees obtained from any alignment based on the RILD, and AL8/5 based on the MRI. combination, and represents a small group of alignments Taxonomic consistency (TC) of tree topology with that performed well in terms of overall tree length and selected ‘known’ monophyletic groups was also assessed overall congruence (Fig. 2). However, even this ‘overall for each of these alignment combinations (Supplementary preferred’ alignment indicated significant incongruence of Tables 2 and 4). Among the 14 possible groups, a maxi- the rrnL partition with the two others (28S, cox1)as mum of 13 (93%) was obtained for a range of alignment revealed by the partition homogeneity test implemented combinations under parsimony. Whereas eight groups were in PAUP* (p = 0.02 and 0.01, respectively; while for 28S recovered for all combinations (Supplementary Table 2), vs. cox1, p = 0.85). Equally, TC also remained below the some were encountered only in a minority of trees but none possible maximum, although all 14 test nodes were con- of these groups were rejected in every combination (Sup- firmed under some (but not all) of the alignment parame- plementary Table 2). The highest TC of 0.93 was obtained ters and therefore should be considered as monophyletic for six alignment combinations (AL10/4, AL2/4, AL7/3, groups confirmed by the molecular analysis. We used the AL7/4, AL9/3, AL9/4). The correlation of character con- AL9/3 alignment as the best choice from our exploration gruence and TC was strong (1-RILD = 0.8345; of the parameter space. ILD = À0.8354; MRI = À0.6945, all with p0.05 < 0.0001), We also performed ML analyses for various alignment and in particular for the set of alignments that resulted in combinations, including those with the best and the worst trees with high TC (P0.81) (Fig. 1). Therefore, we chose combined value of character and taxonomic consistency an overall preferred alignment according to the highest (under parsimony), to test for the preferred alignment under combined value of character congruence and taxonomic model-based criteria using TC and the likelihood scores congruence (here evaluated as TC). Among the three met- (Supplementary Table 3). Generally, the likelihood analyses rics of character congruence explored, we chose 1-RILD performed much better compared to parsimony based on the because it provides a length metric of incongruence that same alignment when assessed using TC, i.e. more of the well is normalized for the maximum incongruence, and there- established nodes were recovered (Supplementary Table 4). fore more suited to compare different alignments which We can not discern whether this improvement was caused may greatly differ in their absolute tree length (compromis- by the optimality criterion (parsimony vs. likelihood) or ing the ILD and MRI). The best alignment judged by the the difference in gap coding, as gaps were not treated as char- product of TC Â 1-RILD was obtained with alignment acter states in our likelihood or Bayesian analyses. For the parameters AL9/3 (Fig. 1; Supplementary Table 3). This preferred alignment from the character congruence exercise alignment contained 990 parsimony-informative charac- (AL9/3) the ML tree had the same TC as under parsimony ters, and resulted in two shortest trees of 22146 steps (CI and a likelihood estimate of ÀlnL = 86157.16.

Fig. 1. Plot of character congruence (measured by 1-RILD) against taxonomic consistency (TC; only >50% shown) for various alignment combinations of 28S and rrnL partitions (see Supplementary Table 3). Note that the application of uniform alignment procedures to both data partitions (black diamonds) results in inferior congruence values. 788 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798

Fig. 2. Overall congruence [the product of taxonomic consistency and character congruence; (TC) Â (1-RILD)] as a function of tree length.

Bayesian analysis was performed by applying models of To establish which alignment parameters had the great- character variation separately to functional data partitions est effect on the tree topologies, we performed comparison that are presumably affected by different dynamics of of all 100 trees using a NJ analysis on the d values (Esta- sequence evolution. Based on the preferred alignment com- brook et al., 1985) of all pairwise combinations of trees. bination, we used a GTR+I+C model with 1-, 3-, 5-, and 7- This revealed that the space of tree topologies was highly partitions (see Section 2). The MCMC chain reached the structured according to the alignment procedure applied stationary phase after about 100,000 generations, but the to the rrnL gene, whereby trees obtained with particular final trees after 3.5 Â 106 generations showed high average algorithms and parameter sets (only tested for Clustal) standard deviation split frequency for the 5-partition and grouped closely together (Fig. 3; Supplementary Fig. 1). the 7-partition analysis (>0.04) compared to those of the The most divergent of these groups of trees were obtained 1- and 3-partition data (<0.028). The 7-partition analysis from alignments constructed with PRANK and BlastAlign, still showed numerous polytomies in the consensus-tree at whereas trees from MUSCLE and MAFFT alignments nodes that were fully resolved in the other runs. Therefore were intermixed in a single group. The preferred tree we repeated the Bayesian analysis with a user defined start- (AL9/3) is embedded in this group of trees where it is part ing tree from the ML analysis of the preferred alignment, of a broader cluster of trees with low divergences from each continuing the chain to 6 Â 106 generations. This resulted other. Trees from the Clustal alignment obtained under the in further minor improvements (but non-significant in an default and the improved (Wilm et al., 2006) parameter val- SH test) of the likelihood estimate, and produced the best ues grouped closely together, but were separated from tree overall when using Bayes factors to assess model those obtained under three other parameters values. In fit. Applying criteria for the significance of threshold lev- contrast to the strong structuring according to the rrnL els for the strength of support (Kass and Raftery, 1995) alignment, parameters applied to the 28S alignment did Bayes factors provide ‘positive’ evidence against the 5- not follow any recognizable pattern (Supplementary partition data and ‘strong’ evidence against the unparti- Fig. 1). tioned and 3-partition data, while the unpartitioned data showed the worst fit (Table 2). The tree topology of the 3.2. Tree topology and taxonomic implications 7-partition analysis from the preferred alignment (AL9/3) differed slightly from those obtained with parsimony The tree topologies based on the preferred alignment from the same alignment. However, comparisons of trees AL9/3 obtained from parsimony (Fig. 4), ML (Fig. 5) obtained given this alignment with different tree building and Bayesian inference (Fig. 6) showed great similarity in methods, the differences in ML values between the parsi- major features but with slightly different extent and posi- mony trees, the ML tree and the MrBayes tree (part 7) tion of some monophyletic groups. The preferred Bayesian were not significant in pairwise SH tests (Table 3), sug- and parsimony trees generally supported the traditional gesting that parsimony and model-based analyses do classification, including the monophyly of Scarabaeidae not contradict each other. (node 3) (Browne and Scholtz, 1999), with Glaphyridae D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 789

Table 2

Comparison of estimated model likelihoods (loge f(X|Mi)) under different data partition models (parts 1–7) in the Bayesian analysis

Partition models compared (M1/M0) Model likelihood Bayes factor

loge f(X|M1) loge f(X|M0) 2loge B10 Part 3/part 1 À83304.77 À86208.19 6.926 Part 5/part 1 À82340.01 À86208.19 7.175 Part 5/part 3 À82340.01 À83304.77 5.969 Part 7/part 1 À82127.27 À86208.19 7.222 Part 7/part 3 À82127.27 À83304.77 6.142 Part 7/part 5 À82127.27 À82340.01 4.656

Two times the log of the Bayes factor (B10) indicates the increase in model fit in 7-partition models compared to simpler models (2loge B10 = 2–6 positive; 2loge B10 = 6–10 strong; from M0 to M1).

Table 3 supported sister relationship of pleurostict scarabs and the Comparison (SH test) of the trees obtained with the different tree building ‘dung scarabs’ (Scarabaeinae + Aphodiinae) (node 3; procedures under parsimony or model-based approaches using an pp = 100%, BS = 10), and the monophyly of Scarabaeinae unpartitioned GTR+I+C model (true dung beetles) (node 2; BS = 1, pp = 95%). Within the Tree ÀlnL Diff ÀlnLpPleurosticti, Anomalini + Adoretini + Dynastinae were TNT tree1 123220.16262 (Best) — sister to the Cetoniinae (including Valgini) (node 5), albeit TNT tree2 123230.20863 10.04601 0.722 with low support (BS = 2, pp = 55%). The Valgini were ML tree 123318.10686 97.94423 0.367 MrBayes tree 123484.79715 264.63452 0.051 paraphyletic in both model-based analyses while monophy- letic under parsimony (BS = 5). The paraphyly of Ruteli- The difference in likelihood (Diff ÀlnL) is calculated by comparing the best tree (TNT tree 1) to each of the others. nae with respect to Dynastinae, as already hypothesized (Smith et al., 2006; Ahrens et al., 2007), was confirmed with the present data set. Confirming Smith et al. (2006), the Dynastinae were sister of the ruteline tribe Adoretini in the parsimony and unpartitioned ML analysis, albeit poorly supported (BS = 2, 20% bootstrap in the ML tree). In the Bayesian tree Dynastinae themselves were paraphy- letic, with Cheiroplatys latipes being sister to the remaining Dynastinae + Anomalini (with pp = 74%). The relation- ships of major melolonthine lineages differed substantially under the various tree searching techniques. While under ML there was weak support for monophyly of Melolonthi- nae as sister of (Anomalini + Adoretini + Dynasti- nae + Cetoniinae) (node 1; Fig. 5), Melolonthinae was paraphyletic in the parsimony analysis and polyphyletic in the Bayesian tree. Low branch support also affected the identification of the sister group to Sericini. A morpho- logical analysis (Ahrens, 2006a) has shown evidence for Sericini to be sister to Ablaberini, but here we found this relationship only under Bayesian inference (Fig. 6, node 7), while under parsimony the Sericini was not even recov- ered as monophyletic due to the distant position of the Fig. 3. Unrooted Neighbor Joining tree-based on d statistics of pairwise Neotropical genus Astaena (Fig. 4). In both model-based tree comparisons between parsimony trees from all 100 alignment approaches Sericini had close relationships to a melolon- combinations (28S/rrnL; Table 1, Supplementary Tables 2 and 3) in this thine clade (Fig. 5, node 2; Fig. 6, node 6) with exclusively study. Different coloring corresponds to the alignment algorithm and parameters used for the rrnL alignment. The ‘preferred’ alignment is Southern Hemisphere distribution composed of tribes Het- indicated by an arrow, *indicates a single outlier (AL6/4) from the eronychini, Liparetrini, Maechidiini, Phyllotocini, and MAFFT and MUSCLE cluster. An enlarged image with fully labeled Scitalini. The Australian Phyllotocini that has been classi- terminals is given in Supplementary Fig. 1. fied as a subtribe (Phyllotocina) of Sericini (Machatschke, 1959; Smith, 2006) represented here by one Phyllotocus spe- (node 1) as its sister, the monophyly of Cetoniinae (node 4; cies was widely separated from Sericini confirming a recent BS = 2, pp = 51%), and the monophyly of most tribes morphology-based analysis (Ahrens, 2006a). The support (Fig. 6). In accordance with morphological work (Browne for a sister relationship of that clade with the Ablaberini + and Scholtz, 1998) we also consistently obtained the highly Sericini in the Bayesian analysis was extremely high (pos- 790 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798

Fig. 4. Strict consensus of the two most parsimonious trees obtained from the preferred alignment. Bootstrap resampling values above 50% are shown above branches, Bremer support below branches. terior probability = 100%) but was very low in the unpar- Nearctic) was almost identical under all three tree search titioned ML analysis (bootstrap support = 4%). approaches, and topologies were generally also consistent The topology of basal nodes (marked by black circles in with recent morphological data (Ahrens, 2006a). The Fig. 6) among the Old World Sericini (node 8; it also monophyly of Old World Sericini was well supported by includes a single group, , with representatives in the high branch support in the model-based tree searches D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 791

Fig. 5. Maximum likelihood tree obtained from the preferred alignment, with bootstrap values shown above branches. 792 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798

Fig. 6. Majority-rule consensus-tree from Bayesian analysis. The tree summarizes 10,000 topologies of the last 5 Â 106 generations (of a total of 6 Â 106) using a user defined starting tree (ML tree of AL9/3). Above the branches are given the posterior probabilities of the node. For symbols see the text. D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 793

(low support in the parsimony analysis). Within the latter, a lineage of about 150 species including the genus Triodon- tella and allies from the Mediterranean and the Afrotropics (Fig. 6; node 9) was sister to the vast majority of Sericini (>3000 species). Similar to the morphology-based analysis the most basal node in this group defines the eastern Med- iterranean Omaloplia + Hellaserica. In the more derived clades, some groups (nodes marked with an asterisk in Fig. 6) were retrieved as monophyletic in all analyses and were well supported in the Bayesian analysis, although relationships among these groups differed slightly in vari- ous searches. Among the remainder of Sericini, several large genera that are well established in the traditional tax- onomy were polyphyletic, including the genera Neoserica, Microserica, Serica, and Maladera.

3.3. Origin of plurilamellate antennae in Sericini

Character transformations in segment number of the antennal club were optimized onto the topology of the sin- gle best Bayesian tree using parsimony (Fig. 7). Pattern from fast and slow optimizations did not differ, except that there were fewer nodes with equivocal character optimiza- tions under DELTRAN settings. According to this recon- struction, antennae with more than three lamellae (plurilamellate) originated independently 10 times within the Sericini alone (Fig. 7, gray box). For the ML topology, nine independent origins were estimated, while the two most parsimonious trees required nine or 10 transitions, respectively, including one loss of the trait. Within the remaining Melolonthinae, a further nine independent gains (or eight gains plus one loss) of the plurilamellate clubs were evident. Tree searches constrained for a single origin of plurilamellate clubs within Sericini were significantly worse in an SH test (Table 4), thus contradicting a single origin of plurilamellate clubs. Other scenarios with con- straints of a twofold, threefold, or fourfold origin of pluril- amellate antennae in Sericini were also worse than the unconstrained tree but less disfavored in the SH test, i.e. more similar to the null hypothesis (Table 4).

4. Discussion

4.1. Number of antennomeres and their diagnostic value

Despite their huge species diversity, phylogenetic analy- ses of phytophagous scarab species are still in their infancy (Browne and Scholtz, 1998; Ahrens, 2006a; Smith et al., 2006). The number of antennal lamellae is an easily scored character distinguishing groups that are difficult to sepa- rate otherwise. Especially in the Sericini with their notori- ous morphological homogeneity, the constitution of the antennal club is among the most important diagnostic fea- tures since the works of Reitter (1896) and Brenske (1897). The number of joints in the antennal club has subsequently Fig. 7. The number of antennomeres composing the antennal club been used for the genus-level classification in most melo- optimized using MacClade on the single Bayesian tree with the highest lonthine lineages, and recently also for cladistic analyses likelihood scores. 794 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798

Table 4 to be revised without recourse to highly homoplastic or Comparison of tree topologies from parsimony searches constrained for sexually dimorphic characters such as the number of clavic- different numbers of origins of plurilamellate antenna ular lamellae. A classification based on numbers of club Number of origins antennomeres is particularly problematic if it is used for 12a2b34Cn the study of evolutionary trends, including the question p-Mean 0.033 0.1921 0.1591 0.1575 0.3095 — about the involvement of antennal traits in species diversity SD 0.00424 0.0292 0.01301 0.01496 0.01909 — and in the diversification of other morphological and phys- TL 22212 22183 22199 22186 22176 22146 iological characteristics that might mediate these processes, n Trees 1 5 3 3 2 2 because such classification does not provide the basis for an Constraints were devised by enforcing one, two, three, or four mono- independent assessment of these traits. phyletic plurilamellate group(s) within Sericini. The table presents the results of SH tests for the trees including p-values and its standard devi- ation from the constrained searches compared to the unconstrained 4.2. Choice of preferred alignment topology (Cn). In cases where the constrained searches produced multiple topologies, we selected those trees which produced plurilamellate groups We used several ‘new’ methods for generating primary that were the most similar to the unconstrained tree. For the twofold alignments, assuming that they would widen or refine the origin two scenarios for relationships among the plurilamellate clades have parameter space. We found that these alignments differed been tested (2a and b). Additionally, the parsimony tree length and the number of equally shortest trees are given. substantially in their overall structure, including the num- ber of parsimony-informative characters and gapped posi- tions. In particular, the probabilistic alignment procedure (e.g. Sanmartin and Martin-Piera, 2003; Ahrens, 2006b). A implemented in PRANK stood out for its ‘gappier’ align- further problem of the genus-level classification of sericines ment, while MUSCLE and BlastAlign also produced align- is that it was elaborated step by step from geographically ments with inferred indels more spread out across the limited surveys (e.g. Reitter, 1896; Nomura, 1973, 1974), matrix (high number of sites with gap characters states) with the result that characters were not examined on a glo- than the Clustal and in particular the MAFFT alignments bal scale. Moreover, many taxonomists were using a typo- (Table 2). However, the topological differences of trees logical classification from combinations of characters (not obtained from these alignments do not necessarily reflect distinguishing apomorphies from plesiomorphies) to diag- the differences among the alignments on which the tree nose genera. This resulted in many endemic ‘genera’ each searches are based. A striking feature of the comparisons with a very small number of species, and a genus concept was the strong clustering of similar trees obtained with a that was not applied uniformly in different geographic particular algorithm (or parameter set) and the apparent areas, e.g. in American, European or Asian taxa. Hence, divergence from other clusters obtained under different genera of Sericini tended to be defined based on obvious conditions (Fig. 3; Supplementary Fig. 1). This demon- morphological apomorphies, such as a particular kind of strates that the newly available methods indeed diversify pilosity (e.g. in Pachyserica), a pronotal margin (e.g. Anom- the space of accessible homology assignments, but further alophylla), asymmetric protarsal claws in males, a metatibia adds to the problem that the choice of the algorithm makes or metafemur with serrated longitudinal ridge (e.g. Lasios- a substantial contribution to the final outcome of the phy- erica, ), or a certain number (>3) of clavicu- logenetic analysis. Three algorithms used here are derived lar antennomeres. This also led to the definition of genera from the same basic procedure of progressive addition of based on autapomorphies of single species, with the result sequences to the alignment based on a similarity score, that almost 30% of generic names in Sericini refer to mono- but with different procedures for refinement. The MAFFT specific groups. and MUSCLE procedures are based on a similar iterative Conversely, species lacking such particular features were alignment step that recalculates the guide tree prior to aggregated into huge groups based on presumably plesio- realignment, which is also reflected in the similarity of morphic features. It is therefore not surprising that many the resulting trees. Both methods are the most similar to well established genera, including Neoserica, Microserica, the Clustal algorithm from which they are derived. Hence and Maladera, appear polyphyletic in our analysis and it is comforting that this class of methods produces trees each break up into four widely separated clades. Maladera that group together but are separated from the two other includes almost 600 inconspicuous species with a trilamel- methods, PRANK and BlastAlign. The former employs a late antennal club mostly associated with ‘broad’ legs, model-based alignment that predicts structural elements while Neoserica (sensu lato; see Ahrens, 2003, 2004, for a and attempts the correct calculations of indels. In contrast, recent reassessment of this group) is an aggregate of some BlastAlign is a so-called local alignment whereby small 200 large-bodied species also with mostly broad legs, but fragments of high similarity (sequence identity) are lined with the clavicular antennomeres in males varying from up against a single ‘most representative sequence’ in the four to seven (three to six in females). Microserica, with data set (Belshaw and Katzourakis, 2005; Hunt and Vog- currently ca. 170 named species, is distinguished from ler, 2008). These methods then provide a range of primary Neoserica only by smaller body size (Brenske, 1897). The alignments and derived trees to cover a wide space of molecular analysis indicates that these taxon concepts need possibilities. D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 795

Objective measures to select among the alignment algo- much weight to indels which are coded here as a 5th char- rithms and parameter settings were based using Wheeler’s acter state. (1995) ‘sensitivity analysis’ and implemented by testing Tests for congruence among data sets as a criterion for for maximum congruence between the length-variable rrnL judging the quality of an alignment are now widely used, and 28S regions (and the length-invariable cox1). Measures ever since the theory of classical tests of homology was of congruence have proliferated in recent years, but unfor- extended to the correspondences of nucleotide residues in tunately the results of these tests are frequently contradic- sequence alignment (De Pinna, 1991; Wheeler, 1995; Brow- tory (Aagesen et al., 2005; Pons and Vogler, 2006), er and Schawaroch, 1996). Whereas the approach is most therefore further complicating the choice of preferred trees. defensible in tree alignment (Sankoff, 1975; Wheeler, This was confirmed here, as the preferred (least incongru- 1996), it can equally be applied to other alignment proce- ent) alignments were slightly different in terms of character dures to test the congruence of partitions (Vogler and Pear- congruence under two of the three methods (ILD, RILD) son, 1996; Terry and Whiting, 2005). Here the approach while they differed more widely in the third metric was taken further to expand the limited alignment space (MRI). Character based congruence measures are prefera- accessible to Clustal which may be improved upon with ble because they can measure conflicting signal directly newer, diverse alignment methods. Using alignments gener- (Giribet and Wheeler, 1999), but here we found the taxo- ated with different procedures therefore may produce better nomic consistency test useful to decide principally which overall alignments, and may also be a way of testing the measure of character conflict to apply. This test favored relative performance of these recent methods (Terry and the RILD and ILD equally, but we preferred the former Whiting, 2005). Our findings suggest that using a combina- because it is a measure of character conflict relative to tion of different alignment procedures and parameters for the total amount of possible conflict in each data set and various length-variable loci may have a positive impact to therefore more suited to comparing different alignment to improve character congruence (Fig. 1). These differences each other. Using these criteria, the MAFFT and MUS- in performance of various alignment procedures may be CLE alignments (AL3 and AL4) provided the highest con- attributed to the fact that loci frequently evolve at different gruence values for rrnL, in combination with Clustal rates, and may also depend of the specific taxon sampling (various settings), PRANK and MAFFT in the 28S that influences the way in which indels are reconstructed alignment. and hence their number and extent in an alignment (Sim- These settings also performed quite well for taxonomic mons and Freudenstein, 2003; Pons and Vogler, 2006). In consistency. The selection of the overall preferred align- the current case the 28S gene has only minimal influence ment was made from the combined taxonomic and charac- on trees from the combined alignment due to its smaller ter congruence parameters whereby our particular size (approximately 1/5 of the number of informative char- weighting scheme favored the TC parameter (Supplemen- acters compared to rrnL) and lower tree support (despite a tary Tables 2 and 4). Again, the MAFFT and MUSCLE higher CI/RI) (Table 1). Yet, as we use it for measures of (AL3 and AL4) alignments of rrnL had highest congru- congruence in selecting among the rrnL alignment param- ence, in combination with the Clustal alignments of the eters, the 28S partitions does also have a major effect on 28S marker (under various parameter settings in Clustal). the phylogenetic conclusions, independently of its absolute The PRANK alignment overall performed poorly, in par- contribution to the combined analysis (although the effect ticular for rrnL, which is surprising because this methodol- is moderated by the inclusion of cox1 in calculating the ogy attempts to distinguish indels and deletions and character congruence, and by using character congruence therefore produces a largely tree-based alignment which together with taxonomic consistency). should help the phylogenetic congruence. Finally, the Blast-based method performed particularly poorly and 4.3. Extended number of antennal lamellae and their role in consistently scored lowest for rrnL, although results were an evolutionary context more mixed with 28S. The latter is very important for assessing the effect of removal of length-variable data; the Our results indicate that among melolonthines and espe- BlastAlign procedure removes ‘unalignable’ sequences (no cially among the tribe Sericini additional lamellae have similarity to any other sequence in the data set under the originated several times independently. Reversals (reduc- specified parameters) but does this on a taxon-by-taxon tion of plurilamellate to trilamellate antennae) seem rare basis rather than removing entire columns (see Gatesy according to character optimization on the trees obtained et al., 1993; Castresana, 2000). A possible explanation for here. Lamellate antennae with >3 segments are limited to this drop in performance may be the loss of data overall the pleurostict scarabs only (exception: Pleocomidae), and hence reduced strength of the phylogenetic signal and although they do appear throughout this group, (about 8% of primary data were removed in the BlastAlign including a few unsampled lineages of melolonthines (e.g. output for rrnL; data not shown), rather than achieving the Sanmartin and Martin-Piera, 2003) and even in one genus removal of mis-aligned bases. The remaining bases were of Trichiini (Cetoniinae), the major groups are sufficiently widely spread out, as indicated from the large extent of well known to confirm the extra lamellae to be the derived the alignment matrix (Table 2), which possibly gives too state. There is evidence from a clade of Microserica where 796 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 shifts from 4 to 5 segmented clubs are also reversed, and may also be useful to address this question by compar- these reversals may also occur in related groups not yet ing local variation at the population level, as preliminary included here (Ahrens, 2001a,b, 2002). In a few examples evidence already suggests infraspecific variation in the an infraspecific polymorphism of 4 and 5 segmented clubs length of antennal club in some Sericini (Ahrens, 1995, was found (see below), suggesting that the number of 2001a; Ahrens and Sabatinelli, 1996) which could in part lamellae is affected by considerable plasticity. be driven by environmental conditions, for example in The repeated origin of extended number of lamellate species poor communities at higher altitudes where bee- antennomeres as the primary location of olfactory sensillae tles tend to have a shorter club and smaller eyes. Finally, (Meinecke, 1975) and pheromone binding proteins (Leal information about taxonomic distribution of different et al., 1998; Nikonov et al., 2002) leaves the intriguing types of sensillae (Meinecke, 1975; Romero-Lo´pez question about the adaptive significance of these structures. et al., 2004; Tanaka et al., 2006) and their receptor pro- Pleurostict scarabs have developed a complex chemical teins and receptor sensitivity (Nikonov et al., 2002), will communication system (Leal, 1998, 1999; Romero-Lo´pez put these morphological traits in a functional context. et al., 2004). Species-specific mate recognition is a two-step In conclusion, the mate recognition system that relies olfactory process (Reinecke et al., 2006; Ruther et al., 2000) greatly on chemical communication may also be implicated by which males first detect kairomones released from the in the evolution of the great species richness of pleurostict host plants upon herbivore feeding, followed by a reaction scarabs. As the group is almost entirely dependent on to pheromones produced by the female. Sexual dimor- angiosperms, the evolutionary success of the hosts may phism in the number of sensillae which may be up to 10 be causally linked to the beetles’ diversification (Scholtz times higher in males is the strongest evidence for the and Chown, 1995). However, an important peculiarity involvement of sensillae in olfactory mate finding (e.g. for pleurosticts is that most species are highly polyphagous. Romero-Lo´pez et al., 2004; Tanaka et al., 2006). The This makes them unlikely to undergo adaptive radiations greater number of sensillae is achieved by an increased den- on particular host lineages which is usually thought to sity or by increased surface area (length) of the antenno- mediate species diversification of herbivores. Instead, meres (Allsopp, 1990). An even greater increase should diversification may be driven by the particular utilization be achieved with the augmentation of the number of lamel- of the plant resource which requires an elaborate system lae in the antennal club and probably permits a greater spa- for mate recognition. Attraction to plant kairomones tial resolution of the odorant molecules in the air stream released in response to feeding damage, in combination (Loudon and Koehl, 2000; Loudon and Davis, 2005; Koe- with the pheromones produced by the females, might result hl, 2006). in local aggregations in a patchy landscape reinforced by Resources of presumed ancestral feeding style of scara- species-specific recognition mechanisms which may ulti- baeids (saprophagy or mycetophagy) (Scholtz and Chown, mately result in population separation and species forma- 1995) occur in a largely two-dimensional matrix (soil, tree tion. It remains unclear what role the variation in trunks, etc., including dung for coprophagous scarabs), number of antennal lamellae could play in this recognition whereas the switch to feeding on live parts of shrubs and system, beyond the simple increase in sensitivity and possi- trees extends the feeding source into a huge three-dimen- ble greater selectivity. However, the high level of homo- sional space. While the largely unspecific feeding on leaves plasy in this trait, apparently greatly underestimated in many groups would not have posed difficulties to detect because it has been used as a convenient trait for classifica- food supply, this causes great problems for mate finding. It tion, is an expression of the species-specific variation in is therefore conceivable that the ecological shift triggered antennal characters and may be indication of a causal link the origin of an elaborate system of chemical communica- of changes in antennal traits with the origination of new tion and the optimization of olfactory organs via an aug- species. mentation of sensors on the increased surface of antennal lamellae (Allsopp, 1990; Romero-Lo´pez et al., 2004; Tanaka et al., 2006). Acknowledgments Given their presumed selective advantage, why do not all pleurosticts have plurilamellate antennae? Interest- We are grateful to Anna Papadopoulou and Ruth Wild ingly, in the two major groups specialized on flowers for help with the laboratory analysis; to Jesus Gomez-Zurita or woody parts (Cetoniinae and Dynastinae, respectively) and Ignacio Ribera for helpful discussion, and to Michael plurilamellate clubs are very rare. However, speculations Balke, David Lees, Paul Lago, Joachim Schmidt, Andrew about which parameters favor the extra lamellae, includ- Smith, and Charly Werner for collecting specimens. We ing the possible effect of habitat, feeding resource, pher- thank the three anonymous referees who helped to improve omone chemical composition, diel activity patterns, the final version of the manuscript. Bayesian analyses were community composition, and other environmental or carried out using resources at the Computational Biology behavioral factors on the density of antennal sensillae Service Unit from Cornell University which is partially would be premature, as counts of sensillae are not yet funded by Microsoft Corporation. Laboratory analysis at widely available. Besides species level comparisons, it the NHM was partly supported by a SYNTHESYS Grant D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798 797

(GB-TAF-63 and 917) to DA. DA was supported by the Brower, A.V.Z., Schawaroch, V., 1996. Three steps of homology German Science Foundation (DFG/AH175/1-1). assessment. Cladistics 12, 265–272. Browne, J., Scholtz, C.H., 1998. Evolution of the scarab hind wing articulation and wing base: a contribution toward the phylogeny of the Appendix A. Supplementary data Scarabaeidae (Scarabaeoidea: Coleoptera). Syst. Entomol. 23, 307– 326. Supplementary data associated with this article can be Browne, J., Scholtz, C.H., 1999. A phylogeny of the families of found, in the online version, at doi:10.1016/ Scarabaeoidea. Syst. Entomol. 24, 51–84. j.ympev.2008.02.010. Burmeister, H., 1855. Handbuch der Entomologie. Vierter Band. Beson- dere Entomologie. Fortsetzung. 2. Abteilung, Coleoptera Lamellicor- nia Phyllophaga chaenochela. TCF Enslin, Berlin, x + 569pp. References Castresana, J., 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. Aagesen, L., Petersen, G., Seberg, O., 2005. Sequence length variation, 17, 540–552. indel costs, and congruence in sensitivity analysis. Cladistics 21, 15–30. De Pinna, M.C.C., 1991. Concepts and tests of homology in the cladistic doi:10.1016/j.ympev.2007.11.029. paradigm. Cladistics 7, 367–394. Ahrens, D., 1995. Taxonomische Studien an Sericinae. Die Gattungen Edgar, R.C., 2004a. MUSCLE: multiple sequence alignment with high Deroserica Moser,1915, Microsericaria Nikolaev, 1979, Pachyderoser- accuracy and high throughput. Nucleic Acids Res. 32 (5), 1792–1797. ica Moser, 1920 und Microserica Brenske, 1894 (Ins., Col., Melo- Edgar, R.C., 2004b. MUSCLE: a multiple sequence alignment method lonth.). Entomol. Abh. (Dresd.) 57 (2), 37–56. with reduced time and space complexity. BMC Bioinform. 5 (113), 1– Ahrens, D., 2001a. Zwei neue Arten der Gattung Chrysoserica aus dem 19. Himalaya (Coleoptera Melolonthidae). Boll. Soc. Entomol. Ital. 133, Erichson, W.F., 1847. Naturgeschichte der Insecten Deutschlands. Erste 131–145. Abtheilung. Coleoptera, vol. 3, Lieferung 4. Nicolaische Buchhand- Ahrens, D., 2001b. Neue Arten der Microserica splendidula (Fabricius, lung, Berlin, pp. 481–640. 1801)—Gruppe von West-Malayasia, Nias und Sumatra (Coleoptera: Estabrook, G.F., McMorris, F.R., Meacham, C.A., 1985. Comparison of Melolonthidae). Koleopterol. Rundsch. 71, 111–120. undirected phylogenetic trees based on subtrees of four evolutionary Ahrens, D., 2002. Revision der Arten der Microserica viridicollis Arrow, units. Syst. Zool. 34, 193–200. 1913—Gruppe (Coleoptera, Melolonthinae, Sericini). Linz. Biol. Beitr. Evans, A.V., Smith, A.B.T., 2007. An electronic checklist of the New 34, 383–412. World chafers (Coleoptera: Scarabaeidae: Melolonthinae), Version 2, Ahrens, D., 2003. Zur Identita¨t der Gattung Neoserica Brenske, 1894, 349pp. nebst Beschreibung neuer Arten (Coleoptera, Melolonthidae, Sericini). Felsenstein, J., 1973. Maximum likelihood and minimum-step methods for Koleopterol. Rundsch. 73, 169–226. estimating evolutionary trees from data on discrete characters. Syst. Ahrens, D., 2004. Monographie der Sericini des Himalaya (Coleoptera, Zool. 22, 240–249. Scarabaeidae). Dissertation. de-Verlag im Internet GmbH, Berlin, Gardner, P.P., Wilm, A., Washietl, S., 2005. A benchmark of multiple 534pp. sequence alignment programs upon structural RNAs. Nucleic Acids Ahrens, D., 2006a. The phylogeny of Sericini and their position within the Res. 33 (8), 2433–2439. Scarabaeidae based on morphological characters (Coleoptera: Scara- Gatesy, J., DeSalle, R., Wheeler, W.C., 1993. Alignment-ambiguous baeidae). Syst. Entomol. 31, 113–144. nucleotide sites and the exclusion of systematic data. Mol. Phylogenet. Ahrens, D., 2006b. The phylogeny of the genus Lasioserica inferred from Evol. 2, 152–157. adult morphology—implications on the evolution of montane fauna of Giribet, G., Wheeler, W.C., 1999. On gaps. Mol. Phylogenet. Evol. 13, the South Asian orogenic belt (Coleoptera: Scarabaeidae: Sericini). J. 132–143. Zool. Syst. Evol. Res. 44, 34–53. Goddard, W., Kubicka, E., Kubicki, G., McMorris, F.R., 1994. The Ahrens, D., 1996. Revision der Gattung Nepaloserica Frey (Coleoptera, agreement metric for labeled binary trees. Math. Biosci. 123, 215–226. Melolonthidae). Frag. Entomol. 28, 209–243. Goldman, N., Anderson, J.P., Rodrigo, A.G., 2000. Likelihood-based Ahrens, D., Monaghan, M.T., Vogler, A.P., 2007. DNA-based taxonomy tests of topologies in phylogenetics. Syst. Biol. 49 (4), 652–670. for associating adults and larvae in multi-species assemblages of Goloboff, P., Farris, S., Nixon, K., 2004. TNT (Tree analysis using New chafers (Coleoptera: Scarabaeidae). Mol. Phylogenet. Evol. 44, 436– Technology). Cladistics 20, 84. 449. Gotoh, O., 1982. An improved algorithm for matching biological Akaike, H., 1974. A new look at the statistical model identification. IEEE sequences. J. Mol. Biol. 162, 705–708. Trans. Autom. Control 19, 716–723. Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to Allsopp, P., 1990. Sexual dimorphism in the adult antennae of Antitrogus estimate large phylogenies by maximum likelihood. Syst. Biol. 52 (5), parvulus Britton and Lepidiota negatoria Blackburn (Coleoptera: 696–704. Scarabaeidae: Melolonthinae). J. Aust. Entomol. Soc. 29, 261–266. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment Altekar, G., Dwarkadas, S., Huelsenbeck, J.P., Ronquist, F., 2004. editor and analysis program for Windows 95/98/NT. Nucleic Acids Parallel metropolis coupled Markov chain Monte Carlo for Bayesian Symp. Ser. 41, 95–98. phylogenetic inference. Bioinformatics 20, 407–415. Hunt, T., Vogler, A.P., 2008. A protocol for large-scale rRNA sequence Balthasar, V., 1963. Monographie der Scarabaeidae und Aphodiidae der analysis: towards a detailed phylogeny of Coleoptera. Mol. Phyloge- palaearktischen und orientalischen Region. Coleoptera: Lamellicornia. net. Evol. 47, 289–301. Band 1. Prag, Verlag der Tschechoslowakischen Akademie der Kass, R.E., Raftery, A.E., 1995. Bayes factors. J. Am. Stat. Assoc. 90, Wissenschaften, pp. 391. 773–795. Belshaw, R., Katzourakis, A., 2005. BlastAlign: a program that uses blast Katoh, K., Misawa, K., Kuma, K., Miyata, T., 2002. MAFFT: a novel to align problematic nucleotide sequences. Bioinform. Appl. Note 21 method for rapid multiple sequence alignment based on fast Fourier (1), 122–123. transform. Nucleic Acids Res. 30 (14), 3059–3066. Brandley, M.C., Schmitz, A., Reeder, T.W., 2005. Partitioned Bayesian Katoh, K., Kuma, K., Toh, H., Miyata, T., 2005. MAFFT version 5: analyses, partition choice, and the phylogenetic relationships of scincid improvement in accuracy of multiple sequence alignment. Nucleic lizards. Syst. Biol. 54 (3), 373–390. Acids Res. 33 (2), 511–518. Brenske, E., 1897. Die Serica-Arten der Erde. I. Berl. Ent. Ztschr. 42, 345– Koehl, M.A.R., 2006. The fluid mechanics of sniffing in 438. turbulent odor plumes. Chem. Senses 31, 93–105. 798 D. Ahrens, A.P. Vogler / Molecular Phylogenetics and Evolution 47 (2008) 783–798

Leal, W.S., 1998. Chemical ecology of phytophagous scarab beetles. Sankoff, D., 1975. Minimal mutation trees of sequences. SIAM J. Appl. Annu. Rev. Entomol. 43, 39–61. Math. 28, 35–42. Leal, W.S., 1999. Mechanisms of chemical communication in scarab beetles. Sanmartin, I., Martin-Piera, F., 2003. First phylogenetic analysis of the In: Hidaka, T., Matsumoto, Y., Honda, K., Honda, H., Tatsuki, K. subfamily Pachydeminae (Coleoptera. Scarabaeoidea, Melolonthidae): (Eds.), Environmental entomlology: behaviour, physiology, and chem- the Palearctic Pachydeminae. J. Zool. Syst. Evol. Res. 41, 2–46. ical ecology. University of Tokyo Press, Tokyo, pp. 464–478. Scholtz, C.H., Chown, S.L., 1995. The evolution of habitat use and diet in Leal, W.S., Wojtasek, H., Miyazawa, M., 1998. Pheromone-binding the Scarabaeoidea: a phylogenetic approach. In: Pakaluk, J., Slipinski, proteins of scarab beetles. Ann. NY Acad. Sci. 855, 301–305. S.A. (Eds.), Biology, Phylogeny, and Classification of Coleoptera: Loudon, C., Koehl, M.A.R., 2000. Sniffing by a silkworm moth: wing Papers Celebrating the 80th Birthday of Roy A. Crowson. Muzeum i fanning enhances air penetration through and pheromone interception Instytut Zoologii PAN, Warszawa, pp. 355–374. by antennae. J. Exp. Biol. 203, 2977–2990. Scholtz, C.H., Grebennikov, V.V., 2005. 13. Scarabaeoidea Latreille, 1802. Loudon, C., Davis, E.C., 2005. Divergence of streamlines approaching a In: Beutel, R.G., Leschen, R.A.B. (Eds.), Coleoptera, Beetles. vol. 1: pectinate antenna: consequences for chemoreception. J. Chem. Morphology and Systematics (Archostemata, Adephaga, Myxophaga, Ecol. 31, 1–13. partim). Handbook of Zoology. vol. IV Arthropoda: Insecta Lo¨ytynoja, A., Goldman, N., 2005. An algorithm for progressive multiple part 38. W. de Gruyter-Berlin, New York, pp. 367–426. alignment of sequences with insertions. Proc. Natl. Acad. Sci. USA 102 Shimodaira, H., Hasegawa, M., 1999. Multiple comparisons of log- (30), 10557–10562. likelihoods with application s to phylogenetic inference. Mol. Biol. Machatschke, J.W., 1959. Phylogenetische Untersuchungen u¨ber die Evol. 16, 1114–1116. Sericini (sensu Dalla Torre 1912) (Coleoptera: Lamellicornia: Melo- Simmons, M.P., Freudenstein, J.V., 2003. The effects of increasing genetic lonthidae). Beitr. Entomol. 9, 730–746. distance on alignment of, and tree construction from, rDNA internal Maddison, W.P., Maddison, D.R., 1998. MacClade, analysis of phylogeny transcribed spacer sequences. Mol. Phylogenet. Evol. 26, 444–451. and character evolution. Ver. 3.08: Sinauer, Sunderland, MA. Simmons, M.P., Mu¨ller, K., Norton, A.P., 2007. The relative performance Meinecke, C.C., 1975. Riechsensillen und Systematik der Lamellicornia. of indel-coding methods in simulations. Mol. Phylogenet. Evol. 44, (Insecta: Coleoptera). Zoomorphologie 82, 1–42. 724–740. Mickevich, M.F., Farris, J.S., 1981. The implications of congruence in Simon, C., Frati, F., Beckenbach, A., et al., 1994. Evolution, weighting, Menidia. Syst. Zool. 30, 351–370. and phylogenetic utility of mitochondrial gene sequences and a Monaghan, M.T., Inward, D.G., Hunt, T., Vogler, A.P., 2007. A compilation of conserved polymerase chain reaction primers. Ann. molecular phylogenetic analysis of the Scarabaeinae (dung beetles). Entomol. Soc. Am. 87, 651–701. Mol. Phylogenet. Evol. 45, 674–692. Smith, A.B.T., 2006. A review of the family-group names for the Morrison, D.A., Ellis, J.T., 1997. Effects of nucleotide sequence alignment Superfamily Scarabaeoidea (Coleoptera) with corrections to nomencla- on phylogeny estimation: a case study of 18S rDNAs of Apicomplexa. ture and a current classification. Coleopterists Soc. Monogr. 5, 144–204. Mol. Biol. Evol. 14, 428–441. Smith, A.B.T., Hawks, D.C., Heraty, J.M., 2006. An overview of the Nikonov, A.A., Peng, G., Tsurupa, G., Leal, W.S., 2002. Unisex classification and the evolution of the major scarb clades pheromon detectors and pheromon binding proteins in Scarab beetles. (Coleoptera: Scarabaeoidea) based on preliminary molecular analyses. Chem. Senses 27, 495–504. Coleopts. Soc. Monogr. 5, 35–46. Nomura, S., 1973. On the Sericini of Japan. I. Toˆhoˆ-Gakuhoˆ 23, 119–152. Swofford, D.L., 2002. PAUP*, Phylogenetic Analysis using Parsimony. Nomura, S., 1974. On the Sericini of Taiwan. Toˆhoˆ-Gakuhoˆ 24, 81–115. Version 4.0b. Sinauer Associates, Sunderland, MA. Nylander, J.A.A., Ronquist, F., Huelsenbeck, J.P., Nieves-Aldrey, J.L., Tanaka, S., Yukuhiro, F., Wakamura, S., 2006. Sexual dimorphism in 2004. Bayesian phylogenetic analysis of combined data. Syst. Biol. 53, body dimensions and antennal sensilla in the white grub beetle, 47–67. Dasylepida ishigakiensis (Coleoptera: Scarabaeidae). Appl. Entomol. Ogden, T.H., Rosenberg, M.S., 2007. How should gaps be treated in Zool. 41, 455–461. parsimony? A comparison of approaches using simulation. Mol. Terry, D.M., Whiting, M.F., 2005. Comparison of two alignment Phylogenet. Evol. 42, 817–826. techniques within a single complex data set: POY versus Clustal. Pons, J., Vogler, A.P., 2006. Size, frequency, and phylogenetic signal of Cladistics 21, 272–281. multiple-residue indels in sequence alignment of introns. Cladistics 22, Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, 144–156. D.G., 1997. The ClustalX windows interface: flexible strategies for Posada, D., Crandall, K.A., 1998. Modeltest: testing the model of DNA multiple sequence alignment aided by quality analysis tools. Nucleic substitution. Bioinform. Appl. Note 14, 817–818. Acids Res. 25, 4876–4882. Rambaut, A., Drummond, A., 2003. Tracer. MCMC Trace Analysis Tool. Vogler, A.P., Pearson, D.L., 1996. A molecular phylogeny of the Tiger Reinecke, A., Ruther, J., Hilker, M., 2006. Pre-copulatory isolation in beetles (Cicindelidae): congruence of mitochondrial and nuclear rDNA sympatric Melolontha species (Coleoptera: Scarabaeidae). Agri. Forest data sets. Mol. Phylogenet. Evol. 6, 321–338. Entomol. 8, 289–293. Wheeler, W.C., 1995. Sequence alignment, parameter sensitivity, and the Reitter, E., 1896. Uebersicht der mir bekannten palaearktischen, mit der phylogenetic analysis of molecular data. Syst. Biol. 44, 321–331. Coleopteren-Gattung Serica verwandten Gattungen und Arten. Wie- Wheeler, W.C., 1996. Optimization alignment: the end of multiple ner Entomol. Ztg. 15, 180–188. sequence alignment in phylogenetics? Cladistics 12, 1–9. Ritcher, P.O., 1969. Spiracles of adult Scarabaeoidea (Coleoptera) and Wheeler, W.C., Hayashi, C.Y., 1998. The phylogeny of the extant their phylogenetic significance. I. The abdominal spiracles. Ann. Chelicerate orders. Cladistics 14, 173–192. Entomol. Soc. Am. 62, 8–63. Wheeler, W.C., Ramı´rez, M.J., Aagesen, L., Schulmeister, S., 2006. Romero-Lo´pez, A.A., Arzuffi, R., Valdez, J., Moro´n, M.A., Castrejo´n- Partition-free congruence analysis: implications for sensitivity analysis. Go´mez, V., Villalobos, F.J., 2004. Sensory organs in the antennae of Cladistics 22, 256–263. Phyllophaga obsoleta (Coleoptera: Melolonthidae). Ann. Entomol. Wilm, A., Mainz, I., Steger, G., 2006. An enhanced RNA alignment Soc. Am. 97, 1306–1312. benchmark for sequence alignment programs. Algorithms Mol. Biol. 1 Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3, Bayesian phylogenetic (19), 1–11. inference under mixed models. Bioinformatics 19, 1572–1574. Wong, K.M., Suchard, M.A., Huelsenbeck, J.P., 2008. Alignment Ruther, J., Reinecke, K., Thiemann, K., Tolasch, T., Francke, W., Hilker, uncertainty and genomic analysis. Science 319, 473–476. M., 2000. Mate finding in the forest cockchafer, Melolontha hippoca- Yang, Z., Rannala, B., 1997. Bayesian phylogenetic inference using DNA stani, mediated by volatiles from plants and females. Physiol. Entomol. sequences: a Markov chain Monte carlo method. Mol. Biol. Evol. 14, 25, 172–179. 717–724.