<<

Molecular Phylogenetics and Evolution 68 (2013) 42–54

Contents lists available at SciVerse ScienceDirect

Molecul ar Phylo genetics and Evolution

journal homepage: www.elsevier.com/locate/ympev

Reconstructing the phylogeny of (: ) using DNA of the obligate symbiont Buchnera aphidicola ⇑ Eva Nováková a,b, , Václav Hypša a, Joanne Klein b, Robert G. Foottit c, Carol D. von Dohlen d, Nancy A. Moran b a Faculty of Science, University of South Bohemia, and Institute of Parasitology, Biology Centre, ASCR, v.v.i., Branisovka 31, 37005 Ceske Budejovice, Czech Republic b Department of Ecology and Evolutionary Biology, Yale University, 300 Heffernan Dr., West Haven, CT 06516-4150, USA c Agriculture & Agri-Food Canada, Canadian National Collection of , K.W. Neatby Bldg., 960 Carling Ave. Ottawa, Ontario, Canada K1A 0C6 d Department of Biology, Utah State University, UMC 5305, Logan, UT 84322-5305, USA article info abstract

Article history: Reliable phylogene tic reconstruction, as a framework for evolutionary inference, may be difficult to Received 21 August 2012 achieve in some groups of organisms. Particularly for lineages that experienced rapid diversification, lack Revised 7 March 2013 of sufficient information may lead to inconsistent and unstable results and a low degree of resolution. Accepted 13 March 2013 Coincident ally, such rapidly diversifying taxa are often among the biologically most interesting groups. Available online 29 March 2013 Aphids provide such an example. Due to rapid adaptive diversification, they feature variability in many interesting biological traits, but consequently they are also a challenging group in which to resolve phy- Keywords: logeny. Particularly within the family Aphididae, many interesting evolutionary questions remain unan- swered due to phylogene tic uncertainties.In this study, we show that molecular data derived from the Evolution Buchnera symbiotic of the Buchnera can provide a more powerful tool than the aphid-derived Phylogeny sequences. We analyze 255 Buchnera gene sequences from 70 host aphid species and compare the result- Informative markers ing trees to the phylogenies previously retrieved from aphid sequences, only. We find that the host and symbiont data do not conflict for any major phylogenetic conclusions. Also, we demonstrate that the symbiont-deriv ed phylogenies support some previously questionable relationships and provide new insights into aphid phylogeny and evolution. Ó 2013 Elsevier Inc. All rights reserved.

1. Introduction Earlier attempts to reconstruct aphid phylogeny using morphol- ogy resulted in conflicting evolutionary scenarios (Heie, 1987; Aphids form a distinctive clade that features considerable Wojciechowski, 1992). While an overall picture describing three variability in interesting biological traits, such as the presence of distinct groups with either viviparous (Aphididae) or oviparous many distinct, yet genetically identical, forms of females during (Adelgidae and Phylloxeridae) parthenogenetic females was gener- the life cycle (polyphenism), alternation of sexual and asexual ally accepted, the phylogeny of the most diverse group, the Aphid- reproduction, sterile soldier morphs, and seasonal alternation be- idae, remained unclear. More recent efforts have applied DNA tween unrelated groups of host plants. These traits vary among sequence data to the reconstruction of Aphididae phylogeny. An species, reflecting a long evolutionary history of biogeographical emergent problem with such data is a lack of sufficient phyloge- expansions and contractions and codiversification with plant netic signal, mostly ascribed to a rapid adaptive radiation within hosts. Understanding the evolution of these traits thus requires a Aphididae (von Dohlen and Moran, 2000; Martínez-Torres et al., reliable phylogeny as a framework for particular evolutionary 2001). Recently, this difficulty has been partly compensated by hypotheses. However, except for a few generally accepted aspects, combining data from nuclear and mitochondrial genes (Ortiz-Rivas studies on aphid phylogeny have not yet produced a clear picture and Martínez-Torres, 2010); however, even this extended source of of many relationships within this group. information leaves many relationships within Aphididae unre- solved, and several evolutionary questions unanswered. Some questions concern the number of origins of particular aspects of ⇑ Corresponding author at: Faculty of Science, University of South Bohemia, and biology. For example, three tribes with similar life cycles that in- Institute of Parasitology, Biology Centre, ASCR, v.v.i., Branisovka 31, 37005 Ceske clude dwarf sexual forms lacking mouthparts (Eriosomatini, Fordi- Budejovice, Czech Republic. Fax: +420 385 310 388. ni, Pemphigini) have been traditionally classified together in the E-mail addresses: [email protected] (E. Nováková), [email protected] (V. Hypša), [email protected] (R.G. Foottit), [email protected] (C.D. von (formerly Pemphiginae) (Heie, 1980; Remaudière Dohlen), [email protected] (N.A. Moran). and Remaudière, 1997). Furthermore, mainly on the basis of shared

1055-7903/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2013.03.016 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 43 life cycle characteristics, Eriosomatinae, , and Ano- et al., 2001), mainly with the aim of testing phylogenetic corre- eciinae have been hypothesized to share a most recent common spondence of host and symbiont. Only a few studies were focused ancestor (Heie, 1987 ). Most molecular phylogenies fail to support explicitly on reconstructing aphid phylogeny using Buchnera mark- these ideas, however, because they recover Eriosomatinae as para- ers; these included analyses performed on Uroleucon, Brachycau- phyletic, and position Eriosomatinae, Hormaphidinae, and Anoecii- dus, and Mollitrichosiphum species (Clark et al., 2000; Jousselin nae as unrelated lineages. Other biological questions that could be et al., 2009; Liu et al., 2013). All three studies ruled out the possi- illuminated by improved phylogenetic reconstruction concern bility of occasional horizontal transfers, even between closely re- whether the common ancestor of extant Aphididae fed on gymno- lated aphids sharing the same host plant. Even within species, sperms or angiosperms, and whether transitions between these phylogenies based on Buchnera genes are congruent with those major plant groups were accompanied by changes in diversifica- based on mitochondrial sequences, confirming that this symbiont tion rates. is strictly maternally inherited (Peccoud et al., 2009; Funk et al., Inclusion of additional markers with appropriate information 2000). Within an individual aphid, Buchnera genes also have higher capacity, as well as additional taxa to bisect long terminal copy number than aphid nuclear genes, so that Buchnera genes are branches, are useful approaches to resolving ambiguous nodes of relatively easy to amplify. DNA of Buchnera symbionts thus serves a rapid radiation (Whitfield and Lockhart, 2007). Nuclear gene as a useful source of information for inferring aphid phylogeny. markers would be the obvious, potentially informative additions However, phylogenetic studies using Buchnera genes so far have for reconstructing the late Cretacous radiation of Aphididae. How- been limited by sparse sampling across aphid taxa, and especially ever, due to the possibility of paralogy, designation of orthologous by taxonomic bias towards the subfamily . nuclear sequences across a broad selection of species could require In this study, we use Buchnera-derived sequences of five genes extensive experimental work. This is particularly relevant for (groEL, trpB, dnaB, ilvD and 16S rDNA) from an extended taxonomic aphids, because a vast amount of gene duplication affecting more set to reconstruct phylogenetic relationships within the Aphididae. than 2000 gene families was confirmed within the Acyrthosiphon We then compare the resulting topologies to those reported from pisum genome (The International Aphid Genomics Consortium, recent multilocus analyses of aphid-derived genes. We focus 2010). This extensive duplication of nuclear genes has been ongo- mainly on the most problematic questions in aphid evolution, such ing during aphid evolution, resulting in complex gene trees in as monophyly of individual subfamilies/tribes and their relation- which orthologs and paralogs are difficult to distinguish and the ships, variation in DNA substitution rates, and rooting of the Aph- species phylogeny is obscured (e.g., Rispe et al., 2008; Nováková ididae tree. We show that symbiont genes yield informative and Moran, 2012). phylogenetic signal and have several methodological advantages. Aphids fortunately possess an additional source of inherited ge- netic material apart from their own genomes, in the form of obli- 2. Materials and methods gate, maternally transferred, and highly derived bacterial mutualists. The association between aphids and bacteria of the 2.1. Sample collection genus Buchnera is one of the most extensively studied symbiotic systems since Buchner (1953) suggested their mutualistic associa- Our study was designed to reconstruct evolutionary relation- tion. As in many other mutualistic relationships, the bacteria sup- ships among a wide diversity of aphids, represented by a broad ply essential nutrients to their hosts, which in turn provide a stable sample of most major Aphididae taxa. The collection includes 70 environment for their bacterial partners. Genomic studies have re- species from 15 of 25 subfamilies and 25 of 36 tribes recognized vealed that particular amino acids, vitamins and sterols are supple- in the most recent, comprehensive classifications of aphids (Rem- mented by Buchnera (Akman Gunduz and Douglas, 2009; Moran audière and Remaudière, 1997; Nieto Nafría et al., 1997)( Table S1). et al., 2005; Moran and Degnan, 2006; Moya et al., 2008; Hansen and Moran, 2011). Buchnera is one of numerous groups of insect- associated symbionts for which evidence supports a long-term, 2.2. DNA extraction, primer design and sequencing strict cospeciation with the host (Moran et al., 1993, 1995; Clark et al., 2000 ). The acquisition of Buchnera symbionts by aphids is For all species, several individuals were pooled and homoge- thought to have been a single event that took place by 150–200 nized, and genomic DNA was extracted using QIAamp DNA Micro MYA and was followed by cospeciation (Moran et al., 1993; Martí- Kit (Qiagen). In efforts to compile sufficient data for a reliable phy- nez-Torres et al., 2001). Due to this close evolutionary association, logenetic reconstruction, we aimed to amplify five Buchnera-de- the symbiont and host phylogenies mirror each other for deeper rived genes, namely 16S rDNA, groEL, trpB, dnaB, and ilvD.To evolutionary divergences. Thus, symbiont-derived data in principle confirm the aphid host species identification, the sequence of the can be used to reconstruct the evolutionary history of hosts. aphid mitochondrial gene COI was obtained for each sample. Such an approach is appealing, because the small and simple Three primer pairs allowing for nine combinations were designed genomes of symbionts may be an easier source of suitable se- for each of the targeted Buchnera genes, except for 16S rDNA. Pairs of quences for phylogenetic analysis. This idea has been considered highly degenerate primers, corresponding to the coding region of for some other insect-symbiont associations. For example, Kölsch groEL, trpB, and dnaB genes, were designed according to alignments and Pedersen (2009) suggested using for elucidat- of partial sequences available in GenBank using Primaclade software ing unresolved questions of reed beetle phylogeny. In addition to (Gadberry et al., 2005). Primers for ilvD were designed in CLC Geno- resolving host relationships, genes have been used mic Workbench (CLC bio A/S) based on an alignmentof Buchnera for calculating divergence times in their hosts (e.g., Cryptocercus genome sequences, with the reverse primers corresponding to a woodroaches( Maekawa et al., 2005 )). Buchnera genomes consist highly conserved region of the flanking tRNA. Primers for amplifying of single-copy genes, most of which are shared across different the 16S rDNA gene and aphid mitochondrial gene COI were from Buchnera genomes (Moran et al., 2009); thus, they lack the compli- previous studies (Folmer et al., 1994; Hajibabaei et al., 2006; Mun- cations presented by gene duplication and paralogy. Bacterial gen- son et al., 1991; Mateos et al., 2006). More detailed information on omes possess several additional advantages, such as haploidy and primer pairs, including the sequence and the length of amplified re- absence of introns. Attempts to infer aphid and Buchnera phyloge- gions, is summarized in Table S2. nies in a common framework have been undertaken several times To amplify PCR products from diverse aphid lineages, we used (e.g., Munson et al., 1991; Moran et al., 1993; Martínez-Torres the two best-performing primer pairs for each gene (Table S2). In 44 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 the case of the 16S rRNA gene, primers 10F and 35R were used to analyses were run for three to twelve million generations with tree detect the presence of secondary symbionts commonly associated sampling every 100 generations. Exploration of MCMC conver- with aphids (Fukatsu and Ishikawa, 1993; Sandström et al., 2001; gence and burn-in determination was based on values of the aver- Russell et al., 2003). This primer pair spans roughly a 1600 bp re- age standard deviation of split frequencies, potential scale gion of the 16S-23S rRNA operon in most bacterial genomes. How- reduction factor (PSRF), and the log likelihood of the cold chain. ever, in the reduced genomes of Buchnera, genes encoding 16S Burn-in corresponded to 20–25% of the sampled trees. All the pre- rRNA and 23S rRNA are widely separated on the sented topologies are majority-rule consensus trees calculated (Munson et al., 1993). Thus, in contrast to almost all other bacteria, from the post-burn-in tree samples. the 10F/35R primer pair does not produce an amplicon in Buchnera. The approximately unbiased (AU) test (Shimodaira and Hase- According to results of the first amplification, the product of the gawa, 2001) was used to compare topologies in two ways. First, reaction using the universal 16S rRNA primer pair (10F and we used the AU test to determine whether hypothetical alternative 1507R, amplifying most of the 16S rRNA gene) was directly se- topologies were significantly different from those retrieved by our quenced or was subcloned to enable separation of the Buchnera analyses based on individual genes. In particular, we tested the and secondary symbiont sequences. monophyly/paraphyly of the subfamily Eriosomatinae. Second, To recover Buchnera 16S rDNA sequences in samples with posi- we evaluated robustness of the phylogenetic signal by applying tive 10F/35R primed products indicating presence of other bacteria, the AU test to the best ML topology for each gene and a topology amplicons of successful 10F/1507R PCRs were subcloned into Pro- with an obvious introduced artifact, i.e., polyphyly of Aphidinae. mega pGEM-T Easy vectors. On average, three transformant colonies To obtain the alternative topologies, we constrained particular arti- from each species were picked and their inserts amplified by colony facts and constructed the tree under the ML criterion in PAUPÃ 4.0 PCR using T7 and SP6 primers. Products of colony PCR as well as PCR (Swofford, 2003 ). To optimize computational demands for these products for groEL, trpB, dnaB, ilvD, and the mitochondrial COI gene analyses, we compiled simplified matrices for each gene consisting were Sanger-sequenced in both directions on an ABI3700 sequencer of one representative for each cluster consistently found in all ML at the Yale University DNA sequencing service. Resulting reads were and BI derived topologies (not shown). Significant differences be- assembled into a single contig using CLC Genomic Workbench (CLC tween likelihood site patterns for each topology pair were tested bio A/S) and manually curated to remove base-calling errors. Except using CONSEL version 0.1i (Shimodaira and Hasegawa, 2001). The for two sequences of dnaB amplified from Greenideinae species, results of the AU tests were used to identify the set of genes with bidirectional sequencing did not show any variability. Ambigious sufficient information for phylogenetic inference (i.e., rejecting the positions in dnaB were replaced with an ‘‘N’’. artificial topology). Two concatenated matrices were constructed from the selected genes. The concatenate comprised amino 2.3. Alignments and phylogenetic analysis acid sequences of the coding regions for groEL, dnaB, trpB, and ilvD. The nucleotide concatenate also included 16S rRNA sequences. Accession numbers of the sequences obtained in this study and Both of the matrices were analyzed as described above. those retrieved from GenBank for the phylogenetic analyses are listed in Table S3. Matrices for individual genes were compiled 2.4. Tests of heterogeneity in evolutionary rates of symbiont genes from the data obtained in this study and from data retrieved from GenBank, including corresponding Buchnera sequences and se- In Buchnera genes, rates of evolution at sites under purifying quences for selected outgroup taxa. Outgroup sequences were cho- selection, such as nonsynonymous sites in protein-coding genes, sen from the Ishikawaella capsulata genome (Hosokawa et al., are affected by the host population size and by the number of cells 2006), which shows the lowest pairwise divergence from Buchnera, transferred from mother to offspring (Rispe and Moran, 2000 ). In as well as other members of the , including addition to providing new data useful for inferring aphid phylog- , Salmonella enterica , and Escherichia coli. Com- eny, our Buchnera dataset provides the opportunity to assess plete datasets for groEL, trpB, dnaB, and ilvD were aligned using whether symbiont population structure differs among aphid taxa. the ClustalW algorithm as implemented in SeaView software Differences among aphid groups in life cycles, ecology, and trans- (Gouy et al., 2010). 16S rDNA sequences were aligned in server- mission of symbionts are expected to produce different levels of based program MAFFT, http://mafft.cbrc.jp/alignm ent/server/in- purifying selection, leading to rate heterogeneity among aphid dex.html, using the Q-INS-i algorithm with default parameters, groups. Specifically, we tested for differences among groups in which allows for a more accurate alignment of variable loops of the rate of evolution at nonsynonymous sites (dN) relative to the the rRNA molecule. The raw alignments were manually corrected rate at synonymous sites (dS). in SeaView (Gouy et al., 2010 ) and further processed in GBlocks We computed dN/dS for pairs of sister taxa in each lineage. (Castresana, 2000) to remove unreliably aligned positions. Average dN/dS values were then compared among subfamilies, The resulting matrices were analyzed using maximum likeli- and the data were tested for the presence of statistically deviant hood (ML) and Bayesian inference (BI). To account for the possibil- values using Dixon’s Q test (Rorabacher, 1991). Because dN/dS ra- ity of unreconcilable substitutional saturation in complete tios are meaningless if changes at synonymous sites are saturated, datasests, all sequences were analyzed using only the first two co- the saturation index for synonymous positions was determined for don positions versus all three codon positions for DNA-based anal- each dataset, as calculated in DAMBE (Xia and Xie, 2001), with a yses, and were also analyzed using translations for proportion of invariable positions assessed in JModelTest (Posada, protein-coding genes. The evolutionary models best fitting the 2008). Both saturation pattern tests and dN/dS calculations were data, as well as parameters for ML analyses, were selected using performed only with the sites available for all analyzed taxa. These programs jModelTest (Posada, 2008) and ProtTest 3 (Darriba analyses were not performed for subfamilies represented by a sin- et al., 2011). ML analyses and 100 bootstrap replicates were per- gle species or by a single pair of species. formed for all nucleotide and protein matrices with PhyML 3.0 (Guindon and Gascuel, 2003) with substitution models and param- 2.5. Long branch attraction (LBA) eters estimated from the data. Bayesian analyses were performed with the closest approximation of best-fit evolutionary models (Ta- Preliminary results indicated that some relationships may have ble 1) as implemented in MrBayes 3.1.2 (Ronquist and Huelsen- been affected by LBA, i.e., artifactual clustering of rapidly evolving beck, 2003 ) in two independent runs with four chains. The taxa regardless of their true relationship (for review see Bergsten, E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 45

Table 1 Basic characteri stics ofmatrices and specification of the evolutionary models used in ML and BI analyses.

Alignment designation Number of taxa Number of characters Variable characters Selected model AICc Mr. Bayes trpB 60 489 311 GTR +I + C GTR + I + C dnaB 52 1209 677 GTR +I + C GTR + I + C groEL 85 1135 666 GTR +I + C GTR + I + C ilvD 77 801 520 GTR + C GTR + C 16S rRNA 65 1494 1087 GTR + C GTR + C COI 71 690 283 TIM2 + C GTR + C trpB_dnaB_groEL_ilvD_16S 94 5079 2700 TIM3 + C GTR + C trpB_dnaB_groEL_ilvD_1st2nd 94 2390 1021 GTR +I + C GTR + I + C trpB_dnaB_groEL_ilvD_AA 94 1195 662 MtREV +I + C + F WAG + I + C

2005). We focused on Aphidinae, where clustering of Aspidophor- . On the contrary, most of the Buchnera sequences iso- odon longicaudus , Capitophorus hudsonicus and two Pterocomma lated from showed a higher GC content, averaging species would, if confirmed, contradict monophyly of the tribes 34.0% for coding genes. Macrosiphini and Pterocommatini. We first tested for effects of LBA on positions of A. longicaudus and C. hudsonicus in the context of the entire tree, particularly their possible affinity to long- 3.2. Phylogenetic analyses of single-gene matrices branched taxa of long-branched lineages from other subfamilies (e.g., Saltusaphidinae, ). Additional analyses were per- For all genes, analyses restricted to first and second codon posi- formed exclusively within Aphidinae between the four taxa men- tions yielded results similar to those obtained from full nucleotide tioned above. For each analysis, a single branch was removed sequences. However, measures of statistical support (i.e. bootstrap from the Aphidinae data sets. and posterior probabilities) as well as phylogenetic resolution be- came weaker with decreasing number of positions (i.e., complete DNA sequence, two positions only, and translation into amino 3. Results acids). This result, together with tests based on a few amino acid matrices (results not shown), indicates that amino acid sequences 3.1. Datasets would yield only poorly supported and largely unresolved topolo- gies. Consequently, amino acid sequences were not used in the sin- Altogether, this study generated 255 new sequences from 70 gle-gene matrices. Buchnera lineages. Characteristics of the resulting alignments are All topologies derived from the single-gene datasets placed detailed in Table 1. Because individual single-gene matrices differ most aphid species into clades corresponding to tribes of the cur- in taxon sampling, all of the concatenated matrices are partially rent classification. For higher taxonomic levels, results were less incomplete. Missing sequences were either not available in Gen- conclusive, and branch support values were weaker. However, Bank (for taxa from previous studies), or, for our samples, we were monophyly of Aphidinae, Lachninae, Greenideinae and Chaitophor- not able to amplify the genes due to PCR failures, most likely inae was recovered across different analyses and was well sup- caused by substitutions in the primer sites, loss of the gene or even ported. The topology inferred from the ilvD single-gene matrix possible absence of Buchnera in certain samples. For example, a (the dataset with the highest proportion of informative sites) is PCR failure apparently related to primer specificity was experi- shown in Fig. 1. Within Aphidinae, was monophyletic; enced for combinations of dnaB primers, which failed to amplify however, two species of Macrosiphini, Aspidophorodon longicaudus sequences for all samples from Calaphidinae and Saltusaphidinae and Capitophorus hudsonicus, grouped with Pterocomma in most of subfamilies. Also, only a few representatives of the subfamily Lach- the single-gene analyses. This grouping was recovered repeatedly ninae yielded PCR products for trpB. in all LBA-avoiding tests, performed on a simplified dataset with Individual matrices of the protein-coding genes contained com- one of the long branches removed in each analysis (an example parable proportions of variable characters, with the largest propor- topology shown in Fig. S1). tion found in trpB and ilvD (64% and 65%, respectively; Table 1). AU Monophyly of Hormaphidinae was recovered only in ML analy- tests with constrained topologies indicated different strengths of sis of the trpB dataset. Two tribes, Hormaphidini and Cerataphidini, phylogenetic signal for different matrices. Sequences for trpB, groEL, did not form a monophyletic group in any other single-gene anal- ilvD and the 16S rRNA gene contained sufficient information to reject ysis. Sequences derived from Pseudoregma koshuensis (Hormaphid- all of the tested artificial topologies (see Section 2). For the dnaB inae: Cerataphidini) had the lowest GC content and clustered dataset, the test was significant only when a major topological arti- artifactually with other sequences of markedly biased nucleotide fact was introduced (data not shown). In contrast, the mitochondrial content. For instance, in the dnaB based topology, Pseudoregma fell COI dataset contained few informative characters (Table 1), and the within the same cluster as Greenideinae and Lachninae, which dis- AU tests were repeatedly insignificant. This dataset was therefore played the lowest GC content values for dnaB (Figs. S2 and S3). The not used for phylogenetic inference; COI sequences were used only same pattern was found in the ilvD dataset, where Pseudoregma to verify species determination based on nucleotide similarity with clustered as a close relative to significantly AT-biased sequences sequences already available in several copies in GenBank. COI se- of the subfamily Drepanosiphinae (Fig. 1). Similar to results in Or- quences recovered in this study shared similarity above 99% with tiz-Rivas and Martínez-Torres (2010), Hormaphidini and Thelaxi- GenBank homologs from the same species, and thus as redundant nae were supported as sister taxa. This relationship was records were not deposited in the database. supported by all analyses based on single genes, except for ilvD. Nucleotide composition, measured as % GC content, varied by In most single-gene analyses, Calaphidinae was paraphyletic 10% across taxa. The lowest GC levels were found in Buchnera from with respect to Saltusaphidinae. The only exception was the topol- Pseudoregma koshuensis, which averaged 27.7% for protein-coding ogy derived from trpB (Fig. S4), in which Saltusaphidinae was not genes. A bias towards higher AT content was also observed in represented and Calaphidinae was monophyletic. In contrast to Buchnera sequences from Drepanosiphinae, Greenideinae and the results of Ortiz-Rivas and Martínez-Torres (2010), none of 46 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54

Fig. 1. BI topology based on ilvD dataset. Solid vertical lines refer to monophyletic tribes and subfamilies. Dashed vertical lines designate paraphyletic taxa. The branch of Pseudoregma koshuensis is scaled to ½ of its actual length. The values at nodes show posterior probability support.

our single-gene derived topologies supported monophyly of Drep- placed at a distant position (Fig. 1). A similar result was previously anosiphidae sensu Heie (1980), i.e., including Calaphidinae, Chaito- reported for aphid nuclear gene-derived topologies by Ortiz-Rivas phorinae and Drepanosiphinae. and Martínez-Torres (2010). All other single-gene analyses recov- Eriosomatinae was polyphyletic or paraphyletic in all single- ered topologies in which the tribes were distant from each other gene analyses. For ilvD-derived topologies, Fordini and Pemphigini (e.g., Figs. S2 and S4). However, when a constraint of Eriosomatinae formed a monophyletic group while Eriosomatini species were monophyly was tested on individual gene sets, all AU test values E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 47 were not significant; thus, the monophyly of Eriosomatinae could several deep nodes (Fig. 2). Particularly, the position of the subfam- not be confidently rejected (p = 0.08–0.47). ilies represented by a single specimen, namely coweni In most of the rooted single-gene topologies, Pemphigini (Tamalinae), Anoecia oenotherae (Anoecinae), and Neophyllaphis formed the sister group to all other taxa. However, branching of totarae (Neophyllaphidinae) appeared to be unstable under differ- the other deep nodes was inconsistent, varying with datasets and ent analysis settings. Bootstrap support for topologies calculated analytical methods, particularly for subfamilies represented by a under the ML criterion was low, possibly due to the incomplete- single specimen, namely Tamalia coweni (Tamalinae), Anoecia oeno- ness of concatenated matrices (Fig. 4). therae (Anoecinae), Neophyllaphis totarae (Neophyllaphidinae) and Because the outgroups selected for all analyses are relatively Mindarus obliquus (Mindarinae). The overall topological character- distant to Buchnera, unrooted analyses (i.e. analyses excluding out- istics sorted into three different arrangements. The first supported groups) were also performed with concatenated datasets to avoid three distinct lineages: Pemphigini, a monophyletic Aphidi- potentially misleading information introduced by very distant se- nae + Calaphidinae, and a cluster composed of all remaining sub- quences. An example of topologies consistently retrieved, regard- families and tribes. In the other two scenarios, aphid diversity less of method or gene composition of concatenates, is shown in was distributed into two major clusters, either Aphidinae versus Fig. 5. Taxa in this unrooted trichotomy fell into three main clus- a lineage of all other taxa, or (Aphidinae + Calaphidinae) versus a ters: (1) Lachninae, (2) Neophyllaphidinae, Chaitophorinae, Ano- lineage of all remaining taxa. eciinae, Cerataphidini, Drepanosiphinae, Thelaxinae, Greenideinae, and (3) Aphidinae, Calaphidinae, Saltusaphidinae, 3.3. Phylogenetic analyses of concatenated data sets Phyllaphidinae, Tamalinae, Hormaphidini, Eriosomatinae (para- phyletic). A close relationship between Aphidinae + Calaphidinae In order to test the potential impact of data incompletenessin is highly supported, as is paraphyly of Hormaphidinae (Hormaphi- concatenated matrices, an additional groEL + trpB dataset was con- dini and Cerataphidini positioned in two different clusters). structed from the taxa for which sequences of both genes were available. These genes were chosen because they were available for the largest number of taxa. The topology obtained from both 3.4. Tests of heterogeneity in evolutionary rate of symbiont genes BI and ML corresponded to the scenario described above with Aph- idinae versus a lineage of all other taxa, and did not differ substan- Comparisons of average dN/dS indicated heterogeneity in evolu- tially from the results obtained from analyses of the incomplete tionary rates across subfamilies and tribes. Elevated values of dN/ concatenated matrices. dS were found for Calaphidinae, Chaitophorinae, Greenideinae Similarly to the single-gene topologies, most results from con- and Saltusaphidinae (Table 2), all groups with simple life cycles catenated datasets placed Pemphigini as sister to the remaining containing few distinct female morphs and narrow host-plant taxa. Two main types of topology were repeatedly retrieved from ranges. Rates at nonsynonymous sites were not elevated in the the concatenated data. While most BI analyses tended to produce Lachninae, nor in Aphidini, Pemphigini, or Hormaphidinae, which topologies with three main clusters, e.g. Pemphigini, Aphidi- include species with complex, host-alternating life cycles. nae + Calaphidinae, and a clade comprising the rest of taxa (Figs. 2 and 3), a few ML analyses supported grouping into two major lin- eages with one consisting exclusively of the subfamily Aphidinae 4. Discussion (not shown). The topologies recovered in this study are partially congruent 4.1. Phylogenetic inference: comparison of Buchnera and aphid data with the recently published aphid phylogeny inferred from two nuclear and one mitochondrial genes (Ortiz-Rivas and Martínez- The results obtained here show that the Buchnera genes se- Torres, 2010). The overall similarity between the aphid-derived lected for this study are useful sources of data for corroboration and Buchnera-derived trees provides sufficient background for de- or rejection of particular phylogenetic hypotheses, although they tailed comparisions and evaluation of particular phylogenetic and cannot completely resolve aphid phylogeny. Here we summarize evolutionary patterns. Ortiz-Rivas and Martínez-Torres (2010) how our results compare to the most recent phylogeny for Aphid- recovered three major aphid clusters, namely Lachninae, a clade idae based on aphid genes (Ortiz-Rivas and Martínez-Torres, 2010). containing Eriosomatinae + Anoeciinae + Thelaxinae + Hormaphid- First, relationships strongly supported by both Buchnera and aphid inae + Mindarinae, and a clade containing Aphidinae + Calaphidi- sequences include monophyly of individual tribes and monophyly nae + Drepanosiphinae + Chaitophorinae. Lachninae formed the of Lachninae. Second, relationships congruent with those from sister group to a lineage comprising the latter two clades in their aphid genes but more strongly supported by Buchnera sequences results. In contrast, we consistently recovered Pemphigini as the include paraphyly of Macrosiphini with respect to Pterocommatini sister group to a lineage including all other taxa (Figs. 2–4). None and monophyly of Aphidinae (including Aphidini, Macrosiphini, of our analyses produced a topology in which Lachninae branched Pterocommatini). Third, relationships incongruent between analy- from the base of the tree, although this position was not rejected ses based on Buchnera versus aphid genes, with neither providing by the AU tests. Furthermore, only the ML analysis based on con- strong or stable resolution, were concentrated at the deeper nodes. catenation of nucleotides for 1st and 2nd positions produced a An example is the monophyly versus polyphyly of Eriosomatinae: cluster containing Aphidinae + Calaphidinae + Drepanosiphi- while Fordini was repeatedly recovered as a sister group to Erioso- nae + Chaitophorinae, as in Ortiz-Rivas and Martínez-Torres matini in Buchnera- based topologies, the aphid data supported a (2010). In this topology, the cluster also included Saltusaphidinae sister relationship of Fordini and Pemphigini. Other instances of and Phyllaphidinae, which were not sampled by Ortiz-Rivas and unstable resolution by either aphid or Buchnera genes involve rela- Martínez-Torres (2010). For other recovered topologies, only the tionships between Calaphidinae and Aphidinae, and among the close relationship between Aphidinae and Calaphidinae was highly Calaphidinae, Chaitophorinae, and Drepanosiphinae, and relation- supported. Thus, the third cluster from the Ortiz-Rivas and Martí- ships between Anoeciinae and Mindarinae. Most notably, we did nez-Torres study, Eriosomatinae +Anoeciinae +Thelaxinae +Hor- not find any clear example of strongly supported discrepancies be- maphidinae + Mindarinae, was never recovered (Figs. 2–4). tween analyses based on aphid genes versus Buchnera genes. We Similar to the topologies derived from aphid genes (Ortiz-Rivas note that differences or lack of differences between the studies also and Martínez-Torres, 2010 ), we obtained only low support for could be affected by the differences in taxon sampling. 48 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54

Escherichia coli Salmonella enterica Serratia symbiotica Ishikawaella capsulata imbricator populiconduplifolius populicaulis caryae Tamalia coweni pistaciae Eriosomatinae Schlechtendalia chinensis americanum Eriosoma grossulariae tujafilina Cinara pergandei Cinara cedri Schizolachnus obscurus Cinara laricifex Eulachnus rileyi Lachninae Essigella pini Trama rara Tuberolachnus salignus Maculolachnus sijpkensis Maculolachnus submacula Anoecia oenotherae Pseudoregma koshuensis Drepanaphis sp. Drepanaphis utahensis Drepanosiphinae Mindarus obliquus Mindarus kinseyi Mindarinae Neophyllaphis totarae Sipha elegans Chaitophorus leucomelas Chaitophorinae Chaitophorus sp. Eutrichosiphum sinense Mollitrichosiphum nigrofasciatum Greenideinae sp. Greenideinae sp. Greenideinae sp. Greenideinae Greenideinae sp. Greenideinae sp. Greenidea okajimai Greenideinae sp. Hamamelistes spinosus Hormaphis hamamelidis Hormaphidinae californica Glyphina longisetosa Kurisakia querciphila Thelaxinae agrifolicola Myzocallis coryli Neosymydobius sp. trifolii arundicolens ulmifolii Calaphidinae kahawaluokalani Takecallis sp. Calaphis betulaecolens gillettei Phyllaphis grandifoliae Iziphya flabella Saltusaphis scirpi Saltusaphidinae Thripsaphis ballii Aspidophorodon longicaudus Capitophorus hudsonicus Pterocomma smithiae Pterocomma populeum Muscaphis stroyani Diuraphis noxia Uroleucon caligatum Uroleucon ambrosiae Artemisaphis artemisicola Hyperomyzus lactucae Acyrthosiphon kondoi 3 Acyrthosiphon pisum 4 Acyrthosiphon pisum 1 Acyrthosiphon pisum 2 Acyrthosiphon lactucae Gypsoaphis oestlundi Aphidinae Lipaphis pseudobrassicae Hyadaphis tataricae Brachycaudus cardui Brachycaudus tragopogonis Myzus persicae Macrosiphoniella sanborni Uroleucon jaceae Macrosiphoniella ludovicianae Aphis nerii Aphis nasturtii Aphis craccivora Rhopalosiphom maidis padi 0.1 graminum

Eriosomatinae:Tamaliinae PemphiginiEriosomatinae:Eriosomatinae: FordiniLachninae: EriosomatiniLachninae: EulachniniLachninae: LachniniAnoeciinae TraminiHormaphidinae:Drepanosiphinae CerataphidiniMindarinaeNeophyllaphidinaeChaitophorinae:Chaitophorinae: SiphiniGreenideinae ChaitophoriniHormaphidinae:Thelaxinae HomaphidiniCalaphidinae:Calaphidinae: PanaphidniPhyllaphidinae CalaphidiniSaltusaphidinae:Saltusaphidinae: SaltusaphidiniAphidinae: ThripsaphidiniAphidinae: PterocommatiniAphidinae: Macrosiphini Aphidini

Fig. 2. BI topology based on the concatenated matrix including 1st and 2nd codon positions. Thicker lines designate branches with a posterior probability above 95%. Solid vertical lines refer to monophyletic subfamilies. Dashed vertical line designates paraphyletic/polyphyletic taxa.

This overview demonstrates that, despite particular differences may differ for weakly supported nodes. This suggests that the among the many trees recovered in this study and in Ortiz-Rivas inconclusiveness of previous studies based on aphid DNA reflects and Martínez-Torres (2010), both data sources are highly real evolutionary processes (e.g., rapid diversification in some lin- congruent for strongly supported relationships, even though they eages) rather than methodological artifacts. E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 49

Escherichia coli Salmonella enterica Serratia symbiotica Ishikawaella capsulata Grylloprociphilus imbricator Thecabius populiconduplifolius Pemphigus populicaulis Prociphilus caryae Tamalia coweni Baizongia pistaciae Eriosomatinae Schlechtendalia chinensis Melaphis rhois Eriosoma americanum Eriosoma grossulariae Cinara tujafilina Cinara pergandei Cinara cedri Schizolachnus obscurus Cinara laricifex Eulachnus rileyi Lachninae Essigella pini Trama rara Tuberolachnus salignus Maculolachnus sijpkensis Maculolachnus submacula Anoecia oenotherae Drepanaphis sp. Drepanaphis utahensis Drepanosiphinae Mindarus obliquus Mindarus kinseyi Mindarinae Neophyllaphis totarae Sipha elegans Chaitophorus leucomelas Chaitophorinae Chaitophorus sp. Eutrichosiphum sinense Mollitrichosiphum nigrofasciatum Greenideinae sp. Greenideinae sp. Greenideinae sp. Greenideinae Greenideinae sp. Greenideinae sp. Greenidea okajimai Greenideinae sp. Pseudoregma koshuensis Hamamelistes spinosus Hormaphidinae Hormaphis hamamelidis Thelaxes californica Glyphina longisetosa Thelaxinae Kurisakia querciphila Myzocallis agrifolicola Myzocallis coryli Neosymydobius sp. Takecallis arundicolens Tinocallis ulmifolii Calaphidinae Sarucallis kahawaluokalani Takecallis sp. Calaphis betulaecolens Euceraphis gillettei Phyllaphis grandifoliae Iziphya flabella Saltusaphis scirpi Saltusaphidinae Thripsaphis ballii Aspidophorodon longicaudus Capitophorus hudsonicus Pterocomma populeum Pterocomma smithiae Rhopalosiphom maidis Rhopalosiphum padi Aphis nerii Aphis nasturtii Aphis craccivora Muscaphis stroyani Diuraphis noxia Lipaphis pseudobrassicae Hyadaphis tataricae Myzus persicae Brachycaudus cardui Aphidinae Brachycaudus tragopogonis Gypsoaphis oestlundi Artemisaphis artemisicola Hyperomyzus lactucae Macrosiphoniella sanborni Uroleucon jaceae Uroleucon caligatum Uroleucon ambrosiae Acyrthosiphon kondoi Acyrthosiphon lactucae Acyrthosiphon pisum 3 Acyrthosiphon pisum 4 Acyrthosiphon pisum 1 0.1 Acyrthosiphon pisum 2

Eriosomatinae:Tamaliinae PemphiginiEriosomatinae:Eriosomatinae: FordiniLachninae: EriosomatiniLachninae: EulachniniLachninae: LachniniAnoeciinae TraminiHormaphidinae:Drepanosiphinae CerataphidiniMindarinaeNeophyllaphidinaeChaitophorinae:Chaitophorinae: SiphiniGreenideinae ChaitophoriniHormaphidinae:Thelaxinae HomaphidiniCalaphidinae:Calaphidinae: PanaphidniPhyllaphidinae CalaphidiniSaltusaphidinae:Saltusaphidinae: SaltusaphidiniAphidinae: ThripsaphidiniAphidinae: PterocommatiniAphidinae: Macrosiphini Aphidini

Fig. 3. A combined BI topology constructed to maximize the information content of amino acid and nucleotide data. The overall tree is based on concatenated amino acids, and the relationships within Aphidini, Macrosiphini and Pterocomatini are based on concatenated nucleotide sequences corresponding to the same matrix (including all three codon positions). Solid vertical lines refer to monophyletic subfamilies.

Apart from a general assessment of the suitability of Buchnera ment in some groups. For example, two genera classified as data for resolving aphid phylogeny, our study also found unex- Macrosiphini, Aspidophorodon longicaudus and Capitophorus hudso- pected relationships, suggesting the need for taxonomic reassess- nicus, grouped significantly with Pterocomma (Pterocommatini) 50 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54

99 Escherichia coli 100 Salmonella enterica Serratia symbiotica Ishikawaella capsulata 90 Thecabius populiconduplifolius 99 Prociphilus caryae Eriosomatinae Pemphigus populicaulis Thelaxes californica 100 Kurisakia querciphila Thelaxinae Glyphina longisetosa Anoecia oenotherae Pseudoregma koshuensis 100 100 Mindarus kinseyi Mindarus obliquus Mindarinae 100 Drepanaphis utahensis Drepanosiphinae Drepanaphis sp. Sipha elegans 100 Chaitophorus leucomelas Chaitophorinae Chaitophorus sp. 98 Eriosoma grossulariae 99 Eriosoma americanum Eriosomatinae Phyllaphis grandifoliae 77 Euceraphis gillettei Calaphis betulaecolens 87 Thripsaphis ballii Iziphya flabella Saltusaphis scirpi 81 Neosymydobius sp. Myzocallis coryli Calaphidinae Myzocallis agrifolicola Sarucallis kahawaluokalani Takecallis arundicolens Therioaphis trifolii Takecallis sp. Tinocallis ulmifolii 100 Hamamelistes spinosus Hormaphis hamamelidis Hormaphidinae Grylloprociphilus imbricator 75 Baizongia pistaciae 100 Melaphis rhois Eriosomatinae Schlechtendalia chinensis 100 Greenideinae sp. 100 Greenidea okajimai 91 Greenideinae sp. 98 Greenideinae sp. 80 Mollitrichosiphum nigrofasciatum Greenideinae Eutrichosiphum sinense 100 Greenideinae sp. Greenideinae sp. Greenideinae sp. Tamalia coweni 100 Maculolachnus sijpkensis Maculolachnus submacula 97 Tuberolachnus salignus 94 Trama rara Cinara tujafilina 93 Essigella pini Lachninae Eulachnus rileyi Cinara cedri Cinara laricifex Cinara pergandei Schizolachnus obscurus Neophyllaphis totarae Diuraphis noxia Macrosiphoniella sanborni Muscaphis stroyani 87 Capitophorus hudsonicus 75 Aspidophorodon longicaudus 98 Pterocomma populeum Pterocomma smithiae Uroleucon caligatum Uroleucon ambrosiae 94 Aphis craccivora Aphis nasturtii Aphis nerii 77 Rhopalosiphom maidis 87 Schizaphis graminum Rhopalosiphum padi Aphidinae Hyadaphis tataricae Lipaphis pseudobrassicae Myzus persicae Brachycaudus tragopogonis Brachycaudus cardui Hyperomyzus lactucae Artemisaphis artemisicola Gypsoaphis oestlundi Uroleucon jaceae Acyrthosiphon lactucae Acyrthosiphon kondoi Acyrthosiphon pisum 2 Acyrthosiphon pisum 3 0.1 Acyrthosiphon pisum 4 Acyrthosiphon pisum 1

Eriosomatinae:Eriosomatinae: PemphiginiEriosomatinae: FordiniTamaliinae EriosomatiniLachninae:Lachninae: EulachniniLachninae: LachniniAnoeciinae TraminiHormaphidinae:Drepanosiphinae CerataphidiniMindarinaeNeophyllaphidinaeChaitophorinae:Chaitophorinae: SiphiniGreenideinae ChaitophoriniHormaphidinae:Thelaxinae HomaphidiniCalaphidinae:Calaphidinae: PanaphidniPhyllaphidinae CalaphidiniSaltusaphidinae:Saltusaphidinae: SaltusaphidiniAphidinae: ThripsaphidiniAphidinae: PterocommatiniAphidinae: Macrosiphini Aphidini

Fig. 4. ML topology based on the concatenated amino acid matrix. Bootstrap support values above 70% are shown. Solid vertical lines refer to monophyletic subfamilies. Dashed vertical lines designate paraphyletic/polyphyletic taxa. rather than with other Macrosiphini, thus rendering Macrosiphini Furthermore, a possible effect of LBA on the position of these taxa paraphyletic. Clustering of these three species was supported by was tested and discounted, suggesting that their relationship analyses of single-gene topologies and by concatenated matrices. reflects evolutionary history rather than phylogenetic artifact. A E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 51

Chaitophorinae Lachninae Drepanosiphinae

Eulachnini 1.00

Lachnini Hormaphidinae:Cerataphidini

1.00 Anoeciinae Tramini Mindarinae 1.00

1.00

Neophyllaphidinae

0.96 Fordini 1.00

1.00 Thelaxinae

1.00 0.91 1.00 0.99 Greenideinae 1.00

Eriosomatini 1.00

1.00 1.00 Hormaphidinae: Hormaphidini 0.84 1.00 1.00 0.99 1.00 1.00 0.79 1.00 0.97 1.00 Pemphigini 1.00 Eriosomatinae 1.00 1.00 1.00 0.84 1.00

Tamaliinae 0.75

0.97

1.00 0.89 0.99 0.98 0.99 1.00 0.81 0.82 0.79 1.00 0.83 1.00 1.00 1.00 1.00 0.85 1.00 1.00 1.00 0.95 1.00 Aphidini 1.00 1.00 Panaphidni

Pterocommatini

Saltusaphidinae Calaphidini Macrosiphini

Phyllaphidinae

Aphidinae

Calaphidinae

0.01

Fig. 5. Unrooted BI topology based on the concatenated matrix including 1st and 2nd codon positions. Three main clusters are defined (shaded areas) to facilitate comparison with the Ortiz-Rivas and Martínez-Torres (2010) scheme, and are resolved as a trichotomy in this unrooted analysis. Solid lines refer to monophyletic taxa. Dashed lines designate paraphyletic/polyphyletic taxa. Numbers are posterior probabilities. previous study (von Dohlen et al., 2006 ) also recovered a paraphy- mark (2000) found previously that Tramini was nested within letic Macrosiphini, due to the sister relationshipof Pterocomma Lachnini, rendering the latter tribe paraphyletic. Our analysis with Cavariella (a genus not included in the present study); a later recovers the same pattern using Buchnera genes, and with high study (Kim et al., 2011) duplicated this finding. Together, these support. Not only do these results indicate that the tribal classifica- studies indicate the need for a re-examination and probable taxo- tion does not reflect phylogeny, they have implications for the nomic revision of Macrosiphini and Pterocommatini. interpretation of feeding strategies within Lachninae. Ortiz-Rivas Some of the relationships found in this study have implications and Martínez-Torres, 2010 proposed that a single transition from for the evolution of host-plant relationships and other interesting gymnosperm hosts to angiosperms may have occured in an ances- features of aphid biology. One example concerns Lachninae. Nor- tor of Lachnini + Tramini, with Eulachnini retaining the ancestral 52 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54

Table 2 Saturation pattern and evolutionary rates found for major aphid lineages.

Lineage Iss a all sites Iss.cb all sites Issa resolved sites Iss.c b resolved sites Iss.c c dN/dS average Aphidinae 0.6124 0.8006 0.1488 0.7577 0.4817 0.0646 Calaphidinae 0.7304 0.8112 0.2120 0.7282 0.5740 0.1263d Fordini 1.0924 0.8426 0.1680 0.8368 0.8310 0.0691 Greenideinae 0.7071 0.8305 0.1358 0.8081 0.7079 0.1191d Hormaphidinae 0.5282 0.8426 0.1825 0.8402 0.8391 0.0864 Chaitophorinae 1.1997 0.8426 0.1447 0.8229 0.8133 0.1053 d Lachninae 0.6455 0.8267 0.1816 0.7626 0.6145 0.0609 Pemphigini 0.7388 0.8426 0.1902 0.8252 0.8152 0.0558 Saltusaphidinae 1.0317 0.8405 0.1530 0.8067 0.7985 0.1362d Thelaxinae 0.5449 0.8426 0.2386 0.8418 0.8445 0.0886

a Index of substitution saturation (Iss) calculated using DAMBE software. b Critical Iss values for calculations including all sites and resolved sites only. c Iss.cc stands for critical Iss values calculated for an asymmetrical tree. d Lineages demonstrated to have elevated dN/dS. state of gymnosperm feeding. However, they acknowledged that four proposed clusters (including between Lachninae and other this scenario was not fully supported by their data, because a line- taxa). This incongruence and uncertainty may reflect the different age comprising Lachnini + Tramini was only rarely present in their taxon sampling and outgroups used in the two studies. While for trees. The nesting of Tramini within Lachnini, supported strongly the Buchnera-based analyses the best available outgroup was the by our data, is compatible with the hypothesis of a single transition phylogenetically distant Ishikawaella capsulata symbiont, the between gymnosperms and angiosperms within the Lachninae. aphid-based reconstructions could be rooted with more closely re- Another example of a phylogenetic result with implications for lated taxa, e.g., Daktulosphaira in Phylloxeridae. However, it is clear the evolutionary history of host plant associations concerns the po- that neither study reliably identifies the root. sition of Saltusaphidinae. This subfamily was recovered previously as nested within Calaphidinae (von Dohlen and Moran, 2000 ); with 4.3. Tests of heterogeneity in evolutionary rate of symbiont genes greater taxon sampling in our study, this nested position was even more strongly supported. This relationship suggests that ancestors The dN/dS calculations reveal higher values for Buchnera of of sedge- and rush-feeding Saltusaphidinae shifted directly from aphid groups with simple life cycles and restricted to living on feeding on trees (as in extant Calaphidinae) to feeding on herba- trees, except for Saltusaphidinae, which lives on sedges but is de- ceous monocots. In contrast, most other aphid species living on rived from within tree-feeding Calaphidinae. These aphid groups herbaceous plants seem to have acquired these hosts through a are often considered to represent the ancestral aphid life cycle, host-alternating intermediate life-cycle stage including both trees involving fewer distinct female morphs and only a single host- and herbs (Moran, 1992). plant group. Higher dN/dS due to reduced efficacy of purifying One final example of a phylogenetic result with implications for selection is expected when the aphid population size is smaller the evolution of an unusual phenotype relates to the monophyly or when the number of symbiont cells inoculating each progeny versus paraphyly of Eriosomatinae, whose members all produce is smaller (Rispe and Moran, 2000 ). A previous phylogenetic study dwarf sexual forms. Two of the three tribes (Eriosomatini and For- based on aphid genes (Ortiz-Rivas and Martínez-Torres, 2010) pre- dini) are supported as sister groups, with weak support based on sented analyses of rate differences using relative-rate tests as the aphid data (Ortiz-Rivas and Martínez-Torres, 2010) and consid- implemented in RRTree (Robinson-Rechavi and Huchon, 2000 ). erably stronger support in our analyses. The third tribe (Pemphi- Those results cannot be meaningfully compared to the dN/dS re- gini) does not cluster with Eriosomatini + Fordini. Thus, both sults for Buchnera, as relative rate tests are designed to detect Buchnera and aphid data suggest that dwarf non-feeding sexuals changes in rates between a test lineage and related lineages, may have originated twice, through convergent adaptation to the whereas dN/dS comparisons reveal changes in efficacy of purifying host plant-alternating life style. A more thorough understanding selection on different branches. We do not report RRTree analyses of the evolution of this puzzling morph (i.e. multiple origins versus because the method is prone to several errors, particularly when repeated losses) could be investigated further with more phyloge- the closest known outgroup is distant (Bromham et al., 2000; Rob- netic studies and/or developmental data. While several other topo- inson et al., 1998), as is the case for I. capsulat a(Hosokawa et al., logical aspects presented here may influence interpretation of the 2006). evolution of host-plant associations or life cycles if proven correct, the resolution and support of the deeper nodes are too weak for 5. Conclusion definite conclusions.

The results obtained from Buchnera-derived sequences confirm 4.2. Rooting the Buchnera tree the overall suitability of this source of phylogenetic information, and further suggest that Buchnera genes may outperform host The phylogenies derived from Buchnera sequences and aphid genes for solving particular phylogenetic problems. A potential sequences appear to contradict one another regarding placement drawback that must be considered when using these markers is of the root node. In analyses based on aphid genes, the root was the well-known tendency of symbiont sequences to suffer from placed between Lachninae and all other taxa in all presented trees various phylogenetic artifacts, mostly due to LBA and composi- (Ortiz-Rivas and Martínez-Torres, 2010). In contrast, in the Buch- tional bias. However, in many groups, these issues also af- nera-derived topologies, the root is positioned between Pemphigini fect mitochondrial and nuclear sequences; in aphids in particular, and all other taxa in most cases. However, the AU tests indicate most nuclear genes are affected by extensive paralogy. Fortunately, that this root position cannot be considered as the only possibility. new computational techniques are in development that can avoid AU tests were not able to reject a root located between any of the artifacts due to LBA and compositional bias, provided that suffi- E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54 53 cient amounts of data are available (Lartillot et al., 2009 ; Husník Hajibabaei, M., Janzen, D.H., Burns, J.M., Hallwachs, W., Hebert, P.D.N., 2006. DNA et al., 2011). barcodes distinguish species of tropical Lepidoptera. Proc. Natl. Acad. Sci. USA 103, 968–971. Codiversification with obligate bacterial symbionts has now Hansen, A.K., Moran, N.A., 2011. Aphid genome expression reveals host symbiont been documented for many invertebrate taxa, including a wide cooperation in the production of amino acids. Proc. Natl. Acad. Sci. USA 108, range of insects such as cockroaches, ants, tsetse flies, weevils, lice, 2849–2854. Heie, O.E., 1980. The Aphidoidea (Hemiptera) of Fennoscandia and Denmark. I. and almost all insect groups feeding on plant sap (Allen et al., General Part. The families Mindaridae, Hormaphididae, Thelaxidae, Anoeciidae, 2007; Conord et al., 2008; Moran et al., 2008 ), as well as a diversity and Pemphigidae. Scandinavian Science Press Ltd., Klampenborg, Denmark. of other animal clades such as marine flatworms and some marine Heie, O.E., 1987. Palaeontology and phylogeny. In: Minks, A.K., Harrewijn, P. (Eds.), Aphids: Their biology, Natural Enemies, and Control, vol. 2A. Elsevier, clams (Gruber-Vodicka et al., 2011; Peek et al., 1998). Considering Amsterdam. the rapidly expanding bacterial genome databases, symbiont-de- Hosokawa, T., Kikuchi, Y., Nikoh, N., Shimada, M., Fukatsu, T., 2006. Strict host- rived data are likely to become a powerful tool for reconstructing symbiont cospeciation and reductive genome evolution in insect gut bacteria’’. PLoS Biol. 4, e337. host phylogenies in these strictly cospeciating associations. Husník, F., Chrudimsky´ , T., Hypša, V., 2011. Multiple origin of endosymbiosis within the Enterobacteriaceae (gamma-): convergency of complex phylogenetic approaches. BMC Biol. 9, 87. Acknowledgments Jousselin, E., Desdevises, Y., D’Acier, A.C., 2009. Fine-scale cospeciation between Brachycaudus and Buchnera aphidicola: bacterial genome helps define species We thank S. Aoki, T. Fukatsu, U. Kurosu, J. Ortego, D. A. J. Teulon and evolutionary relationships in aphids. Proc. R. Soc. Biol. Sci. 276, 187–196. Kim, H., Lee, S., Jang, Y., 2011. Macroevolutionary patterns in the aphidini aphids and E. Maw for donating samples. This work was supported by the (Hemiptera: Aphididae): diversification, host association, and biogeographic United States National Science Foundation (DEB-0723472, DEB- origins. PLoS ONE 6, e24749. 1106195 to N.A.M.). E.N. was supported on a Fulbright fellowship Kölsch, G., Pedersen, B.V., 2009. Can the tight co-speciation between reed beetles (Col., Crysomelidae, Donaciinae) and their bacterial endosymbionts providing and by the Grant Agency of the University of South Bohemia cocoon material clarify the deeper phylogeny of the hosts? Mol. Phylogen. Evol. (135/2010/P) and the Grant Agency of the Czech Republic (206/ 54, 810–821. 09/H026). C.D.v.D. was supported by the Utah Agricultural Experi- Lartillot, N., Lepage, T., Blanquart, S., 2009. PhyloBayes 3: a Bayesian software ment Station; approved as journal paper #8462. package for phylogenetic reconstruction and molecular dating. Bioinformatics 25, 2286–2288. Liu, L., Huang, X.L., Zhang, R.L., Jiang, L.Y., Qiao, G.X., 2013. Phylogenetic congruence between Mollitrichosiphum (Aphididae: Greenideinae) and Buchnera indicates Appendix A. Supplementary material insect-bacteria parallel evolution. Syst. Entomol. 38, 81–92. Maekawa, K., Park, Y.C., Lo, N., 2005. Phylogeny of endosymbiont bacteria harbored Supplementary data associated with this article can be found, in by the woodroach Cryptocercus spp. (Cryptocercidae: Blattaria): molecular clock evidence for a late cretaceous-early tertiary split of Asian and American the online version, at http://dx.doi.org/10.1016/j.ymp ev.2013.03. lineages. Mol. Phylogenet. Evol. 36, 728–733. 016. Martínez-Torres, D., Buades, C., Latorre, A., Moya, A., 2001. Molecular systematics of aphids and their primary endosymbionts. Mol. Phylogenet. Evol. 20, 437–449. Mateos, M., Catrezana, S.J., Nankivell, B.J., Estes, A., Markow, T.A., Moran, N.A., 2006. References Heritable endosymbionts of Drosophila. Genetics 174, 363–376. Moran, N.A., 1992. The evolution of life cycles in aphids. Ann. Rev. Entomol. 37, 321–348. Akman Gunduz, E., Douglas, A.E., 2009. Symbiotic bacteria enable insect to use a Moran, N.A., Degnan, P.H., 2006. Functional genomics of Buchnera and the ecology of nutritionally inadequate diet. Proc. Biol. Sci. 276, 987–991. aphid hosts. Mol. Ecol. 15, 1251–1261. Allen, J.M., Reed, D.L., Perotti, M.A., Braig, H.R., 2007. Evolutionary relationships of Moran, N.A., Munson, M.A., Baumann, P., Ishikawa, H., 1993. A molecular clock in ‘‘Candidatus Riesia spp.’’, endosymbiotic enterobacteriaceae living within endosymbiotic bacteria is calibrated using the insect hosts. Proc. R. Soc. Lond. B hematophagous primate lice. Appl. Environ. Microbiol. 73, 1659–1664. 253, 167–171. Bergsten, J., 2005. A review of long-branch attraction. Cladistics 21, 163–193. Moran, N.A., von Dohlen, C.D., Baumann, P., 1995. Faster evolutionary rates in Bromham, L., Penny, D., Rambaut, A., Hendy, M.D., 2000. The power of relative rates endosymbiotic bacteria than in cospeciating insect hosts. J. Mol. Evol. 41, 727– tests depends on the data. J. Mol. Evol. 50, 296–301. 731. Buchner, P., 1953. Endosymbiose der Tiere mit pflanzlichen Mikroorganismen. Moran, N.A., Degnan, P.H., Santos, S.R., Dunbar, H.E., Ochman, H., 2005. The players Birkhäuser, Basel, Stuttgart, 771pp. in a mutualistic : Insects, bacteria, viruses, and virulence genes. Proc. Castresana, J., 2000. Selection of conserved blocks from multiple alignments for Natl. Acad. Sci. USA 102, 16919–16926. their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552. Moran, N.A., McCutcheon, J.P., Nakabachi, A., 2008. Evolution and genomics of Clark, M.A., Moran, N.A., Baumann, P., Wernegreen, J.J., 2000. Cospeciation between heritable bacterial symbionts. Ann. Rev. Genet. 42, 165–190. bacterial endosymbionts (Buchnera) and a recent radiation of aphids Moran, N.A., McLaughlin, H.J., Sorek, R., 2009. The dynamics and time scale of (Uroleucon) and pitfalls of testing for phylogenetic congruence. Evolution 54, ongoing genomic erosion in symbiotic bacteria. Science 323, 379–382. 517–525. Moya, A., Pereto, J., Gil, R., Latorre, A., 2008. Learning how to live together: genomic Conord, C., Despres, L., Vallier, A., Balmand, S., Miquel, C., Zundel, S., Lemperiere, G., insights into prokaryote-animal symbioses. Nat. Rev. Gen. 9, 218–229. Heddi, A., 2008. Long-term evolutionary stability of bacterial endosymbiosis in Munson, M.A., Baumann, P., Clark, M.A., Baumann, L., Moran, N.A., Voegtlin, D.J., Curculionoidea: additional evidence of symbiont replacement in the Campbell, B.C., 1991. Evidence for the establishment of aphid eubacterium Dryophthoridae family. Mol. Biol. Evol. 25, 859–868. endosymbiosis in an ancestor of four aphid families. J. Bacteriol. 173, 6321– Darriba, D., Taboada, G.L., Doallo, R., Posada, D., 2011. ProtTest 3: fast selection of 6324. best-fit models of protein evolution. Bioinformatics 27, 1164–1165. Munson, M.A., Baumann, L., Baumann, P., 1993. Buchnera aphidicola (a prokaryotic Folmer, O., Black, M., Hoeh, W., Lutz, R., Vrijenhoek, R., 1994. DNA primers for endosymbiont of aphids) contains a putative 16S rRNA operon unlinked to the amplification of mitochondrial cytochrome c oxidase subunit I from diverse 23S rRNA-encoding gene: sequence determination and promoter and metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 3, 294–297. terminator analysis. Gene 137, 171–178. Fukatsu, T., Ishikawa, H., 1993. Occurrence of Chaperonin 60 and Chaperonin 10 in Nieto Nafría, J.M., Mier Durante, M.P., Remaudière, G., 1997. Les noms des taxa du primary and secondary bacterial symbionts of aphids: Implications for the groupe-famille chez les Aphididae (Hemiptera). Revue Fr. Ent. (NS) 19, 77–92. evolution of an endosymbiotic system in aphids. J. Mol. Evol. 36, 568–577. Normark, B.B., 2000. Molecular systematics and evolution of the aphid family Funk, D.J., Helbling, L., Wernegreen, J.J., Moran, N.A., 2000. Intraspecific Lachnidae. Mol. Phylogenet. Evol. 14, 131–140. phylogenetic congruence among multiple symbiont genomes. Proc. Biol. Sci. Nováková, E., Moran, N.A., 2012. Diversification of genes for carotenoid biosynthesis 267, 2517–2521. in aphids following an ancient transfer from a . Mol. Biol. Evol. 29, 313– Gadberry, M.D., Malcomber, S.T., Doust, A.N., Kellogg, E.A., 2005. Primaclade - a 323. flexible tool to find conserved PCR primers across multiple species. Ortiz-Rivas, B., Martínez-Torres, D., 2010. Combination of molecular data support Bioinformatics 21, 1263–1264. the existence of three main lineages in the phylogeny of aphids (Hemiptera: Gouy, M., Guindon, S., Gascuel, O., 2010. SeaView version 4: a multiplatform Aphididae) and the basal position of the subfamily Lachninae. Mol. Phylogenet. graphical user interface for sequence alignment and phylogenetic tree building. Evol. 55, 305–317. Mol. Biol. Evol. 27, 221–224. Peccoud, J., Simon, J.C., McLaughlin, H.J., Moran, N.A., 2009. Post-Pleistocene Gruber-Vodicka, H.R., Dirks, U., Leisch, N., Baranyi, C., Stoecker, K., Bulgheresi, S., radiation of the pea aphid complex revealed by rapidly evolving Heindl, N.R., Horn, M., Lott, C., Loy, A., Wagner, M., Ott, J., 2011. Paracatenula,an endosymbionts. Proc. Natl. Acad. Sci. USA 106, 16315–16320. ancient symbiosis between thiotrophic Alphaproteobacteria and catenulid Peek, A.S., Feldman, R.A., Lutz, R.A., Vrijenhoek, R.C., 1998. Cospeciation of flatworms. Proc. Natl. Acad. Sci. USA 108, 12078–12083. chemoautotrophic bacteria and deep sea clams. Proc. Natl. Acad. Sci. USA 95, Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate 9962–9966. large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. 54 E. Nováková et al. / Molecular Phylogenetics and Evolution 68 (2013) 42–54

Posada, D., 2008. JModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, Sandström, J.P., Russell, J.A., White, J.P., Moran, N.A., 2001. Independent origins 1253–1256. and horizontal transfer of bacterial symbionts of aphids. Mol. Ecol. 10, Remaudière, G., Remaudière, M., 1997. Catalogue des Aphididae du Monde. INRA, 217–228. Homoptera Aphidoidea. Paris, 473pp. Shimodaira, H., Hasegawa, M., 2001. CONSEL: for assessing the confidence of Rispe, C., Moran, N.A., 2000. Accumulation of deleterious mutations in phylogenetic tree selection. Bioinformatics 17, 1246–1247. endosymbionts: Muller’s ratchet with two levels of selection. Amer. Nat. 156, Swofford, D.L., 2003. PAUP Ã. Phylogenetic Analysis Using Parsimony (Ãand Other 425–441. Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts. Rispe, C., Kutsukake, M., Doublet, V., Hudaverdian, S., Legeai, F., Simon, J.C., Tagu, D., The International Aphid Genomics Consortium, 2010. Genome Sequence of the Pea Fukatsu, T., 2008. Large gene family expansion and variable selective pressures Aphid Acyrthosiphon pisum. PLoS Biol. 8, e1000313. for cathepsin B in aphids. Mol. Biol. Evol 25, 5–17. von Dohlen, C.D., Moran, N.A., 2000. Molecular data support a rapid radiation of Robinson, M., Gouy, M., Gautier, C., Mouchiroud, D., 1998. Sensitivity of the relative- aphids in the Cretaceous and multiple origins of host alternation. Biol. J. Linn. rate test to taxonomic sampling. Mol. Biol. Evol. 15, 1091–1098. Soc. 71, 689–717. Robinson-Rechavi, M., Huchon, D., 2000. RRTree: relative-rate tests between groups von Dohlen, C.D., Rowe, C.A., Heie, O.E., 2006. A test of morphological hypotheses for of sequences on a phylogenetic tree. Bioinformatics 16, 296–297. tribal and subtribal relationships of Aphidinae (Insecta: Hemiptera: Aphididae) Ronquist, F., Huelsenbeck, J.P., 2003. MRBAYES 3: Bayesian phylogenetic inference using DNA sequences. Mol. Phylogenet. Evol. 38, 316–329. under mixed models. Bioinformatics 19, 1572–1574. Whitfield, J.B., Lockhart, P.J., 2007. Deciphering ancient rapid radiations. Trends Rorabacher, D.B., 1991. Statistical treatment for rejection of deviant values: critical Ecol. Evol. 22, 258–265. values of Dixon Q parameter and related subrange ratios at the 95 percent Wojciechowski, W., 1992. Studies on the Systematic System of Aphids (Homoptera, confidence level. Anal. Chem. 63, 139–146. Aphidinea). Uniwersytet Slaski, Katowice. Russell, J.A., Latorre, A., Sabater-Muñoz, B., Moya, A., Moran, N.A., 2003. Side- Xia, X., Xie, Z., 2001. DAMBE: data analysis in molecular biology and evolution. J. stepping secondary symbionts: widespread horizontal transfer across and Heredity 92, 371–373. beyond the Aphidoidea. Mol. Ecol. 12, 1061–1075.