ABSTRACT

TRAUTWEIN, MICHELLE DENEE. Multi-gene Phylogenetics to Resolve Key Areas of the Tree of Life. (Under the direction of Brian M. Wiegmann.)

FLYTREE (an NSF Assembling the Tree of Life project) is a large collaborative project aimed at reconstructing relationships among major lineages of Diptera. Previous morphological and molecular work, along with preliminary analyses of phylogenomic and total evidence data sets from FLYTREE, have provided evidence that while much of the fly tree of life can be confidently resolved, some regions remain challenging to decipher. are a species-rich lineage of that originated more than 240 MYA in the Mesozoic. Ancient radiations, particularly if they occurred rapidly, can be difficult to resolve, even with large amounts of data. Phylogenetic inference can be misled by both systematic and stochastic error. The reliance on rigorous data exploration to decipher phylogenetic signal from noise can be crucial to the accurate recovery of evolutionary relationships. This study utilizes multiple nuclear genes and data exploration to address three persistently problematic regions of dipteran evolution. The first chapter evaluates the relationships of the lower brachyceran superfamily , the putative sister-group to Eremoneura (Cyclorrhapha +

Empidoidea). CAD + 28S support traditional asiloid clades and recover multiple hypotheses for the sister group to higher flies, primarily due to the indeterminate placement of the family (bee flies) and the enigmatic genus Hilarimorpha. The genus Apystomyia is strongly supported as sister to

Cyclorrhapha. Taxon stability and the effects of additional genes are explored.

The second chapter addresses the phylogenetics of the subfamilies of

Bombyliidae by analyzing CAD + 28S alone and with morphology. The monophyly of 8 of 15 subfamilies are confirmed along with the polyphyly of

Bombyliinae. A hypothesis for the interrelationships of bee fly subfamilies is presented. Topological incongruence and the effect of the removal of conflict- inducing taxa are explored. The third chapter relies on six-nuclear genes to identify the sister-group of Diptera by resolving the phylogeny of Holometabola.

Traditional supraordinal groupings are confirmed. Mecoptera+Siphonaptera are sister to Diptera. Strepsiptera, previously hypothesized as the closest relative of

Diptera, is confidently placed as the sister-group to Coleoptera. A thorough exploration to rule out the effects of long-branch attraction is presented. Multi-gene Phylogenetics to Resolve Key Areas in the Fly Tree of Life

by Michelle Denee Trautwein

A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fufillment of the requirements for the Degree of Doctor of Philosophy

Entomology

Raleigh, North Carolina

2009

APPROVED BY:

! ! BIOGRAPHY

Michelle Trautwein was born in Philadelphia, Pennsylvania in 1976 and relocated to Austin, Texas six months later. She began her studies at the University of

Texas in 1994 as an art major. Three years later, after an entomology class and a summer catching frogs in Costa Rica, she changed her major to science and graduated in 1999 with a BS in Biology. Before focusing on insects, she assisted with research on pigeon guillemots in Alaska, prairie dogs in Utah, and bottlenose dolphins in Florida. An internship at the Smithsonian studying Diptera brought her back to insects and reintroduced her to flies. Michelle began her graduate studies at North Carolina State University in Raleigh in 2003. Her work and interests are focused on the evolutionary relationships of flies and phylogenomics in general.

! ""! ! ! ACKNOWLEDGMENTS

Thank you to my professors and mentors who introduced me to

entomology (flies in particular), evolution, and phylogenetics: Riley

Nelson, Wayne Mathis, Amnon Friedberg, Lewis Deitz and Brian

Wiegmann. Thank you for your time, your wisdom and your friendship.

Additional thanks go to the other students and post-docs in the Wiegmann

lab who have supported my work and made my years at NC State

memorable.

! """! ! ! TABLE OF CONTENTS

Page

LIST OF TABLES…………………………………………………………………….….v

LIST OF FIGURES ……………………………………………………………………..vi

Chapter 1: A multi-gene phylogeny of the superfamily Asiloidea (Insecta:

Diptera): exploring the effects of taxon sampling and the resolving power of additional genes…….………………………………………………………………1

Abstract.………………………………………………………………………….2

Introduction….…………………………………………………………………...3

Methods and Materials….………………………………………………………7

Results ………………………………………………………………………….14

Discussion………………………………………………………………………20

Conclusions ..………………………………………………………………...... 28

Acknowledgments ……………………….………………………………...... 29

References ……………………………………………………………………..30

Chapter 2: The evolutionary relationships of bee fly subfamilies: short branches, long branches and topological incongruence …………………….52

Abstract…………………………………………………………………………53

Introduction……………………………………………………………………..54

Materials and Methods………………………………………………………..55

Results and Discussion……………………………………………………….61

Conclusions …………………………………………………………………...70

! "#! ! ! Acknowledgments ..……………………….………………………………...... 72

References ……………………………………………………………………..73

Chapter 3: Identifying the sister-group to Diptera: a multi-gene phylogeny of the holometabolous insects …………………………………………………….92

Abstract ………………………………………………………….……………..93

Introduction …………………………………………………….………………94

Results and Discussion ………………………………………….…………...97

Materials and Methods ..….………………………………………………….105

Acknowledgments ..……………………….………………………………....109

References ……………………………………………………………………110

! #! ! ! LIST OF TABLES

Chapter 1

Table 1. Sampled taxa ……………………………………………………….38

Table 2. Clade recovery results with varying methods of analysis,

treatment of data and taxon inclusion …………………………..40

Table 3. Leaf stability values of all taxa, stable taxa and reduced taxa…41

Table 4. Testing of a priori hypotheses in parsimony ……………..………43

Chapter 2

Table 1. Sampled taxa ………………………….…………………..……….80

Chapter 3

Table 1. Taxa, genes and genbank numbers for Holometabola and out-

groups………………………………………………………….….118

Table 2. Clade recovery results from ML analyses with varied taxon and

character inclusion used to counter LBA ………………………121

! #"! ! ! LIST OF FIGURES

Chapter 1

Figure 1. Strict consensus of 6 maximum parsimony trees of 28S+CAD

(bp 1+2). All taxa included …………………………………..…….45

Figure 2. Bayesian tree of 28S+CAD (bp 1+2). All taxa included ………46

Figure 3. Maximum- likelihood tree of 28S+CAD (bp 1+2). All taxa

included ……………………………………………………….……..47

Figure 4. Maximum parsimony analysis of 28S, CAD, TPI, and CO1 …..48

Figure 5. Comparison of MP bootstrap consensus tree including all taxa

and stable taxa only ……………………………………………….. 49

Figure 6. MP, ML, and BI congruent topology of reduced taxa …………..50

Figure 7. Four-cluster likelihood mapping image testing the support for a

monophyletic Asiloidea ………………………………………...... 51

Chapter 2

Figure 1. Strict consensus of 7 maximum parsimony trees of 28S+CAD

(bp 1+2). All taxa included ………………………………………85

Figure 2. Maximum- likelihood tree of 28S+CAD (bp 1+2). All taxa

included …………………………………………………………….86

Figure 3. Bayesian tree of 28S+CAD (bp 1+2). All taxa included ……….87

! #""! ! ! Figure 4. Consensus network showing conflict between MP, ML, and BI

topologies including all taxa ……………………………………...88

Figure 5. Consensus network revealing largely congruent topologies from

MP, ML and BI when conflict-inducing taxa are removed. …….89

Figure 6. Bayesian tree of 28S+CAD (bp 1+2) with 5 conflict-inducing taxa

removed …………………………………………………………….90

Figure 7. Total evidence tree. Bayesian analysis of 28S+CAD (bp 1+2)

plus155 morphological characters . .……………………………..91

Chapter 3

Figure 1. The phylogeny of the holometabolous insects based on 6

nuclear protein-coding genes …………………………………….122

Figure 2. The congruent ML and BI topology with branch lengths and

support values ……………………………………………….…….123

Figure 3a-b. Likelihood-mapping images showing the strength of our

phylogenetic signal and the conflicting data supporting the

placement of Strepsiptera ………………………………….…….124

Figure 4. Neighbor-Nets showing conflicting splits with all taxa included

and with Strepsiptera excluded ……………………………...…..126

! #"""! ! !

Chapter 1:

A multi-gene phylogeny of the superfamily Asiloidea (Insecta: Diptera):

exploring the effects of taxon sampling and the resolving power of additional

genes

! $! ! !

Abstract

Asiloidea are a group of nine lower brachyceran fly families made up of generally large-sized flower visitors, parasitoids and aerial predators of other insects.

Traditionally, the Asiloidea has been viewed as a monophyletic assemblage and the closest relative to the large, successful dipteran radiation Eremonuera

(Cyclorrhapha+). The evidence for asiloid monphyly is limited, and previous morphological and molecular studies demonstrate that this region of fly evolution is marked by very few characters delimiting the relationships between the presumed families of Asiloidea and Eremoneura. Adding to the phylogenetic complexity are the enigmatic ‘asiloid’ genera Hilarimorpha and Apystomyia, currently united in the family , that retain morphological characters of both asiloids and higher flies. In this study we use the nuclear protein-coding gene CAD and 28S rDNA to test the monophyly of the Asiloidea and resolve its relationship to the Eremoneura. To this end, we also explore the effects of taxon sampling on support values and topological stability, the resolving power of additional genes, and hypothesis-testing using flour-cluster likelihood mapping.

! %! ! ! Introduction

! Brachyceran flies are a large Mesozoic radiation of approximately 100,000 described species that includes the great majority of species diversity within the order Diptera (Blagoderov et al., 2007; Yeates et al., 2007). More than

80% of brachyceran species, including the well-known flies, Musca domestica

(house fly) and Drosophila melanogaster (vinegar fly), occur within Eremoneura

(Cyclorrhapha + Empidoidea), or higher flies. Eremoneura are among the best supported of all dipteran clades (Yeates and Wiegmann, 1999), with 13 morphological synapomorphies and the strong support of molecular evidence

(Wiegmann et al., 2003). Divergence time estimates suggest that the origin and diversification of the Eremoneura began approximately 170 MYA and expanded with the origin and radiation of angiosperm plants (Wiegmann et al., 2003). The lower brachyceran superfamily Asiloidea, with 12,000 species, has been hypothesized to be the possible sister group to the Eremoneura (Hennig, 1973;

Woodley, 1989). Because asiloid monophyly is not well established, however, it remains unclear whether Eremoneura and Asiloidea share a most recent common ancestor, or whether the Eremoneura orginated from within the

Asiloidea with closest relatives in one or more of the asiloid family-level lineages.

Asiloid flies are generally large, showy flower visitors as adults and almost exclusively substrate-dwelling predators as larvae. Two of the largest families are significant exceptions to this rule: the larvae of Bombyliidae or bee flies, are insect parasitoids, and , also known as robber flies, prey on insects as

! &! ! ! adults. Asiloid flies are distributed worldwide, with their greatest diversity occurring in arid, sandy areas.

There are nine families included in Asiloidea: Asilidae, Mydidae,

Bombyliidae, , Apsilocephalidae, Apioceridae, Hilarimorphidae,

Therevidae, and Evocoidae. The interrelationships of these families are largely unconfirmed, and the evidence that they all share a recent common ancestor is tenuous. Morphologically, the families from which larvae are known are joined only by a larval spiracle in the penultimate abdominal segment (Woodley, 1989;

Yeates, 1994). Even this single unifying character is subject to homoplasy, appearing in , Xylophagomorpha, and , and may be an adaptation to terrestrial or parasitic habitats (Sinclair et al., 1994).

Molecular evidence from 28S ribosomal DNA in a broader study of brachyeran phylogeny does not support a monophyletic Asiloidea (Wiegmann et al., 2003).

In recent years, there have been several significant studies of asiloid families that have addressed some long-standing phylogenetic questions.

Molecular and morphological data have shown support for a therevoid clade including , Scenopinidae, Apsilocephalidae, as well as, the newly described monotypic family Evocoidae (Yeates et al., 2003). A proposed sister- group relationship between Mydidae and Apioceridae is supported by the presence of multiple rectal papillae and shared wing venation features (Woodley,

1989), and molecular support from 28S ribosomal DNA (Irwin and Wiegmann,

2001). The monophyly of Asilidae, the robber flies, is supported by ribosomal

! '! ! ! DNA and morphology (Bybee et al., 2004; Dikow, 2009), although their position relative to the rest of Asiloidea is as yet unknown (Grimaldi, 2005). In contrast,

Bombyliidae are a heterogeneous assemblage with weak morphological support for monophyly (Yeates, 1994; 2002). They have been most recently hypothesized as sister group to (Woodley, 1989; Yeates, 2002) or paraphyletic with (Sinclair et al., 1994) all other Asiloidea.

Several enigmatic genera are also included in Asiloidea and these have contributed to the complexity of determining the superfamily’s monophyly and to difficulties in deciphering closest relatives of Eremoneura (Yeates and

Wiegmann, 1999; Yeates, 2002). Hilarimorpha and Apystomyia are little known asiloid flies that exhibit morphological characters shared with both the Asiloidea and the Eremoneura, and thus appear to be transitional species (Wiegmann et al., 1993). Both of these genera have previously been placed in Bombyliidae

(Hennig, 1973; Woodley, 1989), Therevidae (Sinclair et al., 1994), or their own separate families (Webb, 1974; Nagatomi and Liu, 1994); but the most current morphological evidence joins Apystomyia and Hilarimorpha together in a single family, Hilarimorphidae (Yeates, 1994; 2002). Hilarimorphidae has been considered the sister-group to Bombyliidae (Yeates, 1994) or sister to the entire

Eremoneura (Yeates, 2002; Grimaldi and Engel, 2005), however both hypotheses lack convincing morphological support.

Compounding the difficulty of determining Apystomyia’s phylogenetic placement is the fact that these exceedingly rare flies have been difficult, if not

! (! ! ! impossible, to acquire. First described by Melander in 1950, only 12 specimens were known in museum collections and many were damaged over time. Despite multiple collection attempts, Apystomyia was not captured again until 2005.

These new specimens have provided us with the opportunity to bring molecular data to bear on the hypothesis of Apystomyia’s inclusion in Asiloidea.

The most recent quantitative study that addressed the phylogeny of

Asiloidea in part was based solely on morphological characters (Yeates, 2002).

No previous phylogenetic work has focused primarily on the superfamily or a broad sampling of its taxa. In our molecular study of Asiloidea, we examine the use of the large, nuclear protein-encoding gene CAD and 28S ribosomal DNA to resolve further the relationships among the asiloid Diptera, to test the monophyly of the superfamily, and to determine the sister-group to Eremoneura. Both CAD and 28S have been shown to exhibit considerable phylogenetic signal for inferring Mesozoic-aged divergences amongst the Diptera (Moulton and

Wiegmann, 2004; 2007; Winterton et al., 2007). In addition, we examine the effects of taxon sampling on support values and topological stability and further test the placement of the anomalous genera Apystomyia and Hilarimorpha with additional sequence from 28S, the mitochondrial gene CO1 and the nuclear protein-coding gene TPI (Hardy, 2007) from a subsample of taxa. We explore our hypotheses through four-cluster likelihood mapping.

! )! ! ! Materials and Methods

Taxa Sampled

A total of 50 species representing 49 genera of orthorrhaphous

Brachycera, Asiloidea and Eremoneura were sampled for nucleotide sequencing.

All 9 families of Asiloidea are represented, including the newly described monotypic family Evocoidae, and the enigmatic genera Hilarimorpha and

Apystomyia. Five taxa representing families from diverse non-asiloid lower brachyceran lineages were sampled as outgroups (Table 1).

To test further the placement of Apystomyia and Hilarimorpha, an 8 taxa subsample representing Asiloidea, Eremoneura and lower were sampled to seek corroboration from additional genes. The 8 taxon subsample was primarily drawn from within the 50 taxon set.

DNA Extraction, Amplification and Sequencing

Genomic DNA was extracted using the DNeasy DNA extraction kit

(QIAGEN Inc., Valencia, CA). The standard protocol was altered by extending the length of time the specimen was in proteinase K solution to two days, in order to effectively break down chitin but to avoid grinding the specimen. Final elution was reduced to 30 µl to avoid diluting the DNA solution.

Sequence data were collected from all 50 taxa for two nuclear genes, the protein-coding gene CAD (carbamoyl phosphate synthetase-aspartate transcarbamoylase-dihydroorotase) and 28S ribosomal DNA. For CAD,

! *! ! ! approximately 4000 bp from the carbomoylphosphate synthase (CPS) domain of the gene were amplified and sequenced using degenerate primers designed by

Moulton and Wiegmann (2004). To amplify approximately 1000 bp from the 3’ end of 28S rDNA, we used published Diptera primers (Yang, 2000). For 8 of the 50 taxa, 2000 additional base pairs of 28S rDNA were sequenced along with approximately 1400 bp of the mitochondrial gene cytochrome c oxidase subunit 1

(CO1) and 500 bp of the nuclear protein coding gene triose phosphate isomerase

(TPI) using primers developed by Hardy (2007) and Junwook Kim. PCR parameters varied for CAD, 28S CO1 and TPI. PCR products were extracted from agarose gels and purified with the Qiaquick Gel Extraction kit (Qiagen,

Santa Clara, CA). Big Dye Sequencing kits (Applied Biosystems, Foster City,

CA) were used for sequencing reactions and sequencing was completed at the

North Carolina State University Genome Sequencing Laboratory (GSL).

Sequences were contiged and edited using Sequencher 4.1 (Gene Codes Corp.,

Ann Arbor, MI).

Alignment was carried out manually using Se-Al 2.0 (Rambaut, 2002).

CAD, TPI and CO1 were aligned according to the amino acid translation. Introns in CAD, hypervariable regions of 28S and other positions of ambiguous alignment were removed from the data set. To detect existing base compositional bias, a chi-square test for homogeneity of base frequencies across taxa was performed for each gene and each codon position of CAD

! +! ! ! independently using Paup* 4.0b10 (Swofford, 2002). The test was repeated for

TPI, CO1 and the additional 28S sequence.

Phylogeny Estimation

Maximum parsimony analyses were carried out using Paup* 4.0b10

(Swofford, 2002). Heuristic search with TBR branch swapping and 100 random addition replicates were completed to find the shortest trees. Node support was obtained by acquiring bootstrap values from heuristic searches of 500 re- sampled data sets with 50 random addition replicates. Analyses including all taxa were conducted for the CAD and 28S rDNA as independent partitions, as well as with combined data sets that included all nucleotides equally weighted.

To overcome misleading phylogenetic signal due to potential saturation in the 3rd positions of CAD, analyses of the concatenated data set were also done with the

3rd positions of CAD removed and with CAD translated to amino acids.

To further test the placement of Hilarimorpha and Apystomyia, parsimony analyses of the larger 28S rDNA fragment, CAD, COI and TPI were conducted for the 8 taxon set. Analyses were carried out with all sequence data combined, and with the 3rd positions of CAD, CO1 and TPI excluded or included, and translated to amino acids.

In order to proceed with a Bayesian analysis, an appropriate model of nucleotide evolution, in this case GTR + I + G, was chosen by using Mr.

Modeltest (Nylander, 2005) to assess the adequacy of substitution models based

! ,! ! ! on the Akaike Information Coefficient (AIC) and nested likelihood ratio tests.

The choice of a model remained the same when genes were analyzed independently or as a concatenated data set. Using Mr.Bayes (Huelsenbeck and

Ronquist, 2001), Bayesian analyses were conducted for 20,000,000 generations, trees were sampled every 1000, and the first 25% (5000 trees) were discarded as burn-in. Analysis was completed with all taxa included and the third positions of CAD removed.

Maximum likelihood analyses were performed on the combined 28S and

CAD data set using Garli (Zwickl, 2006). A GTR+I+G model was implemented.

GARLI analyses were completed with all taxa included and the third positions of

CAD removed.

Stability Testing and Variations on Taxon Sampling

Because taxon choice had a strong effect on our preliminary estimates of tree topology, we sought to identify unstable or ‘rogue’ taxa by quantifying the instability of each taxon in parsimony analyses. The removal of ‘rogue’ taxa has been shown to increase resolution and support values in phylogenies

(Sanderson and Shaffer, 2002). Leaf stability indices were generated using

RadCon (Thorley and Page, 2000). RadCon calculates leaf stability for an individual taxon by analyzing bootstrap trees and determining the average support for the relationships of 3-taxon statements, or triplets, that include a particular taxon. We have relied on the ‘difference’ measurement, defined as the

! $-! ! ! difference between the bootstrap values of the two best-supported triplets that include a particular taxon. Using Paup* 4.0b10 (Swofford, 2002), a bootstrap analysis including all taxa and excluding the 3rd positions of CAD, was completed with 500 replicates and 50 random addition sequences. The trees were imported into RadCon (Thorley and Page, 2000) and the stability of each leaf in the set of rooted bootstrap trees was then calculated. The taxa with instability ratings lower than the overall average were deleted from the data set, resulting in a taxon set we refer to as ‘stable’, the parsimony analysis was then repeated, and new bootstrap values were obtained. The stability of each leaf of the pruned tree was then calculated.

The removal of unstable taxa involves deleting some taxa that are critical to our tests of specific phylogenetic hypotheses. Also, the removal of unstable taxa does not address the instability introduced by choice among possible outgroup taxa. Therefore, we compiled a ‘reduced’ data set that includes only two outgroups, a nemestrinid (Hirmoneura sp.), one of the most closely related outgroups to the asiloid flies, and a xylophagid (Heterostomus sp.). This reduced data set includes all families of Asiloidea. By experimenting with alternative samples of included taxa in preliminary analyses, we found that choice among sampled bombyliids, the especially varied Mythicomyiinae, had a strong effect on the placement of other taxa under all methods of analysis; consequently, our

‘Reduced’ data set includes only 5 bombyliids and a single representative of

! $$! ! ! Mythicomyiinae. The stability of the ‘reduced’ data set was calculated using the above mentioned RadCon method.

Maximium likelihood and Bayesian analyses of the ‘stable’ and ‘reduced’ data set were also completed.

Hypothesis testing in parsimony

To test the statistical significance of alternative topologies under parsimony, Templeton (Wilcoxon signed-ranks) tests and Kishino-Hasegawa tests were performed using Paup* 4.0b10 (Swofford, 2002). Hypothesis testing was done using the ‘reduced’ taxon set and with the 3rd positions of CAD removed. Due to its ambiguous placement, Hilarimorpha was removed from analyses that did not specifically address its placement, such as in tests regarding asiloid monophyly. Trees were estimated with topological constraints representing a priori hypotheses and were then compared to the most parsimonious tree.

Four-cluster likelihood mapping analysis

In order to visualize the strength of phylogenetic signal in our data set and to test for support for alternative hypotheses of relationships amongst asiloid flies, four-cluster likelihood mapping analyses were completed using the program

Tree Puzzle (Schmidt et al., 2002; Strimmer and Von Haeseler, 1997). This quartet-puzzling method allows the user to partition taxa into any four clusters,

! $%! ! ! resulting in three possible tree topologies, and returns a visual summary of the phylogenetic information supporting each of the three topologies. The relative frequencies of the likelihoods for each topology are plotted in an equilateral triangle with each point of the triangle representing a different topology. Support for alternate hypothesis regarding the monophyly of Asiloidea and the placement of Apystomyia and Hilarimorpha were examined using the ‘reduced’ taxon set

(Table 3), with the additional inclusion of the family Evocoidae in order to provide representation for all putative asiloid families. We tested asiloid monophyly by comparing support for alternate placements of Bombyliidae, the remaining asiloid flies, Eremonura, and two lower brachyceran outgroups. Alternative placements for Apystomyia were evaluated by comparing support for placing Apystomyia as closest relative to: all asiloid flies, Eremonura, and two lower brachyceran outgroups. The position of Hilarimorpha in Asiloidea was similarly tested.

Monophyly of the Hilarimorphidae, including Apystomyia, was tested by comparing support for the alternate placements of Hilarimorpha, Apystomyia, the asiloid flies, including Bombyliidae, and Eremoneura. We used exact parameter estimates and a neighbor-joining tree + quartet sampling. A GTR+I+G model of substitution was implemented with the rates of substitution input from estimations made in Paup* (Swofford, 2002). The gamma distribution alpha and the percentage of invariant sites were estimated from the data set.

! $&! ! ! Results

The final concatenated sequence alignment of all 50 taxa for CAD and

28S includes 4772 base pairs (28S=970, CAD=3801). A chi-square test for base composition homogeneity revealed significant heterogeneity among taxa for the full combined gene data set (p <0.0001). To identify the source of base composition heterogeneity within the data, each codon position of CAD, and the

28S gene were tested as independent partitions. Only the 3rd codon positions of

CAD show significant heterogeneity. The 3rd positions of CAD exhibit A/T bias

(70% average) in most taxa and G/T richness in other groups, specifically in the mythicomiine bombyliids (57%) and in some members of the Cyclorrhapha, primarily Paraplatypeza (85%). Exclusion of 3rd positions of CAD restores homogeneity of base pair composition for the concatenated data set.

To test the placement of Hilarimorpha and Apystomyia, we evaluated an additional data set limited to eight taxa sampled from across the higher-level taxonomic diversity of our study. This taxon-limited data set includes 8674 base pairs (28S =2917, CO1=1490, TPI= 486, CAD=3801). A chi-square test for base composition homogeneity across taxa shows significant heterogeneity among sites across taxa (p <0.0001). Tested independently, the 3rd positions of CAD and CO1 reject the null hypothesis of homogeneity. The exclusion of the 3rds positions of these two genes restores homogeneity of base pair composition.

! $'! ! ! Phylogenetic analyses including all taxa

Table 2 summarizes all topologies obtained in altenative phylogenetic analyses. Parsimony analysis of 28S and CAD nucleotides 1+2 (3505 total characters, 2148 constant, 469 parsimony uninformative, 866 parsimony informative) yields six trees of length 5748, the strict consensus of which is shown in Fig. 1. All asiloid and eremoneuran families are monophyletic with the exception of Mydidae, which is unresolved in respect to Asilidae and the

Therevoid clade. Asiloidea, excluding Bombyliidae, and Eremoneura are also monophyletic, but the relationship between them is unresolved. Amongst the asiloid families, two higher-level clades are recovered: a therevoid clade, including Therevidae, Apsilocephalidae and Scenopinidae, and a second clade including Asilidae. Bootstrap support and resolution is generally low for most nodes of the tree.

Other higher-level phylogenetic studies using CAD have shown that parsimony analyses that include the 3rd positions can result in the spurious phylogenetic placement of taxa, likely due to saturation or base composition bias

(Moulton and Wiegmann, 2004; Bertone et al., 2008). An analysis of the combined data set including the 3rd positions resulted in a tree with erroneous placements of outgroups amongst Cyclorrhaphans and bombyliids. Analyses using CAD + 28S, with CAD translated to amino acids recovered all of the families found in the n1+n2 tree along with the Eremoneura, however, the relationships between these groupings were entirely unresolved.

! $(! ! ! In all parsimony analyses that achieve resolution, the genus Apystomyia is placed as a basal cyclorraphan. Hilarimorpha lacks a stable placement and appears in various analyses as unplaced, the sister group to either Asiloidea,

Eremoneura, Asiloidea+Eremoneura, or sister to Bombyliidae. In general, our parsimony trees are sensitive to taxon sampling and outgroup selection.

A Bayesian analysis of all taxa with the 3rd positions of CAD removed differs from the corresponding parsimony analysis in that the Asiloidea is monophyletic, though lowly supported (58%) (Fig. 2). The relationship between the Eremonura, the Asiloidea and a clade including Evocoa and Hilarimorpha is unresolved. The monophyly of Therevidae, Asilidae, Eremoneura, Cyclorrhapha and Empidoidea are supported by 100 pp. The sister-group relationship between

Asilidae and Mydidae plus Apioceridae is supported by 100 pp. The sister-group relationship between Apystomyia and Cyclorrhapha is also supported by 100 pp.

A maximum likelihood analysis of all taxa with 3rd positions of CAD excluded differs from the parsimony and Bayesian trees in that bombyliid subfamily Mythicomyiinae are the sister-group to Eremoneura (Fig. 3). Similar to the parsimony and Bayesian tree, Asiloidea, excluding Bombyliidae, is made up of a monophyletic therevoid clade, including the new family Evocoidae, and a clade of asilids, mydids and apiocerids. The ML analyses also place Apystomyia as sister-group to Cyclorrhapha. Hilarimorpha is the sister-group to the asiloids, excluding the bee flies, plus Eremonura (including Mythicomyiinae).

! $)! ! ! A parsimony analysis of 28S and CAD, TPI and CO1 nucleotides 1+2

(6754 total characters, 5327 constant, 686 parsimony informative, 741 parsimony uninformative) results in a single tree of length 2632 (Fig. 4). Asiloidea and the

Eremoneura (83 bp) are monophyletic. Apystomyia is the sister-group to the

Cyclorrhapha (67 bp). Hilarimorpha is the sister-group to Asiloidea +

Eremoneura. When analyzed as amino acids, the bootstrap support for the sister-group relationship between Apystomyia and Cyclorrhapha is 95.

Stability testing and variations in taxon sampling

Leaf stability values are shown in Table 3. The average value of the difference between the bootstrap values of the two best-supported triplets containing a particular taxon ( = ‘difference’) is 0.5646. Out of 51 taxa, 16 taxa were deemed unstable due to difference values below 0.5646. The unstable taxa included all 3 members of the Scenopinidae, 5 bombyliids, including the 4

Mythicomyiinae and Heterotropus, Hilarimorpha, a mydid, an apiocerid and outgroups belonging to Nemestrinoidea. Removing unstable taxa and recalculating the parsimony bootstrap values, increases the average leaf stability to 0.8186.

Removing unstable taxa also dramatically increases the resolution of the bootstrap consensus tree, resulting in higher support for clades across the tree, including many that had not previously been supported. In contrast to the conflicting results retrieved when all taxa are included, the removal of unstable

! $*! ! ! taxa produces largely congruent results from parsimony, Bayesian, and likelihood analyses. The ‘stable’-taxon tree exhibits a monophyletic Bombyliidae (61 bp, 92 pp), Asiloidea excluding Bombyliidae (78 bp, 100pp) and Eremoneura (93 bp,

100pp). Parsimony and model-based methods of analysis of the stable-taxon data shows no support for a higher-level relationship between Bombyliidae and all remaining asiloid flies plus Eremoneura. Fig. 5 compares the resolution of the bootstrap trees of analyses including all taxa and stable taxa only.

The removal of unstable taxa, although useful for increasing support values, also removed many taxa that are critical to our phylogenetic hypotheses.

Furthermore, the removal of unstable taxa did not account for the topological effects due to outgroup selection. By adding several phylogenetically important taxa back into our analyses, removing several bombyliids and reducing the number of outgroups, we found a taxon set, referred to as ‘reduced’ that maintained a high average leaf stability (0.7024) and returned concordant results from parsimony, Bayesian and maximum likelihood analyses. The ‘reduced’ taxon tree (Fig. 6), like the ‘stable’ taxon tree, results in Bombyliidae as the sister group to the remaining asiloid flies + Eremoneura (in MP and ML, yet unresolved in BI). The trees resulting from different analysis methods conflict only in their placement of Hilarimorpha. Bootstrap and posterior probabilities are not as high for the reduced data set as they are for the stable data set, but they are higher than those resulting from the inclusion of all taxa.

! $+! ! ! Under all methods of analysis, no variation of taxon sampling returns high support for the relationship between Eremoneura, Bombyliidae and the remaining

Asiloidea.

Hypothesis testing

Under parsimony, Templeton and Kishino-Hasegawa tests of alternative a priori hypotheses of asiloid relationships using the ‘reduced’ taxon set, fail to reject alternate placements of Bombyliidae (as part of a monophyletic Asiloidea, or as sister-group to Eremoneura) and Hilarimorpha. Nevertheless, the same tests significantly reject topologies that include Apystomyia in Asiloidea and grouping of Hilarimorpha+Apystomyia (Hilarimorphidae) as sister to Eremoneura

(Table 4).

Four-cluster likelihood analysis

A four-cluster likelihood analysis testing asiloid monophyly shows a high percentage of non-phylogenetic signal present in the data set regarding this question (17.8%, 10-15% is considered high). Support for the relationship between asiloids (excluding bombyliids), Eremoneura, and bombyliids is divided, though the highest percentage of data points fall in favor of a monophyletic

Asiloidea (33.4%) (Fig. 7). Apystomyia is strongly supported as the closest relative of the higher flies (90.4%). When the placement of Hilarimorpha is tested between its affinity to a monophyletic Asiloidea, Eremoneura, or the outgroups, it

! $,! ! ! groups with the outgoups (50.2%, and 18.9% and 21.4% respectively). However, when the outgroups are removed and Apystomyia is included in order to test for support for the family Hilarimorphidae (Apystomyia + Hilarimorpha),

Hilarimorpha clusters with Asiloidea (81.8%) with much greater frequency that

Apystomyia (14.8%). Though these mixed results fail to resolve the placement of

Hilarimorpha, they do demonstrate the lack of support for the family

Hilarimorphidae.

Discussion

The more ancient and rapid a divergence, the more difficult it is to recover a well-resolved phylogenetic history with limited amounts of data (Rokas et al.,

2005a; Whitfield and Lockhart, 2007; Whitfield and Kjer, 2008). Asiloidea likely originated close to 200 million years ago (Wiegmann et al., 2003), and the current systematic status of the relationship between Asiloidea, Bombyliidae

(considered a member of Asiloidea), and Eremoneura exhibit some of the characteristics of a rapid radiation: namely, a topology lacking resolution or support, and short internal branches. Independent data sets, morphological and molecular, show agreement on which internal branches are short on our tree estimates, suggesting that rapid diversification is a key feature of early asiloid fly evolution (Poe and Chubb, 2004; Whitfield and Lockhart, 2007). Morphology alone has been unable to confirm the monophyly of Asiloidea or its relationship to the rest of the large majority of Diptera due to a dearth of characters uniting the

! %-! ! ! group (Yeates, 2002; Woodley, 1989) and the ambiguous placement of a collection of several asiloid-like genera (Yeates and Wiegmann, 1999). In a previous molecular analysis of Brachycera using 28S ribosomal DNA, the asiloid flies appear in a mostly unsupported arrangement in an otherwise well-supported topology (Wiegmann et al., 2003).

The results of our analyses echo previous morphological and molecular studies by presenting an unsubstantiated picture of the relationship between asiloid flies and Eremoneura. Methods of analysis, along with variations in outgroup selection and taxon sampling, yield conflicting topologies regarding a monophyletic Asiloidea. The incongruence in our results amongst higher-level groupings lies primarily with the placement of Bombyliidae, which appears either as sister group to the remaining asiloid flies, as proposed by Woodley (1989), or as the sister group to Eremoneura. When all asiloid families are represented in our analyses, these higher-level relationships are never accompanied by bootstrap support above 50 or resolution in BI, and hypothesis testing shows no significant difference between competing topologies; therefore a strongly supported estimate of relationships between asiloid flies and Eremoneura remains elusive.

Taxon sampling, the inclusion of unstable or ‘rogue’ taxa, as well as outgroup selection have a strong effect on our tree topologies, their stability and support values, a common finding that has been addressed in other phylogenetic studies (Leconitre et al., 1993; Sanderson and Shaffer, 2002; Rokas, 2005a). By

! %$! ! ! eliminating taxa from our data set that were deemed unstable though a leaf stability analysis, overall node support increased and results from different methods of analysis converged on similar topologies. By reintroducing key taxa while maintaining increased resolution, support values and stability, we find our best current estimates of asiloid relationships (Fig. 6). Though several analyses find resolution for the relationship between Bombyliidae, the remaining Asiloidea and Eremoneura, these results are not supported and are unstable in respect to taxon sampling and method of analysis. Lecointre et al. (1993) define a robust clade as ‘a node with a BP that is high and does not significantly vary whatever the species sample used to represent the corresponding group.’ The relationship between the asiloid flies (excluding the Bombyliidae) and the Eremoneura cannot be considered robust based on our current data.

The amount and type of sequence data we acquired has previously provided sufficient phylogenetic signal to resolve dipteran relationships in other analyses (Moulton and Wiegmann, 2004; 2007; Bertone et al., 2008; Winterton et al., 2007; Scheffer et al., 2007). Moreover, a four-cluster likelihood mapping analysis addressing the question of asiloid monophyly indicates that a large percentage of our data (17.8%) (Fig. 7) consists of nonphylogenetic information

(greater than 10-15% is considered inadequate for tree reconstruction) (Thorley and Page, 2000). Similarly, Rokas et al. (2005a) used as evidence of a rapid radiation in Metazoa, the observation that based on the same sampled genes, relationships among major metazoan lineages remained unresolved and

! %%! ! ! unsupported, while its sister kingdom, Fungi exhibited a well resolved and well supported tree. A similar comparison can be made of Asiloidea and the closely related superfamily Empidoidea. The actual sister group of Asiloidea is unknown, but their close relationship to Eremoneura (Empidoidea +

Cyclorrhapha) has been established along with evidence that they may have diverged at a similar time (Wiegmann et al., 2003). A phylogenetic analysis of the major empidoid lineages based on a large fraction of the CPSase region of CAD resulted in a highly supported, well resolved tree for basal lineages (Moulton and

Wiegmann, 2007). Yet, contemporaneus, deep diverging relationships in

Asiloidea are not resolved with high support using CAD alone, or together in combination with 28S rDNA.

Increasing the number of characters in a data set is often cited as the best means to increase resolution and node support (Rosenberg and Kumar, 2001;

Rokas, 2005b); a parsimony analysis that included 2 additional protein coding genes, CO1 and TPI, and an additional 2000 base pairs of 28S rDNA for a subsampling of taxa, did not improve the bootstrap support for the Asiloidea or its relationship to Eremoneura. Furthermore, the resulting topology, though lacking bootstrap support, yields a monophyletic Asiloidea as the sister group to

Eremoneura, which stands in conflict with the topology that our previous analyses of fewer genes and more taxa had converged upon (Fig. 4).

Though the broader issue of the higher-level relationships between asiloid flies and Eremoneura has not been clearly resolved by our data, they have

! %&! ! ! provided corroborating evidence for some of the relationships amongst asiloid families. Our results nevertheless suggest that Asiloidea could be defined in a more conservative, yet more reliably monophyletic arrangement than the currently accepted definition of the clade, by excluding the large, diverse familily

Bombyliidae. Hennig (1973) referred to this group as Asiliformia, but did not identify any morphological synapomorphies (Sinclair et al.,1994). This more conservative composition of Asiloidea is robust to analysis conditions and data sets and is consistently comprised of two monophyletic clades: the therevoid clade and a clade consisting of the Mydidae, Apioceridae and Asilidae.

In accord with other morphological and molecular hypotheses (Yeates et al., 2003), the majority of our analyses confirm the existence of a “therevoid” clade that includes Scenopinidae, Evocoidae and Apsilocephalidae + Therevidae as sister-groups. The instability of Scenopinidae in our data set prevents the

Therevoid clade from being accompanied by high support values; however, analyses of the ‘stable’ taxonset, that exclude the Scenopinidae, show

Apsilocephalidae + Therevidae supported by 100 pp and 94 bp. This result, along with the morphological and molecular evidence used by Yeates (2002) and

Yeates et al. (2003), contradicts the suggestion of Nagatomi et al. (1991) of a putative relationship between Apsilocephalidae and Eremoneura. The recently described monotypic family Evocoidae was placed in the therevoid clade in model-based analyses, as expected based on a previous study using morphological data and sequence from 28S rDNA (Yeates et al., 2003).

! %'! ! ! The clade joining Asilidae, Mydidae and Apioceridae, recovered in all of our analyses, is supported by 100 pp in all Bayesian analyses. This clade was previously proposed based on the potential synapomorphies of adults with a sunken vertex and overall larval similarity (Woodley, 1989), but the relationship of

Asilidae to the rest of Asiloidea has remained contentious (Yeates, 2002). The

Asilidae are consistently monophyletic in all methods of analysis of all taxon sets, in agreement with morphological and molecular hypotheses (Bybee et al., 2004;

Dikow, 2009). The relationship between Mydidae and Apioceridae, though also supported by 100 pp in all Bayesian analyses, lacks definitive resolution. Mydids are most often found to be paraphyletic with respect to apiocerids (Irwin and

Wiegmann, 2001).

The most surprising and strongly supported finding in our study is the placement of Apystomyia, the small, rare asiloid-like fly that has been collected only 3 or 4 times in southern California. The name Apystomyia, appropriate for this fly, in Greek means ‘a fly of which nothing is known’ (Melander, 1950). Since its discovery, the fly’s taxonomic placement has been controversial due to its unique morphology, including features that are common to both asiloid and eremoneuran flies. Apystomyia retains the synapomorphic male genitalia of the

Asiloidea, however its epandrium, reduced wing venation, and overall vestiture, with bristle-like setae, suggest affinity to the early cyclorrhaphan lineages.

Melander initially described Apystomyia as a member of the bombyliid subfamily

Heterotropinae, a catch-all grouping of hard-to-place flies. Melander’s reasoning

! %(! ! ! for including Apystomyia in Heterotropinae reflects the ambiguity of its classification. He states, “This enigmatic little fly does not seem to be related to any other genus, and its assignment to the Heterotropinae is made because it does not conform to any other subfamily or family.” Over time, Apystomyia has been treated as an anomalous bombyliid (Hull, 1973; Woodley, 1989; Nagatomi et al., 1991) and as a therevid (Sinclair et al., 1994). In 1994, Yeates’ quantitative morphological analysis of Bombyliidae placed Apystomyia in the previously monogeneric family Hilarimorphidae as the sister group to

Bombyliidae. His more recent morphological analysis of Brachycera, however, resulted in the family Hilarimorphidae, including Apystomyia, being placed as the sister group to Eremoneura (Yeates, 2002).

Our results do not reflect a close relationship between Hilarimorpha and

Apystomyia, and remove Apystomyia even further from its traditional placement in Asiloidea. Our molecular data unambiguously place Apystomyia as the sister group to Cyclorrhapha. This surprising result appears both stable and strongly supported among all treatments of data and methods of analysis. Parsimony and

Bayesian analyses show high support for this new placement of Apystomyia. A leaf stability test shows that under parsimony, Apystomyia is among the most stable of all taxa included in the data set and appears to be unaffected by outgroup selection or the inclusion of other ‘rogue taxa’. Alternative hypotheses testing in parsimony rejects the placement of Apystomyia in Bombyliidae or elsewhere in Asiloidea. A four-cluster maximum likelihood analysis shows that

! %)! ! ! our data contain phylogenetic signal that favor the relationship between

Apystomyia and Cyclorrhapha (90.4%) while rejecting its close affinity with

Asiloidea (5.5%). The implications of this unexpected finding are that

Apystomyia appears to be a relict or transitional lineage that has retained morphological similarity to the most recent common ancestor of both Asiloidea and Eremoneura, and is sister to the large successful Cyclorrhaphan lineage, the major synapomorphies of which it lacks. This pattern of unique, depauperate, difficult to place lineages recovered as sister to major radiations is a recurring phylogenetic pattern for modern studies of many major clades of living organisms: flowering plants (Soltis et al., 1999; Davies et al., 2004), Lepidoptera

(Wiegmann et al., 2000), Coleoptera (Maddison et al., 1999), Diptera (Bertone et al., 2008). As of yet, there are no convincing morphological synapomorphies uniting Apystomyia and the Cyclorrhapha, thus, amongst dipterists, this placement of Apystomyia will undoubtedly be subject to detailed examination further.

Like Apystomyia, Hilarimorpha’s placement within Asiloidea has varied, being considered a therevid (Sinclair et al., 1994) an empidid (Nagatomi et al.,

1991), or more frequently a bombyliid (Woodley, 1989; Webb, 1974), or even a close relative to bombyliids (Nagatomi and Liu, 1994; Yeates, 1994). Our data failed to resolve the placement of Hilarimorpha. In all parsimony, Bayesian and likelihood trees, the location of Hilarimorpha lacks support and varies from tree to tree. In a leaf stability analysis, Hilarimorpha ranks as the most unstable taxon in

! %*! ! ! the data set. Testing of a priori hypotheses does not reject the inclusion of

Hilarimorpha in Asiloidea or as the sister group to Eremoneura. A four-cluster likelihood mapping analysis shows that our data contains more phylogenetic signal supporting the hypothesis of a sister group relationship between Asiloidea and Hilarimorpha (81.8%) than of a sister group relationship of Hilarimorpha and

Eremoneura (0.6%) or Apystomyia (14.8%). Due to the ambiguity of our results, we suggest that Hilarimorpha continue to be considered incertae sedis amongst

Heterodactyla (Asilodea+Eremoneura).

Conclusions

Our results, in concordance with previous molecular and morphological work, show that relationships in this region of the dipteran tree of life continue to defy attempts at resolution using standard phylogenetic methods. The low support values for clades on our trees lead us to seek heuristic estimates of topological stability and congruence across analysis methods and treatments of data to arrive at a current best estimate of asiloid relationships. If the relationships between Asiloidea, Bombyliidae and Eremonura reflect an ancient, rapid radiation, as evidence currently indicates, our efforts to recover highly supported evolutionary relationships amongst them will likely continue to be a major challenge for dipteran systematics.

! %+! ! ! Acknowledgments

We are grateful to D.K. Yeates, M.E. Irwin and N.I. Evenhuis for the provision and identification of specimens. Additionall thanks go to Brian Cassel for assistance in the collection of molecular data. This project was supported by US

National Science Foundation (NSF) Assembling the Tree of Life (ATOL) grant

EF-03394 to BMW and DKY.

! %,! ! ! References

Bertone, M. A., G. W. Courtney, and B. M. Wiegmann. 2008.

Phylogenetics and temporal diversification of the earliest true flies (Insecta:

Diptera) based on multiple nuclear genes. Syst. Entomol. 33:668–687.

Blagoderov, V., D. A. Grimaldi, and N. C. Fraser. 2007. How time flies for flies: diverse Diptera from the of Virginia and early radiation of the order.

Am. Mus. Novit. 3572:1-40.

Bybee, S. M., S. D. Taylor, C. Nelson, M. F. Whiting. 2004. A phylogeny of robber flies (Diptera: Asilidae) at the subfamilial level: molecular evidence. Mol.

Phylogenet. Evol. 30:789-797.

Davies, T. J., T. G. Barraclough, M. W. Chase, P. S. Soltis, D. E. Soltis, and V. Savolainen. 2004. Darwin's abominable mystery: insights from a supertree of the angiosperms. Proc. Natl. Acad. Sci. 101:1904-1909.

Dikow, T. 2009. Phylogeny of Asilidae inferred from morphological characters of imagines (Insecta: Diptera: Brachycera: Asiloidea). Bull. Am. Mus.

Nat. Hist. 319:1–175.

! &-! ! ! Grimaldi D., and Engel, M.S., 2005. Evolution of the Insects. New York:

Cambridge.

Hardy, N. B. 2007. Phylogenetic utility of dynamin and triose phosphate isomerase. Syst. Entomol. 32:396-403.

Hennig, W. 1973. Diptera (Zweiflügler). In Helmcke, J. G., D. Starck, and

H. Wermuth, eds., Handbuch der Zoologie. IV Band. 2 Hälfte: Insecta. 2 Teil:

Spezielles. 31. Walter De Gruyter, Berlin. 337.

Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17:754-755.

Hull, F. M. 1973. Bee flies of the world: the genera of the family

Bombyliidae. Bull. U.S. Nat. Mus. 286:1-687.

Irwin, M.E., and B. M. Wiegmann. 2001. A review of the southern African genus Tongamya (Diptera: Asiloidea: Mydidae: Megascelinae), with molecular assessment of the phylogenetic placement of Tongamya and the Megascelinae.

Afr. Invert. 42:225–253.

! &$! ! ! Lecointre, G, H. Philippe, H. L. V. Lê, and H. Le Guyader. 1993. Species sampling has a major impact on phylogenetic inference. Mol. Phylogenet. Evol.

2:205–224.

Maddison W. P., and D. R. Maddison. 1992. MacClade, Version 3.0.

Sinauer Associates, Sunderland, MA.

Maddison, D. R., M. D. Baker, and K. A. Ober. 1999. Phylogeny of carabid beetles inferred from 18S ribosomal DNA (Coleoptera: Carabidae). Syst.

Entomol. 24:103–138.

Melander, A. L. 1950. Taxonomic notes on some smaller Bombyliidae

(Diptera). Pan-Pac. Entomol. 26:139–156.

Moulton, J. K., and B. M. Wiegmann. 2004. Evolution and phylogenetic utility of CAD (rudimentary) among Mesozoic-aged eremoneuran Diptera

(Insecta). Mol. Phylogenet. Evol. 31:363-378.

Moulton, J. K., and B. M. Wiegmann. 2007. The phylogenetic relationships of flies in the superfamily Empidoidea (Insecta: Diptera). Mol. Phylogenet. Evol.

43:701–713.

! &%! ! !

Nagatomi, A., T. Saigusa, H. Nagatomi, and L. Lyneborg. 1991. The systematic position of the Apsilocephalidae, Rhagionempididae, Protempididae,

Hilarimorphidae, Vermileonidae and some genera of Bombyliidae (Insecta,

Diptera). Zool. Sci. 8:593-607.

Nagatomi A., and N. Liu. 1994. Apystomyidae, a new family of Asiloidea

(Diptera). Acta Zool. Acad. Sci. Hung. 40:203-218.

Nylander, J. A. A. 2004. MrModeltest v2.2 Program distributed by author.

Evolutionary Biology Centre, Uppsala University.

Poe, S., and A. L. Chubb. 2004. Birds in a bush: five genes indicate explosive evolution of avian orders. Evolution 58:404–415.

Rambaut, A. 2002. Sequence alignment editor, Version 2.0. Available as shareware from http://evolve.zoo.ox.ac.uk/index.html.

Rokas, A., D. Kruger, and S. B. Carroll. 2005a. evolution and the molecular signature of radiations compressed in time. Science. 310:1933–1938.

! &&! ! ! Rokas, A., and S. B. Carroll. 2005b. More genes or more taxa? The relative contribution of gene number and taxon number to phylogenetic accuracy.

Mol. Biol. Evol. 22:1337–1344.

Rosenberg, M. S., and S. Kumar. 2001. Incomplete taxon sampling is not a problem for phylogenetic inference. Proc. Natl. Acad. Sci. 98:10751-10756.

Sanderson, M. J., and H. B. Shaffer. 2002. Troubleshooting molecular phylogenetic analyses. Ann. Rev. Ecol. Syst. 33:49–72.

Schmidt, H.A., K. Strimmer, M. Vingron, and A. von Haeseler (2002)

TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 18:502-504.

Scheffer, S. J., I. S. Winkler, and B. M. Wiegmann. 2007. Phylogenetic relationships within the leaf-mining flies (Diptera: ) inferred from sequence data of multiple genes. Mol. Phylogenet. Evol. 42:756-775.

Sinclair, B.J., J.M. Cumming, and D. M. Wood. 1994. Homology and phylogenetic implications of male genitalia in Diptera-Lower Brachycera.

Entomol. Scand. 24. 407–432.

! &'! ! ! Soltis, P. S., D. E. Soltis, and M. W. Chase. 1999. Angiosperm phylogeny inferred from multiple genes as a tool for comparative biology. Nature 402:402–

404.

Swofford, D. L., 2002. PAUP*. Phylogenetic Analysis Using Parsimony (* and Other Methods) Version 4. Sinauer Associates, Sunderland, MA.

Strimmer, K., and A. Von Haeseler. 1997. Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc. Nat.

Acad. Sci. 94: 6815-6819.

Thorley, J. L., and R. D. Page. 2000. RadCon: phylogenetic tree comparison and consensus. Bioinformatics. 16:486-487.

Thorley, J. L., and M. Wilkinson. 1999. Testing the phylogenetic stability of early tetrapods. J. Theor. Biol. 200:343–344.

Webb, D. W. 1974. A Revision of the genus Hilarimorpha (Diptera:

Hilarimorphidae). J. Kansas Entomol. Soc. 47:172-222.

Whitfield, J. B., and P. J. Lockhart. 2007. Deciphering ancient rapid radiations. Trends Ecol. Evol. 22:258-265.

! &(! ! !

Whitfield, J.B., and Kjer. 2008. Ancient rapid radiations of insects: challenges for phylogenetic analysis. Ann. Rev. Entomol. 53:449-472.

Wiegmann, B.M., Mitter, C., Thompson, F.C. 1993. Evolutionary origin of the cyclorrhapha (Diptera): test of alternative morphological hypotheses.

Cladistics. 9: 41-81.

Wiegmann, B. M., S. C. Tsaur, D. W. Webb, D. K. Yeates, and B. K.

Cassel. 2000. Monophyly and relationships of the (Diptera:

Brachycera) based on 28S ribosomal gene sequences. Ann. Entomol. Soc. Am.

93:1031-1038.

Wiegmann, B. M., D. K. Yeates, J. L. Thorne, and H. Kishino. 2003. Time flies, a new molecular time-scale for brachyceran fly evolution without a clock.

Syst. Biol. 52:745-756.

Winterton, S. L., B. M. Wiegmann, and E. I. Schlinger. 2007. Phylogeny and Bayesian divergence time estimations of small-headed flies (Diptera:

Acroceridae) using multiple molecular markers. Mol. Phylogenet. Evol. 43:808-

832.

! &)! ! ! Woodley, N. E., 1989. Phylogeny and classification of the

“Orthorrhaphous” Brachycera. Man. Nearct. Diptera. 3:1371-1392.

Yang, L, B. M. Wiegmann, D. K. Yeates and M. E. Irwin. 2000. Higher- level phylogeny of the Therevidae (Diptera: Insecta) based on 28S ribosomal and elongation factor-1 ! gene sequences. Mol. Phylogenet. Evol. 15:440-451.

Yeates, D. K. 1994. Cladistics and classification of the Bombyliidae

(Diptera: Asiloidea). Bull Am. Mus. Nat. Hist. 219:1-19.

Yeates, D. K. 2002. A quantitative phylogenetic analysis of the Brachycera

(Diptera). Zoologica Scripta, 31, 105–121.

Yeates, D.K., Irwin, M.E., Weigmann, B.M. 2003. Ocoidae, a new family of asiloid flies (Diptera: Brachycera: Asiloidea), based on Ocoa chilensis from Chile,

South America. Sys Ent. 28: 417-431.

Yeates, D.K., and B. M. Wiegmann. 1999. Congruence and controversy: toward a higher-level phylogeny of Diptera. Ann. Rev. Entomol. 44:397-428.

! &*! ! ! Yeates, D. K., B. M. Wiegmann, G. W. Courtney, R. Meier, C. Lambkin, and T. Pape. 2007. Phylogeny and systematics of Diptera: Two decades of progress and prospects. Zootaxa 1668:565-590.

Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.

! &+! ! !

Table 1. Sampled taxa

Taxon Genus species Outgroups TABANAMORPHA Pelecorhyncus personatus Vermelionidae Leptynoma sp. XYLOPHAGAMORPHA Heterostomus sp. STRATIOMYOMOPRPHA Actina sp. Pantopthalmus sp. NEMESTRINOIDEA Hirmoneura sp. Ingroup ASILOIDEA Asilidae Diogmites Dasypogon diadem Machimus Ommatius Apioceridae haruspex Mydidae Tongamyia Scenopinidae Prorates sp. Scenopinus sp. Stenomphrale sp. Therevidae Ectinorhynchus sp. Effatouniella sp. Lysilinga sp. Bombyliidae Heterotropus sp. Sericosoma sp. Geminaria sp. Antonia sp. Pantarbes sp. Neosardus sp. Neosardus sp. sp. Aphoebantus sp. major Pteraulax sp. Epacmus sp. Tomomyza sp. Mythicomyiinae Cyrtosiini Genus sp.

! &,! ! ! Table 1. Continued

Mnemomyia sp. Glabellula sp. Mythicomyia sp. Hilarimorphidae Hilarimorpha sp. Apystomyia sp. EMPIDOIDEA Empis sp. Empididae Hilara sp. Dolichopus sp. Atelestus pulicarus Atelestidae Mehyperus sudeticus CYCLORRHAPHA Lonchoptera fusca Pipunculus houghi Syrphidae Rhingia sp. Paraplatypeza atra

! '-! ! !

Table 2. Clade recovery with under MP, ML and BI with varying taxon and character inclusion

! "#! Table 3. Leaf stability values generated from MP bootstrap trees for all taxa, stable taxa only and our reduced taxa data set.

All taxa Stable taxa Reduced taxa Leaf Difference Leaf Difference Leaf Difference Hilarimorpha 0.3306 Mydidae 0.7617 Stenomphrale 0.5291 Stenomphrale 0.4076 Sericosoma 0.7884 Scenopinus 0.5291 Scenopinus 0.4076 Actina 0.7904 Hirmoneura 0.5372 Ogcodes 0.4139 Bombylius 0.7923 Mythicomyia 0.5373 Mesophysa 0.4268 Diogmites 0.7988 Hilarimorpha 0.5617 Heterotropus 0.4278 Dasypogon 0.7988 Prorates 0.6172 Evocoa 0.4329 Pantopthalmus 0.7994 Heterostomus 0.6287 Mnemomyia 0.4866 Apsilocephala 0.8032 Apiocera 0.6425 Cyrtosiini 0.4866 Efflatouniella 0.8037 Tongomyia 0.6552 Mythicomyia 0.4873 Ectinorhyncus 0.8038 Mydidae 0.6598 Glabellula 0.4873 Laxotela 0.8038 Apsilocephala 0.7079 Leptynoma 0.4876 Tomomyza 0.8047 Bombylius 0.7101 Hirmoneura 0.5146 Pteralaux 0.8050 Lordotus 0.7109 Tongomyia 0.5485 Ommatius 0.8050 Neosardus 0.7123 Prorates 0.5552 Machimus 0.8050 Epacmus 0.7128 Apiocera 0.5616 Pantarbes 0.8060 Diogmites 0.7215 Average 0.5646 Neosardus 0.8154 Dasypogon 0.7215 Mydidae 0.5728 Neosardus 0.8154 Efflatouniella 0.7271 Dolichopus 0.5772 Epacmus 0.8175 Ectinorhyncus 0.7297 Apystomyia 0.5793 Aphoebantus 0.8175 Laxotela 0.7297 Atelestus 0.5797 Lordotus 0.8196 Ommatius 0.7347 Meghypherus 0.5797 Amphicosmus 0.8196 Machimus 0.7347 Empis 0.5797 Geminaria 0.8196 Dolichopus 0.7846 Hilara 0.5797 Heterostomus 0.8325 Apystomyia 0.7860 Lonchoptera 0.5830 Dolichopus 0.8484 Lonchoptera 0.7930 Paraplatypeza 0.5832 Lonchoptera 0.8495 Paraplatypeza 0.7934 Pipunculus 0.5835 Paraplatypeza 0.8500 Atelestus 0.7948 Rhingia 0.5835 Apystomyia 0.8501 Meghypherus 0.7948 Diogmites 0.5849 Rhingia 0.8506 Empis 0.7950 Dasypogon 0.5849 Pipunculus 0.8507 Hilara 0.7950 Apsilocephala 0.5963 Empis 0.8517 Rhingia 0.7952 Ommatius 0.5988 Hilara 0.8517 Pipunculus 0.7953 Machimus 0.5988 Atelestus 0.8518 Average 0.7024 Efflatouniella 0.6011 Meghypherus 0.8518 Ectinorhyncus 0.6015 Average 0.8186

Table 3. Continued

Laxotela 0.6015 Bombylius 0.6042 Tomomyza 0.6057 Sericosoma 0.6084 Pteralaux 0.6089 Pantarbes 0.6101 Neosardus 0.6137 Neosardus 0.6137 Epacmus 0.6146 Aphoebantus 0.6146 Lordotus 0.6163 Amphicosmus 0.6163 Geminaria 0.6163 Actina 0.6198 Pantopthalmus 0.6582 Heterostomus 1

Table 4. Tests of a priori hypotheses. Based on the taxa set 'reduced', CAD (bp

1+ 2) and 28S rDNA. * Significant difference at p<0.05.

Hypotheses Maximum parsimony Likelihood based

Wilcoxon signed- Kishino-

ranks test p* Hasegawa test p*

Asiloidea+Eremonura 0.7905 0.5931

Bombyliidae +

(Asiloidea+Eremoneura) 1 1

Asiloidea+ 0.6171, 0.5716, 0.5051, 0.4498,

(Bombyliidae+Eremonuera) 0.3877 0.2483

Asiloidea+

(Hilarimorphidae+ 0.0137*, 0.0339*, 0.0367*, 0.0881,

Eremoneura) 0.0184*, 0.0426* 0.0641, 0.0704

Asiloidea+ Hilarimorpha 0.4497 0.3779

Asiloidea+ Apystomyia 0.0240*, 0.0291* 0.0258*, 0.0353*

Figure 1. MP analysis of 28S and CAD (bp 1+2) (3505 total characters, 469 parsimony-uninformative, 866 parsimony-informative). Strict consensus of 6 trees

of length 5748. All taxa included.

Figure 2. Bayesian analysis of 28S + CAD (bp 1+2) including all taxa.

Figure 3. ML tree of 28S+CAD (bp 1+2) including all taxa.

Figure 4. MP analysis of 28S, CAD, TPI, and CO1. 6915 characters (nt 1+2) and 717 parsimony-informative characters.

Figure 5. The MP bootstrap majority-rule consensus tree for stable taxa only exhibits greater resolution in comparison to the bootstrap consensus tree with all taxa included.

Figure 6. MP, ML and BI congruent topology including ‘reduced taxa’. The placement of the genus Hilarimorpha is the only point of conflict between all 3 methods of analysis.

Figure 7. Four-cluster likelihood mapping image showing distribution of phylogenetic signal supporting various arrangements of asiloids (ASIL), bombyliids (BOM), Eremoneura (ERE) and outgroups (OUT). 17.8% of quartets remain unresolved, indicating a large percentage of non-phylogenetic signal.

Chapter 2:

The evolutionary relationships of the bee fly subfamilies: short branches, long

branches and topological incongruence

Abstract

Bombyliidae, or bee flies, are a lower brachyceran family of flower-visiting flies that, as larvae, act as parasitoids of other insects. Including almost 5000 spp., the bee flies are both species-rich and extremely morphologically diverse. The subfamily relationships are primarily known from a single, previous morphological analysis that yielded minimal support for higher-level groupings. We use the protein-coding gene CAD and 28S rDNA in combination with existing morphological data to test the monophyly of the existing subfamilies, the higher- level divisions Tomophtlamae and ‘the sand chamber subfamilies,’ and to determine the intersubfamilial relationships. We explore the topological incongruence between analysis methods and the effects of the excision of conflict inducing taxa. We find 8 subfamilies to be monophyletic with the interrelationships of most subfamilies to be represented by short branches with low support. Our data do not support the monophyly of Tomophthalmae or the

‘sand chamber subfamilies’.

Introduction

Bombyliidae, commonly known as bee flies, are among the most species- rich (more than 4500 sp.) and morphologically diverse families of Diptera. Their name derives from their frequent appearance as hymenopteran mimics. Bee flies have a dynamic life history, existing as flower visitors as adults and parasitoids (ectoparasitoids, endoparasitoids and hyperparasitoids) of insects as larvae. They are known to be parasitoids of crop pests, such as Noctuidae, but they also parasitize beneficial insects, such as solitary bees (Hull, 1973; Yeates and Greathead, 1997). Their economic relevance as pollinators and parasitoids of pests has not been explored. Bee flies occur across all zoogeographic regions, with the greatest diversity in arid, sandy habitats. The morphological diversity within the family Bombyliidae is extreme, exhibited by microbombyliids as small as the head of a pin and large, speckled-winged anthracines with a wingspan greater than 3 cm. Though many common bee flies can be recognized by a unique combination of characters, there is not a non- homoplastic diagnostic synapomorphy uniting the family (Yeates, 1994). This striking diversity has contributed to the complexity of determining their evolutionary relationships using morphology alone.

Bombyliids are lower brachyceran flies that have long been considered a member of the superfamily Asiloidea, along with robber flies (Asilidae), mydas flies (Mydidae) and stiletto flies (Therevidae). They have also been previously grouped with Nemestrinidae and based on the parasitic nature of the

larvae of all three of these lower brachyceran families. Recent analyses, however, provide evidence for placing the bee flies as the sister group to the asiloids + the higher flies (Trautwein et al., in prep). Putative bombyliid fossils have been found that are millions of years older than any other asiloid fossil

(Grimaldi and Engel, 2005; Wedmann and Yeates, 2008), indicating the possible early divergence of this family.

The classification of bee fly tribes and genera into subfamilies and the relationships between these subfamilies has been even more confounding than the higher-level placement of bomyliids. Yeates (1994) and Hull (1973) provide the most thorough reviews of the complex history of bee fly classification and phylogenetics. The two largest bee fly subfamilies, and Anthracine, include the vast majority of bee flies and were once the only two subfamilies in the classification (Hull, 1973). Bee fly later became 'top heavy with subfamilies' (Hull, 1973; Yeates, 1994). An early phylogenetic analysis of the family done by Muhlenberg (1971) was based on morphology and laid the ground plan for the higher-level relationships within bee flies. Only one other, more recent quantitative analysis has been completed across the family (Yeates,

1994). Yeates’ thorough analysis, also based on morphology, resulted in the maintenance of 15 bee fly subfamilies that were largely established by Hull

(1973). Yeates' classification, based on phylogeny, is the foundation for our molecular study.

The bee flies are currently divided into two major groups,

Homeophthalmae and Tomophthalmae (Bezzi 1924, Evenhuis, 1991 ). The defining features of Homeophthalmae and Tomophthalmae have been altered over the years, and the current definition was established by Evenhuis in 1991.

'Homeophthalmae' is a paraphyletic group that includes a ladder of basal bombyliids that have one occipital foramen corresponding to a flat or bulbous post cranium. Tomophthalmae form a monophyletic group of more derived bee flies that all possess two occipital foramina and varying degrees of concavity of the postcranimum. Another important morphological character that divides the bombylliids is a female genetalic feature called a sand chamber. The sand chamber allows bee flies to protectively coat their eggs in sand before flicking them onto the substrate in the close vicinity of a potential host for their larvae.

The most species-rich subfamilies possess sand chambers, and it has been hypothesized that this feature ensures a specialization to ground-dwelling hosts that has allowed bee flies to avoid direct competition with more efficient parasitoids, such as hymneopterans and tachinids (Yeates, 1994; Yeates and

Greathead, 1997; Wedmann and Yeates, 2008). Although a loss of the sand chamber appears in some derived bee flies, subfamilies that possess a sand chamber form a monophyletic group, previously known as Psammomorphidae, now termed 'sand chamber subfamilies ' (Yeates, 1994; Muhlenburg, 1971).

This group includes the members of Tomophthalmae along with their closest relatives, Bombyliinae and Crocidiinae.

Our study is the first to utilize molecular data to address the evolutionary

relationships across the family Bombyliidae. Using the nuclear protein-coding gene CAD and 28S ribosomal DNA, we aim to test the monophyly of currently accepted subfamilies, to determine the relationships between these subfamilies, and to test the monophyly of the large divisions Tomophthalmae and the sand chamber subfamilies. Yeates' (1994) morphological phylogeny is the foundation for our work, as well as a source for independent comparison of our findings. We explore the topological conflict in our results, examine the effects of taxon sampling, and take a 'total evidence' approach and analyze our concatenated molecular data combined with Yeates' (1994) morphological data set.

Materials and Methods

Taxon sampling

Fifty-nine taxa, including 55 genera, representing 13 out of 15 bee fly subfamilies and 4 lower brachyceran outgroups, were sampled for this study (Table 1). The two subfamilies not represented are small monogeneric subfamilies with limited zoogeographical distribution. Oligodraninae, includes a small genus restricted to the Palearctic that was previously placed in Phthiriinae or Usiinae (Evenhuis

1990, Yeates 1994). Oniromyiinae, with only two species, is found only in South

Africa.

DNA Extraction, Amplification and Sequencing

Genomic DNA was extracted using the DNeasy DNA extraction kit (QIAGEN Inc.,

Valencia, CA). The standard protocol was altered by extending the length of time the specimen was in proteinase K solution to two days, in order to effectively break down chitin but to avoid grinding the specimen. Final elution was reduced to 30 µl to avoid diluting the DNA solution.

Sequence data was collected from all 59 taxa for two nuclear genes, the protein- coding gene CAD (carbamoyl phosphate synthetase-aspartate transcarbamoylase-dihydroorotase) and 28S ribosomal DNA. For CAD, approximately 4000bp from the carbomoylphosphate synthase (CPS) domain of the gene were amplified and sequenced using Diptera-specific degenerate primers designed by Moulton (Moulton and Wiegmann, 2004), in addition to primers designed to amp across bee flies and those designed for specific bee fly taxa. To amplify approximately 1000 bp from the 3’ end of 28S rDNA, we used published Diptera primers (Yang, 2000). PCR parameters varied for CAD and

28S. PCR products were extracted from low-melt agarose gels and purified with the Qiaquick Gel Extraction kit (Qiagen, Santa Clara, CA). Big Dye Sequencing kits (Applied Biosystems, Foster City, CA) were used for sequencing reactions and sequencing was completed at the North Carolina State University Genome

Sequencing Laboratory (GSL). Sequences were contiged and edited using

Sequencher 4.1 (Gene Codes Corp., Ann Arbor, MI).

Alignment was carried out manually using Se-Al 2.0 (Rambaut, 2002). CAD was

aligned according to the amino acid translation. Introns in CAD, hypervariable regions of 28S and other positions of ambiguous alignment were removed from the data set. To detect existing base compositional bias, a chi-square test for homogeneity of base frequencies across taxa was performed for the concatenated CAD+28S data set and for each gene and codon position of CAD independently using Paup* 4.0b10 (Swofford, 2002) and Tree Puzzle (Schmidt et al., 2002). Unlike Paup, Tree Puzzle identifies particular taxa that fail the homogeneity test, allowing one to remove taxa as a means to rectify biases due to compositional heterogeneity.

Phylogenetic Analyses

Maximum parsimony (MP), maximum likelihood (ML) and Bayesian (BI) analyses were completed with both genes and of each independent gene. In addition, we conducted a MP AND BI ‘total evidence’ analysis that included 154 morphological characters generated by Yeates (1994) combined with our concatenated molecular data. To avoid the effects of saturation and systematic bias due to base composition in the 3rd codon position, all analyses including

CAD were completed with and without 3rd positions.

Parsimony Analyses

Maximum parsimony analyses were done using Paup* 4.0b10 (Swofford, 2002).

Heuristic searches with TBR branch swapping and 500 random addition

replicates were completed to find the shortest trees. Node support was obtained by acquiring bootstrap values from heuristic searches of 500 re-sampled data sets and 50 random addition replicates.

Bayesian Analyses

An appropriate model of nucleotide evolution, in this case GTR + I + G, was chosen by using Mr. Modeltest (Nylander, 2004). Using Mr.Bayes (Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck, 2003), analyses were conducted for 20,000,000 generations, trees sampled every 1000, with the first

25% discarded as burn-in. For nucleotide analyses, the model GTR+I+G was used with each gene treated as a separate partition; however, when 3rd positions were included, each codon position of CAD was treated as a separate partition.

For our total evidence analysis, we utilized a standard discrete (morphology) model (Lewis, 2001) for the morphological partition.

Maximum Likelihood Analyses

Maximum likelihood analyses were performed using Garli (Zwickl, 2006) with a

GTR+I+G model. To obtain bootstrap values, 500 bootstrap replicates were performed.

Conflict visualization

In order to visualize the conflicting topologies returned from MP, ML and BI

analyses of our molecular data (3rd positions removed), we generated a consensus network in SplitsTree (Huson and Bryant, 2006). We used the consensus network to identify taxa that appeared to be central to the conflict between topologies. We then removed these taxa and reanalyzed our concatenated data set in MP, ML, and BI (with and without 3rd positions of CAD, with and without morphology).

Results and Discussion

Molecular findings

MP, Ml and Bayesian analyses of the combined 28S rDNA (1136 bp) +

CAD data set (2536 bp excluding 3rd positions) yield similar topologies showing high support for the placement of early branching lineages and the monophyly of various tribes and subfamilies (Fig. 1-3). Our trees generally lack support for higher-level groupings and exhibit very short branches depicting most of the inter-relationships of bee fly subfamilies. Out of the 14 subfamilies included in our data set, 9 are represented by more than one taxon and thus were sampled to test for monophyly. Six of these nine subfamilies were found to be monophyletic in all methods of analyses, including Mythicomyiinae,

Toxophorinae, , Tomomyzinae, Usiinae (sensu Evenhuis,1990; excluding the Phthiriinae) and Lordotinae. In varying methodologies,

Cythereinae (MP and ML) and , at the inclusion of the subfamily

Antoniinae (ML and BI), were also found to be monophyletic. Bombyliinae

exhibited far-ranging polyphyly in all methods of analysis. Tomophthalmae, the bee flies with two occipital foramina corresponding to a concave postcranium,

Homophthalmae and the 'sand chamber subfamilies' were not found to be monophyletic under any method of analysis.

Molecular analyses with conflict-inducing taxa removed

Though all three methods of analysis primarily agree on the monophyly of the subfamilies, their placement within the family and their relationships to each other varies. The majority of the incongruence exists between the MP tree and topologies from the model-based methods of analysis (ML and BI). Parsimony is more likely to succumb to systematic biases such as long-branch attraction; therefore, conflicting results from MP and model-based methods can indicate such potential systematic biases at work (Felsenstein, 1978; Hendy and Penny

1989; Kennedy et al., 2005, Bergsten, 2005). When evaluating the phylogeny of ancient radiations, such as this one, it has been suggested that model-based methods may outperform MP due to their ability to compensate for multiple changes per nucleotide site and rate heterogeneity between sites (Whitfield and

Kjer, 2008). A consensus network of all three topologies (Fig. 4) allows for the direct visualization of the conflict between the MP, ML and BI trees and aids in the identification of taxa potentially generating the conflict and the regions of the tree affected.

The network displays many conflicting splits in the topology, however,

close investigation reveals that the conflict centers around relatively few taxa:

Sericosoma, Eusurbus, Theventimyia, Corsomyza and Phthiria. All of these taxa have phylogenetic ambiguity in their back history and are members of anomolous subfamilies or tribes. In our current analyses, these taxa are mostly long- branched. The removal of long-branched or unstable 'rogue' taxa has been shown to increase support values (Sanderson and Shaffer, 2002; Rokas et al.,

2005) and may rectify artifactual placements in topologies (Bergsten, 2005).

Sericosoma is an enigmatic genus that was left incertae sedis in Yeate's morphological study (1994). Corsomyza and Phthiria are single representatives of the subfamilies Mariobezzinae and Phthiriinae. Thevenetimyia and Eusurbus are members of the problematic subfamily Bombyliinae, included in the tribes

Eclimini and Bombyliini, respectively. The removal of these taxa results in topologies from MP, ML and BI that have increased support values and are largely congruent in respect to the relationships between bee fly subfamilies (Fig.

5). The backbone of the tree remains replete with short internodes, representing a dearth of molecular characters to support the higher-level relationships among the bee flies, but the congruence between methods of analysis indicates the retrieval of decisive signal. Considering the superior performance of model- based methods for old lineages, in addition to providing the means to estimate model parameters for individual data partitions, the BI tree with 'rogue' taxa removed is our best current DNA-based estimate of the evolutionary relationships of bee flies (Fig. 6).

The topology of the Bombyliidae subfamilies in our molecular phylogeny differs greatly from the morphological topology of Yeates (1994) and is more resolved. Both morphology and molecules do agree, however, that the subfamily of microbombyliids, called Mythicomyiinae, are an early branching lineage of bee flies. Fossil evidence has previously demonstrated that microbombyliids are a much older lineage than the remaining bee fly radiation

(Lamas and Nihei, 2007; Grimaldi and Engel, 2005; Wedmann and Yeates,

2008). Additional evidence of their early divergence is their predatory feeding as larvae, unlike other bee fly subfamilies, and they share this pleisiomorphic life history shared with other lower brachyceran and asiloid flies. Though there has been some consensus that the microbombyliids are the sister group to the remaining bee flies, there is disagreement over whether the differences between microbombyliids and other bee flies are sufficient to merit family status as

Mythicomyiidae (Evenhuis, 2002; Yeates, 1994; Zaytsev, 1992; Rhodendorf

1974).

In our molecular phylogeny, the sister-group to Mythicomyiinae is not the rest of Bombyliidae, but instead, the monogeneric subfamily Heterotropinae.

Heterotropinae was previously a catch-all subfamily that included misplaced scenopinds and other hard-to-place asiloid-like flies. It has since been revised and only the namesake genus Heterotropus remains (Theodor, 1983; Evenhuis,

1991). Afrotropical larval Heterotropus, identified by Yeates and Irwin (1992), were found to lack a pentultimate spiracle, the synapomorphy unifying Asiloidea,

and thus it was hypothesized that the genus may belong outside of Bombyliidae at the base of the asiloid radiation. Yeates' 1994 analysis, however, maintained

Heterotropus within the bee flies in a more derived placement than our data suggest. Heterotropus, like mythicomyiids and in contrast to the rest of

Bombyliidae, are minute, enigmatic flies with a predatory larval stage. These similarities lend additional credence to our novel, and well-supported molecular results.

A fundamental contrast between our molecular phylogeny and the morphological hypothesis set forth by Yeates (1994), is that while the morphological topology exhibits a ladder of subfamilies with each successive subfamiliy being the sister-group of the remainder of the family (reflecting the fact the there is little information regarding how the subfamilies relate to one another), our molecular trees exhibit more complex inter-subfamilial relationships. A new basal grouping of subfamilies is the sister-group relationship between

Tomomyzinae and Lomatiinae. This is an unexpected placement considering that morphology designated Tomomyzinae among the two most derived bee fly subfamilies; however, there are some morphological features shared between specific members of Tomomyzinae and Lomatiinae, such as the lack of prealar bristles (Yeates, 1994) that could be reinvestigated. Another supra-subfamily clade recovered in all analyses is the grouping of the large subfamily Anthracinae with Antoniinae, Cythereinae, and Bombyliini tribe of Bombyliinae. This clade, often appearing as a polytomy in our analyses, is similar to the polytomy

recovered in Yeates’ reanalysis of the initial cladistic data gathered by

Muhlenberg (1971). Anthracinae, Antoniinae and Cythereinae were also previously united as part of a monophyletic Tomophthalmae, though the inclusion of a tribe of Bombyliinae and the exclusion of Tomomyzinae, Lomatiinae,

Crocidinae and Sericosoma was unexpected.

The large subfamily Bombyliinae is not recovered as a monophyletic assemblage in our analyses. Yeates states that 'the Bombyliinae itself is a weak clade supported by two apomorphies replete with homoplasy...' (1994). In the

World Catalog of Bee Flies, Evenhuis and Greathead (1999) express that despite the high degree of homoplasy, 'it is a readily recognizable subfamily- usually robust flies with dense hair on the body and a well developed sand chamber guarded by dense hair'. Three tribes of Bombyliinae were sampled: Bombyliini,

Conophorini, and Eclimini. None of these tribes were recovered in close vicinity of the other, and only one, Conophorini was found to be monophyletic (though 3 out of 5 members of the tribe Bombyliini also generally formed a clade). It is significant that the tribe Conophorini was found to be the sister-group to

Lordotinae, a subfamily whose included genera were previously members of the tribe Conophorini before being raised to subfamily status by Yeates' morphological revision. Lordotinae and Conophorini share some morphological similarities, including the possesion of one midtibial spur, and our molecular data indicate that there is evidence for considering their reunification. The tribe

Eclimini was found to be distantly polyphyletic. The Australian genus

Marmasoma interestingly was placed as the sister-group to the subfamily

Toxophorinae, in which it was previously included in Hull's “Bee Flies of the

World” (1973).

Anthracinae, the most species-rich subfamily of bee flies, does exhibit monophyly in some analyses, though it includes the single taxon representative of the subfamily Antoninae. Other analyses (BI, ML with reduced taxa) show the anthracines as unresolved or rendered polyphyletic by the tribe Aphoebantini, which separates itself from the remainder of the subfamily. Our analyses do confirm, in agreement with the phylogenetic analyses completed by Yeates

(1994) and Muhlenburg (1971), and in contrast to Becker (1913), that the anthracines are a relatively derived subfamiliy of bee flies.

Tomophthalmae and the 'sand chamber subfamilies', both previously considered monophyletic groups, are not recovered in our analyses. This indicates either that, CAD and 28S do not convey a sufficient amount of phylogenetic signal for the accurate resolution of the relationships between bee fly subfamilies, or that the morphological characters defining these two groups, the presence of two occipital foramina and the sand chamber, exhibit a great deal of plasticity in their gain and loss across subfamily-level divergences. The

'sand chamber subfamilies' as defined by Yeates (1994), did already include apomorphic instances of secondary loss of this female genetalic character, in members of Antoniinae, for example. Also Evenhuis and Greathead's

'phylogenetic considerations' in the “World Catalog of Bee Flies” (1999)

discusses the variations and modifications of the sand chamber among the members of the 'sand chamber subfamilies'. It seems that homoplasy was already evident within this group, yet the scale of the homoplasy, now seen across virtually the entire family tree, will have to be revisited.

Total Evidence

Our total evidence Bayesian analysis, CAD+28S combined with 155 morphological characters, results in a tree that is largely similar to our Bayesian molecular tree including all taxa, yet the total evidence tree is fully resolved and achieves higher support values. The five 'rogue taxa' removed from our previously discussed analyses are included here and three out of five of these taxa exhibit branches that are longer than average, and have phylogenetically questionable placement. The high support values found across this tree generate confidence for many of the hypothesized relationships, however, simulations have shown, that posterior probabilities can overstate support for short branches (Alfaro et al., 2003). Considering the backbone of inter- relationships of bee fly subfamilies is built of short internodes with posterior probabilities of less than 95%, we recognize that phylogenetic uncertainty remains across the family, and our data do not provide robust evidence for a new classification of higher-level divisions of bee flies.

Congruence, branch lengths, and future work

Though, in general, the 'true' tree of evolutionary relationships cannot be confirmed, the evaluation of congruence between independent character sets can establish an appropriate level of confidence in phylogenetic results.

Although a great deal of taxonomic work has been done on the bee flies, only one other modern, quantitative phylogenetic analysis has been completed for the family, and our findings show very limited congruence with this existing morphological tree, which itself showed limited resolution and a backbone of short branches. Thus we have limited independent confirmation of our findings, and a recent study of incongruence between morphological and molecular trees provides us no indication that molecular trees should be preferred a priori (Pisani et al., 2007).

The 28S gene alone and in combination with CAD have provided resolution to many areas of the fly tree of life (Bertone et al. 2008, Moulton and

Wiegmann 2003; Moulton and Wiegmann, 2007; Wiegmann et al., 2003,

Winterton et al., 2007 Scheffer et al., 2007) however, they have not been proven effective for the bee flies and their superfamily, Asiloidea (Trautwein et al., in prep). Morphological and molecular analyses, including the current study, establish that the evolutionary relationships in this region of the fly tree are particularly challenging to resolve, exhibiting both short branches and low support values. Ancient radiations, such as the bee flies, can be difficult to recover with limited data, particularly if taxon diversification took place in rapid succession leaving few polarizing characters to discern the relationships among

lineages (Rokas et al., 2005; Whitfield and Lockhart, 2007; Whitfield and Kjer,

2008). To rectify these short branches and low support values, we will need the resolving power of additional genes. Further investigation may determine that this region of the fly tree actually experienced a rapid radiation 200 million years ago, in which case, only limited phylogenetic information will exist even if large amounts of data are added. Thus, our short branches may be an accurate reflection of evolutionary history (Whitfield and Kjer, 2008). Another problematic feature of short branches is that they are often indicative of regions of a species tree that exhibit incongruent gene histories (Wiens et al., 2008)

A few enigmatic lineages represented in our analyses are by single 'rogue taxa', often exhibiting long branches, will require more extensive taxon representation to achieve resolution in future studies. Short branches, as exhibited in our trees, are known to exacerbate the problem of long branch attraction (Felsenstien, 1978, Wiens, et al., 2008). The addition of taxa has been shown to break up long branches and increase phylogenetic accuracy (Graybeal,

1998; Heath et al., 2008). This will likely help determine the accurate placement of Sericosoma, Mariobezzinae, and the Bombyliinae tribe Eclimini amongst others.

Conclusions

Our molecular study of the evolutionary relationships among the subfamilies of

the bee flies has resulted in the confirmation of the monophyly of eight of fifteen subfamilies with varying levels of support, in addition to demonstrating wide- ranging polyphyly of the large subfamily Bombyliinae. We also introduce novel hypotheses of inter-subfamilial relationships, such as the well-supported sister group relationship among the enigmatic subfamily Heterotropinae and the microbombyliids, Mythicomyiinae. We confirm, in consensus with morphology and fossil data, that Mythicomiinae is an early branching lineage of bee flies. In conclusion, our analyses, though not providing strongly supported resolution to the evolutionary relationships of all bee fly subfamilies, do provide a molecular foundation for future work, along with a phylogenetic framework to guide and direct future lines of inquiry.

Acknowledgments

We are grateful to D.K. Yeates, M.E. Irwin and N.I. Evenhuis for the provision and identification of specimens. Additional thanks to Brian Cassel for assistance in the collection of molecular data. This project was supported by US National

Science Foundation (NSF) Assembling the Tree of Life (ATOL) grant EF-03394 to BMW and DKY.

References

Alfaro, M. E., S. Zoller, and F. Lutzoni. 2003. Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov Chain Monte

Carlo sampling and bootstrapping in assessing phylogenetic confidence. Mol.

Biol. Evol. 20:255-266.

Becker, T. 1913. Genera Bombyliidarum. Ezheg. Zool. Muz. 17:421-502.

Bertone, M. A., G. W. Courtney, and B. M. Wiegmann. 2008.

Phylogenetics and temporal diversification of the earliest true flies (Insecta:

Diptera) based on multiple nuclear genes. Syst. Entomol. 33:668–687.

Bergsten, J. 2005. A review of long-branch attraction. Cladistics. 21:163-

193.

Bezzi, M. 1924. The Bombyliidae of the Ethiopian Region. London: British

Museum, Natural History.

Evenhuis, N. L.1991. World catalog of the genus-group names of bee flies

(Diptera: Bombyliidae). Bishop Mus. Bull. Entomol. 5:1-105.

Evenhuis, N. L. 1990. Systematics and evolution of the genera in the subfamilies Usiinae and Phthiriinae (Diptera: Bombyliidae) of the world.

Entomonograph. 11:1-72.

Evenhuis, N. L., and D. J, Greathead. 1999. Word catalog of bee flies.

Leiden: Backhuys Publishers.

Evenhuis, N. L. 2002. Catalog of the of the World

(Insecta: Diptera). Bishop Mus. Bull. Entomol. 10:1-85

Felsenstein, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27:401-410.

Graybeal, A. 1998. Is it better to add taxa or characters to a difficult problem? Syst. Biol. 47:9-17.

Grimaldi D., and M. S. Engel. 2005. Evolution of the Insects. New York:

Cambridge. 491-547.

Heath, T. A., S. M. Hedtke, and D. M. Hillis. 2008. Taxon sampling and the accuracy of phylogenetic analyses. J. Syst. Evol. 46:239-257.

Hendy M. D., and D. Penny. 1989. A framework for the quantitative study of evolutionary trees. Syst. Zool. 38:297-309.

Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogeny. Bioinformatics. 17:754-755.

Hull, F. M. 1973. Bee flies of the world: the genera of the family

Bombyliidae. Bull. U.S. Nat. Mus. 286:1-687.

Huson D. H., and D. Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23:254-267.

Kennedy, M., B. R. Holland, R. D. Gray, and H. G. Spencer. 2005.

Untangling long branches: identifying conflicting signals using spectral analysis, neighbor-net, and consensus networks. Syst. Biol. 54:620-633.

Lamas, C. J. E., and S. S. Nihei. 2007. Biogeographical analysis of

Crocidiinae (Diptera: Bombyliidae): finding congruence among morphological, molecular, fossil and paleogeographical data. Rev. Brasil. Entomol. 51:267-274.

Moulton J. K., and B. M. Wiegmann. 2004. Evolution and phylogenetic utility of CAD (rudimentary) among Mesozoic-aged Eremoneuran Diptera

(Insecta). Mol. Phylogenet. Evol. 31:363-378.

Moulton J. K., and B. M. Wiegmann. 2007. The phylogenetic relationships of flies in the superfamily Empidoidea (Insecta: Diptera). Mol. Phylogenet. Evol.

43:701–713.

Muhlenburg, M. 1971. Phylogenetisch-systematische Studien an

Bombyliiden (Diptera). Zool. Morphol. Tiere. 70:73-102.

Nylander, J. A. A. 2004. MrModeltest v2.2 Program distributed by author.

Evolutionary Biology Centre, Uppsala University.

Pisani, D., M. Benton, and M. Wilkinson. 2007. Congruence of

Morphological and Molecular Phylogenies. Act. Biotheor. 55:269-281.

Rambaut A. 2002. Sequence alignment editor, Version 2.0. Available as shareware from http://evolve.zoo.ox.ac.uk/index.html.

Rohdendorf, B. 1974. The historical development of Diptera. Edmonton,

Alberta, Canada: University of Alberta Press.

Rokas, A., D. Kruger, and S. B. Carroll, 2005. Animal evolution and the molecular signature of radiations compressed in time. Science. 310:1933–1938.

Ronquist A., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 19:1572-1574.

Sanderson, M. J., and H. B. Shaffer. 2002. Troubleshooting molecular phylogenetic analyses. Ann. Rev. Ecol. Syst. 33:49-72.

Scheffer, S. J., I. S. Winkler, and B. M. Wiegmann. 2007. Phylogenetic relationships within the leaf-mining flies (Diptera: Agromyzidae) inferred from sequence data of multiple genes. Mol. Phylogenet. Evol. 42:756-775.

Schmidt, H.A., K. Strimmer, M. Vingron, and A. von Haeseler. 2002.

TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics. 18:502-504.

Swofford ,D. L. 2002. PAUP*. Phylogenetic Analysis Using Parsimony (* and Other Methods) Version 4. Sinauer Associates, Sunderland, MA.

Theodor, O. 1983. The genitalia of Bombyliidae (Diptera). The Israel

Academy of Sciences and Humanities, Jerusalem.

Trautwein, M. D., D. K. Yeates, and B. M. Wiegmann. Multi-gene phylogenetics of the superfamily Asiloidea (Diptera). In prep.

Wedmann, S. and D. K. Yeates. 2008. Eocene records of bee flies

(Insecta, Diptera, Bombyliidae, Compostia): Their palaeobiogegraphic implications and remarks on the evolutionary history of bombyliids. Paleontol.

51:231-240.

Whitfield, J. B., and K. M. Kjer. 2008. Ancient Rapid Radiations of Insects:

Challenges for Phylogenetic Analysis. Ann. Rev. Entomol. 53:449-472.

Whitfield, J. B., and P. J. Lockhart. 2007. Deciphering ancient rapid radiations. Trends Ecol. Evol. 22:258-265.

Wiens, J. J., C. A. Kuczynski, S. A. Smith, D. G. Mulcahy, J. W. Sites, T.

M. Townsend, and T. W. Reeder. 2008. Branch lengths, support, and congruence: testing the phylogenomic approach with 20 nuclear loci in snakes.

Syst. Biol. 57:420-431.

Wiegmann, B. M., D. K. Yeates, J. L. Thorne, and H. Kishino, 2003. Time flies, a new molecular time-scale for brachyceran fly evolution without a clock.

Syst. Biol. 52:745-756.

Winterton, S. L., B. M. Wiegmann, and E. I. Schlinger. 2007. Phylogeny and Bayesian divergence time estimations of small-headed flies (Diptera:

Acroceridae) using multiple molecular markers. Mol. Phylogenet. Evol. 43:808-

832.

Yeates, D. K. 1994. Cladistics and classification of the Bombyliidae

(Diptera: Asiloidea). Bull. Am. Mus. Nat. Hist. 219:1-19.

Yeates, D.K., and D. Greathead. 1997. The evolutionary pattern of host use in the Bombyliidae (Diptera): a diverse family of parasitoid flies. Biol. J. Linn.

Soc. 60:149-185.

Yeates, D. K., and M. E. Irwin. 1992. Three new species of Heterotropus

Loew (Diptera: Bombyliidae) from South Africa with descriptions of the immature stages and a discussion of phylogenetic placement of the genus. Am. Mus. Nov.

3036:1-25.

Zaytsev, V. F. 1992. Contribution to the phylogeny and systematics of the superfamily Bombylioidea (Diptera). . 71:94-114.

Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.

Table 1. Lower brachyceran outgroups and bombyliids sampled for analysis.

Taxa sampled Genus species

Outgroups

XYLOPHAGIDAE Heterostomus

NEMESTRINIDAE Hirmonura

THEREVIDAE Ectinorhynchus

ASILIDAE Machimus

Ingroup

BOMBYLIIDAE

Mythicomyiinae

Cyrtosiini undescribed species

Mythicomyiini Mythicomyia

Glabellulini Glabellula

Mnemomyia

Usiinae

Apolysini Apolysis

Usiini Usia

Pthiriini Phthiria

Toxophorinae

Gerontini

Systropodini

Table 1. Continued

Systropus flavoornatus

Zaclava

Dolichomyia

Toxophorini

Toxophora compta

Lordotinae Lordotus

Geminaria

Amphicosmus

Heterotropinae Heterotropus

Bombyliinae

Bombyliini

Bombylius

Eusurbus crassilabris

Heterostylum

Meomyia albiceps

Conophorini

Eclimini Thevenetimyia

Marmosa

Crocidiinae Desmaytomyia

Mariobezziinae Corsomyza

Sericosoma Sericosoma chilensis

Table 1. Continued

Cythereinae Neosardus

Pantarbes

Neosardus principius

Lomatiinae

Lomatiini Aleucosia hemiteles

Comptosia

quadripennis

Comptosia moretoni

Lyophlaeba

Antoniinae Antonia

Tomomyzinae Paracosmus

Tomomyza

Anthracinae

Anthracini Anthrax

Xenox tigrinis

Aphoebantini Pteraulax

Epacmus

Aphoebantus

Exprosopini Hyperalonia

Exoprosopa

Exoprosopa

Table 1. Continued

Ligyra satyrus

Pseudopenthes

fenestrata

Epacomoides

Villini Rhynchanthrax

Chrysanthrax

Lepidanthrax

Villa

Figure 1. MP analysis of 28S+CAD (3672bp) with 3rd positions removed. 836 parsimony-informative characters/ 2300 constant/ 536 parsimony-uninformative.

Strict consensus of 7 trees with length 4962 shown. Monophyletic subfamilies are highlighted. Starred taxa are removed in subsequent analysis to explore the effects of conflict-reduction to attain a supported, congruent topology.

Figure 2. ML analysis of 28S+CAD (3672bp) with 3rd positions removed.

GTR+I+G model implemented. Monophyletic subfamilies are highlighted.

Starred taxa are removed in subsequent analysis to explore the effects of conflict-reduction to attain a supported, congruent topology.

Figure 3. BI analysis of 28S+CAD (3672bp) with 3rd positions removed. GTR+I+G model implemented with genes in independent partitions. 20,000,000 generations with the initial 25% of trees deleted as burn-in. Monophyletic subfamilies are highlighted. Starred taxa are removed in subsequent analysis to explore the effects of conflict-reduction to attain a supported, congruent topology.

Figure 4. Consensus network showing the conflicting splits in MP, ML and BI topologies based on 28S+CAD (3672bp) with 3rd positions removed.

Figure 5. Consensus network showing greater congruence between MP, ML and

BI topologies based on 28S+CAD (3672bp) with 3rd positions and conflict- inducing taxa removed.

Figure 6. BI analysis of 28S+CAD (3672bp) with 3rd positions and conflict inducing taxa removed. GTR+I+G model implemented with genes in independent partitions. 20,000,000 generations with the initial 25% of trees deleted as burn-in.

Monophyletic subfamilies are highlighted.

Figure 7. Total evidence tree. BI analysis of 28S+CAD (3672bp) with 3rd positions removed plus 155 morphological characters. GTR+I+G model implemented with genes in independent partitions and a standard discrete model for the morphology partition. 20,000,000 generations with the initial 25% of trees deleted as burn-in. Monophyletic subfamilies are highlighted. Starred families represent those removed in exploratory analyses.

Chapter 3:

Identifying the sister-group to Diptera: a multi-gene phylogeny of the

holometabolous insects

Abstract

Phylogenetic relationships among the 11 extant orders of holometabolous insects remain either unresolved or contentious, but are extremely important as a context for accurate comparative biology of insect model organisms. Here we present evidence from nucleotide sequences of six single-copy nuclear protein coding genes used to reconstruct phylogenetic relationships for all extant order-level lineages of Holometabola. Our results strongly support a sister-group relationship for Hymenoptera and all other Holometabola, the monophyly of the extant orders, and the monophyly of traditionally recognized groupings of

Coleoptera + Neuroptera, Lepidoptera + Trichoptera, and Siphonaptera +

Mecoptera. Most significantly, we find strong support for Coleoptera (beetles) +

Strepsiptera (twisted-wing insects)), a previously proposed, but analytically controversial relationship. Our analyses reveal robust support for this relationship that cannot be explained by long-branch attraction (LBA) or other systematic biases. These findings provide the most complete evolutionary framework for future comparative studies on model organisms and contribute strong evidence for the resolution of the ‘Strepsiptera problem,’ a long-standing and hotly debated issue in insect phylogenetics.

Introduction

Insects that undergo complete metamorphosis, collectively known as the

Holometabola or (~850,000 species), represent the vast majority of animal life on Earth (Wilson, 1992). The life history of these insects is divided into discreet developmental stages, including a distinct larval (feeding) and pupal

(quiescent) stage. Most of the species-richness of this group is found in the four largest orders of insects, the Coleoptera (beetles), Hymenoptera (bees, ants and wasps), Diptera (true flies), and Lepidoptera (moths and butterflies), in addition to

7 smaller orders, the Neuroptera (lacewings), Megaloptera and Raphidioptera

(dobsonflies and alderflies), Trichoptera (caddisflies), Mecoptera (scorpionflies),

Siphonaptera (fleas), and Strepsiptera (twisted-wing parasites). Important model species such as Drosophila melanogaster, Apis mellifera (honey bee), Bombyx mori (silkworm) and Tribolium castaneum (flour beetle) are members of the

Holometabola and understanding the evolutionary relationships within this diverse insect radiation is critical for comparative studies.

The monophyly of the orders included in the Holometabola is well established, with the exception of the Mecoptera, which in some molecular analyses is rendered paraphyletic due to the inclusion of the fleas (Whiting, 2002;

Whiting et al., 2003). There is, however, less unanimity regarding the relationships among the orders. Traditional morphological hypotheses and emerging molecular results have converged on the division of the Holometabola into two major lineages, the Neuropteroidea, which includes Coleoptera + the

Neuropterida (Neuroptera, Megaloptera, Rhaphidioptera), and the Mecopterida (

= Panorpida), including Lepidoptera, Trichoptera, Diptera, Mecoptera, and

Siphonaptera. The identification of the basal branching lineage, and the resolution of the placement of the Hymenoptera and the unusual order

Strepsiptera remain as the most disputed issues in Holometabolan phylogeny.

Hymenoptera and Strepsiptera have been placed in various positions, the former most often placed as sister to the Mecopterida and the latter traditionally included either within or as sister to Coleoptera (Crowson, 1960; Beutel and

Gorb, 2001). The consensus view is that most morphological features of the

Hymenoptera and the Strepsiptera are too highly modified to unequivocally resolve their phylogenetic positions (Hornschenmeyer, 2002; Hunefeld and

Beutel, 2005). Thus the placement of these two orders will necessarily rely on the use of molecular data; however, the conflicting results of molecular studies to date contribute to the indeterminate nature of their evolutionary relationships.

Two recent phylogenomics projects, with limited taxon sampling but large numbers of genes, addressed the placement of the Hymenoptera; mitochondrial genomes provide evidence for a sister group relationship between Hymenoptera and Mecopterida (Castro and Dowton, 2005), while combined analysis of 185 nuclear genes shows strong support for the Hymenoptera as the earliest branching holometabolan lineage, sister to all other orders (Savard et al., 2006a).

Most other molecular analyses of holometabolan phylogeny have relied on ribosomal DNA, and the results have been highly dependent on taxon sampling,

alignment, and method of analysis (Kjer, 2004; Whiting, 2002; Carmean et al.,

1992; Chalwatzis et al., 1996; Pashley et al., 1993). The most provocative rDNA results involve the Strepsiptera, a small (600 spp) enigmatic order that maintain a degree of phylogenetic ambiguity that is unique amongst insects. Their affinity to any other order is unconfirmed and until relatively recently, even their inclusion in the Holometabola was questioned (Kristensen, 1981; 1999). Strepsipterans are endoparasites of other insects, with free-living males and eyeless, larviform, viviparous females that remain inside their host (exception: females of the family

Mengenillidae). Ribosomal DNA analyses show support for a sister group relationship between the Diptera and the Strepsiptera, united in a clade called

Halteria (Whiting and Wheeler, 1994). Dipterans and strepsipterans both have halteres, paired knob-like structures that are homologous to wings, although dipteran halteres are found on the 3rd thoracic segment in the place of hind wings, while strepsipteran halteres are on the 2nd thoracic segment in place of forewings. The initial 18S findings implied that a homeotic mutation resulted in the divergence of these two orders, though no supporting genetic evidence for this has since been found (Rokas et al.,1999; Kristensen, 1999). Additionally, all of the morphological characters that unite Mecopterida and Antliophora (in which

Halteria would be included) are lacking or inapplicable in Strepsiptera

(Kristensen, 1999; Wheeler et al., 2001; Beutel and Pohl, 2006). The Halteria concept also contradicts traditional interpretations of morphological characters uniting Strepsiptera and Coleoptera based on structural modifications due to

posteromotorism or hind-wing powered flight (Beutel and Pohl, 2006).

Subsequent reanalyses of 18S and additional ribosomal DNA resulted in the 'Strepsiptera problem' becoming the first proposed empirical example of long branch attraction (Carmean and Crespi, 1995; Huelsenbeck, 1997; 1998; Hwang et al., 1998). No additional multi-gene phylogenetic analyses have been completed to address the Strepsiptera question, but 3 other molecular studies, one that examined an engrailed homeobox intron (Rokas et al., 1999), and 2 that analize the structure and evolutionary rate of the ecdysone receptor and ultraspiracle proteins (Hayward et al., 2005, Bonneton et al., 2006) failed to find any evidence of a close relationship between Diptera and Strepsiptera.

To further resolve the evolutionary relationships of the Holometabola and to specifically clarify the sister group to the Diptera, we provide the first phylogenetic analysis to include multiple nuclear genes and representative taxa from all eleven holometabolous orders.

Results and Discussion

We analyzed 6 nuclear genes (AATS, CAD, TPI, SNF, PGD & RNA POLII) comprised of 5736 bp to infer the phylogeny of 29 species representing all 11

Holometabolan orders and 2 hemimetabolous outgroups (see Table 1).

Maximum likelihood and Bayesian analyses yield congruent trees with high posterior probabilities and mixed bootstrap values (Fig. 1, Fig. 2). All orders are found to be monophyletic, including the Mecoptera with the Siphonaptera as its

sister group. The Hymenoptera are the basal branching lineage, concordant to the phylogenomic findings of Savard et al. (2006a). The enigmatic Strepsiptera are unequivocally placed as the sister group to Coleoptera, providing additional evidence for the traditional morphological placement of the twisted-wing parasites. In accordance with previous morphological and molecular hypotheses, our study finds Holometabola to be divided into two major lineages,

Neuropteroidea and Mecopterida. Within these two lineages, the traditional respective supraordinal groupings are recovered; Neuropteroidea includes

Coleoptera, Strepsiptera and Neuropterida (Neuroptera, Megaloptera and

Raphidioptera), and Mecopterida includes Ampiesmenoptera (Lepidoptera and

Trichoptera) + Antliophora (Diptera, Mecoptera and Siphonaptera).

It appears that the use of nuclear genes, 6 in our study and 185 in Savard et al. (2006a), has brought decisive and robust results to the previously obscure phylogenetic placement of the Hymenoptera. Most previous morphological hypotheses favored a sister group relationship between the Hymenoptera and the Mecopterida (Beutel and Pohl, 2006, Kristensen, 1999), though strong supporting evidence was lacking. Mitochondrial genomes also favor the

Hymenoptera+ Mecopterida relationship, although not definitively, as the authors suggest that another 'plausible alternative placement is at the base of the

Holometabola' (Castro and Dowton, 2005). 18S rDNA paradoxically supports both previously mentioned Hymenoptera hypotheses depending on alignment and taxon sampling (Kjer, 2004; Whiting et al., 1997; Whiting, 2002). Our results

constitute the tipping point of the compounding evidence (extensive sample of nuclear genes, fossil evidence, wing characters and introns of ef1alpha), that the

Hymenoptera are evolutionarily the earliest branching lineage of the holometabolan radiation (Savard et al., 2006a; Rohdenorf and Rasnitsyn, 1980;

Rasnitsyn, 2002; Kuckalova-Peck and Lawrence, 2004; Krauss et al., 2005,

Zdobnov and Bork, 2006).

Recently, the hypothesis that fleas are actually members of the scorpionfly order Mecoptera has gained wide acceptance (Whiting, 2002; Whiting et al.,

2003; Beutel and Pohl, 2006). Analyses based on morphology, ribosomal and mitochondrial DNA have strongly supported the collapse of the Siphonaptera and their inclusion within the Mecoptera as the sister group to the wingless family of snow scorpionflies, the Boreidae (Whiting, 2002; Whiting et al., 2003; Beutel and

Pohl, 2006). Our data provide no indication of a close relationship between fleas and boreids. We find the traditional grouping of Mecoptera, with the exclusion of the fleas, to be highly supported in our analyses. No variation of taxon sampling, character inclusion, or methodology resulted in the placement of the fleas within

Mecoptera. Our results suggest that the morphological characters grouping the fleas and the boreids, such as wing reduction and characters of oogenesis, be reevaluated (Beutel and Pohl, 2006).

The controversial hypothesis that the Diptera and the Strepsiptera

(Halteria) are each others' closest relatives has been the subject of much debate; particularly in regard to whether the sister group relationship is the result of a

methodological artifact known as long branch attraction (LBA), the phylogenetic phenomenon of rapidly evolving sequences clustering in analyses counter to their true evolutionary history. Halteria, as supported by 18S rDNA, is often cited as the first empirical evidence for LBA and initiated the development and use of parametric simulation as a statistical test for detecting LBA (Huelsenbeck, 1997).

Both flies and strepsipterans have exhibited long branches in previous 18S analyses. Similarly, in our current study one strepsipteran has a uniquely long branch, and the taxon with the next longest branch is the coleopteran Tribolium.

To address the possibility that in our analyses the Strepsiptera-Coleoptera relationship is a spurious artifact due to LBA, we thoroughly examined our data and modified our analyses to detect and potentially rectify affects of LBA.

The detection of LBA is challenging and until somewhat recently, it was still questioned whether LBA occurs in real data sets (Huelsenbeck, 1997; 1998).

Currently, the retrieval of conflicting results from parsimony and ML, parametric simulation, and the visualization of conflict in a data set can all provide suggestive evidence that LBA may be affecting an analysis (Kennedy et al.,

2005, Bergsten, 2005). Parsimony, in particular, is more likely to succumb to

LBA than model-based methods (Felsenstein, 1978; Hendy and Penny, 1989).

Our parsimony trees agree with the topology generated by both ML and BI, a finding not suggestive of long-branch attraction.

Parametric simulation, a method developed by Huelsenbeck (1998) to test the rDNA-based Halteria findings for LBA, can provide statistical support for the

existence of LBA. In a procedure similar to a parametric bootstrap, simulated data sets are generated according to a tree in which taxa with elevated rates of evolution are separated in the topology; in this case, the strepsipterans are separated from the coleopterans and constrained to the base of Holometabola.

The simulated data sets are then analyzed to determine whether the putative long-branched taxa will cluster counter to their placement in the tree on which the data were simulated. If Strepsiptera and Coleoptera consistently form a clade in analyses of the simulated data sets, we would conclude that grouping to be the result of LBA. None of our 100 ML analyses of the simulated data resulted in the attraction of the long-branched strepsipterans and coleopterans to each other.

This finding signifies that in our data set, in contrast to the original rDNA data, there is no statistical evidence to suggest that the rates of evolution in the strepsipteran and coleopteran branches are sufficiently elevated to attract each other counter to their accurate (simulated) evolutionary placement.

In contrast to other methods that are implemented post-analysis, visualizing conflict in a data set can be used to identify the potential for LBA prior to analysis (Kennedy et al., 2005). A data set likely to be affected by LBA should exhibit conflicting signal supporting both the artifactual relationship and the actual evolutionary relationship. We utilized two visualization methods, likelihood mapping and Neighbor-Nets, and our results were not definitive. Likelihood mapping, a quartet puzzling method, shows little conflict (revealed by only 0.4% of unresolved quartets while 10-15% is considered high) (Fig. 3a). However, our

Neighbor-Net analysis, a network showing all compatible and incompatible splits, does show conflicting signal throughout our data set (Fig. 4). The conflicting splits exist across many regions of the tree, not just regarding Strepsiptera, indicating that there is no reason to suspect LBA in regards to Strepsiptera more than other clades. Yet when a network including Strepsiptera is directly compared to a network with Strepsiptera excluded, it is evident that the conflict in this data set is substantially alleviated by the absence of the strepsipterans, particularly in respect to the reticulation at the base of Diptera. This is not a clear sign of LBA, but it does suggest that there is conflicting support for the placement of Strepsiptera and their relationship to Diptera.

To explore further the potential for LBA identified by the Neighbor-Net, we used a four- cluster likelihood mapping analysis to again visualize the degree of conflicting signal regarding the placement of Strepsiptera. We divided the taxa into 4 clusters: 1) Neuropteroidea (which includes Coleoptera), 2) Mecopterida

(which includes Diptera), 3) Hymenoptera, and 4) Strepsiptera. The possible relationships among these four clades generate three possible topologies, each represented by a tip of the triangle. This quartet puzzling method plots the probability of each possible quartet closest to the topology that it favors. Each region of the triangle or 'basin of attraction' contains a percentage of quartets that support a particular topology. This analysis again reveals the conflicting signal in our data set and shows that we have signal supporting all three hypotheses regarding the placement of Strepsiptera, with the most support in this analysis for

a close relationship of Strepsiptera and Mecopterida (including the flies) (Fig. 3b).

Though our concatenated data set clearly results in the placement of

Strepsiptera with Coleoptera in MP, ML and BI, there is evidence that some signal supports a closer relationship between Strepsiptera and Diptera. To determine the source of this conflicting signal, we examined ML analyses of the six individual gene trees. Data contributing phylogenetic information for the placement of Strepsiptera is available for five out of six genes, and three out of those five genes place Strepsiptera within the close vicinity of Coleoptera or

Neuropterida. The gene tree for CAD, however, recovers Halteria, with

Strepsiptera as the sister group to Diptera. At 2000 bp, CAD is the longest gene in the data set and in recent years has become a staple for resolving Mesozoic- age divergences among flies. The topology of the CAD ML tree reveals that

Diptera and Strepsiptera all have the longest branches in the tree, similar to the initial 18S findings, suggesting the possibility that LBA may play a role in the

CAD recovery of Halteria. It has been hypothesized that Diptera have experienced accelerated evolution in comparison to other insects (Savard et al.

2006b), and by observing their long branches in various data sets we can surmise that Strepsiptera may have as well. Rapid evolution in specific loci, such as 18S and CAD, could lead to LBA and the erroneous grouping of Diptera and

Strepsiptera. The reliance on a single locus for phylogenetic resolution, though useful in some circumstances, can clearly result in inaccurate conclusions. No single gene in our data set recovers our well-supported phylogeny that is

congruent to morphological hypotheses. Our phylogeny relies on the concatenation of all six genes to overcome the misleading signal in CAD placing

Strepsiptera as the sister group to Diptera.

Our findings are robust over multiple phylogenetic methods intended to counter LBA including the removal of 3rd positions, RY coding of 1st and 3rd positions, the removal of outgroups and long branches, and the use of a conservative alignment (with fast evolving positions removed by the program

Gblocks (Castresana, 2000; Kennedy et al., 2005, Bergsten, 2005) (Table 2).

Our many attempts to identify or ameliorate LBA did not result in a positive detection of LBA or a change in our results, thus we conclude that the

Strepsiptera + Coleoptera relationship is not a clear case of systematic error due to LBA. Our study is the first to rely on multiple genes to re-address the placement of Strepsiptera and our robust findings should weaken the debate regarding the morphologically dissimilar orders Strepsiptera and Diptera as sister groups. In light of our findings, upcoming work involving much larger genomic data sets (S. Longhorn, pers.comm.), and the re-examination of existing morphological characters shared by strepsipterans and beetles (Beutel and Pohl,

2006), we anticipate that the phylogenetic placement of Strepsiptera will cease to be considered the most controversial issue in holometabolan phylogenetics.

Materials and Methods

Taxa Sampled, DNA Extraction, Amplification and Sequencing

A total of 29 taxa representing the eleven holometabolous orders and 2 hemimetabolous outgroups were sampled for sequence data from 6 nuclear protein-coding genes: CAD, AATS, TPI, RNA POL II, SNF and PGD (Table 1).

Taxonomic information and Genbank accession numbers are available in the

Supplementary materials accompanying this paper. Sequence alignments and trees are available for download from Treebase.org. Genomic DNA was extracted using the DNeasy DNA extraction kit (QIAGEN Inc., Valencia, CA). The standard protocol was altered by extending the length of time the specimen was in proteinase K solution to two days in order to allow enzymes to penetrate the cuticle without grinding the specimen. Final elution was reduced to 30ul to avoid diluting the DNA solution. Genes were amplified and sequenced using degenerate primers designed by Moulton for CAD (Moulton and Wiegmann,

2003) and by Jungwook Kim for the remaining five genes. PCR parameters varied for the six genes, but followed typical three step reaction protocols

(available on request from the first author). PCR products were extracted from agarose gels and purified with the Qiaquick Gel Extraction kit (Qiagen, Santa

Clara, CA). Big Dye Sequencing kits (Applied Biosystems, Foster City, CA) were used for sequencing reactions and sequencing was completed at the North

Carolina State University, Genome Research Laboratory (GRL). Sequences were assembled and edited using Sequencher 4.1 (Gene Codes Corp., Ann

Arbor, MI). Alignment was carried out manually according to the amino acid translation using Se-Al 2.0 (Rambaut, 2002). Introns and other positions of

ambiguous alignment were removed from the analysis. To detect existing base compositional bias, a chi-square test of homogeneity of base frequencies across taxa was performed for the concatenated data set using Tree Puzzle (Strimmer and Von Haeseler, 1997).

Phylogenetic Analyses

Maximum parsimony, maximum likelihood and Bayesian analyses were completed with all positions included, 3rd positions excluded, 3rd positions RY coded (purine/pyrimidine coding), 1st and 3rd positions RY coded, as amino acids, and of each independent gene with the 3rd positions removed. In addition, a conservative alignment was generated in the program GBlocks, which identifies and removes areas of ambiguous alignment from the data set, (Castresana,

2000) and analyzed with MP, ML and BI. The following exploratory analyses with adjusted taxon sampling were completed with 3rd positions removed: the removal of taxa with base composition bias, strepsipterans removed, coleopterans removed, and outgroups removed. These variations on character and taxon inclusion have all been suggested as means to rectify LBA (Bergsten

2005, Brinkmann et al., 2005).

Parsimony Analyses

Maximum parsimony analyses were done using Paup* 4.0b10 (Swofford, 2002).

Heuristic searches with TBR branch swapping and 100 random addition replicates were completed to find the shortest trees. Node support was obtained by acquiring bootstrap values from heuristic searches of 500 re-sampled data sets and 10 random addition replicates.

Bayesian Analyses

An appropriate model of nucleotide evolution, in this case GTR + I + G, was chosen by using Mr. Modeltest (Nylander, 2004). Using Mr.Bayes (Huelsenbeck and Ronquist, 2001; Ronquist and Huelsenbeck 2003), analyses were conducted for 5,000,000 generations, trees sampled every 1000, with the first 25% discarded as burn-in. For nucleotide analyses, the model GTR+I+G was used with each gene treated as a separate partition; however when 3rd positions were included, each codon position was treated as a separate partition. For amino acid analyses, the WAG model (Whelan and Goldman, 2001) and a mixed model

(Lartillot and Philippe, 2004) were used, with each gene treated as a separate partition.

Maximum Likelihood Analyses

Maximum likelihood analyses were performed using Garli (Zwickl, 2006) with a

GTR+I+G model for nucleotides and the WAG model for amino acids. To obtain bootstrap values, 500 bootstrap replicates were performed.

Conflict visualization

In order to visualize conflicting phylogenetic signal in our data set, likelihood mapping and four-cluster likelihood mapping analyses were completed using the program Tree Puzzle (Strimmer and Von Haeseler, 1997). Analyses of 10,000 quartets were completed using quartet sampling and a neighbor-joining tree with exact parameter estimates and a GTR+I+G model of substitution. To generate

Neighbor-Nets, we analyzed a Paup * (Swofford, 2002) generated matrix of ML inferred distances in the program SplitsTree (Huson and Bryant, 2006).

Neighbor-Nets were generated with all taxa included and with the Strepsiptera excluded.

Parametric simulation

In an effort to statistically determine whether our recovery of the clade Halteria was the result of long-branch attraction, we carried out a parametric simulation similar to that described by Huelsenbeck (1998). First, an input tree topology was constructed on which the Diptera and Strepsiptera were separated (the strepsipterans were constrained to group with the Coleoptera). Branch lengths, substitution rates, base frequencies, and gamma parameters for both 1st and

2nd positions independently were calculated using Paup* 4.0b10 (Swofford,

2002). Using the program Mesquite v. 2.5 (Maddison and Maddison, 2008), 100 individual data sets of 1912 bp were simulated according to the constraint tree

and with ML branch lengths and model parameters for both positions 1 and 2 of our empirical data. These position 1 and 2 data sets were then concatenated, resulting in 100 data sets of 3824 bp. The 100 data sets were analyzed using

Garli (Zwickl, 2006). To determine if, and how frequently, the strepsipterans grouped with the dipterans, a consensus network showing all splits in the 100 resulting trees was generated in SplitsTree (Huson and Bryant, 2006).

Acknowledgments

We thank the following individuals and laboratories for providing insect specimens used in this study: G. Kennedy, F. Gould, M. Scoble, C. Mitter, A.

Borkent, T. Rogers and W. Watson. Our appreciation goes to Brian Cassel,

Junwook Kim and Shaun Winterton for the data collection. We also thank M.

Bertone, A. Deans, J. Regier, and I. Winkler for comments on an early draft of the MS.

References

Bergsten, J. 2005. A review of long-branch attraction. Cladistics 21:163-

193.

Beutel, R. G., and S. Gorb. 2001. Ultrastructure of attachment specializations of hexapods (Arthropoda): evolutionary patterns inferred from a revised ordinal phylogeny. J. Zool. Syst. Evol. Res. 39:177-207.

Beutel, R. G., and H. Pohl. 2006. Endopterygote systematics - where do we stand and what is the goal (Hexapoda, Arthropoda)? Syst. Entomol. 31:202-

219.

Bonneton, F., F. G. Brunet, J. Kathirithamby, and V. Laudet. 2006. The rapid divergence of the ecdysone receptor is a synapomorphy for Mecopterida that clarifies the Strepsiptera problem. Insect Mol. Biol. 15:351-362.

Brinkmann, H., M. van der Geizen, Y. Zhou, G. P. de Raucourt, and H.

Philippe 2005. An Empirical Assessment of Long-Branch Attraction Artifacts in

Deep Eukaryotic Phylogenomics. Syst. Biol. 54:743-757.

Carmean, D., L. S. Kimsey, and M. L. Berbee. 1992. 18S rDNA

sequences and the Holometabolous Insects. Mol. Phylogenet. Evol. 1:270-278.

Carmean, D., and B. J. Crespi. 1995. Do long branches attract flies?

Nature 373:666.

Castresana, J. 2000. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17:540-552.

Castro L. R., and M. Dowton. 2005. The position of the Hymenoptera within the Holometabola as inferred from the mitochondrial genome of Perga condei (Hymenoptera: Symphyta: Pergidae). Mol. Phylogenet. Evol. 34:469-479.

Chalwatzis, N., J. Hauf, Y. V. D. Peer, R. Kinzelbach, and F. K.

Zimmerman. 1996. 18S ribosomal RNA genes of insects: Primary structure of the genes and molecular phylogeny of the Holometabola. Ann. Entomol. Soc. Am.

89:788-803.

Crowson, R. A. 1960. The phylogeny of the Coleoptera. Ann. Rev.

Entomol. 5:111-134.

Felsenstein, J. 1978. Cases in which parsimony or compatibility methods will be positively misleading. Syst. Zool. 27:401-410.

Hayward, D. C., J. W. H. Trueman, M. J. Bastiani, and E. E. Ball. 2005.

The structure of the USP/PXR of Xenos pecki indicates that Strepsiptera are not closely related to Diptera. Dev. Genes Evol. 215:213-219.

Hendy, M. D., and D. Penny. 1989. A framework for the quantitative study of evolutionary trees. Syst. Zool. 38:297-309.

Hornschenmeyer, T. 2002 Phylogenetic significance of the wing base of

Holometabola (Insecta). Zool. Scripta. 31:17-29.

Huelsenbeck, J. P. 1997. Is the Felsenstein zone a fly trap? Syst. Biol.

46:69-74.

Huelsenbeck, J. P. 1998 Systematic bias in phylogenetic analysis: is the

Strepsiptera problem solved? Syst. Biol. 47:519-537.

Huelsenbeck, J. P., and F. Ronquist. 2001. MRBAYES: Bayesian inference of phylogeny. Bioinformatics. 17:754-755.

Hunefeld, R., and R. G. Beutel. 2005. The sperm pumps of Strepsiptera and Antliophora (Hexapoda). J. Zool. Syst. Evol. Res. 43: 297-306.

Huson, D. H., and D. Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23:254-267.

Hwang, U. W., W. Kim, D. Tautz, and M. Friedrich. 1998. Molecular phylogenetics at the Felsenstein Zone: Approaching the Strepsiptera problem using 5.85 and 28S rDNA sequences. Mol. Phylogenet. Evol. 9:470-480.

Kennedy, M., B. R. Holland, R. D. Gray, and H. G. Spencer. 2005.

Untangling long branches: identifying conflicting signals using spectral analysis, neighbor-net, and consensus networks. Syst. Biol. 54:620-633.

Kjer, K. M. 2004. Aligned 18S and insect phylogeny. Syst. Biol. 53:506-

514.

Krauss, V., M. Pecyna, K. Kurz, and H. Sass. 2005. Phylogenetic mapping of intron positions; A case study of translation initiation factor elF2. Mol. Biol.

Evol. 22:74-84.

Kristensen, N. P. 1981. Phylogeny of insect orders. Ann. Rev. Entomol.

26:135-157.

Kristensen, N. P. 1999. Phylogeny of endopterygote insects, the most successful lineage of living organisms. Eur. J. Entomol. 96:237-253.

Kukalova-Peck, J., and J. F. Lawrence. 2004. Relationships among coleopteran suborders and major endoneopteran lineages: Evidence from hind wing characters. Eur. J. Entomol. 101:95-144.

Lartillot, N., and H. Philippe. 2004. A Bayesian Mixture Model for Across-

Site Heterogeneities in the Amino-Acid Replacement Process. Mol. Biol. Evol.

21:1095-1109.

Maddison, W. P., and D. R. Maddison. 2008. Mesquite: a modular system for evolutionary analysis. Version 2.5 http://mesquiteproject.org

Moulton, J. K., and B. M. Wiegmann. 2003. Evolution and phylogenetic utility of CAD (rudimentary) among Mesozoic-aged eremoneuran Diptera

(Insecta). Mol. Phylogenet. Evol. 31:363-378.

Nylander, J. A. A. 2004. MrModeltest v2. Program distributed by the author. Evolutionary Biology Centre, Uppsala University.

Pashley, D. P., B. A. McPheron, and E. A. Zimmer. 1993. Systematics of

the holometabolous insect orders based on 18S ribosomal RNA. Mol.

Phylogenet. Evol. 2:132-142.

Rambaut, A. 2002. Sequence Alignment Editor. Ver. 2.0. Available via http://evolve.zoo.ox.ac/software/se-Al/main.html.

Rasnitsyn, A. P. 2002. Cohors Scarabaeiformes Laicharting, 1781. The

Holometabolans. In: Rasnitsyn A. P., and D. L. J. Quicke eds. History of

Insects. Dordrecht, The Netherlands: Khuwer Academic Publishers. pp157-159.

Rokas, A., J. Kathirithamby, and P. W. H. Holland. 1999. Intron insertion as a phylogenetic character:the engrailed homeobox of Strepsiptera does not indicate affinity with Diptera. Insect Mol. Biol. 8: 527-530.

Rohdendorf, B. B., and A. P. Rasnitsyn. 1980. Historical development of the class Insecta. Moscow: Nauka Press.

Ronquist, F., and J. P. Huelsenbeck. 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572-1574.

Savard, J., D. Tautz, S. Richards, G. M. Weinstock, R. A.Gibbs, J. H.

Werren, H. Tettelin, and M. J. Lercher. 2006a. Phylogenomic analysis reveals bees and wasps (Hymenoptera) at the base of the radiation of holometabolous insects. Genome Res. 16:1334-1338.

Savard, J., D. Tautz, and M. Lercher. 2006b. Genome-wide acceleration of protein evolution in flies (Diptera). BMC Evol. Biol. 6:7.

Strimmer, K., and A. Von Haeseler. 1997. Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc. Nat.

Acad. Sci. 94:6815-6819.

Swofford, D. L. 2002. PAUP* Phylogenetic analysis using parsimony (*and other methods), Version 4. Sunderland, Massachusets: Sinauer Associates.

Wheeler W. C., M. Whiting, Q. D. Wheeler, and J. M. Carpenter. 2001.

The phylogeny of the extant hexapod orders. Cladistics 17:113-169.

Whelan, S., and N. Goldman. 2001. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18:691-699.

Whiting M. F. 2002. Phylogeny of the holometabolous insect orders:

molecular evidence. Zool. Scripta. 31:3-15.

Whiting M. F., and W. C. Wheeler. 1994. Insect homeotic transformation.

Nature. 368:696.

Whiting M. F., J. C. Carpenter, W. C. Wheeler, Q. D. Wheeler. 1997. The

Strepsiptera problem: Phylogeny of the holometabolous insect orders inferred from 18s and 28s ribosomal DNA sequences and morphology. Syst. Biol. 46:1-

68.

Whiting, M. F., A. S. Whiting, and M. W. Hastriter. 2003. A comprehensive phylogeny of Mecoptera and Siphonaptera. Entomol. Abhandlungen. 61:169.

Wilson E. O. 1992. The Diversity of Life. Belknap Press Cambridge,

Massachusetts.

Zdobnov E. M., and P. Bork. 2006. Quantification of insect genome divergence. Trends Genet. 23:16-20.

Zwickl, D. J. 2006. Genetic algorithm approaches for the phylogenetic analysis of large biological sequence data sets under the maximum likelihood criterion. Ph.D. dissertation, The University of Texas at Austin.

Table 1. Genes sampled for Holometabola and out-groups. Gene fragments that were unobtainable for this analysis are indicated by a horizontal line. Asterisks denote portions of CAD amplified in separate non-overlapping fragments.

Genes Number of base pairs

915 AATS alanyl-tRNA synthetase CAD carbamoylphosphate synthase domain 2057 PGD 6-phosphogluconate dehydrogenase 802 SNF sans fille 560 TPI triosephosphate isomerase 498 RNA Pol II RNA polymerase II 215 Kda subunit 899

Taxa Order Genus species Genbank number Dictyoptera Blatella germanica GQ265573, GQ265596 GQ265621, GQ265633 GQ265647, GQ265663 Thysanoptera Frankliniella fusca GQ265566, GQ265588 GQ265614, ————— GQ265641, GQ265657 Hymenoptera Ametastegia equiseti GQ265565, GQ265586* GQ265587*, GQ265613 GQ265628, GQ265640 GQ265656 Hymenoptera Muscidifurax raptorellus GQ265578, GQ265604* GQ265605*, GQ265606* GQ265624, GQ265634 GQ265650, GQ265668 Hymenoptera Apis mellifera XM_395392, XM_393888, XM_625087 XM_393440, XR_014889, XM_623278 Coleoptera Tribolium castaneum XM_970534, EU677538, XM_966958 XM_963178, XM_970400, XM_968377 Coleoptera Strangalia bicolor GQ265574, GQ265599 —————,————— —————,GQ265664 Neuroptera Austronevrorthus brunneipennis GQ265575, GQ265600 —————,————— GQ265649, GQ265665 Neuroptera Kempynus sp. GQ265567, GQ265589 GQ265615, —————

Table 1. Continued

Neuroptera Platystoechotes sp. GQ265568, GQ265590 GQ265616, GQ265629 GQ265642, GQ265658 Raphidioptera Mongoloraphidia martynovae —————, GQ265597 GQ265622, ————— —————,————— Megaloptera Nigronia sp. —————, GQ265598 GQ265623, ————— GQ265648, ————— Trichoptera Hydropsyche phalerata GQ265569, GQ265591 GQ265617, GQ265630 GQ265643, GQ265659 Lepidoptera Heliothis virescens GQ265570,GQ265592 GQ265618, ————— GQ265644, GQ265660 Lepidoptera Bombyx mori M55993, EU032656, NM_001047060, DQ202313, NM_001126258, ————— Diptera Anopheles gambiae XM_318757, XM_310823, XM_313091, XM_320869, XM_321467, XM_313929 Diptera Tipula abdominalis GQ265563, GQ265584 GQ265611, GQ265626 —————, ————— Diptera Musca domestica GQ265564, GQ265585 GQ265612, GQ265627 GQ265639, ————— Diptera Drosophila melanogaster NM_205934, X04813, M80598, NM_078490, NM_176587, NM_078569

Table 1. Continued

Strepsiptera Halictophagidae sp. GQ265562, GQ265583 GQ265610, ————— GQ265638, GQ265655 Strepsiptera Mengenilla sp. —————, GQ265580 ————— , ————— ————— , GQ265651 Mecoptera Nannochorista sp. GQ265571, GQ265593* GQ265594*, GQ265619 GQ265631, GQ265645 GQ265661 Mecoptera Panorpa sp. GQ265572, GQ265595 GQ265620, GQ265632 GQ265646, GQ265662 Mecoptera Boreus brumalis GQ265576, GQ265601 —————, ————— —————, GQ265666 Mecoptera Australobittacus sp. GQ265577, GQ265602*, GQ265603*, ————— —————, GQ265667 Mecoptera Microchorista philpotti GQ265560, ————— GQ265608, ————— GQ265635, GQ265652 Mecoptera Boreus sp. —————, GQ265582 —————,————— GQ265637, GQ265654 Siphonaptera Neotyphloceras sp. GQ265579, GQ265607 —————, ————— —————, GQ265669 Siphonaptera Ctenocephalides felis GQ265561, GQ265581 GQ265609, GQ265625 GQ265636, GQ265653

! ! Table 2. Clade recovery results from ML analyses with varied taxon and character inclusion used to counter LBA. These variations on taxon and character inclusion (with the exception of the inclusion of 3rd positions) are cited as a means to rectify the effects of long-branch attraction. Clade recovery from these various methods are substantially in agreement with each other and with our final results. This finding signifies that LBA does not play a clearly detectable role in our analyses.

! "#"! ! !

Figure 1. The phylogeny of the holometabolous insects. Posterior probabilities/ML bootstrap values are shown at each node. NEU = Neuropterida,

AMP = Amphiesmenoptera, ANT = Antliophora.

! "##! ! !

Figure 2. The congruent ML and BI topology. ML branch lengths, posterior probabilities are shown above and ML bootstrap values below. Though one strepsipteran, Halictophagidae sp., has an exceptionally long branch, the

Tribolium branches are only slightly longer than average.

! "#$! ! !

Figure 3. Conflict visualization using likelihood mapping in Tree Puzzle

(Strimmler and Von Haeseler 2002). a. The tips of the triangle are considered ‘basins of attraction’ that contain the likelihoods of the percentage of quartets that are fully resolved. The center of the triangle represents the percentage (0.5%) of quartets that are unresolved. 0.4% indicated that there is not substantial conflict within our data set (Strimmer and

Von Haeseler 2002).

b. Four-cluster likelihood mapping analysis of Mecopterida, Neuropteroidea,

Strepsiptera and Hymenoptera indicates there is conflicting data supporting the affinity of Strepsiptera to each of these three groups.

! "#%! ! !

! "#&! ! !

Figure 4. Neighbor-Nets showing conflicting splits when all taxa are included compared to when Strepsiptera is excluded. The decreased level of conflict in the data set exhibited when the fast-evolving Strepsiptera is excluded may be considered indicative of long-branch attraction.

! "#'! ! !

! "#(!