DNA-based Species Delimitation of the Agriculturally Important , Ravinia (Diptera: Sarcophagidae)

A dissertation submitted to the

Graduate School Of the University of Cincinnati

In partial fulfillment of the requirements for the degree of Doctor of Philosophy

In the Department of Biological Sciences

of the McMicken College of Arts and Sciences

By

Evan Scott Wong

B.S., Biological Sciences with Honors, University of Cincinnati, September 2010

Committee chair: Ronald W. DeBry

Abstract Species, and their constituent populations, are the units upon which evolution acts.

The ability to accurately identify and delimit species boundaries is crucial for research, but is also important for public policy. Traditionally, species were identified and delimited based on morphological characteristics, yet sometimes species are cryptic, meaning no diagnostic morphological traits can be found. Consequently, DNA sequence data are increasingly being used in the areas of species delimitation and identification. As a model to explore species delimitation and identification I use members of the flesh family

Sarcophagidae, specifically the genus Ravinia. Using Bayesian inference and maximum likelihood methods, I infer a molecular phylogeny, using mitochondrial DNA. Several paraphyletic relationships were inferred among species of Ravinia that are discordant with current morphological . I investigate whether this conflict between genetic lineages and morphological identification within the species Ravinia anxia and Ravinia querula is due to introgressive gene flow, retained ancestral polymorphism, or indicative of the presence of unrecognized species. Using a combination of concatenated phylogenetic analyses, population genetic analyses, migration analyses, and coalescent methodologies I delimit the species Ravinia anxia and Ravinia querula. I find three morphologically cryptic lineages that comprise the current morphologically-identified species R. anxia and R. querula. With the findings of this species delimitation research, I then re-evaluate the phylogenetic relationships of Ravinia using increased sampling of both species and loci.

ii

Copyright © 2015

by

Evan S. Wong

All Rights Reserved

iii

iv

Acknowledgments

The majority of specimen collection work and a minor portion of molecular sequencing were supported by Grant No. 2005-DA-BX-K102 awarded by the National

Institute of Justice, Office of Justice Programs, U.S. Department of Justice to Ronald W.

DeBry (P.I.). The opinions, findings, and conclusion or recommendations expressed in this publication are the author(s) and do not necessarily reflect the views of the Department of

Justice.

Additional molecular work and a portion of specimen collection work were funded by multiple Wiemenn/Wendel/Benedict Summer Fellowships to Evan S. Wong. The majority of molecular work was supported by the GSGA Summer Research Fellowship of the Graduate Student Governance Association (University of Cincinnati). Specimen collection in the state of Ohio was supported by The Ohio Biological Survey’s Small

Research Grants, awarded to Dr. Gregory A. Dahlem (P.I.) and Evan S. Wong (co-P.I.).

General laboratory costs and summer support was supported by Scholarship America:

Genzyme’s Chart Your Own Course national scholarship for people living with lysosomal storage disorders; and University of Cincinnati’s Research Council SUMR Mentor program.

Conference and travel support was awarded by the Graduate Student Governance

Association for the public presentation of portions of this research.

I would like to thank my advisor, Dr. Ronald W. DeBry for his help and advice on this project. I first came to the DeBry lab as a doe-eyed undergraduate student, with very little laboratory experience and a small amount of critical thinking capability. Without Ron’s support, this dissertation, my research, and many areas of my life would be very different than they are today. I would also like to thank my wonderful committee (Drs. Gregory

v

Dahlem, Bryan Carstens, Theresa Culley, and Ken Petren), who helped in the formation of my ideas and their scientific soundness, and the many hours of their time that they gave me. Specifically, I would like to thank Dr. Carstens, who graciously invited me to The Ohio

State University where I was able to learn under the guidance of Mr. Jordan Satler and Dr.

Carstens the specifics of the coalescent-based programs and migration analyses that were used in this dissertation. I thank Dr. Theresa Culley for her constant availability, support, and insight. Without fail, I could go to talk to Dr. Culley without an appointment and she would freely give of her time. I thank Dr. Ken Petren for his sound mind and available ear

(Dr. Petren would listen to my rants about things such as chairs, desks, offices, and door locks). I also thank Dr. Petren for his knowledge in evolutionary ecology which greatly improved this research. Finally, I thank Dr. Gregory Dahlem who I am deeply indebted to and without whom I would not have been able to start or complete this research. Dr.

Dahlem has been a constant mentor and is an amazing scientist from whom I have learned so much. The many, many hours of collecting and the time spent on nature trails are experiences that will stay with me for the rest of my life. Dr. Dahlem invested many hours into me; teaching proper collection techniques and identifications.

I would also like to thank Dr. Trevor Stamper. Dr. Stamper was the first person that I met in the DeBry Lab and to whom started me on the path of science. Dr. Stamper was a great mentor who always was available to me for support. These acknowledgements are far too short a place to put all my thanks. I also thank Dr. Dave Hauber who helped train me in microbiology and who I consider a good friend. I thank Mrs. Stephanie Gierek who came to the DeBry lab as an undergraduate researcher and later became a graduate student in our lab. I only wish that she could have joined the lab sooner so I could have spent more time in

vi the lab researching with her. I also thank the many undergraduate students I was able to mentor over these past several years: Gabrielle Lombardo (2011-2012), Annie Roesler

(2011), Andrea Fuitem (2012-2013), Brandon Baker (2012-2013), Jessica Harrington

(2013), Stephanie Gierek (2014-2015), Dane Larson (2014-2015), and Jacqui Yarman

(2014-2015).

I thank the Department of Biological Sciences at the University of Cincinnati.

Specifically, I thank the office staff and support: Sandra McGeorge, Charmaine Mitchell,

Mary Wischer, Heather Wischer, Paul McKenzie, and Roger Ruff. I would like to thank the wonderful Ms. Julie Stacey and Mrs. Ellen Cordonnier. Julie made teaching a blast and I am very glad that I was able to teach Microbiology with her.

Finally, I want to thank my family and friends for their patience and unwavering support during this time. My friend, Elizabeth Brown, was a constant source of support and encouragement over these past five years. She saw me at the best of times and the worst of times and accepted me for who I am. Thank you for being my life-long friend. To my brothers, Elliot and Ean, thank you for your encouragement during this period of my life. I thank you for being understanding and accepting. I thank my parents, Perry and Sandy

Wong, for their support. It has taken a while to get here, and I thank you for being there along the way. I thank my Aunt Vivian who impressed upon me the life philosophy of:

“Good, Better, Best. I will not rest until my good is better, and my better is best.”

I dedicate this dissertation to two groups of people. The first group to whom I dedicate this dissertation to are people living with rare genetic diseases. The second group to whom I dedicate this dissertation to is the LGBTQ community. Both of these groups have had a significant impact on my life and I am a better person because of them.

vii

Table of Contents

Abstract ...... ii

Acknowledgments...... v

Table of Contents ...... viii

List of Tables ...... xi

List of Figures ...... xii

Chapter One: Introduction & Dissertation Organization ...... 1

Chapter Two: Discordance between morphological species identification and mtDNA

in the genus, Ravinia (Diptera: Sarcophagidae) ...... 14

2.1 Abstract ...... 15

2.2 Introduction ...... 16

2.3 Materials & Methods ...... 19

2.4 Results ...... 27

2.5 Discussion...... 29

2.6 Conclusions ...... 34

2.7 Tables ...... 37

2.8 Figures ...... 41

2.9 References ...... 43

viii

Chapter Three: Coalescent DNA-based species delimitation within the flesh fly genus

Ravinia Robineau-Desvoidy, 1863 (Diptera: Sarcophagidae) ...... 50

3.1 Abstract ...... 51

3.2 Introduction ...... 52

3.3 Materials & Methods ...... 56

3.4 Results ...... 66

3.5 Discussion...... 70

3.6 Conclusions ...... 75

3.7 Tables ...... 79

3.8 Figures ...... 83

3.9 References ...... 87

3.10 Supplemental Materials ...... 98

Appendix S3.1 Primer Information and Amplification Procedure ...... 98

Appendix S3.2 Best Model Selection and Partition Schemes ...... 101

Supplemental Tables ...... 101

Supplemental Figures ...... 107

Chapter Four: Phylogenetic relationships of the flesh fly genus Ravinia (Diptera:

Sarcophagidae) ...... 125

4.1 Abstract ...... 126

4.2 Introduction ...... 127

4.3 Materials & Methods ...... 130

ix

4.4 Results ...... 136

4.5 Discussion...... 139

4.6 Conclusions ...... 143

4.7 Tables ...... 144

4.8 Figures ...... 150

4.9 References ...... 159

4.10 Supplemental Materials ...... 165

Appendix S4.1 Best Model Selection and Partition Schemes ...... 165

Supplemental Tables ...... 167

Supplemental Figures ...... 171

Chapter Five: General Conclusions ...... 173

x

List of Tables

Chapter Two

Table 2.1. Specimen locality data and references for COI and COII sequences

included in this study...... 37

Table 2.2. Tamura-Nei + G corrected distances within and between indicated

groups...... 40

Chapter Three

Table 3.1. Gene information for multilocus analyses...... 79

Table 3.2. *BEAST marginal likelihoods...... 80

Table 3.3. Migration results using Migrate-n...... 81

Table 3.4. Mean estimates of population parameters from IMa2 analyses...... 82

Table S3.1. Primer sequence information...... 98

Table S3.2. List of specimens, specimen identification (I.D.) codes, collection data, and

GenBank Accession numbers (Chapter 3)...... 104

Table S3.3 BP&P results with guide tree parameters...... 106

Chapter Four

Table 4.1. List of specimens, specimen identification (I.D.) codes, sequenced genes

and GenBank accession numbers (Chapter 4) ...... 144

Table 4.2. Gene information in multilocus analyses...... 149

Table S4.1. Specimen sex and locality data...... 167

xi

List of Figures

Chapter Two

Figure 2.1. The inferred phylogenetic hypothesis of Ravinia species relationships

of North America...... 41

Chapter Three

Figure 3.1. Maximum likelihood tree, inferred using RAxML, of the concatenated

(2 mtDNA loci + 5 nuDNA loci) data set for all samples...... 83

Figure 3.2. Results from species delimitation analyses, represented on BEAST tree

from bGMYC of all sampling localities...... 85

Figure S3.1. Inferred COI gene-tree using maximum likelihood and Bayesian

inference...... 107

Figure S3.2. Inferred COII gene-tree using maximum likelihood and Bayesian

inference ...... 109

Figure S3.3. Inferred EF-1a gene-tree using maximum likelihood and Bayesian

inference ...... 111

Figure S3.4. Inferred White gene-tree using maximum likelihood and Bayesian

inference...... 113

Figure S3.5. Inferred Period gene-tree using maximum likelihood and Bayesian

inference ...... 115

Figure S3.6. Inferred G6PD gene-tree using maximum likelihood and Bayesian

inference ...... 117

Figure S3.7. Inferred PEPCK gene-tree using maximum likelihood and Bayesian

inference ...... 119

xii

Figure S3.8. bGMYC analysis heat map and topology...... 121

Figure S3.9. *BEAST tree selected as the most supported species delimitation

scenario ...... 123

Chapter Four

Figure 4.2. Sampling map of Ravinia species used for phylogenetic analyses in

this study ...... 150

Figure 4.2. Inferred phylogenetic tree from concatenated 12 loci using the program

RAxML ...... 152

Figure 4.3. Wing diagram showing location/absence of setae on Chaetoravinia and Ravinia

individuals ...... 155

Figure 4.4. Species-tree estimation from five mitochondrial and seven nuclear loci

using the program *BEAST...... 157

Figure S4.1. Species-tree estimation from seven nuclear loci using the program

*BEAST...... 171

xiii

Chapter One: Introduction & Dissertation Organization

1

INTRODUCTION

Species delimitation is not just an abstract problem in biology, but is a crucial first step for many kinds of biological research and a vitally important part of the relationship between science and public policy. For example, the ability to accurately define a species is fundamental for conservation (e.g., Lowenstein et al. 2009) and control of pest and invasive species (e.g., Spillings et al. 2009). Indeed, delimiting species boundaries and determining phylogenetic relationships have been described as the primary goals of systematics (Chen et al. 2014). Wiens (2007) describes species delimitation as the process through which species boundaries are determined, new species are discovered, and currently recognized species are found to be synonymous. Despite considerable debate, there appears to be consensus that species are independently evolving metapopulation lineages (de Queiroz

2007). The variation among different species concepts can be attributed to the point in time on the speciation continuum when a lineage becomes recognized as a species. This provides a justification for the use of genetic data as the primary data for delimiting species, as genetic data reflect the historical processes that give rise to these independent metapopulation lineages (i.e., species). Consequently, DNA sequence data are increasingly being used in species delimitation studies, yet few studies utilize DNA data as the primary tool in species delimitations.

Traditionally, morphological characteristics are the first and primary tools used in the delimitation of species. In such situations, highly trained scientists conduct identifications and develop keys of hypothesized characteristics that could be used to distinguish between species. The presence of diagnostic characters can provide evidence indicating that there is an absence of (or perhaps minimal) gene flow between species and

2 other populations in the environment (Wiens and Servedio 2000). However, there are certain limitations to this methodology that include the possibility that no distinguishing factor can be provided from morphological data alone, for example in cryptic species

(Hebert et al. 2004a, 2004b; Lowenstein et al. 2009; Spillings et al. 2009). However, even when morphological characters successfully delimit species, the use of genetic data can speed the process (Sites and Marshall 2004). Furthermore, using an integrative species delimitation approach (e.g., morphology and genetic data) pushes taxonomy beyond conventional naming to understanding the evolutionary processes that bring species about

(Schlick-Steiner et al. 2010).

With the efficient and rapid availability of DNA sequence data there has been an increase in new methods used in species delimitation. One of the most prominent approaches is the use of DNA sequence data in phylogenetic analyses, followed by determination of species boundaries from the taxonomic status and phylogenetic relationships (Catleka and McAllister 2004). Typically these methods have utilized arbitrary cutoffs as indicators of species status such as sequence divergence and migration rates. For example, the “10x rule” proposed by Hebert et al. (2004b) requires between- species divergence to be at least 10 times that as seen within species. Other methods, based upon the genealogical species concept (Baum and Shaw 1995), require that gene-trees at all loci exhibit reciprocal monophyly. This imposes an unrealistic requirement on speciation because of the very long time required to achieve reciprocal monophyly at neutral loci (Hickerson et al. 2006). In addition, conflicting gene-trees are a natural phenomenon and a common result of stochastic changes in the coalescent, or the alleles of

3 a gene shared by all members in the population, of ancestral species (Hudson and Coyne

2002; Rannala and Yang 2003).

Some recent examples of species delimitation have used mitochondrial DNA

(mtDNA) to infer species limits (Serb et al. 2001; Wiens and Penkrot 2002; Ballard and

Whitlock 2004). Mitochondrial DNA has been shown to be beneficial in species delimitation studies, particularly in species discovery studies using a single locus (e.g., barcodes). One useful property of mtDNA is that the rate at which haplotypes of smaller effective population sizes (Ne) coalesce is four times faster than nuclear markers (Moore 1995). As a result, newly formed species generally will become reciprocally monophyletic for their mtDNA sequences before most nuclear markers (Wiens and Penkrot 2002), although this is not always the case (Hudson and Turelli 2003). The expected time to coalescence assumes that the sampled haplotypes are adaptively neutral and will depend on the genetic data used (i.e., autosomal loci, mitochondrial loci, allosomal loci; Moore 1995).

The exclusive use of mtDNA for the discovery, validation, and identification of a species is an area of controversy and it has been convincingly argued that species should not be delimited based on these data alone (Moritz et al. 1992; Moritz 1994; Sites and

Crandall 1997). The mitochondrial genome produces only a single gene-tree regardless of the number of base pairs sequenced (Wiens and Penkrot 2002; Amaral et al. 2010). Thus, independent replicated data cannot be obtained regardless of how many genes are sequenced. In addition, mtDNA has the propensity to introgress across species boundaries when nuclear DNA (nuDNA) does not (Ballard and Whitlock 2004). This problem becomes significant at the early stages of speciation, where both morphological and molecular markers will likely show low magnitudes of divergence across species (de Queiroz 2007).

4

Problems that make the delimitation of recently diverged species difficult include incomplete lineage sorting (ILS) and current gene flow (i.e., migration) between populations or species (Maddison 1997; Hudson and Coyne 2002; Degnan and Rosenberg

2006). Although gene flow and ILS may lead to a similar topology on the inferred trees, these two processes have different implications in species delimitation (Harrington and

Near 2012). Gene flow produces polyphyly in the gene-tree through the intrusion of allelic diversity across species boundaries. Introgressive hybridization, as a result of gene flow from one species into the gene pool of another, may lead to perceived genetic similarity among otherwise phenotypically divergent individuals (Keck and Near 2010).

ILS is a natural phenomenon and results from a stochastic sorting of ancestral alleles. ILS causes gene genealogies to vary across the genome, resulting in a gene-tree that is often different from the actual species phylogeny (Pamilo and Nei 1988; Takahata 1989;

Hobolth et al. 2011). At speciation, and a considerable time after, the random sorting of ancestral alleles will cause some daughter species to possess alleles that are more closely related to a sister species, than to conspecifics. Thus, daughter species are expected to produce polyphyletic gene-trees for some time after the speciation event. Eventually, the diversity in allelic lineages will be lost through genetic drift resulting in a single allelic lineage that is representative of each daughter species. However, this progression from polyphyly to monophyly in the gene-trees can take a long amount of time (i.e., 4Ne generations; Moore 1995).

Species delimitation using genetic data (nuclear and mitochondrial loci) necessitates that one consider both the process of species divergence and the contribution of stochastic genetic processes that may lead to discordance among gene-trees and species

5 boundaries (Knowles and Carstens 2007). It cannot be determined if an inferred phylogeny, based upon a single locus, is experiencing introgression or ILS unless it is analyzed with other independent loci. These problems encountered when exclusively using mtDNA highlight the need to use multiple nuclear loci in species delimitation studies

(Ballard and Whitlock 2004). When multiple nuclear and mitochondrial loci are analyzed together more consistent and accurate relationships can be inferred when using DNA- based methods (Wang et al. 1997).

Study system: Ravinia (Diptera: Sarcophagidae)

Ravinia provides a model genus for exploration into the aspects of species delimitation using DNA-based techniques. This study system was chosen because of its: 1) broad geographic and ecological range (Wong et al. 2015), 2) importance in other fields

(i.e., agriculture; Pickens 1981), 3) longstanding taxonomic disagreements (e.g., Hall 1928;

Aldrich 1930; Dodge 1956), and 4) ambiguous species limits of some members based on morphology (Wong et al. 2015).

The genus Ravinia is a common group of primarily coprophagous ; as larvae they consume and digest feces of (Wong et al. 2015). Currently, there are 34 recognized species worldwide, of which 17 occur in North America, 16 are Neotropical, and

1 species is from the Old World (Pape 1996; Pape and Dahlem 2010). Many species of

Ravinia are closely associated with human activity and have a direct impact on the environment through their role in vertebrate waste decomposition. In addition, Ravinia species have been investigated for their potential use as biological control agents, having been shown to be effective against introduced pests such as the face fly, Musca autumnalis

(De Geer), and the horn fly Haematobia irritans (Linnaeus) (Pickens 1981; Thomas and

6

Morgan 1972). While Ravinia may be among the most common of all sarcophagid flies collected in the U.S. based on its presence in institutional collections, little is still known about the biology and evolution of many species within this genus. As of yet, there has been no phylogenetic study that establishes relationships for species within this genus.

DISSERTATION ORGANIZATION

Chapter Two presents a phylogeny of the genus Ravinia and compares the recovered lineages with the morphologically identified species. I analyze data from two mitochondrial genes in order to better understand the phylogenetic relationships among species in this genus. Using both Bayesian inference and maximum likelihood methods, I estimate the mitochondrial phylogeny and find several novel relationships. Among species of Ravinia, I find several highly supported paraphyletic relationships. However, this conflict between the morphological species definitions and the mtDNA phylogeny could be indicative to the presence of cryptic species, due to introgression, a result of incomplete lineage sorting, or a combination of the three. To further investigate the causes of this discordance a species delimitation study using multilocus data would need to be conducted.

In Chapter Three, I conduct a species delimitation study focusing on two species within the genus Ravinia: Ravinia anxia (sensu lato) and Ravinia querula (sensu lato). Wong et al. (2015) have shown that these species demonstrate discordance between the mtDNA phylogeny and the morphologically identifications. However, this discordance could be the result of incomplete lineage sorting, introgression and migration, or the presence of unrecognized species. In order to understand the cause of this discordance, I delimit these two species using a multilocus data set and several different species delimitation methods,

7 including migration analyses, population genetic analyses, coalescent-based species-tree analyses, and standard phylogenetic analyses.

Chapter Four brings these two previously mentioned studies together, and using a multilocus data set (5 mtDNA + 7 nuDNA loci) I investigate the phylogenetic status of the three cryptic species recovered from the species delimitation of R. anxia and R. querula (R. anxia (sensu stricto), R. querula (sensu stricto), Ravinia n.sp.), and the status of R. floridensis and R. lherminieri. Bayesian inference, maximum likelihood, and coalescent-based species- tree methods using these markers strongly supported the distinctiveness of each of the

Ravinia species, including showing the three cryptic species as monophyletic lineages. This study is the first to employ a coalescent-based species-tree approach within the

Sarcophagidae using a large data set.

The final chapter, Chapter Five: General Conclusions, provides a synopsis of the findings of the three studies found in Chapters Two, Three, and Four. In Chapter Five, I also discuss the potential for future work within Ravinia and suggest possible avenues for future research in this flesh fly genus.

8

REFERENCES

Aldrich, J.M. (1930). Notes on the types of American two-winged flies of the genus

Sarcophaga and a few related forms, described by early authors. Proc. U.S. Nat. Mus.

78:1-39.

Amaral, A.R., Silva, M.C., Möller, L.M., Beheregaray, L.B., Coelho, M.M. (2010). Anonymous

nuclear markers for cetacean species. Conserv. Genet. 11:1143-1146.

Ballard, J.W.O., Whitlock, M.C. (2004). The incomplete natural history of mitochondria. Mol.

Ecol. 13:729-744.

Baum, D.A., Shaw, K.L. (1995). Genealogical perspectives on the species problem. In: Hoch,

P.C., Stephenson, A.G., editors. Molecular and experimental approaches to plant

biosystematics. St. Louis, MO: Missouri Botanical Garden. p. 289-303.

Catleka, B.C., McAllister, B.F. (2004). A genealogical view of chromosomal evolution and

species delimitation in the Drosophila virilis species subgroup. Mol. Phyl. Evol.

33:664-670.

Chen, X., Jiang, K., Guo, P., Huang, S., Rao, D., Ding, L., Takeuchi, H., Che, J., Zhang, Y., Myers,

E.A., Burbrink, F.T. (2014). Assessing species boundaries and the phylogenetic

position of the rare Szechwan ratsnake, Euprepiophis perlacea (Serpentes:

Colubridae), using coalescent-based methods. Mol. Phyl. Evol. 70:130-136.

Degnan, J.H., Rosenberg, N.A. (2006). Discordance of species trees with their most likely

gene trees. PLoS Genetics 2:e68. de Queiroz, K. (2007). Species concepts and species delimitation. Syst. Biol. 56:879-886.

Dodge, H.R. (1956). New North American Sarcophagidae, with some new synonymy

(Diptera). Ann. Entomol. Soc. Am. 49:182-190.

9

Hall, D.G. (1928). Sarcophaga pallinervis and related species in the Americas. Ann. Entomol.

Soc. Am. 21:331-352.

Harrington, R.C., Near, T.J. (2012). Phylogenetic and coalescent strategies of species

delimitation in snubnose darters (Percidae: Etheostoma). Syst. Biol. 61:63-79.

Hebert, P.D.N., Penton, E.H., Burns, J.M., Janzen, D.H., Hallwachs, W. (2004a). Ten species in

one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly

Astraptes fulgerator. Proc. Natl. Acad. Sci. USA 101:14812-14817.

Hebert, P.D.N., Stoeckle, M.Y., Zemlak, T.S., Francis, C.M. (2004b). Identification of birds

through DNA barcodes. PLoS Biol. 2:e312.

Hickerson, M.J., Meyer, C.P., Moritz, C. (2006). DNA barcoding will often fail to discover new

species over broad parameter species. Syst. Biol. 55:729-739.

Hobolth, A., Dutheil, J.Y., Hawks, J., Schierup, M.H., Mailund, T. (2011) Incomplete lineage

sorting patterns among human, chimpanzee, and orangutan suggest recent

orangutan speciation and widespread selection. Genome Res. 21:349-356.

Hudson, R.R., Coyne, J.A. (2002). Mathematical consequences of the genealogical species

concept. Evolution 56:1557-1565.

Hudson, R.R., Turelli, M. (2003). Stochasticity overrules the “Three-times rule”: genetic

drift, genetic draft, and coalescence times for nuclear loci versus mitochondrial DNA.

Evolution 57:182-190.

Keck, B.P., Near, T.J. (2010). Geographic and temporal aspects of mitochondrial

replacement in Nothonotus darters (Teleostei: Percidae: Etheostomatinae).

Evolution 63:1410-1428.

10

Knowles, L.L., Carstens, B.C. (2007). Delimiting species without monophyletic gene trees.

Syst. Biol. 56:887-895.

Lowenstein, J.H., Amato, G., Kolokontronis, S.-O. (2009). The real maccoyii: identifying tuna

sushi with DNA barcodes – constraining characteristic attributes and genetic

distances. PLoS ONE. 4:e7866.

Maddison, W.P. (1997). Gene trees in species trees. Syst. Biol. 58:289-311.

Moore, W.S. (1995). Inferring phylogenies from mtDNA variation: mitochondrial gene trees

versus nuclear-gene tree. Evolution. 49:718-726.

Moritz, C. (1994). Application of mtDNA in conservation: a critical review. Mol. Ecol. 3:401-

411.

Moritz, C., Schneider, C.J., Wake, D.B. (1992). Evolutionary relationships within the Ensatina

escholtzii complex confirm the ring species interpretation. Syst. Biol. 41:273-291.

Pamilo, P., Nei, M. (1988). Relationships between gene trees and species trees. Mol. Biol.

Evol. 5:568-583.

Pape, T. (1996). Genus Ravinia Robineau-Desvoidy. In: Catalogue of the Sarcophagidae of

the world (Insecta: Diptera). Gainesville, FL, USA, Associated Publishers. p 284-290.

Pape, T., Dahlem, G.A. (2010). Sarcophagidae (Flesh Flies). In: Brown, B.V. et al., editors.

Manual of Central American Diptera, Volume 2. Ottawa, Ontario, CAN. p. 1313-1335.

Pickens, L.G. (1981). The life history and predatory efficiency of Ravinia lherminieri

(Diptera: Sarcophagidae) on the face fly (Diptera: Muscidae). Canad. Entomol.

113:523-526.

Rannala, B., Yang, Z. (2003). Bayes estimation of species divergence times and ancestral

population sizes using DNA sequences from multiple loci. Genetics 164:1645-1656.

11

Schlick-Steiner, B.C., Steiner, F.M., Seifert, B., Stauffer, C., Christian, E., Crozier, R.H. (2010).

Integrative taxonomy: a multisource approach to exploring biodiversity. Annu. Rev.

Entomol. 55:421-438.

Serb, J.M., Phillips, C.A., Iverson, J.B. (2001). Molecular phylogeny and biogeography of

Kinosternon flavescens based on complete mitochondrial control region sequences.

Mol. Phylo. Evol. 18:149-162.

Sites, J.W., Jr., Marshall, J.C. (2004). Operational criteria for delimiting species. Annu. Rev.

Ecol. Syst. 35:199-227.

Spillings, B.L., Brooke, B.D., Koekemoer, L.L., Chiphwanya, J., Coetzee, M., Hunt, R.H. (2009).

A new species concealed by Anopheles funestus Giles, a major malaria vector in

Africa. Am. J. Trop. Med. Hyg. 81:510-515.

Takahata, N. (1989). Gene genealogy in three related populations: consistency probability

between gene and population trees. Genetics. 122:957-966.

Thomas, G.D., Morgan, C.E. (1972). Parasites of the horn fly in Missouri. J. Econ. Entomol.

65:169-74.

Wang, R.L., Wakely, J., Hey, J. (1997). Gene flow and natural selection in the origin of

Drosophila pseudoobscura and close relatives. Genetics. 147:1091-1106.

Wiens, J.J. (2007). Species delimitation: new approaches for discovering diversity. Syst. Biol.

56:875-878.

Wiens, J.J., Penkrot, T.A. (2002). Delimiting species based on DNA and morphological

variation and discordant species limits in spiny lizards (Sceloporus). Syst. Biol.

51:69-91.

12

Wiens, J.J., Servedio, M.R. (2000). Species delimitation in systematics: inferring diagnostic

differences between species. Proc. R. Soc. Lond. [Biol]. 267:631-636.

Wong, E.S., Dahlem, G.A., Stamper, T.I., DeBry, R.W. (2015). Discordance between

morphological species identification and mtDNA phylogeny in the flesh fly genus

Ravinia (Diptera: Sarcophagidae). Invert. Syst. 29:1-11.

13

Chapter Two:

Discordance between morphological species identification and mtDNA in the flesh fly genus Ravinia (Diptera: Sarcophagidae).1

1 This article is the pre-publication version of the published article: Wong, E.S., Dahlem, G.A., Stamper, T.I., & DeBry, R.W. (2015) Discordance between morphological species identification and mtDNA phylogeny in the flesh fly genus Ravinia (Diptera: Sarcophagidae). Invertebrate Systematics 29, 1-11. The definitive version of this article can be accessed via Invertebrate Systematics.

14

2.1 ABSTRACT

In order to better understand the phylogenetic relationships among species in the genus Ravinia Robineau-Desvoidy, 1863, we analyzed data from two mitochondrial gene fragments: cytochrome oxidase I (COI) and cytochrome oxidase II (COII). We used Bayesian inference and maximum likelihood methods to infer phylogenetic relationships. Our results indicate that the genera Ravinia and Chaetoravinia, previously synonymized into the genus

Ravinia (sensu lato) are each likely to be monophyletic (posterior probability 1; bootstrap support 85%). We found highly supported paraphyletic relationships among species of

Ravinia, with relatively deep splits in the phylogeny. This conflict between the morphological species definitions and the mtDNA phylogeny could be indicative of the presence of cryptic species in Ravinia anxia (Walker, 1849), Ravinia floridensis (Aldrich,

1916), Ravinia lherminieri (Robineau-Desvoidy, 1830), and Ravinia querula (Walker, 1849).

Keywords: COI, COII, Chaetoravinia, mitochondrial DNA, molecular, morphology,

phylogenetics, North America

15

2.2 INTRODUCTION

The genus Ravinia Robineau-Desvoidy, 1863 represents an agriculturally important lineage of flesh flies in the Dipteran family Sarcophagidae. Ravinia species are primarily coprophagous flies that feed on a variety of mammalian species’ dung, including common livestock animals such as cattle, horses, and pigs. This genus has a broad distribution with

34 species recognized worldwide: 1 Old World species and 33 species with a New World distribution (Pape 1996; Pape and Dahlem 2010). Ravinia may be among the most common of all flesh flies collected in North America, based on the abundant presence of these flies in insect collections. Although larvae of Ravinia species are believed to be primarily coprophagous, R. anxia (Walker, 1849) has been shown to be a facultative predator of the face fly, Musca autumnalis De Geer, 1776, larvae in bovine feces (Pickens 1981).

Interestingly, Ravinia do not pose a nuisance to humans. They do not normally enter houses and other human built structures and have little interest in landing on exposed skin.

These flies are not known to drink livestock eye secretions and are not attracted to wounds. As such, an increase in Ravinia populations could have a beneficial effect for both urban areas and rural farms in the reduction of accumulated mammalian waste.

The currently defined genus, Ravinia, has been separated into three genera at various times and by various authors: Ravinia, Adinoravinia Townsend, 1917, and

Chaetoravinia Townsend, 1917 (Lopes 1969). These groups have since been fused into the modern Ravinia (sensu lato) (Pape 1996). To date, no phylogenetic relationships of species within this genus have been proposed. Ravinia has proven to be a taxonomically difficult group, in that member species exhibit limited numbers of diagnostic morphological characters. Robineau-Desvoidy (1863) originally established Ravinia based on the

16 hemispherical shape of the abdomen, with no hairs on the first segments of the abdomen

(female), and lack of tufts of hairs on the hind tibia (male). However, these types of characters cannot be used to distinguish modern sarcophagid genera. Adults of the genus

Ravinia have the following identifying characteristics: postalar wall setulate; tegulae orangish in ground color; frontal vitta of female at midpoint more than two times the width of parafrontal plate; male lacking apical scutellar bristles and with a reddish-orange genital capsule.

Morphological characteristics, in general, are the first and primary tools used in the identification of species. In the modern context, trained taxonomists carry out identifications and develop keys of diagnostic character traits that could be used to distinguish between species. Some argue that the presence of cryptic species limits the ultimate utility of morphological taxonomy (Hebert et al. 2004a, 2004b; Spillings et al.

2009). The identification of diagnostic characters is important because it indicates that there is an absence of (or perhaps minimal) gene flow between species (Wiens and

Servedio 2000). The degree of intra-specific variation in morphological traits present in

Ravinia has made it difficult to make precise and accurate identifications of species members. This limited number of distinctive species-level morphological characters makes evolutionary reconstruction of species relationships difficult. Integrating DNA sequence data into taxonomy provides new methods for species identification.

One of the most prominent approaches is the use of DNA sequence data in analyses of taxonomic status and phylogenetic relationships (e.g., Catleka and McAllister 2004).

DNA-based approaches are likely to be particularly important in groups like Ravinia that are characterized by limited numbers of discriminatory morphological characteristics at

17 the species-level. The ability to group individuals into separate genetic groups using DNA sequence data may allow for the discovery of new morphological characters that would enable easier and more reliable separation of these species by morphological means.

Mitochondrial DNA (mtDNA) sequence data are useful tools for species discovery and identification studies (Hebert et al. 2004a, 2004b; Smith et al. 2007; Reid and Carstens

2012). Mitochondrial DNA is easily obtained due to the high abundance in cells relative to nuclear DNA (Wells and Stevens 2009) and much of the sequence data available for the class Insecta is for mitochondrial genes (Caterino et al. 2000). Currently, the vast majority of described species are non-model organisms where in-depth molecular data is lacking. As a result, molecular markers are often chosen based on their use in previous molecular studies, not necessarily because they are the most adequate. One useful property of mtDNA is that with smaller effective populations sizes (Ne) the haplotypes will coalesce four times faster than nuclear markers (Moore 1995). Consequently, newly formed species will become distinct in their mtDNA sequences before they become distinct for most nuclear markers (Wiens and Penkrot 2002). This time to coalescence, however, assumes that the sampled haplotypes are adaptively neutral and, thus, will depend on the markers chosen

(Moore 1995).

To understand if disagreement exists between the mitochondrial genealogy and the morphological characteristics currently used in species identification, we inferred a phylogenetic tree using two mitochondrial loci: cytochrome oxidase I (COI) and cytochrome oxidase II (COII). We also discuss the presence of two distinct clades,

Chaetoravinia and Ravinia, within the genus Ravinia (sensu lato) and suggest future studies to determine if subgeneric or generic status is warranted.

18

2.3 MATERIALS AND METHODS

Specimen collection and sampling

Flies were collected by net, Malaise trap, and bait traps using dog dung or aged chicken thigh and liver as an attractant. Specimens were frozen in bulk and taken to the lab for morphological identification by GAD and molecular analysis (based on Stamper et al.

2013). DNA was extracted from 1-3 legs for use in analyses, thereby preserving all morphological structures important for species identification. Legs were removed and transferred to a vial of 95% EtOH and stored at -80C until DNA extraction. The remainder of the specimen was then pinned or stored in a separate vial of EtOH at -80C. All specimens used in this study are retained as vouchers (currently held by GAD, these will be placed in an established natural history collection upon completion of current studies).

Kutty et al. (2010) found the genus Blaesoxipha Loew, 1861 to be sister to Ravinia, while other analyses, particularly those based on larval and adult morphological characters, have more often supported the genus Oxysarcodexia Townsend, 1917 as being the sister to Ravinia (Lopes 1982; Pape 1994; Giroux et al. 2010). A recent analysis based on two mitochondrial genes, COI and COII, found Oxysarcodexia to be the closest relative to

Ravinia (Stamper et al. 2013). Therefore, we included in this study four outgroups, two from each of these genera: Oxysarcodexia ventricosa (Wulp, 1895); O. cingarus (Aldrich,

1916); Blaesoxipha arizona Pape, 1994; B. cessator (Aldrich, 1916).

We sampled several individuals from multiple populations for 10 of the 17 species of Ravinia in North America. We were not successful in obtaining material for the following species, despite collection efforts in regions where they have been reported to occur:

Ravinia acerba (Walker, 1849), R. anandra (Dodge, 1956); R. coachellensis (Hall, 1931); R.

19 effrenata (Walker, 1861); R. pectinata (Aldrich, 1916); R. sueta (van der Wulp, 1895); and R. tancituro Roback, 1952. Individuals representing two of the three previously synonymized genera, Chaetoravinia and Ravinia were included in this study. The specimens included in this study, their locality information, and GenBank accession numbers can be found in

Table 2.1.

Nomenclatural Notes

The nomenclature involved with R. anxia has been rather confused in the past and much of the confusion deals with the separation of this species from R. querula and the mistaken identity of R. lherminieri. The following nomenclatural history of this species should illustrate some of the problems involving the morphological separation of species in this group of Ravinia. The first name that was commonly applied to this species was

Parker’s R. communis (Parker, 1914). Specimens of both R. anxia and R. querula were determined under this name generally between 1914 and 1928. The revision provided by

Hall (1928) placed R. communis as a synonym of R. pallinervis. For several years, specimens of R. anxia and R. querula were placed under this name. Aldrich (1930) examined types of

American Sarcophagidae in European museums and he provided the mistaken synonymy of R. anxia and R. querula with R. lherminieri. Both R. anxia and R. querula were identified as

R. lherminieri until Dodge (1956) was able to provide characters to separate R. querula from R. anxia. Therefore, references before 1956 concerning one of the earlier names must be considered as potentially referring to R. anxia, unless otherwise indicated by the author’s figures or comments.

Unfortunately, Dodge (1956) continued the improper usage of R. lherminieri as the senior name of R. anxia. Discussions with Dr. W. L. Downes, Jr. (deceased) by one of us

20

(GAD) provided some insight into how this mistake occurred. Both Downes and Dodge independently discovered characters to separate what was considered R. lherminieri into two species. The type of R. lherminieri was not actually seen by either of these specialists, but specimens representing the two species were independently sent to E. Seguy at the

National Museum of Natural History in Paris for comparison with the type. The best comparison that Seguy could make indicated that R. lherminieri = R. anxia. Unfortunately, neither specialist thought to include a third species, R. ochracea, which was originally considered just a variation of R. communis by Aldrich. In 1986, one of us (GAD) was able to examine the type of R. lherminieri at the Paris Museum and established from the holotype that R. lherminieri = R. ochracea. During the same year GAD was able to examine the types of Walker at the British Museum (Natural History). Direct examination of the types resulted in the currently recognized synonymy given in Pape (1996).

Ravinia anxia shows a great deal of intra-specific variation in the shape of the male and female genitalia. A close study of this species over its wide geographical range has led to suspicion that this high amount of variation may indicate that this name may apply to more than one species. Two larval forms are easily recognized from puparia pinned with reared flies and this is mentioned by Dodge (1956). The first form, which is very common in eastern North America, has very darkly pigmented dorsal tubercles. The second form lacks these conspicuous tubercles. Many genitalic dissections of specimens from pivotal areas, such as Arizona, Oregon, and Mexico, have led us to believe that R. anxia may represent a species complex. However, no definitive morphological characters have been found that will allow consistent separation of this species into smaller units. Genetic studies, like the one reported here, in association with multiple rearings of this species

21

(especially from western localities) should be very useful for determining the range of variation present in this species.

Morphological diagnoses for R. anxia, R. floridensis, R. lherminieri,

and R. querula

Ravinia anxia is easily separated from R. floridensis by its gray legs and palps. It can be separated from R. lherminieri by the gray coloration of its head and abdomen. Males can usually be separated from R. querula and R. lherminieri by the lesser developed lateral scutellar setae, as compared to the subapical scutellar setae. The trapezoidal appearance of sternite 7 of the female separates this species from the more rectangular sternite 7 of R. querula. The lack of golden pruinosity on the genital sternites will easily separate females from R. lherminieri.

Ravinia floridensis has been separated from other members of this group of species based on its distinctive coloration. This species exhibits bright and conspicuous orange palpi and legs.

Ravinia lherminieri is separated from the other species by the golden pruinose gena and apical abdominal sternites, but with gray leg coloration.

Ravinia querula is separated from all the other species except R. anxia by its gray pollinose gena and tergite 5. It can be separated from R. anxia in eastern North America by the presence of two pairs of well-developed lateral scutellar setae adjacent to the preapical scutellar setae. In western North America, the diagnostic shape of the male and female genitalia are more useful in separating this species from R. anxia, as the size of the scutellar setae are more variable in western individuals.

22

DNA extraction and amplification

DNA extractions using Qiagen’s DNeasy kit (Qiagen Cat. No. 69506) were performed on a whole fly leg cut into thirds. A region of the mitochondrial genome encoding COI and

COII, spanning position 1460-3775 of the Drosophila yakuba Burla, 1954 (Clary and

Wolstenholme 1985; GenBank accession number NC_0013322), was amplified in fragments using multiple primer pairs, listed in Stamper et al. (2013). PCR amplification solutions consisted of: 200 μM dNTPs (Promega), 1μM of each primer, 1.5mM MgCl2, 0.25 units

Platinum Pfx DNA polymerase (Invitrogen), buffer (final concentration: 50mM Tris, pH 8.5,

4 mM MgCl2, 20 mM KCl, 500 μg/ml bovine serum albumin, 5% DMSO) (Idaho Technology), and 3μl template DNA. Amplification was conducted using Rapid Cycler (Idaho

Technologies); thermocycling conditions are as follows: 95°C for 3min, followed by 35 cycles of 94 °C for 1s, 56-63°C for 1s, and 68°C for 20s. Forward strands only were sequenced with the amplification primer using BigDye chemistry (PE Applied Biosystems) on an automated ABI 3730 XL Instrument. Sequencing was performed by High Throughput

Genomics Unit at the Department of Genome Sciences, University of Washington.

Sequence assembly and characterization

Assembled sequences were edited in FinchTV v.1.4.0. (Geospiza, Inc. www.geospiza.com) and aligned in Mesquite v.2.73 (Maddison and Maddison 2011). The leucine tRNA gene separating COI and COII was excluded from analyses. The alignment program Muscle v.3.8 (Edgar 2004) plug-in was used in Mesquite v.2.73 and the subsequent alignment re-checked as both the translated amino acid sequence and the nucleotide sequence. No indels were present among sequences, as such the final alignment was identical to the preliminary Muscle v.3.8 alignment. We believe that nuclear insertions

23 of mitochondrial fragments (numts) were not erroneously sequenced due to the absence of indels, absence of in-frame stop codons, taxon specific COI + COII primers used, and careful examination of sequence characteristics such as amplicon length and quality score (phred) of peaks (Song et al. 2008). Average Tamura-Nei corrected distances (Tamura and Nei

1993) within species-level groups and smallest Tamura-Nei corrected distances between species-level groups were calculated in MEGA 5 (Tamura et al. 2011). The rate variation among sites was modeled with a gamma distribution (shape parameter =1). The Tamura-

Nei + gamma corrected distance was selected as the best-fit model according to

PartitionFinder from among models implemented in the program MEGA 5. Smallest inter- specific distances were used due to studies showing exaggeration of the divergence of species sequence variability (i.e., barcoding gap; Meier et al. 2006).

Phylogenetic inference and model selection

To infer relationships, two methods were used in the phylogenetic analyses: maximum likelihood (ML) using the program GARLI v2 (Genetic Algorithm for Rapid

Likelihood Inference; Zwickl 2006), and Bayesian inference (BI) using MrBayes v3.2.1

(Ronquist and Huelsenbeck 2003). Because both COI and COII make up a single linkage group, the fragments are expected to share a single tree topology. However, there might be differences in the evolutionary constraints across codon positions and possibly between the two genes (Shapiro et al. 2006; Bofkin and Goldman 2007). To accommodate gene/codon-position heterogeneity, DNA sequence data were partitioned into subsets, and model parameters estimated across subsets.

PartitionFinder (Lanfear et al. 2012) was used to select the substitution models and partitioning scheme for the ML and BI analyses. We compared 4 partition schemes:

24 unpartitioned; by gene (2 subsets); by codon position (3 subsets); and by both gene and codon position (6 subsets). We calculated the log-likelihood scores under 56 models of nucleotide evolution for each subset, including with and without gamma-distributed rates

(+G) and invariant sites (+I) for the ML analyses. For the Bayesian method, model selection was limited to those that could be implemented in MrBayes, using the function “models = mrbayes” in PartitionFinder. Models and partition schemes were selected using the Akaike information criterion (AIC; Akaike 1973, 1974) test (with and without AICc correction).

For BI, we used the model + partitioning scheme chosen in PartitionFinder, and sampled the posterior distribution by MCMC using MrBayes. Two independent Metropolis- coupled Markov chain Monte Carlo (MC3) (Geyer 1991; Gilks and Roberts 1996) analyses were run, each with one cold and three heated chains (temperature set to the default = 0.1) for 1x106 generations and sampled every 500 generations; the first 25% of samples were discarded as “burn-in.” Convergence on a stable posterior distribution was assessed by using the program Tracer v1.5 (Rambaut and Drummond 2007) to visually examine the long-term stability of the individual parameter estimates and the log-likelihood values, the harmonic mean likelihood values of chains, the similarity of trees, branch lengths, and the average standard deviation of split frequencies being less than 0.01 across runs. Posterior probabilities (PP) were used to assess nodal support.

The model and partition schemes selected by PartitionFinder as best supported by

AIC (with and without AICc correction) were as follows: ML analysis: COI first: TIM + I + G;

COI second: F81 + I; COI third: TrN + G; COII first: TrN + I; COII second: HKY + I; COII third:

HKY + G; BI analyses: COI +COII first: GTR + I + G; COI + COII second: HKY + I; COI + COII third: GTR + G.

25

The BI prior distributions were set as follows. 1st-position sites: substitution rates across partition expressed as proportions of the overall rate sum, Dirichlet (1, 1, 1, 1, 1, 1); substitution rates across site within partition, gamma distribution approximated with 4 categories with prior distribution for shape parameter  uniform (0,200); proportion of invariable sites uniformly distributed (0,1); nucleotide frequencies, Dirichlet (1, 1, 1, 1), likelihood summarized over all rate categories in each gene. 2nd-position sites: transition and transversion rates expressed as proportions of the rate sum with a  (1,1) prior; proportion of invariable sites, uniform (0, 1); nucleotide frequencies, Dirichlet (1, 1, 1, 1).

3rd-position sites: substitution rates across partition expressed as proportions of the overall rate sum, Dirichlet (1, 1, 1, 1, 1, 1); substitution rates across site within partition, gamma distribution approximated with 4 categories with prior distribution for shape parameter  uniform (0,200); nucleotide frequencies, Dirichlet (1, 1, 1, 1), likelihood summarized over all rate categories in each gene.

ML analyses were conducted using the maximum likelihood function in Garli v2. The optimal model + partition scheme was selected for the concatenated sequences with

PartitionFinder, using the AIC. Bootstrap resampling (Felsenstein 1985) was used to assess nodal support with 1000 pseudoreplicates. Program defaults were used for stopping the search. Maximum likelihood bootstrap probabilities (BPML) for the splits were mapped onto the bootstrap consensus tree using SumTrees of the DendroPy v 3.12.0 (Sukumaran and

Holder 2010) program.

26

2.4 RESULTS

DNA sequence data were obtained for 106 individuals (including the 4 specimens used as outgroups) comprising 10 of the 17 species of Ravinia that are found in North

America (Pape 1996; Pape and Dahlem 2010). The final molecular alignment consisted of

2239 base pairs (bp) of mtDNA sequence data: COI 1539 bp, with 350 parsimony- informative characters and COII 700 bp, with 136 parsimony-informative characters. Data matrices and trees used for this study have been uploaded to TreeBase

(http://purl.org/phylo/treebase/phylows/study/TB2:S15566). The phylogenetic hypotheses inferred using both ML and BI analyses resulted in the same general topology, with the BI consensus tree fully congruent with the best ML tree shown in Fig. 2.1. Average

Tamura-Nei corrected within-group pairwise (PW) distances and smallest Tamura-Nei corrected between-group PW distances are shown in Table 2.2.

We find strong support for two clades within Ravinia that correspond to the groups

Ravinia and Chaetoravinia (Fig. 2.1; BPML=85; PP=1.0). This grouping is also supported by the absence or presence of setae on the R1 wing vein in Ravinia and Chaetoravinia, respectively (Lopes 1969). Within Chaetoravinia, our single specimen of R. errabunda

(Wulp, 1895) is sister to the other three species (Fig. 2.1; BPML=97; PP=1). Ravinia stimulans (Walker, 1849), which is distinctive within Chaetoravinia based on male genital morphology, showed very little intra-specific variation and was strongly supported as sister to a clade comprising R. derelicta (Walker, 1853) + R. vagabunda (Wulp, 1895) (Fig.

2.1; BPML=99; PP=1.0). Our two specimens of R. vagabunda (from a single locality in New

Mexico) were monophyletic, but were weakly supported as nested within a paraphyletic R. derelicta (Fig. 2.1; BPML=62; PP=0.87).

27

Within the Ravinia group, R. pusiola (Wulp, 1895) is sister to the other five species:

R. anxia and R. querula, R. floridensis and R. lherminieri, and Ravinia planifrons (Aldrich,

1916). Our 10 specimens of R. pusiola (all of which are from New Mexico) are monophyletic and distinct from the other species (Fig. 2.1; BPML=99; PP=1.0). Within R. pusiola are two distinct haplotype groups. Ravinia planifrons was distinct and strongly supported as monophyletic (Fig. 2.1; BPML100; PP=1.0), and was sister to a clade consisting of R. anxia, R. querula, R. floridensis, and R. lherminieri (Fig. 2.1; BPML=54; PP=0.65). Ravinia lherminieri and R. floridensis formed a sister group to the R. anxia + R. querula group (Fig. 2.1; BPML=92;

PP=1.0). Our nine specimens of R. lherminieri sorted onto two distinct haplotype groups: the R. lherminieri from Florida and Tennessee formed one highly supported clade (Fig. 2.1;

BPML=100; PP=1.0), which was sister to a paraphyletic group comprised of all of our R. floridensis plus the R. lherminieri specimens from New York and West Virginia (Fig. 2.1;

BPML=100; PP=1.0). The R. lherminieri (FL+TN) haplotype group is distinct from the R. lherminieri (NY+WV) haplotype group (smallest PW inter-group distance = 5.2%). The average PW intra-group distances within each of these two R. lherminieri haplotype groups are low (0.1% and 0.4%), and similar to the intra-group distances of other Ravinia species

(Table 2.2). Our six R. floridensis individuals had a haplotype that was nearly identical to that of the R. lherminieri individuals collected from New York and West Virginia (smallest

PW inter-group distance = 0.0%), but distinct from that of the R. lherminieri collected from

Florida and Tennessee (smallest PW inter-group distance = 5.4%). In the group comprising

R. anxia and R. querula, we found highly supported paraphyletic relationships. Our 12 specimens of R. anxia and 16 specimens of R. querula sorted onto three distinct haplotype groups. The R. anxia from Minnesota and New Mexico form a highly supported clade (Fig.

28

2.1; BPML=98; PP=1.0), that is sister to a smaller clade with two terminal nodes comprised of: 1) R. querula from Oregon, Wisconsin, Georgia, and Virginia; and 2) R. anxia from

California and Oregon plus R. querula from California, Oregon, and New Mexico. Both of these terminal nodes were highly supported in both the ML and BI analyses (Fig. 2.1;

BPML>95; PP=1.0). Over the large range of sampling for the species of both R. anxia and R. querula, there are high levels of inter-, and intra-specific variation (Table 2.2).

2.5 DISCUSSION

Our analysis of the mtDNA phylogeny of Ravinia resulted in strong support for a number of inter-specific relationships, while also suggesting that several species may be either paraphyletic at the level of mtDNA genealogy or poorly delimited by current morphological characters. At deeper phylogenetic levels, our data strongly support separation of the species included in this study into two clades, which correspond to previously recognized species groupings at the generic or subgeneric level (Ravinia and

Chaetoravinia). The separation of these two groups is further supported morphologically based on the absence or presence of setae on the R1 wing vein. Together, the molecular and morphological data suggest that Ravinia and Chaetoravinia are likely to be natural groups, and merit recognition as subgenera of Ravinia or even as separate genera. However, formal recognition of a change in taxonomic status should await inclusion of additional species, particularly those expected to fall within the Adinoravinia group.

We sampled four species from the Chaetoravinia group: R. derelicta, R. errabunda, R. stimulans, and R. vagabunda. Two of these species, R. derelicta and R. stimulans, are both encountered very commonly and are both distributed throughout the continental United

States. Our collections include >40 specimens from widely distributed locations for each of

29 these two species. In contrast, R. errabunda (1 specimen in this study; Midwest and

Southern parts of the U.S. and Mexico) and R. vagabunda (2 specimens in this study;

Southwestern U.S. and Mexico) both have more restricted distributions and are both encountered much less frequently. Ravinia derelicta, R. stimulans, and R. vagabunda form a strongly supported clade (Fig. 2.1; BPML=99; PP=1.0), with R. errabunda as sister to that group. Ravinia stimulans shows very limited mtDNA variation across the entire range encompassed by our sampling and is sister to R. derelicta + R. vagabunda. Ravinia derelicta may be paraphyletic with respect to mtDNA, because the two specimens of R. vagabunda nested within R. derelicta. This is surprising, because the male genitalia of R. derelicta and

R. vagabunda are distinctly different from one another and these species are easily separated by morphology. In terms of morphology, R. vagabunda and R. stimulans exhibit much more similar male genitalia. However, it is not clear that this is a case of species-level paraphyly because the branches that place R. vagabunda within, rather than as sister to, R. derelicta are short, and the statistical support is weak (BPML=62; PP=0.87). Both of our specimens of R. vagabunda came from a single locality in New Mexico, so additional sampling would be important for fully understanding the relationship between these two species. Regardless of the placement of R. vagabunda, it is clear that R. derelicta has more genetic variability and intra-specific phylogenetic structure at the COI and COII loci, compared to R. stimulans (Table 2.2).

Six species were sampled from the Ravinia group: R. anxia, R. floridensis, R. lherminieri, R. querula, R. planifrons, and R. pusiola. Three of these species, R. anxia, R. pusiola, and R. querula, are widely distributed and commonly encountered throughout the continental United States and southern Canada. Ravinia lherminieri is mainly found in the

30 southeastern United States, while R. planifrons is more common in the Great Plains area of the central United States. Ravinia floridensis has a distribution restricted to the states of

Florida and Georgia, but is commonly collected in these areas. Ravinia pusiola is strongly supported as monophyletic (Fig. 2.1; BPML=99; PP= 1.0), and is strongly supported as the first branching within the Ravinia group (Fig. 2.1; BPML=95; PP=1.0). Ravinia planifrons is weakly supported as the next branching (Fig. 2.1; BPML=54; PP=0.65), with strong support for its sister clade comprising R. anxia, R. floridensis, R. lherminieri, and R. querula (Fig. 2.1;

BPML=92; PP=1.0). In this clade, two groups were inferred to be monophyletic with strong support, R. anxia + R. querula (Fig. 2.1; BPML=97; PP=1.0), and R. floridensis + R. lherminieri

(BPML=100; PP=1.0). Within each of these two latter clades, however, relationships are more complex, with R. anxia, R. lherminieri, and R. querula all showing paraphyly in the mtDNA tree (Fig. 2.1).

Ravinia floridensis and Ravinia lherminieri

Our results show that R. floridensis plus R. lherminieri form a strongly supported mtDNA clade. Within this group are two distinct and strongly supported clades, but these smaller clades do not correspond to the species as currently diagnosed by morphology. One clade contains all of our R. floridensis specimens plus several R. lherminieri specimens from

New York and West Virginia, while the other clade comprises all remaining R. lherminieri samples, including those captured within the geographic range of R. floridensis. The discrepancy between the mtDNA tree and the current morphology-based species identifications could have arisen in several ways. First, the current morphological species delimitation could be correct, in which case the mtDNA genealogy would reflect either old, retained polymorphism (incomplete lineage sorting; ILS) or post-speciation hybridization

31 followed by introgression of the R. floridensis mitochondrial genome into R. lherminieri. The hybridization hypothesis seems the less likely of the two possibilities, due to the large geographic distance between the flies that have R. floridensis-like mtDNA but were morphologically identified as R. lherminieri and the known range of R. floridensis (Florida and Georgia). Future data could be consistent with the hybridization/introgression hypothesis if it turns out the R. lherminieri from localities that are geographically intermediate between Florida and West Virginia, such as Georgia and the Carolinas, carry the R. floridensis form of mtDNA while having nuclear genomes that are typical for R. lherminieri. Second, R. floridensis and R. lherminieri could be a single species with both a color polymorphism and a relatively deep intra-species divergence between two mtDNA lineages. Finally, the deep division in the mtDNA phylogeny could reflect the actual species boundary. This would have two implications: 1) the traits used to separate R. floridensis and R. lherminieri under the current morphological definitions do not reflect species boundaries (this would mean, in particular, that there are populations of R. floridensis that lack the orange pedicel, palpi, and legs that are currently considered diagnostic for the species); and 2) the true geographic range of R. floridensis extends much farther north than is currently believed. Morphological variation within characters of both the male and female genitalia in Ravinia lherminieri makes it difficult to use those traits to evaluate species-level hypotheses. In our own collections it is difficult to find two male R. lherminieri that exactly match one another in the fine structures of the aedeagus. Even collections from a single locality on a single date showed morphological variability. Given this morphological ambiguity, the most important future data for inferring species boundaries in this clade might be expected to come from nuclear DNA.

32

Ravinia anxia and Ravinia querula

Ravinia anxia plus R. querula form another sister-species pair that shows divergent mtDNA lineages that are discordant with current morphological species definitions. There are 3 strongly supported mtDNA clades in this group: one comprised of only R. anxia collected in two widely separated localities (Minnesota and New Mexico); one comprised of only R. querula collected across a wide geographic area (e.g., Oregon, Wisconsin, Virginia); and a third clade containing individuals assigned to both species according to the morphological criteria. The mixed-species clade includes flies from the western U.S.

(California, Oregon, New Mexico). It is notable that our collections from Oregon include members of both the R. querula-only and the mixed-species mtDNA lineages.

The morphological distinction between R. anxia and R. querula is well established, so it is very unlikely that they would represent a single species. Otherwise, the same set of possible explanations for discordance between mtDNA genealogy and morphological species identifications apply to the R. anxia + R. querula clade: ILS, hybridization, or incorrect morphological delimitation. There has been speculation in the past that R. anxia may contain more than one species (see 2.3 Material and Methods: Nomenclatural Notes) and the mitochondrial genealogy inferred in the present study (Fig. 2.1) would be consistent with the presence of more than one species within what is now called R. anxia.

However, no adult traits have been identified that correspond to the two larval morphotypes described by Dodge (1956) and we do not yet have specimens available with which to test the hypothesis that the larval traits correspond to the division seen in the mtDNA genealogy. As adults, R. anxia exhibit morphological variation over the wide geographic range it inhabits, but more extensive molecular sampling will be required to

33 determine if molecular variation is associated with specific morphological characters that could objectively identify smaller “species” units, or if R. anxia is a single, geographically variable species. Mitochondrial data alone are likely not adequate for a full study of species limits, so future studies of this potential complex should include multiple loci (both mitochondrial and nuclear) and further morphological analysis of individuals (both larvae and adults) sampled from across a larger geographic range.

Mitochondrial/Morphological Discordance and Molecular Species Identification

The goal of this study was to explore phylogenetic relationships within Ravinia, and to that end all species identifications were performed using morphological characteristics.

Mitochondrial DNA data has been used for species identification in dipteran taxa (and sarcophagid taxa in particular), dating back to Sperling et al. (1994). In recent years, several studies have demonstrated the utility of mitochondrial data for species identification in different dipteran taxa, using COI (Meiklejohn et al. 2011, 2012, 2013a;

Jordaens et al. 2013), COII (Guo et al. 2010), COI and COII (Tan et al. 2010), and sequence data from other mtDNA genes and morphology (i.e., COI and CAD and morphology;

Meiklejohn et al. 2013b). Our results showing discordance between morphology and the mtDNA phylogeny within the Ravinia group suggest that mtDNA might not be appropriate for species identification purposes in at least some Ravinia unless future data support delimitation of species along the lines suggested by the tree in Fig. 2.1.

2.6 CONCLUSIONS

This study is the first to analyze the phylogenetic relationships within Ravinia using

DNA sequence data. We find strong support for a number of interspecies relationships within Ravinia. At the deepest level, the mtDNA phylogeny is consistent with a hypothesis

34 that the previously recognized subdivisions Ravinia and Chaetoravinia are natural groups.

If this result holds for future studies that include more species (especially those expected to be part of an Adinoravinia lineage), then formal recognition of subgeneric or generic status for these three groups would be warranted. Molecular phylogenetic studies are often characterized by limited sampling, and this study is no exception. We have sampled only those species of Ravinia found in North America, but there is an extensive diversity of

Ravinia in Central and South America, plus one species found only in the Old World. Even within North America there were a number of species for which we obtained no specimens and others for which our sampling was limited, due to the inherent rarity of several species

(c.f. Lim et al. 2012). Nevertheless, an important result of this study – the discordance between morphological species criteria and the mitochondrial phylogeny of 1) R. anxia and

R. querula, and 2) R. floridensis and R. lherminieri – was not substantially affected by limited availability of specimens. The present data cannot, however, distinguish between the various hypotheses (incomplete lineage sorting, introgression, undescribed species) to explain those discordant patterns. Further research focusing on increased geographic sampling of the species that exhibit discordance, plus in-depth sampling of the nuclear genome and morphological analysis of those same species will be needed in order to address these species delimitation problems. Further examination of the larger picture of

Ravinia phylogeny will require increased species sampling, especially including the Old

World and Neotropical species of Ravinia.

35

ACKNOWLEDGMENTS

We thank Amanda J. Stein (New York), Tom Lyons (New York), Ed Quin (Minnesota),

Kristel Peters (Minnesota), Eric Lobner (Wisconsin), Donna Watkins (Florida) and Gary

Steck (Florida) for their help in obtaining collection permits in their respective state’s parks. We thank Rudolf Meier and two anonymous reviews for comments that greatly improved this manuscript.

This project was supported by Grant No. 2005-DA-BX-K102 awarded by the

National Institute of Justice (NIJ), Office of Justice Programs, US Department of Justice. The opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the Department of Justice.

The authors recognize no conflicts of interest.

36

2.7 TABLES

Table 2.1. Specimen locality data and references for COI and COII sequences included in

this study. Localities are in the U.S.A. and are reported as county and state.

Species Locality Sex GenBank Accession Voucher/Citation COI/COII Blaesoxipha arizona Pape Grant, NM M JQ806809/JQ806788 AW32/Stamper et al. 2013 Blaesoxipha cessator (Aldrich) Grant, NM M JQ806810/JQ806789 AW27/Stamper et al. 2013 Oxysarcodexia cingarus (Aldrich) Schuyler, NY M JQ806816/JQ806795 AP68/Stamper et al. 2013 Oxysarcodexia ventricosa (Wulp) Hamilton, OH M GQ223321 E8/Stamper et al. 2013 Ravinia anxia (Walker) Siskiyou, CA M JQ807069/KJ586403 AE81 Ravinia anxia (Walker) Jefferson, OR F KJ586343/KJ586404 AG51 Ravinia anxia (Walker) Jefferson, OR F KJ586344/KJ586405 AH52 Ravinia anxia (Walker) Jefferson, OR M KJ586345/KJ586406 AJ18 Ravinia anxia (Walker) Jefferson, OR F KJ586346/KJ586407 AK25 Ravinia anxia (Walker) Kandiyohi, MN F KJ586347/KJ586408 AT44 Ravinia anxia (Walker) Kandiyohi, MN F JQ807068/KJ586409 AT45 Ravinia anxia (Walker) Kandiyohi, MN F KJ586348/KJ586410 AT46 Ravinia anxia (Walker) Kandiyohi, MN F KJ586349/KJ586411 AT47 Ravinia anxia (Walker) Kandiyohi, MN M KJ586350/KJ586412 AT49 Ravinia anxia (Walker) Kandiyohi, MN M JQ807070/KJ586413 AT50 Ravinia anxia (Walker) Grant, NM F KJ586351/KJ586414 AW48 Ravinia derelicta (Walker) Columbia, FL F KJ586352/KJ586415 AA10 Ravinia derelicta (Walker) Columbia, FL F KJ586353/KJ586416 AA13 Ravinia derelicta (Walker) Liberty, FL F KJ586354/KJ586417 AA25 Ravinia derelicta (Walker) Liberty, FL F KJ586355/KJ586418 AA26 Ravinia derelicta (Walker) Madison, FL F KJ586356/KJ586419 AA38 Ravinia derelicta (Walker) Walton, FL F KJ586357/KJ586420 AB05 Ravinia derelicta (Walker) Washington, FL F KJ586358/KJ586421 AB41 Ravinia derelicta (Walker) Highland, FL F KJ586359/KJ586422 AC75 Ravinia derelicta (Walker) Highland, FL F KJ586360/KJ586423 AC76 Ravinia derelicta (Walker) Highland, FL M KJ586361/KJ586424 AC81 Ravinia derelicta (Walker) Highland, FL M KJ586362/KJ586425 AD01 Ravinia derelicta (Walker) Highland, FL M KJ586363/KJ586426 AD02 Ravinia derelicta (Walker) Highland, FL M KJ586364/KJ586427 AD06 Ravinia derelicta (Walker) Highland, FL F KJ586365/KJ586428 AD07 Ravinia derelicta (Walker) Highland, FL F KJ586366/KJ586429 AD11 Ravinia derelicta (Walker) Martin, FL M KJ586367/KJ586430 AD40 Ravinia derelicta (Walker) Martin, FL M KJ586368/KJ586431 AD42 Ravinia derelicta (Walker) Glenn, CA F KJ586369/KJ586432 AE57 Ravinia derelicta (Walker) Jefferson, TN F KJ586370/KJ586433 AN13 Ravinia derelicta (Walker) Jefferson, TN M KJ586371/KJ586434 AN18 Ravinia derelicta (Walker) Hamilton, OH F JQ807072/KJ586435 AZ09 Ravinia derelicta (Walker) Bland, VA F JQ807073/KJ586436 AZ62 Ravinia derelicta (Walker) Hamilton, OH M JQ807075/KJ586437 BA29 Ravinia derelicta (Walker) Hamilton, OH M JQ806817/JQ806796 BA30/Stamper et al. 2013 Ravinia errabunda (Wulp) Glenn, CA M KJ586372/KJ586438 AE51 Ravinia floridensis (Aldrich) Duval, FL M JQ807076/KJ586439 AA01 Ravinia floridensis (Aldrich) Duval, FL M JQ807077/KJ586440 AA03 Ravinia floridensis (Aldrich) Duval, FL F KJ586373/KJ586441 AA04 Ravinia floridensis (Aldrich) Sarasota, FL M JQ807078/KJ586442 AA60 Ravinia floridensis (Aldrich) Washington, FL M JQ807079/KJ586443 AB15

37

Ravinia floridensis (Aldrich) Martin, FL F KJ586374/KJ586444 AD41 Ravinia lherminieri (Robineau-Desvoidy) Madison, FL M JQ807080/KJ586445 AA36 Ravinia lherminieri (Robineau-Desvoidy) Jefferson, TN M JQ807081/KJ586446 AN14 Ravinia lherminieri (Robineau-Desvoidy) Jefferson, TN M JQ807082/KJ586447 AN15 Ravinia lherminieri (Robineau-Desvoidy) Jefferson, TN M JQ806818/JQ806797 AN16/Stamper et al. 2013 Ravinia lherminieri (Robineau-Desvoidy) Jefferson, TN M JQ807083/KJ586448 AN17 Ravinia lherminieri (Robineau-Desvoidy) Saratoga, NY F JQ807099/KJ586449 AQ58 Ravinia lherminieri (Robineau-Desvoidy) Saratoga, NY F KJ586375/KJ586450 AR49 Ravinia lherminieri (Robineau-Desvoidy) Kanawha, WV F JQ807084/KJ586451 AZ44 Ravinia lherminieri (Robineau-Desvoidy) Kanawha, WV M JQ807085/KJ586452 AZ58 Ravinia planifrons (Aldrich) Siskiyou, CA M JQ807088/KJ586453 AF06 Ravinia planifrons (Aldrich) Jefferson, OR M KJ586376/KJ586454 AG34 Ravinia planifrons (Aldrich) Jefferson, OR M KJ586377/KJ586455 AG49 Ravinia planifrons (Aldrich) Jefferson, OR M JQ807089/KJ586456 AK08 Ravinia pusiola (Wulp) Grant, NM F JQ807094/KJ586457 AW64 Ravinia pusiola (Wulp) Grant, NM M JQ807095/KJ586458 AW73 Ravinia pusiola (Wulp) Grant, NM M JQ807093/KJ586459 AX61 Ravinia pusiola (Wulp) Grant, NM M JQ807096/KJ586460 AX62 Ravinia pusiola (Wulp) Grant, NM M JQ806819/JQ806798 AX63/Stamper et al. 2013 Ravinia pusiola (Wulp) Grant, NM F KJ586378/KJ586461 AY38 Ravinia pusiola (Wulp) Grant, NM M JQ807090/KJ586462 AY39 Ravinia pusiola (Wulp) Grant, NM M JQ807091/KJ586463 AZ05 Ravinia pusiola (Wulp) Grant, NM M JQ807092/KJ586464 BA18 Ravinia pusiola (Wulp) Grant, NM M JQ807098/KJ586465 BA19 Ravinia querula (Walker) Siskiyou, CA M KJ586379/KJ586466 AF03 Ravinia querula (Walker) Clackamas, OR M KJ586380/KJ586467 AF16 Ravinia querula (Walker) Jefferson, OR M KJ586381/KJ586468 AG28 Ravinia querula (Walker) Jefferson, OR M KJ586382/KJ586469 AG30 Ravinia querula (Walker) Jefferson, OR M KJ586383/KJ586470 AG31 Ravinia querula (Walker) Jefferson, OR M KJ586384/KJ586471 AG32 Ravinia querula (Walker) Jefferson, OR M KJ586385/KJ586472 AG33 Ravinia querula (Walker) Crook, OR M KJ586386/KJ586473 AH42 Ravinia querula (Walker) Crook, OR F KJ586387/KJ586474 AH51 Ravinia querula (Walker) Jefferson, OR M KJ586388/KJ586475 AJ19 Ravinia querula (Walker) Jefferson, OR F KJ586389/KJ586476 AJ23 Ravinia querula (Walker) Jefferson, OR M KJ586390/KJ586477 AK05 Ravinia querula (Walker) Rabun, GA M KJ586391/KJ586478 AL32 Ravinia querula (Walker) Marathon, WI F KJ586392/KJ586479 AS25 Ravinia querula (Walker) Bland, VA M JQ807101/KJ586480 AZ61 Ravinia querula (Walker) Grant, NM M JQ807102/KJ586481 BA16 Ravinia stimulans (Walker) Volusia, FL F KJ586393/KJ586482 AD60 Ravinia stimulans (Walker) Volusia, FL M KJ586394/KJ586483 AD64 Ravinia stimulans (Walker) Volusia, FL M KJ586395/KJ586484 AD65 Ravinia stimulans (Walker) Volusia, FL M KJ586396/KJ586485 AD66 Ravinia stimulans (Walker) Volusia, FL M KJ586397/KJ586486 AD67 Ravinia stimulans (Walker) Volusia, FL M KJ586398/KJ586487 AD68 Ravinia stimulans (Walker) Newberry, SC F KJ586399/KJ586488 AE11 Ravinia stimulans (Walker) Newberry, SC M KJ586400/KJ586489 AE12 Ravinia stimulans (Walker) Saratoga, NY M JQ807105/KJ586490 AR33 Ravinia stimulans (Walker) Kittson, MN F KJ586401/KJ586491 AU03 Ravinia stimulans (Walker) Hamilton, OH F JQ807109/KJ586492 AZ10 Ravinia stimulans (Walker) Hamilton, OH F JQ807103/KJ586493 AZ11 Ravinia stimulans (Walker) Kanawha, WV F JQ807110/KJ586494 AZ45 Ravinia stimulans (Walker) Kanawha, WV F JQ807111/KJ586495 AZ46 Ravinia stimulans (Walker) Bland, VA F JQ807112/KJ586496 AZ60

38

Ravinia stimulans (Walker) Hamilton, OH M JQ807108/KJ586497 BA27 Ravinia stimulans (Walker) Hocking, OH F KJ586402/KJ586498 D45 Ravinia stimulans (Walker) Hamilton, OH M GQ223320 E7/Stamper et al. 2013 Ravinia vagabunda (Wulp) Grant, NM M JQ807106/KJ586499 BA15 Ravinia vagabunda (Wulp) Grant, NM M JQ807107/KJ586500 BA17

39

Table 2.2. Tamura-Nei + G corrected distances within and between indicated groups.

Average Intra-Specific Distances, % Ravinia anxia (CA + MN + NM+ OR) 1.6 R. anxia (CA + OR) 0.1 R. anxia (MN + NM) 0.1 Ravinia derelicta 0.9 Ravinia floridensis 0.2 Ravinia lherminieri (FL + NY + TN +WV) 3.2

R. lherminieri (NY+WV) 0.1 R. lherminieri (FL + TN) 0.4 Ravinia planifrons 0.2 Ravinia pusiola 1.0 Ravinia querula 1.0 Ravinia stimulans 0.4 Ravinia vagabunda 0.1 Smallest Inter-Group Distances, % R. anxia (CA + OR) vs. R. querula (Para) 0.0 R. anxia (NM +MN) vs. R. querula (Para) 2.6 R. anxia: CA + OR vs. NM + MN 2.6 R. querula (Mono) vs. R. querula (Para) 2.4 R. floridensis vs. R. lherminieri (FL + TN) 5.4 R. floridensis vs. R. lherminieri (NY + WV) 0.0 R. lherminieri: NY + WV vs. FL + TN 5.2 R. derelicta vs. R. vagabunda 1.1

40

2.8 FIGURES

Figure 2.1. The inferred phylogenetic hypothesis of Ravinia species relationships of North

America, using maximum likelihood analysis of mitochondrial COI and COII genes.

Support values for nodes are given in the following order: bootstrap support (%)

from ML analysis using GARLI/Bayesian posterior probabilities using MrBayes.

41

2.9 REFERENCES

Akaike, H. (1973). Information theory and an extension of the maximum likelihood

principle. In ‘Second International Symposium on Information Theory’. (Eds. B. N.

Petrov and F. Caski.) pp. 267-281. (Akademiai Kiado: Budapest, Hungary).

Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on

Automatic Control 19, 716-723.

Aldrich, J. M. (1916). Sarcophaga and allies in North America, Entomological Society of

America. (Murphy-Divins Co. Press: Lafayette, IN, USA.)

Aldrich, J. M. (1930). Notes on the types of American two-winged flies of the genus

Sarcophaga and a few related forms, described by the early authors. Proceedings of

the United States National Museum 78, 1-39.

Bofkin, L., and Goldman, N. (2007). Variation in evolutionary processes at different codon

positions. Molecular Biology and Evolution 24, 513-521.

Caterino, M. S., Cho, S., and Sperling, F. A. H. (2000). The Current State of Insect Molecular

Systematics: A Thriving Tower of Babel. Annual Review of Entomology 45, 1-54.

Catleka, B. C., and McAllister, B. F. (2004). A genealogical view of chromosomal evolution

and species delimitation in the Drosophila virilis species subgroup. Molecular

Phylogenetics and Evolution 33, 664-670.

Clary, D. O., and Wolstenholme, D. R. (1985). The mitochondrial DNA molecule of

Drosophila yakuba: nucleotide sequence, gene organization, and genetic code.

Journal of Molecular Evolution 22, 252-271.

Dodge, H. R. (1956). New North American Sarcophagidae, with some new synonymy

(Diptera). Annals of the Entomological Society of America 49, 182-190.

43

Edgar, R. C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Research 32, 1792-1797.

Felsenstein, J. (1985). Phylogenies and the comparative method. American Naturalist 125,

1-15.

Geyer, C. J. (1991). Markov chain Monte Carlo maximum likelihood. In ‘Computing Science

and Statistics: Proceedings of the 23rd Symposium on the Interface’. (Ed. E. M.

Keramides.) pp.156-163. (Interface Foundation: Fairfax Station, USA).

Gilks, W. R., and Roberts, G. O. (1996). Strategies for improving MCMC. In ‘Markov chain

Monte Carlo in Practice’. (Eds. W. R. Gilks, S. Richardson, and D. J. Spiegelhalter.) pp.

89-114. (Chapman & Hall: London, UK).

Giroux, M., Pape, T., and Wheeler, T. (2010). Towards a phylogeny of the flesh flies (Diptera:

Sarcophagidae): Morphology and phylogenetic implications of the acrophallus in the

subfamily . Zoological Journal of the Linnaean Society 158, 740-778.

Guo, Y. D., Cai, J. F., Li, X., Xiong, F., Su, R. N., Chen, F. L., Liu, Q. L., Wang, X. H., Chang, Y. F.,

Zhong, M., Wang, X., and Wen, J. F. (2010). Identification of the forensically

important sarcophagid flies Boerttcherisca peregrina, Parasarcophaga albiceps and

Parasarcophaga dux (Diptera: Sarcophagidae) based on COII gene in China. Tropical

Biomedicine 27, 451-460.

Hall, D. G. (1928). Sarcophaga pallinervis and related species in the Americas. Annals of the

Entomological Society of America 21, 331-352.

Hebert, P. D. N., Penton, E. H., Burns, J. M., Janzen, D. H., and Hallwachs, W. (2004a). Ten

species in one: DNA barcoding reveals cryptic species in the neotropical skipper

44

butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences 101,

14812-14817.

Hebert, P. D. N., Stoeckle, M. Y., Zemlak, T. S., and Francis, C. M. (2004b). Identification of

birds through DNA barcodes. PLoS Biology 2, 1657-1663.

Jordaens, K., Sonet, G., Richet, R., Dupont, E., Braet, Y., and Desmyter, S. (2013).

Identification of forensically important Sarcophaga species (Diptera:

Sarcophagidae) using the mitochondrial COI gene. International Journal of Legal

Medicine 127, 491-504.

Kutty, S. J., Pape, T., Wiegmann, B. M., and Meier, R. (2010). Molecular phylogeny of the

Calyptratae (Diptera: Cyclorrhapha) with an emphasis on the superfamily

Oestroidae and the position of Mystacinobiidae and McAlpine’s fly. Systematic

Entomology 35, 614-635.

Lanfear, R., Calcott, B., Ho, S. Y. W., and Guindon, S. (2012). PartitionFinder: Combined

selection of partitioning schemes and substitution models for phylogenetic analyses.

Molecular Biology and Evolution 29, 1695-1701.

Lim, G. S., Balke, M., and Meier, R. (2012). Determining species boundaries in a world full of

rarity: singletons, species delimitation methods. Systematic Biology 61, 165-169.

Lopes, H. S. (1969). Family Sarcophagidae. In ‘A catalogue of the Diptera of the Americas

South of the United States, Vol. 103’. (Ed. N. Papavero.) pp. 1-88. (Departmento de

Zoologia, Secretaria da Agricultura: São Paulo, Brazil).

Lopes, H. S. (1982). The importance of the mandible and clypeal arch of the first instar

larvae in the classification of the Sarcophagidae (Diptera). Revista Brasileira de

Biologia 26, 293-326.

45

Maddison, W. P., and Maddison, D. R. (2011). ‘Mesquite 2.73: a modular system for

evolutionary analysis.’ Available at http://mesquiteproject.org.

Meier, R., Shiyang, K., Vaidya, G., and Ng, P. K. L. (2006). DNA Barcoding and Taxonomy in

Diptera: A Tale of High Intraspecific Variability and Low Identification Success.

Systematic Biology 55, 715-728.

Meiklejohn, K. A., Wallman, J. F., and Dowton, M. (2011). DNA-based identification of

forensically important Australian Sarcophagidae (Diptera). International Journal of

Legal Medicine 125, 27-32.

Meiklejohn, K. A., Wallman, J. F., Cameron, S. L., and Dowton, M. (2012). Comprehensive

evaluation of DNA barcoding for the molecular species identification of forensically

important Australian Sarcophagidae (Diptera). Invertebrate Systematics 26, 515-

525.

Meiklejohn, K. A., Wallman, J. F., and Dowton, M. (2013a). DNA Barcoding Identifies all

Immature Life Stages of a Forensically Important Flesh Fly (Diptera:

Sarcophagidae). Journal Forensic Sciences 58, 184-187.

Meiklejohn, K. A., Wallman, J. F., Pape, T., Cameron, S. L., and Dowton, M. (2013b). Utility of

COI, CAD and morphological data for resolving relationships within the genus

Sarcophaga (sensu lato) (Diptera: Sarcophagidae): A preliminary study. Molecular

Phylogenetics and Evolution 69, 133-141.

Moore, W. S. (1995). Inferring phylogenies from mtDNA variation: Mitochondrial gene trees

versus nuclear gene trees. Evolution 49, 718-726.

Pape, T. (1994). The world Blaesoxipha Loew, 1861 (Diptera: Sarcophagidae). Entomologica

Scandinavica Supplement 45, 1-247.

46

Pape, T. (1996). Genus Ravinia Robineau-Desvoidy. In ‘Catalogue of the Sarcophagidae of

the world (Insecta: Diptera)’. pp. 284-290. (Associated Publishers: Gainesville, FL,

USA.)

Pape, T., and Dahlem, G.A. (2010). Sarcophagidae (Flesh Flies). In ‘Manual of Central

American Diptera, Volume 2’. (Eds. Brown, B.V. et al.) pp. 1313-1335. (NRC Research

Press: Ottawa, Ontario, CAN.)

Parker, R. R. (1914). Sarcophagidae of New England: males of the genera Ravinia and

Boettcheria. Proceedings of the Boston Society of Natural History 35, 1-77.

Pickens, L. G. (1981). The life history and predatory efficiency of Ravinia lherminieri

(Diptera: Sarcophagidae) on the face fly (Diptera: Muscidae). Canadian Entomologist

113, 523-526.

Rambaut, A., and Drummond, A. J. (2007). ‘Tracer (Version 1.4).’ Available at

http://beast.bio.ed.ac.uk/TRACER.

Reid, N. M., and Carstens, B.C. (2012). Phylogenetic estimation error can decrease the

accuracy of species delimitation: a Bayesian implementation of the general mixed

Yule-coalescent model. BMC Evolutionary Biology 12, 196-206.

Robineau-Desvoidy, J. B. (1863). Histoire naturelle des Dipteres des environs de Paris. Vol.

2, pp. 920. (Paris, France).

Ronquist, F., and Huelsenbeck, J. P. (2003). MrBayes 3: Bayesian phylogenetic inference

under mixed models. Bioinformatics 19, 1572-1574.

Shapiro, B., Rambaut, A., and Drummond, A. J. (2006). Choosing appropriate substitution

models for phylogenetic analysis of protein-coding sequences. Molecular Biology and

Evolution 23, 7-9.

47

Smith, M. A., Wood, D. A., Janzen, D. H., Hallwacks, W., and Hebert, P. D. N. (2007). DNA

barcodes affirm that 16 species of apparently generalist tropical parasitoid flies

(Diptera, Tachinidae) are not all generalists. Proceedings of the National Academy of

Sciences 104, 4967-4972.

Song, H., Buhay, J. E., Whiting, M. F., and Crandall, K. A. (2008). Many species in one: DNA

barcoding overestimates the number of species when nuclear mitochondrial

pseudogenes are coamplified. Proceedings of the National Academy of Sciences 105,

13486-13491.

Sperling, F. A. H., Anderson, G. S., and Hickey, D. A. A. (1994). DNA-based approach to the

identification of insect species used for post-mortem interval estimation. Journal of

Forensic Sciences 39, 418-427.

Spillings, B. L., Brooke, B. D., Koekemoer, L. L., Chiphwanya, J., Coetzee, M., and Hunt, R. H.

(2009). A new species concealed by Anopheles funestus Giles, a major malaria vector

in Africa. American Journal of Tropical Medicine and Hygiene 81, 510-515.

Stamper, T., Dahlem, G. A., Cookman, C., and DeBry, R. W. (2013). Phylogenetic relationships

of flesh flies in the subfamily Sarcophaginae based on three mtDNA fragments

(Diptera: Sarcophagidae). Systematic Entomology 38, 35-44.

Sukumaran, J., and Holder, M. T. (2010). DendroPy: A Python library for phylogenetic

computing. Bioinformatics 26, 1569-1571.

Tamura, K., and Nei, M. (1993). Estimation of the number of nucleotide substitutions in the

control region of mitochondrial-DNA in humans and chimpanzees. Molecular Biology

and Evolution 10, 512-526.

48

Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., and Kumar, S. (2011). MEGA5:

Molecular Evolution Genetics Analysis using Maximum Likelihood, Evolutionary

Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution 28,

2731-2739.

Tan, S. H., Rizman-Idid, M., Mohd-Aris, E., Kurahashi, H. and Mohamed, Z. (2010). DNA-

based characterization and classification of forensically important flesh flies

(Diptera: Sarcophagidae) in Malaysia. Forensic Science International 199, 43-49.

Wells, J. D., and Stevens, J. R. (2009). Molecular Methods for Forensic Entomology. In

‘Forensic Entomology: The Utility of in Legal Investigations, 2nd Ed’.

(Eds. J. H. Byrd, and J.L. Castner.) pp. 439-455. (CRC Press: London, UK.)

Wiens, J. J., and Penkrot, T. L. (2002). Delimiting species based on DNA and morphological

variation and discordant species limits in spiny lizards (Sceloporus). Systematic

Biology 51, 69-91.

Wiens, J. J., and Servedio, M. R. (2000). Species delimitation in systematics: Inferring

diagnostic difference between species. Proceedings of the Royal Society of London B

267, 631-636.

Zwikl, D. J. (2006). ‘Genetic algorithm approaches for the phylogenetic analysis of large

biological sequence datasets under the maximum likelihood criterion.’ PhD thesis.

(The University of Texas: USA.)

49

Chapter Three:

Coalescent DNA-based species delimitation within the flesh fly genus Ravinia

Robineau-Desvoidy, 1863 (Diptera: Sarcophagidae).1

1 This manuscript has been formatted for submission to Systematic Biology

50

3.1 ABSTRACT

Species and their constituent populations are the units upon which evolution acts.

Traditionally morphological characters are the first tools used in the identification of species boundaries. However, the increase in availability and decrease in cost of obtaining genetic data in non-model organisms has allowed researchers to go beyond the conventional aspect of naming species to understanding the processes that occur during speciation events. Additionally, using genetic data is helpful in situations where other lines of evidence (e.g., morphology) are not available or are in disagreement. In this study we investigate species limits in a complex of two morphologically variable flesh fly species

(Sarcophagidae: Ravinia) found in the United States. Multiple approaches were used in inferring species boundaries; including multilocus coalescent-based species-tree methods, population genetic analyses, migration analyses, and standard phylogenetic inference. We find three morphologically cryptic lineages that are consistently recovered in analyses. To prevent continued confusion within this complex we suggest the names Ravinia anxia

(sensu stricto), Ravinia n.sp., and Ravinia querula (sensu stricto) be used when referring to these individuals until formal species descriptions can be conducted. Through this study we clarify the species limits in this morphologically cryptic group and provide a backbone for future species descriptions using morphological characters.

Keywords: delineation, cryptic species, speciation, multilocus coalescent, species-tree,

molecular, phylogenetics, North America

51

3.2 INTRODUCTION

Species, and their constituent populations, are the units upon which evolution acts.

The ability to accurately identify and delimit species boundaries is crucial for research, but is also important for public policy (e.g., conservation: Lowenstein et al. 2009; disease vectors: Spillings et al. 2009). Species delimitation can be described as the process through which species boundaries are determined, new species are discovered, and currently recognized species are found to be synonymous (Wiens 2007). Here we delimit members of the flesh fly genus, Ravinia, using genetic data as the primary data for species delimitation and provide evidence for recent (or ongoing) interspecific gene flow. To accomplish this we utilize DNA-based species-tree and coalescent-based methods, phylogenetic inference, population structure analyses, and migration analyses between identified species groups.

Despite considerable debate, a widely held view is that species are independently- evolving metapopulation lineages, as described by the General Lineage Concept (GLC; de

Queiroz 2007). This concept has had a direct impact on our understanding of species, in that multiple lines of evidence (i.e., reproductive isolation, monophyly, behavior, morphology) can be used as secondary criteria to identify lineage divergence. Under the

GLC, conflict among alternative species criteria can be attributed to the variation of when in time the lineage divergence, or speciation event, is estimated to have occurred. Recognizing species as lineages provides a justification for the use of genetic data in the delimitation of species, as genetic data reflect processes that were involved in the speciation event (i.e., lineage divergence). Speciation events and lineage divergences are recorded through the sorting of ancestral alleles in the genome of an organism. The genetic record can then be used to delimit species without the restrictive requirement of monophyletic loci (Knowles

52 and Carstens 2007). This sorting of ancestral alleles is the basis of coalescent theory

(Klingman 1982; Hudson 1991), which models species as lineages, and allows one to estimate the probability of allele fixation within said lineages. Consequently, through the application of the GLC, multilocus DNA sequence data are increasingly being used in species delimitation studies.

The increased availability of multilocus data sets and the recent advent of analytical methods capable of estimating migration rates have provided the ability to analyze species both as independent lineages and as dynamic entities. While recent (or ongoing) interspecific gene flow is a highly debated topic and can further complicate the recognition of lineage divergence, several studies have provided examples of interspecific gene flow occurring in a variety of taxa (e.g., Shaw 2002; Emelianov et al. 2004; Llopart et al. 2005;

Niemiller et al. 2008, 2012). A primary goal in species delimitation studies is discovering characters (e.g., molecular, morphological, etc.) that accurately and uniquely describe a species. However, migration and consequent gene flow from one population into another is of great concern in species delimitations. Gene flow, whether a result of introgression, migration, or hybridization, complicates the recognition of species boundaries by increasing interspecific similarity. Gene flow may lead to genetic similarity among species that are phenotypically divergent (Keck and Near 2010). Gene flow can also produce phenotypic similarities (i.e., hybrids) among genetically divergent species depending upon which genes recombine and which genes are sampled for analyses. This can be a significant problem if working under the GLC; defining species as independently-evolving metapopulation lineages (de Queiroz 2007). In fact, some of the most commonly used coalescent-based methods incorporate an explicit assumption of no recent or ongoing

53 migration (e.g., *BEAST, Heled and Drummond 2010; Grummer et al. 2014; BP&P, Yang and

Rannala 2010). In addition, the effects caused by incomplete lineage sorting (ILS) are difficult to separate from those effects resulting from gene flow across species boundaries

(Funk and Omland 2003). Although gene flow and ILS may produce a similar topology in the inferred trees, these two processes have different implications in species delimitation

(Harrington and Near 2012). Taking a multilocus approach to species delimitation allows for: 1) the correction of the stochastic nature of alleles through the coalescent caused by

ILS; 2) estimating the rates of gene flow between species units to determine if migration has taken place.

The flesh fly family Sarcophagidae contains approximately 2600 described species

(Pape 1996), that can be found on every continent except Antarctica. A large number of sarcophagid species are carrion-feeding , with some species’ life histories being important to forensic entomology in estimations of time since death (post-mortem interval:

Greenberg 1991; Kutty et al. 2010). Many flesh fly species play an important role in the recycling of nutrients in the environment. Typically, sarcophagid flies are identified to species by an expert using morphological characters that are believed to be unique to that group of organisms. A preponderance of the molecular studies on sarcophagid flies have focused on using DNA data to provide a means to identify individuals to species, with emphasis on the forensically important species (e.g., Meiklejohn et al. 2011, 2012, 2013a;

Jordaens et al. 2013). These studies have used either a single locus (COI, mitochondrial

DNA (mtDNA): Guo et al. 2010; Tan et al. 2010; Meiklejohn et al. 2011, 2012, 2013a;

Jordaens et al. 2013), or concatenated gene fragments (e.g., Meiklejohn et al. 2013b). The data sets generated by these studies provide a framework within which genetic evidence

54 can be used to place an unknown specimen (a larva, for example) into a morphologically defined species. Such studies presuppose that our current understanding of species boundaries based on morphological traits studies are correct. In some Sarcophagidae, however, suggestions have been made in the literature that there might be more species than are currently recognized (e.g., Meiklejohn et al. 2011; Braga et al. 2013; Wong et al.

2015). In these studies, lineages that had been discovered by molecular methods could then potentially be used as the backbone for identifying morphological traits useful in the identification of the cryptic species (Meiklejohn et al. 2011; Braga et al. 2013). The application of genetic data to questions of species limits has had success in a variety of different taxa, where genetic data were used in both the discovery and validation of cryptic species (e.g., Salter et al. 2013; Dumas et al. 2015). To date, studies of sarcophagid flies have not used DNA-based species-tree approaches to species delimitation.

The genus Ravinia (Robineau-Desvoidy, 1863) is a common group of primarily New

World flesh flies, with 34 species recognized worldwide: 1 Palearctic species; 17 Nearctic species; and 16 Neotropical species (Pape 1996; Pape and Dahlem 2010). Based on their abundance in insect natural history collections, Ravinia are among the most commonly collected flies of the family Sarcophagidae in North America (Wong et al. 2015).

Interestingly, unlike other sarcophagid genera, Ravinia larvae are primarily coprophagous.

A recent study found discordance between the mtDNA phylogeny and the morphological characters used to identify some species within this genus (Wong et al.

2015). Two species within this genus, Ravinia anxia (Walker, 1849) and Ravinia querula

(Walker, 1849), can be found throughout much of continental North America in high abundance. The taxonomic history of these two species is complex, with several taxonomic

55 revisions having occurred due to mistaken identifications by various authors (Wong et al.

2015). The mtDNA phylogeny given by Wong et al. (2015), is consistent with the hypothesis that more than two species are present within the R. anxia – R. querula complex.

In order to test this hypothesis with multiple unlinked markers, we obtain DNA sequence data from seven gene fragments. We then apply a combination of several species delimitation and population genetic methods to test for lineage divergence at the species level within R. anxia and R. querula. Furthermore, the presence of recent or ongoing gene flow is estimated between species to determine if isolation with migration has occurred.

3.3 MATERIALS AND METHODS:

Specimen collection and sampling

Flies were collected using a net or by baited traps (using dog dung or chicken liver).

Specimens were frozen and taken to the lab for morphological identification, and molecular analysis (based on Stamper et al. 2013). To preserve morphological structures important for species identifications, DNA was extracted from 1-3 legs for use in analyses. Leg tissue was transferred to a vial of 95% EtOH stored at -80C until DNA extraction. Specimens were then pinned and are retained as vouchers (currently held by GAD, these will be placed in an established natural history collection upon completion of current studies).

Specimens were collected from multiple localities throughout the continental United

States. At each locality we attempted to collect multiple individuals, although at some localities this was not possible and only a single individual could be collected. Ravinia has a broad distribution across North America, with some species (e.g., Ravinia anxia) found throughout the continental United States, Canada, and northern parts of Mexico. Population sampling of flying insects is also complicated by the large range that individuals can travel

56 over the course of their lifetime (e.g., Cochliomyia hominivorax; Mayer and Atzeni 1993). As such, we attempted to collect individuals from different geographic localities for those species that delimitation efforts would be focused on for this study: Ravinia anxia and R. querula. Specimens in this study have been assigned a unique specimen number that can be found in Table S3.2.

Based upon both morphological and molecular evidence (Lopes 1982; Pape 1994;

Giroux et al. 2010; Stamper et al. 2013; Wong et al. 2015), phylogenetic trees of R. anxia and R. querula, were rooted using sequences from an outgroup taxon: Ravinia floridensis

(Aldrich, 1916).

DNA extraction and amplification

Genomic DNA extractions were performed on whole fly legs, cut into thirds, using

Qiagen’s DNeasy kit (Qiagen Cat. No. 69506). Seven separate gene fragments were PCR- amplified, including two mtDNA and five nuDNA gene fragments (cytochrome oxidase subunit I (COI), cytochrome oxidase subunit II (COII) mtDNA; elongation factor 1 alpha (EF-

1a), glucose-6-phosphate dehydrogenase (G6PD), phosphoenolpyruvate carboxykinase

(PEPCK), period (Per), white (Wt) nuDNA). PCR amplification solutions consisted of: 200

μM dNTPs (Promega), 1μM of each primer, 1.5mM MgCl2, 0.25 units Platinum Pfx DNA polymerase (Invitrogen), buffer (final concentration: 50mM Tris, pH 8.5, 4 mM MgCl2, 20 mM KCl, 500 μg/ml bovine serum albumin, 5% DMSO) (Idaho Technology), and 3μl template DNA. PCR thermocycling was conducted on Rapid Cycler (Idaho Technologies) instruments. A region of the mitochondrial genome encoding COI and COII was amplified using primer pairs with protocols listed in Stamper et al. (2013). Nested PCR was used to amplify sequencing products for the nuclear genes: EF-1a, G6PD, PEPCK, Per, Wt. Primer

57 construction information and protocols for amplification (EF-1a, G6PD, PEPCK, Per, Wt) are given in the Supplemental Materials (Appendix S3.1). Forward and reverse strands were sequenced with the amplification primers using BigDye Terminator v.3.1, post reaction dye terminator removal using Agencourt CleanSEQ and sequence delineation on an ABI prism

3730x/ with base calling and data compilation. Sequencing was performed by Beckman

Coulter Genomics (Danvers, MA).

Sequence Assembly and Characterization

Sequences were viewed in FinchTV v.1.4.0 (Geospiza, Inc. www.geospiza.com), and assembled and edited in CLC Main Workbench v.7.0.3 (CLC Bio www.clcbio.com).

Assembled sequences were aligned in Mesquite v.2.73 (Maddison and Maddison 2011) using the plug-in Muscle v.3.8 (Edgar 2004). The resulting alignment was re-checked as both the translated amino acid sequence and the nucleotide sequence. No insertions or deletions (indels) were present among sequences obtained for the genes (COI, COII, EF-1a,

G6PD, PEPCK, Per, Wt) as such the final alignment was identical to the preliminary Muscle alignment. Based on the use of taxon specific primers, absence of in-frame stop codons, and examination of sequence characteristics (i.e., amplicon length, quality score (phred) of peaks; Song et al. 2008), we believe that nuclear insertions of mitochondrial fragments

(numts) and silenced genes were not erroneously sequenced.

Nuclear gene sequences were further analyzed using the program PHASE v.2.1

(Stephens et al. 2001; Stephens and Donnelly 2003) to resolve heterozygous haplotypes.

The web-based program SeqPHASE (Flot 2010) was used to convert DNA sequence data into PHASE input format. Phased sites with 0.90 pp (posterior probability) and higher were retained; heterozygous sites below this threshold use standard ambiguity codes. TOPALi

58 v.2.5 (Milne et al. 2009) was used to test for recombination with default program settings using the Difference of Sums of Squares (DSS) method. The program DNasp v.5 (Librado and Rozas 2009) was used to evaluate genetic variability of sequenced loci, calculating number of unique alleles (k), segregating sites (S), parsimony informative sites (PI), number of haplotypes (h), haplotype diversity (Hd) and nucleotide diversity (π).

Individual and Concatenated Gene-Tree Analyses

Models of DNA sequence evolution and gene partition were estimated with

PartitionFinder (Lanfear et al. 2012) using Akaike Information Criterion (AIC; Akaike 1973,

1974), with and without AICc correction. Due to restriction in model choice/partition, model selection was limited to those that could be implemented in MrBayes (for Bayesian inference) and RAxML (for maximum likelihood), using the function “models=mrbayes” or

“models=raxml”, respectively. To infer relationships, we analyzed mitochondrial and nuclear loci with maximum likelihood (ML) using the program RAxML v.8.0.20 (Stamatakis

2014), and Bayesian inference (BI) using MrBayes v.3.2.1 (Ronquist and Huelsenbeck

2003).

For BI, two independent Metropolis-coupled Markov chain Monte Carlo (MC3)

(Geyer 1991; Gilks and Roberts 1996) analyses were used to sample the posterior distribution using default settings in MrBayes. Each analysis had one cold and three heated chains, sampling every 500 generations for 1x107 generations with the first 40% of samples discarded as burn-in. Convergence on a stable posterior distribution was determined using the program TRACER v.1.6. (Rambaut et al. 2014) to examine the stability of parameter estimates, the harmonic mean likelihood values of chains, similarity of trees, branch lengths, and average standard deviation of split frequencies being less than

59

0.01 between runs. Maximum likelihood analyses were conducted in RAxML v.8.0.20

(Stamatakis 2014). Due to restrictions in model choice when using RAxML, gene-trees and the concatenated data set analyses were analyzed using subsets of the GTRGAMMA model

(Appendix S3.2). Nodal support was assessed with 1000 bootstrap replicates. The 70% majority rule consensus tree of replicates was constructed using SumTrees of the

DendroPy v.3.12.0 (Sukumaran and Holder 2010) program.

Two concatenated data sets were used to perform the phylogenetic reconstruction.

We concatenated the phased nuclear loci (EF-1a, G6PD, PEPCK, Per, Wt) into a single data set and analyzed these data separately before adding the mitochondrial data (COI, COII) and repeating analyses. This was done to determine the influence that the mitochondrial data have on the phylogenetic inferences. For the BI analyses of the concatenated data sets, the number of generations was extended to 2x107, with the first 40% discarded as burn-in to help with measuring convergence of runs. For the concatenated ML analyses, the number of bootstrap replicates was increased to 5000. Individual gene-tree analyses were inferred using the same parameters, except with the model + partitioning scheme selected by PartitionFinder changed.

Coalescent-based Species Delimitation

A Bayesian implementation of the GMYC model (Pons et al. 2006), bGMYC (Reid and

Carstens 2012) was used as a species discovery method using the COI and COII data set.

This method, which does not require the a priori assignment of individuals to species categories, identifies the species boundary through shifts in branching rates on a tree of multiple species and populations (Monaghan et al. 2009). Since branching rates differ between species and within species, the GMYC assesses the point of highest likelihood of

60 the transition between these two modes of lineage evolution (Pons et al. 2006; Fontaneto et al. 2007). The independent lineages recovered from the GMYC analyses are the putative species. The bGMYC is a Bayesian implementation of the GMYC model that samples the posterior distribution of sampled gene-trees to incorporate gene-tree uncertainty (Reid and Carstens 2012). BEAST v.1.8.0. (Drummond et al. 2012) was used to estimate ultrametric gene-trees under a lognormal relaxed molecular clock using a MCMC run of

5x107 generations, sampling every 5000 generations with the first 40% of sampled trees discarded as burn-in. The posterior distribution was trimmed to 100 trees, by evenly sampling the posterior, and used as input for the bGMYC. Default settings were used; starting number of species was set to half the total number of terminals. The analysis was run on R. anxia and R. querula individuals + R. floridensis as outgroup. For all analyses, bGMYC was run for 5 x 104 generations, with first 50% discarded as burn-in, and sampled every 100 generations.

The program Bayesian Phylogenetics and Phylogeography (BP&P; Yang and Rannala

2010) uses a species validation-based approach implemented in a Bayesian coalescent framework that is able to assign individuals to species using multilocus genetic sequence data. BP&P calculates the posterior probabilities of species delimitation models using reversible-jump Markov chain Monte Carlo (rjMCMC) algorithms. This method is able to use the concordance of gene-trees across multiple loci but does not rely on reciprocal monophyly as the determinant for multiple species (Zhang et al. 2011). Model assumptions in BP&P include free recombination between loci, no gene flow or migration, neutral evolution of loci, and no recombination within loci. BP&P also requires the use of a guide- tree to decrease computation difficulty, and an accurate guide-tree is critical to the

61 accuracy of the species delimitation (Leaché and Fujita 2010). Therefore several species hypotheses were generated that would be tested within BP&P to insure all possible species would be represented in the guide-tree. First, morphologically identified species were treated as species (R. anxia and R. querula). Second, the mtDNA gene-tree was used as a guide for treating inferred terminal groups (Clade 1, 2, and 3; Fig. 3.1) as species. Third, morphologically identified species and mtDNA gene-tree terminal groups were nested together and treated as species. This third species hypothesis resulted in 4 hypothetical species groups: Clade 1, Clade 2, Clade 3-R. querula, Clade 3-R. anxia.

Multiple analyses were run with varying priors ( and ) to determine how these influence results, as suggested by Leaché and Fujita (2010). Parameters of ancestral population size (ϴ) and root age (τo) were selected in order to mimic: large ancestral populations and deep divergences (BP1); small population size and shallow divergences

(BP2); large populations with shallow divergence (BP3); small population size and deep divergences (BP4). The BP&P analyses used the following settings: species delimitation was set to 1, algorithm set to 0, and the fine tune parameter () set to 10. Analyses were run for 5x105 generations, discarding the first 1x104 as burn-in, with a sampling interval of

5. Analyses were then repeated with starting seed change to check consistency between runs. Strong support (pp>0.95) was required across both runs to retain a given node, indicating lineage splitting.

The program *BEAST (Heled and Drummond 2010) implemented in the program

BEAST (Drummond et al. 2012) was used to estimate species-trees under the multispecies coalescent model directly from the sequence data set containing the 2 mtDNA loci and 5 nuDNA loci. *BEAST was not originally developed for use as a species delimitation method

62

– but as a program meant for estimation of a species-tree under the multispecies coalescent model. A method for species delimitation can be adopted, however, through the incorporation of marginal likelihood estimation (MLE) to quantify the fit of each species delimitation hypothesis (Grummer et al. 2014). Each *BEAST analysis requires the a priori assignment of individuals to species categories to reconstruct the topology of the species- tree. By conducting multiple *BEAST analyses, varying a priori species assignments according to the species delimitation hypotheses, and quantifying the MLE for each analysis, a best-fitting scenario can be determined. Therefore, species groups were defined using the same species hypotheses generated for BP&P analyses. This resulted in a total of three species delimitation hypotheses: a morphologically identified species scenario (R. anxia and R. querula); a genetically identified species scenario (Clade 1, 2, and 3; Fig 3.1); a morphologically identified species and genetically identified species scenario (Clade 1,

Clade 2, Clade 3-R. anxia, Clade 3-R. querula; Fig 3.1). Marginal likelihood estimation (MLE) was performed on each analysis using path sampling (PS: Gelman and Meng 1998; Lartillot and Phillippe 2006)/stepping-stone sampling (SS: Xie et al. 2011) as implemented in

*BEAST. The PS/SS analyses have been shown to provide a more accurate MLE than the harmonic mean method for comparing different evolutionary models (Xie et al. 2011).

Analyses were conducted using both a nuclear only data set and a nuclear + mitochondrial data set. The uncorrelated lognormal relaxed clock (Drummond et al. 2006) was assumed and a Yule model prior used for divergence times in the species-tree. The MCMC algorithm was run for 5x107 generations, sampling every 5000 generations with a 40% burn-in.

Convergence was assessed using TRACER v.1.6.0., checking for stationary in likelihood

63 scores and tree lengths. The model of nucleotide substitution was selected using

PartitionFinder, reducing selection to models that could be implemented in *BEAST.

Population Structure

The program STRUCTURE (Pritchard et al. 2000), which is a model-based, genetic clustering method, was used to infer population structure from allele frequencies by identifying clusters that are in Hardy-Weinberg and linkage equilibrium, without the a priori assignment of population groups. STRUCTURE runs were conducted assuming between 2 and 27 populations (K=2 through K=27), with each value of K replicated 5 times.

Analyses were conducted with a burn-in of 1x106 steps and 2x106 MCMC steps after burn- in, used an admixture model, and allele frequencies were considered independent between populations. The Evanno method (Evanno et al. 2005) implemented in STRUCTURE

HARVESTER (Earl and vonHoldt 2012) was used to estimate the optimal K value. The program CLUMPP (Jakobsson and Rosenberg 2007) was used to summarize the data using the “FullSearch” algorithm. The data were visualized using the program DISTRUCT

(Rosenberg 2004).

Migration & Isolation-with-migration models

To better understand the effect that migration could be having within this species complex, migration analyses were performed using the programs Migrate-n v.3.6.4 (Beerli and Palczewski 2010) and Isolation-with-Migration (IMa2; Hey 2010). Because both coalescent-based analyses (*BEAST, bGMYC), population structure analyses (STRUCTURE), and phylogenetic analyses (RAxML, MrBayes) found high support for the putative species group made up of the individuals AW48 and AW70, these individuals were excluded from downstream migration analyses. As such, these analyses included all individuals of R. anxia

64 and R. querula, excluding individuals AW70 and AW48. All data sets were analyzed using the two-population assumption. In total, two data sets were used where the first data set divided the putative species into morphologically identified R. querula + R. anxia, and the second data set divided putative species into the individuals comprising Clade 2 + Clade 3

(Fig 3.1).

The program Migrate-n was used to estimate the long-term migration rates

(Mquerulaanxia, Manxiaquerula; MClade 2Clade 3, MClade 3Clade 2) and the effective population sizes (anxia, querula; Clade 2, Clade 3) under a Bayesian inference model. Three independent runs were performed to verify convergence, and uniform priors were used for all parameters. Each run involved 1 long-chain, and four heated chains with 1.25x107 steps following a burn-in of 2.5x104 steps, and a sampling increment of 25 steps. The four gene flow models compared for the R. querula + R. anxia data set were: 1) Full 2-population gene flow model; 2) R. querula to R. anxia migration; 3) R. anxia to R. querula migration; 4) Single panmictic population. The four gene flow models compared for the Clade 2 + Clade 3 data set were: 1) Full 2-population gene flow model; 2) Clade 2 to Clade 3 migration; 3) Clade 3 to Clade 2 migration; 4) Single panmictic population. Model choice was determined using

Bayes Factors and log marginal likelihood (Kass and Raftery 1995; Beerli and Palczewski

2010). Migrate-n is able to analyze gene flow among populations without the use of a known tree topology, however it assumes that migration has been occurring between two independent populations and cannot estimate divergence of populations through time.

The program IMa2 was used to estimate the migration rates (m1, m2), effective population sizes (Nanxia, Nquerula, Nancestor), population divergence time (T), and the splitting parameter (s) in the R. anxia/R. querula data set and the Clade 2/Clade 3 data set.

65

Preliminary runs were conducted to optimize the priors used in analyses. Final analyses were run with an HKY mutation model (Hasegawa et al. 1985), a geometric heating scheme

(h1=0.96, h2=0.9), 15 chains, and a chain length of 2x107 with first 5 million steps discarded as burn-in. To assess convergence, each run was repeated using the same prior parameters with different random number seeds. The analyses were then repeated using different starting priors for population size and migration rates. These tests were compared to determine the similarity of results between runs. To ensure the proper mixing of the Markov chain, the effective sample size (ESS) was monitored for each run.

3.4 RESULTS

Molecular Data

List of specimens, locality information, and GenBank accession numbers augmenting those published in Wong et al. (2015) are provided in Table S3.2. Information for all loci

(unique alleles, segregating sites, nucleotide diversity) are found in Table 3.1. The final concatenated molecular alignment consisted of 4905 base pairs (bp) of sequence data for 2 mtDNA (COI + COII) and 5 nuDNA (EF-1a, G6PD, PEPCK, Per, Wt) loci with the full data set

(R. anxia and R. querula + outgroup) including 31 individuals. TOPALi analyses indicated possible recombination within the PEPCK nuDNA gene, but this is likely an artifact of very small pairwise distances within regions of this gene where for some windows there is no difference between sequences (McGuire et al. 1997).

Phylogenetic Reconstruction

The model of evolution and partitioning scheme selected as best by PartitionFinder are given in the supplementary materials (Appendix S3.2). The BI and ML analyses, using the mtDNA + nuDNA, and the nuDNA only data sets, resulted in the same general topology

66 and inferred a phylogeny congruent with previously inferred relationships using only mtDNA (Fig 3.1; Wong et al. 2015). Ravinia anxia and R. querula are inferred as paraphyletic sister species under current morphological-based assignments with high nodal support (BPML=100; PP=1.0). The 31 individuals of R. anxia and R. querula sorted into three distinct groups (Fig 3.1: Clade 1, Clade 2, & Clade 3). Ravinia anxia collected from

Minnesota and New Mexico form a highly supported clade (Fig 3.1, Clade 1: BPML=100;

PP=1.0), that is sister to a smaller clade with two terminal groups comprised of: 1) R. querula from Oregon, Wisconsin, Georgia, and Virginia (Fig 3.1, Clade 2: BPML=100;

PP=1.0); 2) R. querula from California, Oregon, and New Mexico and R. anxia from

California and Oregon (Fig 3.1, Clade 3: BPML=100; PP=1.0). The inferred ML and BI phylogenies obtained from single gene-trees showed varying degrees of resolution for R. anxia and R. querula (Figs. S3.1-S3.7).

Test of Species Limits

Multiple approaches, including both discovery- and validation-based methods were used to test species limits within the clades comprising all of the individuals assigned to R. anxia and R. querula as currently defined by morphology. For the validation-based methods, a priori partitions of individuals were generated based on competing hypotheses of species delimitation scenarios. Species delimitation hypotheses were validated using

*BEAST and BP&P. For *BEAST, the marginal likelihood scores (with and without mtDNA) estimated using PS/SS for the competing hypotheses are provided in Table 3.2 and the inferred best species-tree is given in Fig S3.9. Results from *BEAST were congruent for both the mitochondrial + nuclear data set and the nuclear-only data set (Table 3.2). There was very strong support for the four-species delimitation model (DNA + Morphology Guide

67

Tree) between both data sets and PS/SS analyses (10< {2ln]>14.0, Kass and Rafferty 1995;

Table 3.2). However, nodal support for the best-fitting species-tree was weak for two out of the three nodes (Fig. S3.9; PP<68). For BP&P, all guide trees regardless of ancestral population sizes and divergence type priors returned equally supported, conflicting species delimitation scenarios (Table S3.2).

For the discovery-based methods, bGMYC results using the mitochondrial data (COI

+ COII) detected more putative lineages than the other species delimitation methods (Fig.

S3.8). However, the majority of these putative lineages were found within a single clade

(Clade 2; Fig 3.1, Fig. S3.8) As a conservative measure to prevent over-splitting of potential species groups, these lineages were merged resulting in a single putative species with a posterior probability 0.5

PP≥0.95, Clade 3≥0.90; Fig. 3.2).

Population Structure

The program STRUCTURE was used as a genetic clustering method that estimates population structure based on Hardy-Weinberg and linkage equilibrium. The STRUCTURE results estimated three species that are congruent with those obtained from the bGMYC method (Fig. 3.2). The Evanno method (Evanno et al. 2005) was used to determine the best partition, and a visual assessment of the STRUCTURE results corroborate the presence of these three groups (Fig. 3.2). Due to the large geographic range that R. anxia and R. querula inhabit, it is possible that isolation-by-distance may be a driving factor in the composition of these three groups (Pritchard et al. 2000; 2010). However, given the large geographic

68 overlap of samples from Clades 2 and 3 (collected in California and Oregon), it is expected that the effect of isolation-by-distance would have minimal impact.

Migration Analyses

Migration analyses using Migrate-n found high levels of uni-directional migration between the inferred Ravinia anxia + Ravinia querula lineages and between the Clade 2 +

Clade 3 (Fig. 3.1) lineages. These levels of gene flow are suggestive that the BP&P analyses could infer erroneous results in regards to delimitation models (Zhang et al. 2011).

Migrate-n results from the independent runs were all congruent, as such only results from one run are shown (Table 3.3). For the R. anxia + R. querula data set, Model 2 (R. querula →

R. anxia migration) was selected as best according to the Bayes Factors with high probability (P=0.997; Table 3.3). The number of migrants per generation based on 2 mtDNA loci and 5 nuDNA loci for R. querula → R. anxia was 1.02 individuals/generation

(95% highest posterior distribution; HPD = 0.08-3.17). For the Clade 2 + Clade 3 data set,

Model 2 (Clade 2 → Clade 3 migration) was selected as best according to the Bayes Factors with high probability (P=0.997; Table 3.3). The number of migrants per generation for

Clade 2 → Clade 3 migration was 0.70 individuals/generation (95% highest posterior distribution; HPD = 0.21-1.45).

Isolation-with-migration analyses using IMa2 also provided evidence of gene flow between inferred R. anxia + R. querula lineages and between the Clade 2 + Clade 3 (Fig. 3.1) lineages. All repeated runs produced similar results indicating significant migration events; as such, only results from a single run are shown. ESS values for the time parameter ranged from 74-29169 for the R. anxia + R. querula data set and 151-42102 for the Clade 2 + Clade

3 data set. Both data sets produced ESS values above the recommended cutoff value (Hey

69

2011). However, despite manipulation of prior parameters and extension of run times, analyses did not converge on a stable ancestral population size. This is believed to be a result of the data not containing enough information due possibly to a lack of depth in phylogenetic signal for nuclear loci (Hey 2011). For the R. anxia + R. querula data set results showed a statistically significant immigration event for both R. anxia → R. querula (M anxia > querula; Table 3.4) and R. querula → R. anxia (M querula > anxia; Table 3.4). For the Clade 2 +

Clade 3 data set results showed a statically significant uni-directional immigration event for Clade 2 → Clade 3 (MClade2 > Clade3; Table 3.4).

3.5 DISCUSSION

The main goal of this study was to test for evolutionary lineage divergence (i.e., speciation) at the species level within two currently recognized, Ravinia species. We find a high level of genetic diversity and distinct lineage divergence within R. anxia and R. querula that is not congruent with the current morphology-based taxonomy. Using DNA-based coalescent species-tree methods, phylogenetic inference, population structure analyses, and migration analyses, our result consistently indicate cryptic lineages within R. anxia and

R. querula, but the composition of these lineages varied across methods.

Comparison of coalescent-based species delimitation methods

Validation-based approaches are desirable because they can be computationally tractable due to the a priori assignment of individuals to species. This can also be a potential limitation of validation-based approaches because the accuracy of the species- trees is constrained by the a priori hypothesized species groups. *BEAST analyses require that individuals be partitioned to species groups before analyses. As such, several species assignment models were generated from competing hypotheses based on groups obtained

70 from phylogenetic analyses (BI and ML), morphology-based assignments, or a combination of the two. The best species delimitation model selected from this validation-based approach recovered four evolutionary independent lineages (Fig. S3.9). This model splits both morphologically identified species R. anxia and R. querula into each being composed of two cryptic species. The lineage split that is seen in Clade 3 (Fig. S3.9) is between individuals that for the most part occur in sympatry in California and Oregon and lack any obvious geographic barriers. Leaché et al. (2014) conducted a recent simulation study using *BEAST to look at how gene flow influences the topology and support of the species- tree. This simulation study showed that paraphyletic gene flow, regardless of whether it occurred at shallow or deep levels in the species-tree, can lead to strong support for the incorrect topology and a sharp decline in posterior probability of clades (Leaché et al.

2014). The 4-species model, selected as best-fit by the PS/SS analyses implemented in

*BEAST, would have intermediate levels of paraphyletic gene flow and as such potential to infer the incorrect species-tree with high support (Leaché et al. 2014). What is evident from the *BEAST results is that when comparing the 4-species scenario and 3-species scenario there is a relatively small difference between likelihoods, but a rather large difference is seen between the 3-species scenario and the 2-species scenario (Table 3.2).

This suggests that while *BEAST may have difficulty choosing between a 4-species or 3- species scenario, the elimination of the 2-species scenario as most likely appears to be probable.

Approaches that are capable of delimiting lineages without the requirement of a priori species assignments can be critical when there is a lack of evidence for pre-existing divisions. Two discovery-based methods were used to identify the presence of evolutionary

71 lineage divergence: bGMYC and BP&P. The inferred lineages recovered using the multilocus

Bayesian method BP&P gave ambiguous results. Regardless of the prior parameters and guide trees used, all species delimitations scenarios were equally and highly supported for each model (PP=1.0; Table S3.2). An explicit assumption of BP&P is an absence of gene flow between lineages (Yang and Rannala 2010). At low rates of migration (~0.1 individuals/generation), there is virtually no impact on BP&P analyses (Zhang et al. 2011).

At high migration rates (i.e., ≥10 individuals/generation), BP&P tends to infer only a single species (Zhang et al. 2011). In contrast, at intermediate levels of migration (1.0-10 individuals/generation) BP&P is found to be indecisive (Zhang et al. 2011). Due to the fact that intermediate levels of migration (0.7-1.02 individuals/generation) were estimated within R. anxia – R. querula, BP&P is not a suitable multilocus species delimitation analysis for this study.

The second discovery-based species delimitation method that we used was bGMYC.

The bGMYC results showed multiple potential coalescent units, particularly present among

Clade 2 individuals (Fig. 3.2, Fig. S3.8). Species delimitation is complicated within Ravinia, and in systems that are genetically and geographically fragmented in general, with the potential to oversplit species. While the bGMYC model has been used previously to delimit insect species (Pons et al. 2011; Keith and Hedin 2012), it is most reliable among species that exhibit a measurable difference in DNA similarity (i.e., barcoding-gap; Pons et al. 2006;

Satler et al. 2013). The bGMYC method identifies the species boundaries from the change in branching rates in the gene-tree between the levels of species and populations (Pons et al.

2006). The branching rates are expected to differ at the speciation event, where lengths between species are a result of speciation and extinction events (Nee et al. 1994), and

72 interspecific branch lengths a results of the coalescent (Rosenberg and Nordborg 2002;

Wakely 2006). However, coalescent events in the gene-tree are expected to be found within clades where genetic structure is lacking because it is assumed that the individuals in these clades are panmictic (Pons et al. 2006; Satler et al. 2013). The high amount of structure that is seen within Clade 2 (Fig. 3.2) suggests that the demarcation inferred by the GMYC model may be biased toward the present, and thus many coalescent units (i.e., species) would be recognized. This problem in GMYC analyses has been recognized to exist in other taxa also exhibiting conspicuous structure (Lohse 2009; Hamilton et al. 2011; Satler et al. 2013). We suggest that this is what is occurring among individuals that make up Clade 2 (Fig. 3.2, Fig.

S3.8) and is likely the result of the large geographic ranges for which these particular individuals were sampled. If the multiple coalescent units within Clade 2 (Fig. 3.2, Fig. S3.8) are collapsed, they form a single unit with a relatively high nodal support (0.5< PP >0.9). In addition, the merging of these coalescent units in Clade 2 would result in a species delimitation scenario congruent with the population structure analyses, the lineages recovered from concatenated ML and BI analyses, and the migration analyses.

Population Structure and Migration Analyses

As an alternative to coalescent-based analyses, we used the program STRUCTURE, a genetic clustering algorithm that is able to incorporate gene flow between groups, utilize multilocus data, and does not require a priori assignments. We wanted to further test the lineages (Clades 1, 2, and 3; Fig. 3.2) recovered from the ML and BI analyses and see if corresponding groups would be inferred from the STRUCTURE results. STRUCTURE analyses resulted in three groups that corroborate with the Clades 1, 2, and 3 obtained

73 from ML and BI analyses (Fig. 3.2). A visual examination of the obtained STRUCTURE plot

(Fig. 3.2) shows little overlap between inferred groups.

Migration analyses (Migrate-n and IMa2) yielded results showing migration between hypothetical species units regardless of the data set used. For the Migrate-n analyses, the migration rate between putative species and the highest-posterior distribution was decreased when the Clade 2 + Clade 3 data set was used, relative to analyses using the morphologically identified R. querula + R. anxia data set (Table 3.4).

While migration between the putative species comprising Clade 2 + Clade 3 is still present

(0.7 individuals/generation), these results indicate there is likely to be more barriers to gene flow than the species delimitation scenario present in the current taxonomy (R. anxia

+ R. querula; 1.02 individuals/generation). Isolation-with-migration analyses using IMa2 showed comparable levels of migration to the Migrate-n analyses. In the IMa2 analyses, the migration rate between putative species was decreased with the Clade 2 + Clade 3 data set, the migration was uni-directional, and the High/Low 95% HPD also decreased relative to the R. anxia + R. querula data set (Table 3.4).

In both migration analyses, two species delimitation scenarios were considered. The first, using the R. anxia + R. querula data set, assumes that those individuals morphologically identified are the putative species. The second, using the Clade 2 + Clade 3 data set, assumes that those individuals defined by the Clade 2 lineage and Clade 3 lineage are the putative species. In order to accept this first species delimitation scenario, one would also have to accept a large migration rate between the putatively defined species.

The scenario that delimits the species into Clade 2 and Clade 3 putative species has much lower levels of migration. It is expected that the more likely delimitation scenario would

74 have lower levels of migration between putative species (i.e., more barriers to gene flow)

(Mayr 1942, 1963). In addition, Wright (1931) showed that species cohesion breaks down when gene flow is reduced below one migrant per generation. Thus, the migration rate between the morphologically identified R. anxia and R. querula, according to Wright (1931), would be sufficiently high enough to prevent lineage divergence.

3.6 CONCLUSIONS

The criteria that are used in determining if a population is a species, is an ongoing, controversial topic. The use of genetic data for species discovery and delimitation has been proposed as an approach that is objective and uses methods that are reproducible (Fujita and Leaché 2011). There is debate though on the primary use of genetic data in species delimitation studies and its ultimate utility (Bauer et al. 2011). However, it becomes obvious in cryptic lineages with high levels of genetic diversity that, at least on some level, accuracy in species delimitation will be a product of the data available and decisions that are made by the researchers (Satler et al. 2013). This is a particular challenge when there is incongruence among methods; where the researcher is left the task of choosing the most appropriate species delimitation scenario. As such, it is important to employ a wide range of methods to the data and, in short, select those scenarios that are most congruent across methods (Carstens et al. 2013). Additionally, it is important to take a conservative approach to species delimitation studies to prevent the oversplitting of species (Carstens et al. 2013).

Across all methods, the species delimitation scenario that uses the current morphological taxonomy, R. anxia and R. querula, is the least supported. This is particularly evident in *BEAST analyses where the morphologically-based species scenario is separated

75 from the next most likely scenario by well over a hundred Bayes factors regardless of the data set used (Table 3.2). The results obtained from bGMYC suggest that this particular method, which uses a single locus, may not be suitable in the species delimitation of

Ravinia, where high levels of genetic diversity are present. However, a conservative approach can be taken, where the multiple coalescent units of Clade 2 (Fig. 3.2, Fig. S3.8) are merged, and obtain the 3-species delimitation scenario (0.5

MrBayes, RAxML) is taken to be evidence that important methodological assumptions (i.e., migration/gene flow) have been violated (Carstens et al. 2013).

Given the strong support of the phylogenetic, population genetic, and migration- with-isolation analyses we feel that these three species are distinct. This distinction, however, is currently only found within the molecular data. Future additional evidence obtained using other techniques (e.g., morphology) would strengthen the status of these species, and additional efforts will be dedicated to such research. At this time, however, the authors defer to the current conventions adopted in the International Code of Zoological

Nomenclature concerning the naming of new species (Articles 11-20; 1999). As such, to prevent the use of nomina nuda the taxonomic rank and names will remain until further

76 research provides morphological descriptions of these new species. However, to decrease confusion when referencing these three species in subsequent research, we suggest these conventions be used when referring to these genetically delimited species: Ravinia anxia

(sensu stricto) = individuals comprising Clade 1; Ravinia querula (sensu stricto) = individuals comprising Clade 2; Ravinia n.sp. = individuals comprising Clade 3.

77

ACKNOWLEDGMENTS

We thank Amanda J. Stein (New York), Tom Lyons (New York), Ed Quin (Minnesota),

Kristel Peters (Minnesota), Eric Lobner (Wisconsin), Donna Watkins (Florida) and Gary

Steck (Florida) for their help in obtaining collection permits in their respective state’s parks.

This project was supported by Grant No. 2005-DA-BX-K102 awarded by the

National Institute of Justice (NIJ), Office of Justice Programs, US Department of Justice. This project was supported by the Department of Biological Sciences and The Graduate Student

Governance Association of The University of Cincinnati. The opinions, findings, and conclusions or recommendations expressed in this publication are those of the authors and do not necessarily reflect the views of the Department of Justice or The University of

Cincinnati. The authors recognize no conflicts of interest.

78

3.7 TABLES

Table 3.1. Gene information for multilocus analyses. Includes name of locus, length (bp),

number of sequences (# Seq.), number of segregating sites (S), number of

parsimony informative sites (PI), nucleotide diversity (), number of haplotypes (h),

haplotype diversity (Hd), and model of sequence evolution for the R. anxia + R.

querula data set.

Locus bp # Seq. S PI  h Hd Model COI/COII mtDNA 2241 29 31 31 0.01508 6 0.707 GTR + I + G EF-1a nuDNA 712 58 18 6 0.00307 10 0.699 SYM + I White nuDNA 589 60 34 28 0.01427 39 0.973 GTR + I + G Period nuDNA 512 58 22 17 0.00873 19 0.852 HKY + G G6PD nuDNA 485 60 92 87 0.01674 29 0.892 TrN + I + G PEPCK nuDNA 400 60 13 11 0.01174 6 0.615 SYM + G

79

Table 3.2. *BEAST marginal likelihoods. Estimated using mitochondrial and nuclear data

for each species delimitation model tested.

Mitochondrial and Nuclear Data Path-Sampling Stepping-Stone ln(Marginal 2ln(Bayes ln(Marginal 2ln(Bayes Delimitation Model Likelihood) Factor) Likelihood) Factor) DNA + Morphology Guide Tree -10544.1* - -10544.7* Morphology Guide Tree -10724.5 180.4++ -10724.7 180.0++ Mitochondrial Guide Tree -10558.4 14.3++ -10558.7 14.0++

Nuclear Data ln(Marginal 2ln(Bayes ln(Marginal 2ln(Bayes Delimitation Model Likelihood) Factor) Likelihood) Factor) DNA + Morphology Guide Tree -6362.0* - -6362.3* Morphology Guide Tree -6511.4 149.4++ -6511.9 149.6++ Mitochondrial Guide Tree -6383.5 21.5++ -6383.8 21.5++

Note: Marginal Likelihoods estimated using the Path-Sampling and Stepping-Stone Methods implemented in *BEAST

*best-fitting hypothesis, under comparison with best-fitting hypothesis and alternative species delimitation hypothesis

++very strong support for best-fitting hypothesis (10< 2ln[BF]; Kass and Raftery 1995)

80

Table 3.3. Migration results using Migrate-n

Migration excluding Clade 1 (AW70 + AW48) individuals R. anxia/R. querula Bezier Harmonic LBF Choice Model data set lmL lmL (Bezier) (Bezier) Probability Full Model -9260.58 -8920.46 30.36 3 2.550*10-7 M querula -> anxia -9245.40 -8913.86 0.00 1 (BEST) 0.997 M anxia -> querula -9251.44 -8949.16 12.08 2 2.376*10-3 Panmictic -9314.11 -8913.53 137.42 4 1.44*10-30 Clade 2/Clade 3 data set Full Model -9226.16 -8913.39 45.08 3 2.549*10-7 M Clade 2 -> Clade 3 -9203.62 -8933.50 0.00 1 (BEST) 0.997 M Clade 3 -> Clade 2 -9217.54 -8921.34 27.84 2 2.376*10-3 Panmictic -9313.68 -8927.61 220.12 4 1.441*10-30

81

Table 3.4. Mean estimates of population parameters from IMa2 analyses. Highest (H) and

Lowest (L) 95% HPD (highest posterior density) are given. Note: “M anxia > querula”

read as: immigration into R. anxia; “MClade2 > Clade3” read as: immigration into Clade 2.

†Indicates significant LLR test >0 (Nielsen and Wakeley 2001).

R. anxia + R. querula Data set Parameter Mean L(95%) H(95%) ϴanxia 1.032 0.448 1.683 ϴquerula 4.315 3.458 4.997 ϴancestral 4.549 3.732 4.997 M anxia > querula 0.310† 0.000 0.867 M querula > anxia 0.497† 0.165 0.881 Clade 2 + Clade 3 Data set Parameter Mean L(95%) H(95%) ϴClade 2 1.632 0.933 2.398 ϴClade 3 3.586 2.518 4.773 ϴancestral 4.488 3.583 4.997 MClade2 > Clade3 0.113† 0.008 0.248 M Clade3 > Clade2 0.027 0.248 0.080

82

3.8 FIGURES

Figure 3.1. Maximum likelihood tree, inferred using RAxML, of the concatenated (2mtDNA

loci + 5nuDNA loci) data set for all samples. Bayesian analyses, using MrBayes,

returned the same topology. The same topology was seen with the reduced data set

(with and without inclusion of mtDNA loci). Support values for nodes are ML-

bootstrap support (%)/BI-posterior probabilities.

83

Figure 3.2. Results from species delimitation and STRUCTURE analyses, represented on

BEAST tree from bGMYC of all sampling localities.

85

3.9 REFERENCES

Akaike H. 1973. Information theory and an extension of the maximum likelihood principle.

In: Petrov B.N., Caski, F., editors. Second International Symposium on Information

Theory. Budapest: Akademiai. p. 267-281.

Akaike H. 1974. A new look at the statistical model identification. IEEE T. Automat. Contr.

19:716-723.

Bauer A.M., Parham J.F., Brown R.M., Stuart B.L., Grismer L., Papenfuss T.J., Böhme W.,

Savage J.M., Carranza S., Grismer J.L., Wagner P., Schmitz A., Ananjeva N.B., Inger R.F.

2011. Availability of new Bayesian-delimited gecko names and the importance of

character-based species descriptions. Proc. Royal Soc. B. 278:490-492.

Beerli P., Palczewski M. 2010. Unified framework to evaluate panmixia and migration

direction among multiple sampling locations. Genetics. 185:313-326.

Braga M.V., Pinto Z.T., de Carvalho Queiroz M.M., Matsumoto N., Blomquist G.J. 2013.

Cuticular hydrocarbons as a tool for the identification of insect species: Puparial

cases from Sarcophagidae. Acta Trop. 128:479-485.

Carstens B.C., Pelletier T.A., Reid N.M., Satler J.D. 2013. How to fail at species delimitation.

Mol. Ecol. 22:4369-4383. de Queiroz K. 2007. Species concepts and species delimitation. Syst. Biol. 56:879-886.

Drummond A.J., Ho S.Y.W., Phillips M.J., Rambaut A. 2006. Relaxed phylogenetics and dating

with confidence. PLoS Biol. 4:e88.

Drummond A.J., Suchard M.A., Xie D., Rambaut A. 2012. Bayesian phylogenetics with BEAUti

and BEAST 1.7 Mol. Biol. Evol. 29:1969-1973.

87

Dumas P, Barbut J., Ru B.L., Silvain J.-F., Clamens A.L., d’Alençon E., Kergoat, G.J. 2015.

Phylogenetic molecular species delimitations unravel potential new species in the

pest genus Spodoptera Guenée, 1852 (, Noctuidae). PLoS ONE. 10(4):

e0122407.

Earl D.A., vonHoldt B.M. 2012. STRUCTURE HARVESTER: a website and program for

visualizing STRUCTURE output and implementing the Evanno method. Conserv.

Genet. Resour. 4:359-361.

Edgar R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acids Res. 32:1792-1797.

Emelianov I., Marec F., Mallet J. 2004. Genomic evidence for divergence with gene flow in

host races of the larch budmoth. Proc. Royal Soc. B. 271:97-105.

Evanno G., Regnaut S., Goudet J. 2005. Detecting the number of clusters of individuals using

the software STRUCTURE: a simulation study. Mol. Ecol. 14:2611-2620.

Flot J.-F. 2010. SEQPHASE: a web tool for interconverting PHASE input/output files and

FASTA sequence alignments. Mol. Ecol. Resour. 10:162-166.

Fontaneto D., Herniou E.A., Boschetti C., Capriolo M., Melone G., Ricci C., Barraclough T.G.

2007. Independently Evolving Species in Asexual Bdelloid Rotifers. PLoS Biol. 5:e87.

Fujita M.K., Leaché A.D. 2011. A coalescent perspective on delimiting and naming species: a

reply to Bauer et al. Proc. Royal Soc. B. 278:493-495.

Funk D.J., Omland K.E. 2003. Species-level paraphyly and polyphyly: frequency, causes, and

consequences, with insights from animal mitochondrial DNA. Annu. Rev. Ecol. Evol.

Syst. 34:397-423.

88

Gelman A., Meng X. 1998. Simulating normalizing constants: from importance sampling to

bridge sampling to path sampling. Statist. Sci. 13:163-185.

Geyer C.J. 1991. Markov chain Monte Carlo maximum likelihood. In: Keramides E.M., editor.

Computing Science and Statistics: Proceedings of the 23rd Symposium on the

Interface. Fairfax Station, USA, Interface Foundation. p. 156-163.

Gilks W.R., Roberts G.O. 1996. Strategies for improving MCMC. In: Gilks W.R., Richardson S.,

Spiegelhalter, editors. Markov chain Monte Carlo in Practice. London, UK, Chapman

and Hall. p. 89-114.

Giroux M., Pape T., Wheeler T. 2010. Towards a phylogeny of the flesh flies (Diptera:

Sarcophagidae): Morphology and phylogenetic implications of the acrophallus in the

subfamily Sarcophaginae. Zool. J. Linnean Soc. 158:740-778.

Greenberg B. 1991. Flies as forensic indicators. J. Med. Entomol. 28:565-577.

Grummer J.A., Bryson R.W., Jr., Reeder T.W. 2014. Species delimitation using Bayes factors:

simulation and application to the Sceloporus scalaris species group (Squamata:

Phrynosomatidae). Syst. Biol. 63:119-133.

Guo Y.D., Cai J.F., Li X., Xiong F., Su R.N., Chen F.L., Liu Q.L., Wang X.H., Chang Y.F., Zhong M.,

Wang X., Wen J.F. 2010. Identification of the forensically important sarcophagid flies

Boerttcherisca peregrina, Parasarcophaga albiceps and Parasarcophaga dux

(Diptera: Sarcophagidae) based on COII gene in China. Trop. Biomed. 27:451-460.

Hamilton C.A., Formanowicz D.R., Bond J.E. 2011. Species delimitation and phylogeography

of Aphonopelma hentzi (Araneae, Mygalomorphae, Theraphosidae): cryptic diversity

in North American tarantulas. PLoS ONE. 6:e26207.

89

Harrington R.C., Near T.J. 2012. Phylogenetic and Coalescent Strategies of Species

Delimitation in Snubnose Darters (Percidae: Etheostoma). Syst. Biol. 61:63-79.

Hasegawa M., Kishino H., Yano T. 1985. Dating of the human-ape splitting by a molecular

clock of mitochondrial DNA. J. Mol. Evol. 22:160-174.

Heled J., Drummond A.J. 2010. Bayesian inference of species trees from multilocus data.

Mol. Biol. Evol. 27:570-580.

Hey J. 2010. Isolation with migration models for more than two populations. Mol. Biol. Evol.

27:905-920.

Hey J. 2011. Documentation for IMa2. .

Hudson R.R. 1991. Gene genealogies and the coalescent process. Oxf. Surv. Evol. Biol. 7:1-

44.

ICZN (International Commission on Zoological Nomenclature) (1999) International code of

zoological nomenclature, 4th edn. London, UK: International Trust for Zoological

Nomenclature.

Jakobsson M., Rosenberg N.A. 2007. CLUMPP: a cluster matching and permutation program

for dealing with label switching and multimodality in analysis of population

structure. Bioinformatics. 23:1801-1806.

Jordaens K., Sonet G., Richet R., Dupont E., Braet Y., Desmyter S. 2013. Identification of

forensically important Sarcophaga species (Diptera: Sarcophagidae) using the

mitochondrial COI gene. Int. J. Legal. Med. 127:491-504.

Kass R.E., Raftery A.E. 1995. Bayes Factors. J. Am. Statist. Assoc. 90:773-795.

90

Keck B.P., Near T.J. 2010. Geographic and temporal aspects of mitochondrial replacement in

Nothonotus darters (Teleostei: Percidae: Etheostomatinae). Evolution. 64:1410-

1428.

Keith R., Hedin M. 2012. Extreme mitochondrial population subdivision in southern

Appalachian paleoendemic spiders (Araneae: Hypochilidae: Hypochilus), with

implications for species delimitation. J. Arachnology. 40:167-181.

Klingman J.F.C. 1982. The Coalescent. Stoch. Proc. Appl. 13:235-248.

Knowles L.L., Carstens B.C. 2007. Delimiting species without monophyletic gene trees. Syst.

Biol. 56:887-895.

Kutty S.N., Pape T., Wiegmann B.M., Meier R. 2010. Molecular phylogeny of the Calypratae

(Diptera: Cyclorrhapha) with an emphasis on the superfamily and the

position of the Mystacinobiidae and McAlpine’s fly. Syst. Entomol. 35:614-635.

Lanfear R., Calcott B., Ho S.Y.W., Guindon S. 2012. PartitionFinder: Combined selection of

partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol.

Evol. 29:1695-1701.

Lartillot N., Philippe H. 2006. Computing Bayes factors using thermodynamic integration.

Syst. Biol. 55:195-207.

Leaché A.D., Fujita M.K. 2010. Bayesian species delimitation in West African forest geckos

(Hemidactylus fasciatus). Proc. Royal Soc. B. 277:3071-3077.

Leaché A.D., Harris R.B., Rannala B., Yang Z. 2014. The influence of gene flow on species tree

estimation: a simulation study. Syst. Biol. 63:17-30.

Librado P., Rozas J. 2009. DnaSP v5: A software for comprehensive analysis of DNA

polymorphism data. Bioinformatics 25:1451-1452.

91

Llopart A., Lachaise D., Coyne J.A. 2005. Multilocus analysis of introgression between

sympatric sister species of Drosophila: Drosophila yakuba and D. santomea. Genetics.

171:197-210.

Lohse K. 2009. Can mtDNA barcodes be used to delimit species? A response to Pons et al.

(2006). Syst. Biol. 58:439-442.

Lopes H.S. 1982. The importance of the mandible and clypeal arch of the first instar larvae

in the classification of the Sarcophagidae (Diptera). Rev. Bras. Biol. 26:293-326.

Lowenstein J.H., Amato G., Kolokotronis S.-O. 2009. The Real maccoyii: Identifying Tuna

Sushi with DNA Barcodes – Contrasting Characteristic Attributes and Genetic

Distances. PLoS ONE. 4: e7866.

Maddison W.P., Maddison D.R. 2011. ‘Mesquite 2.73: a modular system for evolutionary

analysis.’ Available from [http://mesquiteproject.org].

Mayr E. 1942. Systematics and the Origin of Species. Columbia University Press, New York.

Mayr E. 1963. Animal Species and Evolution. Harvard University Press, Cambridge,

Massachusetts.

Mayer D.G., Atzeni M.G. 1993. Estimation of Dispersal Distances for Cochliomyia

hominivorax (Diptera: Calliphoridae). Environ. Entomol. 22:368-374.

McGuire G., Wright G., Prentice M.J. 1997. A graphical method for detecting recombination

in phylogenetic data sets. Mol. Biol. Evol. 14:1125-1131.

Meiklejohn K.A., Wallman J.F., Dowton M. 2011. DNA-based identification of forensically

important Australian Sarcophagidae (Diptera). Int. J. Leg. Med. 125:27-32.

92

Meiklejohn K.A., Wallman J.F., Cameron S.L., Dowton M. 2012. Comprehensive evaluation of

DNA barcoding for the molecular species identification of forensically important

Australian Sarcophagidae (Diptera). Invert. Syst. 26:515-525.

Meiklejohn K.A., Wallman J.F., Dowton M. 2013a. DNA Barcoding Identifies all Immature

Life Stages of a Forensically Important Flesh Fly (Diptera: Sarcophagidae). J.

Forensic. Sci. 58:184-187.

Meiklejohn K.A., Wallman J.F., Pape T., Cameron S.L., Dowton M. (2013b) Utility of COI, CAD

and morphological data for resolving relationships within the genus Sarcophaga

(sensu lato) (Diptera: Sarcophagidae): A preliminary study. Mol. Phylo. Evol.

69:133-141.

Milne I., Lindner D., Bayer M., Husmeier D., McGuire G., Marshall D.F., Wright F. 2009.

TOPALi v2: a rich graphical interface for evolutionary analyses of multiple

alignments on HPC clusters and multi-core desktops. Bioinformatics. 25:126-127.

Monaghan M.T., Wild R., Elliot M., Fujisawa T., Balke M., Inward D.J.G., Lees D.C.,

Ranaivosolo R., Eggleton P., Barraclough T.G., Vogler A.P. 2009. Accelerated Species

Inventory on Madagascar Using Coalescent-Based Models of Species Delineation.

Syst. Biol. 58:298-311.

Nee S., May R.M., Harvey P.H. 1994. The reconstructed evolutionary process. Phil. Trans. R.

Soc. B. 344:305-11.

Nielsen R., Wakeley J. 2001. Distinguishing migration from isolation: a Markov chain Monte

Carlo approach. Genetics. 158:885-896.

93

Niemiller M.L., Fitzpatrick B.M., Miller B.T. 2008. Recent divergence-with-gene-flow in

Tennessee cave salamanders (Plethonodontidae: Gyrinophilus) inferred from gene

genealogies. Mol. Ecol. 17:2258-2275.

Niemiller M.L., Near T.J., Fitzpatrick B.M. 2012. Delimiting species using multilocus data:

diagnosing cryptic diversity in the southern cavefish, Typhlichthys subterraneus

(Teleostei: Amblyopsidae). Evolution. 66:846-866.

Pape T. 1994. The world Blaesoxipha Loew, 1861 (Diptera: Sarcophagidae). Entomol.

Scand. Suppl. 45:1-247.

Pape T. 1996. Catalogue of the Sarcophagidae of the World (Insecta: Diptera), Memoirs on

Entomology International, Vol. 8. Associated Publishers, Gainesville, FL.

Pape T., Dahlem G.A. 2010. Sarcophagidae (Flesh Flies). In: Brown B.V., et al., editors.

Manual of Central American Diptera, Volume 2. Ottawa, Ontario, CAN., NRC Research

Press p. 1313-1335.

Pons J., Barraclough T.G., Gomez-Zurita J., Cardoso A., Duran D.P., Hazell S., Kamoun S.,

Sumlin W.D., Vogler A.P. 2006. Sequence-based species delimitation for the DNA

taxonomy of undescribed insects. Syst. Biol. 55:595-609.

Pons J., Fujisawa T., Claridge E.M., Savill R.A., Barraclough T.G., Vogler A.P. 2011. Deep

mtDNA subdivision within Linnean species in an endemic radiation of tiger beetles

from New Zealand (genus Neocicindela). Mol. Phyl. Evol. 59:251-262.

Pritchard J.K., Stephens M., Donnelly P. 2000. Inference of population structure using

multilocus genotype data. Genetics. 155:945-959.

94

Pritchard J.K., Wen X., Falush D. 2010. Documentation for structure software: Version 2.3,

Available from [http://pritchardlab.stanford.edu/structure_software/

release_versions/v2.3.4/structure_doc.pdf] Last accessed 01/27/2015.

Rambaut A., Suchard M.A., Xie D., Drummond A.J. 2014. Tracer v1.6, Available

from [http://beast.bio.ed.ac.uk/Tracer].

Reid N.M., Carstens B.C. 2012. Phylogenetic estimation error can decrease the accuracy of

species delimitation: a Bayesian implementation of the general mixed Yule-

coalescent model. BMC Evol. Biol. 12:196.

Robineau-Desvoidy J.B. 1863. Histoire naturelle des Dipteres des environs de Paris. Vol. 2,

pp. 920. (Paris, France).

Ronquist F., Huelsenbeck J.P. 2003. MrBayes 3: Bayesian phylogenetic inference under

mixed models. Bioinformatics. 19:1572-1574.

Rosenberg N.A., Nordborg M. 2002. Genealogical trees, coalescent theory and the analysis

of genetic polymorphisms. Nat. Rev. Genet. 3:380-390.

Rosenberg N.A. 2004. Distruct: a program for the graphical display of population structure.

Mol. Ecol. Notes. 4:137-138.

Satler J.D., Carstens B.C., Hedin M. 2013. Multilocus Species Delimitation in a Complex of

Morphologically Conserved Trapdoor Spiders (Mygalomorphae, Antrodiaetidae,

Aliatypus). Syst. Biol. 62:805-823.

Shaw K.L. 2002. Conflict between nuclear and mitochondrial DNA phylogenies of a recent

species radiation: what mtDNA reveals and conceals about modes of speciation in

Hawaiian crickets. Proc. Royal Soc. B. 99:16122-16127.

95

Song H., Buhay J.E., Whiting M.F., Crandall K.A. 2008. Many species in one: DNA barcoding

overestimates the number of species when nuclear mitochondrial pseudogenes are

coamplified. Proc. Natl. Acad. Sci. USA. 105:13486-13491.

Spillings B.L., Brooke B.D., Koekemoer L.L., Chiphwanya J., Coetzee M., Hunt R.H. 2009. A

new species concealed by Anopheles funestus Giles, a major malaria vector in Africa.

Am. J. Trop. Med. Hyg. 81:510-515.

Stamatakis A. 2014. RAxML Version 8: A tool for Phylogenetic Analysis of Large

Phylogenies. Bioinformatics. 30:1312-1313.

Stamper T., Dahlem G.A., Cookman C., DeBry R.W. 2013. Phylogenetic relationships of flesh

flies in the subfamily Sarcophaginae based on three mtDNA fragments (Diptera:

Sarcophagidae). Syst. Entomol. 38:35-44.

Stephens M., Donnelly P. 2003. Ancestral inference in population genetics models with

selection. Aust. NZ. J. Stat. 45:901-931.

Stephens M., Smith N.J., Donnelly P. 2001. A new statistical method for haplotype

reconstruction from population data. Am. J. Hum. Genet. 68:978-989.

Sukumaran J., Holder M.T. 2010. DendroPy: A Python library for phylogenetic computing.

Bioinformatics. 26:1569-1571.

Tan S.H., Rizman-Idid M., Mohd-Aris E., Kurahashi H., Mohamed Z. 2010. DNA-based

characterization and classification of forensically important flesh flies (Diptera:

Sarcophagidae) in Malaysia. Forensic Sci. Int. 199:43-49.

Wakely J. 2006. Coalescent theory: an introduction. Roberts and Co., Greenwood Village,

Colorado.

96

Wiens J.J. 2007. Species Delimitation: New Approaches for Discovering Diversity. Syst. Biol.

56:875-878.

Wright S. 1931. Evolution in Mendelian populations. Genetics. 16:97-159.

Wong E.S., Dahlem G.A., Stamper T.I., DeBry R.W. 2015. Discordance between

morphological species identification and mtDNA phylogeny in the flesh fly genus

Ravinia (Diptera: Sarcophagidae). Invert. Syst. 29:1-11.

Xie W., Lewis P.O., Fan Y., Kuo L., Chen M.-H. 2011. Improving marginal likelihood

estimation for Bayesian phylogenetic model selection. Syst. Biol. 60:150-160.

Yang Z., Rannala B. 2010. Bayesian species delimitation using multilocus sequence data.

Proc. Natl. Acad. Sci. USA. 107:9264-9269.

Zhang C., Zhang D.X., Zhu T., Yang Z. 2011. Evaluation of a Bayesian coalescent method of

species delimitation. Syst. Biol. 60:747-761.

97

3.10 SUPPLEMENTAL MATERIALS

APPENDIX: S3.1. Primer Information and Amplification Procedure:

Mitochondrial gene fragments COI and COII were amplified using primers and specifications given in DeBry et al. (2013) and Stamper et al. (2013), respectively.

Nuclear gene fragments for elongation factor 1 alpha (EF-1a), white (Wt), period

(Per), glucose-6-phosphate dehydrogenase (G6PD), phosphoenolpyruvate carboxykinase

(PEPCK). All fragments except those listed were generated directly from published fly sequences*. We have listed GenBank accession numbers and citations of the flies used to generate these new primers.

Table S-3.1. Primer sequence information

Gene Name Sequence Citation EF-1a EF1a -55f 5'-GGT-ATC-ACC-ATT-GAT-ATT-GCT-TTG-TGG-3' Stamper 2008 EF1aEWF 5'-ATC-ATT_GAT-GCT-CCT-GGT-CA-3' This Study EF1aEWR 5'-TTG-AAA-CCA-ACG-TTG-TCA-CC-3' This Study EF1a836R 5'-CAG-CAG-CAC-CTT-TAG-GTG-GGC-TAG-CCT-T-3' Stamper 2008 White WECF21 5'-GTT-TGT-GGC-GTA-GCC-TAT-CC-3' Ready et al. 2009 WECF195 5'-CGC-GCA-CTA-TGA-CTC-AAA-AA-3' This Study WECR459 5'-TTG-CCA-CGT-TGG-GAT-AAT-TT-3' This Study WECR12 5'-AAT-GTC-ACT-CTA-CCY-TCG-GC-3' Ready et al. 2009 Period Per1607F 5'-GCG-AGA-TCT-CCC-CGC-AYC-AYG-AYT-A-3' This Study PerEWF 5'-TAT-AGT-GGC-CCA-GGG-CAT-3' This Study 2007F 5'-TCG-CGA-ATC-CCG-TGG-ACG-AA-3' This Study PerEWR 5'-CCA-GCC-ATA-TAT-TGA-AGA-3' This Study 2574R 5'-CAT-GGC-AGC-GGC-AGG-ATG-AGT-3' This Study G6PD G6PD40F 5'-TCC-ACA-TGG-AAT-CGT-GAA-AA-3' This Study G6PD87F 5'-GGA-ACC-TTT-CGG-TAC-TGA-GG-3' This Study G6PD599R 5'-GGC-GAT-TTG-GTC-ATC-ATT-TT-3' This Study G6PD690R 5'-ACG-TTC-ATA-GGC-ATC-GGG-TA-3' This Study PEPCK PEPCK37F 5'-AGT-TTG-GCC-GAA-GGT-GTT-C-3' This Study PEPCK427R 5'-GTA-CTT-GGC-CAC-GAG-ATT-CC-3' This Study PEPCK455R 5'-ACG-AAC-CAG-TTG-ACG-TGG-A-3' This Study

98

Fly sequences used for generating primers*: Elongation factor 1 alpha: -Amplified sequences were obtained from those specimens that worked with Stamper (2008) primer set EF1a55f – EF1a836R. Subsequent primers were then generated from sequenced Ravinia individuals. White: -Amplified sequences were obtained from those specimens that worked with Ready et al. (2009) primer set WECF21-WECR12. Subsequent primers were then generated from sequenced Ravinia individuals. Period: -Cochliomyia macellaria (GenBank Accession Number: KC178068; Wiegmann et al. 2011). -Musca domestica (GenBank Accession Number: KC178071; Wiegmann et al. 2011). -Sarcophaga bullata (GenBank Accession Number: KC178072; Wiegmann et al. 2011). Glucose-6-phosphate dehydrogenase: -Cochliomyia macellaria (GenBank Accession Number: KC178100; Wiegmann et al. 2011). -Musca domestica (GenBank Accession Number: KC178102; Wiegmann et al. 2011). -Sarcophaga bullata (GenBank Accession Number: KC178103; Wiegmann et al. 2011). Phosphoenolpyruvate carboxykinase: -Cochliomyia macellaria (GenBank Accession Number: KC177343; Wiegmann et al. 2011). -Musca domestica (GenBank Accession Number: KC177346; Wiegmann et al. 2011). -Sarcophaga bullata (GenBank Accession Number: KC177347; Wiegmann et al. 2011).

Amplification Protocols: All nuclear gene fragments were first amplified using standard PCR, if this failed then nested PCR was used. All primers for both the first and the second rounds of PCR were able to be amplified using the following parameters: 95˚C for 3 min, followed by 35 cycles of: 94˚C for 1 s, 60˚C for 1 s, 68˚C for 20 s. PCR schemes were as follows: EF-1a:

1st Attempt PCR: EF1aEWF-EF1a836R @ Annealing 60˚C 2nd Attempt PCR: EF1a55f-EF1a836R @ Annealing 60˚C 3rd Attempt Nested PCR: [EF1a55f-EF1a836R] → EF1aEWF-EF1a836R @ Annealing 60˚C 4th Attempt Nested PCR: [EF1a55f-EF1a836R] → EF1a55f-EF1aEWR @ Annealing 60˚C

White:

1st Attempt PCR: WECF21-WECR12 @ Annealing 60˚C 2nd Attempt Nested PCR: [WECF21-WECR12] → WECF21-WECR459 @ Annealing 60˚C 3rd Attempt Nested PCR: [WECF21-WECR12] → WECF195-WECR12 @ Annealing 60˚C

99

Period:

1st Attempt Nested PCR: [1607-2574] → PerEWF-2574 @ Annealing 60˚C 2nd Attempt Nested PCR: [1607-2574] → 2007-2574 @ Annealing 60˚C 3rd Attempt Nested PCR: [1607-2574] → PerEWF-PerEWR @ Annealing 60˚C

G6PD:

1st Attempt Nested PCR: [G6PD40F-G6PD690R] → G6PD87F-G6PD599R @ Annealing 60˚C

PEPCK:

1st Attempt PCR: PEPCK37F-PEPCK455R @ Annealing 60˚C 2nd Attempt Nested PCR: [PEPCK37F-PEPCK455R] → PEPCK37F-PEPCK427R @ Annealing 60˚C

DeBry R.W., Timm A., Wong E.S., Stamper T., Cookman C., Dahlem G.A. 2013. DNA-Based Identification of Forensically Important Lucilia (Diptera: Calliphoridae) in the Continental United States*. J. Forensic Sci. 58(1):73-78. Ready P.D., Testa J.M., Wardhana A.H., Al-Izzi M., Khalaj M., Hall M.J.R. 2009. Phylogeography and recent emergence of the Old World screwworm fly, Chrysomya bezziana, based on mitochondrial and nuclear gene sequences. Med. Vet. Entomol. 23:43-50. Stamper T. 2008. Improving the Accuracy of Postmortem Interval Estimations Using Carrion Flies (Diptera: Sarcophagidae, Calliphoridae and Muscidae). Ph.D. Dissertation. University of Cincinnati, Cincinnati, Ohio, USA. Stamper T., Dahlem G.A., Cookman C., DeBry R.W. 2013. Phylogenetic relationships of flesh flies in the subfamily Sarcophaginae based on three mtDNA fragments (Diptera: Sarcophagidae). Syst. Entomol.. 38(1):35-44. Wiegmann B.M., Trautwein M.D., Winkler I.S., Barr N.B., Kim J.-W., Lambkin C., Bertone M.A., Cassel B.K., Bayless K.M., Heimberg A.M., Wheeler B.M., Peterson K.J., Pape T., Sinclair B.J., Skevington J.H., Blagoderov V., Caravas J., Kutty S.N., Schmidt-Ott U., Kampmeier G.E., Thompson C.F., Grimaldi D.A., Beckenbach A.T., Courtney G.W., Friedrich M., Meier R., Yeates D.K. 2011. Episodic radiations in the fly tree of life. Proc.Natl. Acad. Sci.USA. 108(14):5690-5695.

100

APPENDIX: S3.2. Best Model Selection and Partition Schemes:

Bayesian Inference Analyses (MrBayes)

R. anxia and R. querula data set COI: COI_pos1: GTR; COI_pos2: HKY; COI_pos3: HKY, Scheme lnL = -2496.96783; Scheme AIC = 5271.93566 R. anxia and R. querula data set COII: COII_pos1: HKY + I; COII_pos2: HKY; COII_pos3: HKY, Scheme lnL = -1155.25889; Scheme AIC = 2582.51778 R. anxia and R. querula data set EF1a: EF1a_pos1: HKY; EF1a_pos2: F81 + I; EF1a_pos3: HKY, Scheme lnL = -1146.63463; Scheme AIC = 2563.26926 R. anxia and R. querula data set White: White_pos1: K80 + I; White_pos2: HKY + I + G; White_pos3: HKY + I + G, Scheme lnL = -1293.58635; Scheme AIC = 2861.1727 R. anxia and R. querula data set Period: (Per_pos1, Per_pos2): GTR + G; Per_pos3: HKY, Scheme lnL = -935.49325; Scheme AIC = 2140.9865 R. anxia and R. querula data set PEPCK: (PEPCK_pos1, PEPCK_pos2): K80 + I; PEPCK_pos3: HKY + G, Scheme lnL = -724.60091, Scheme AIC = 1707.20182 R. anxia and R. querula data set G6PD: G6PD_pos1: HKY; G6PD_pos2: F81; G6PD_pos3: HKY + G, Scheme lnL = -1273.5593; Scheme AIC = 2817.1186 R. anxia and R. querula All Genes data set: PartitionFinder Input: All genes, all codon position. Greedy algorithm run to determine best-fit models and partitions; resulted in 19 subsets. Scheme lnL = -10311.25024; Scheme AIC = 2122.50048 Subset 1: COI_pos1 = GTR + I Subset 2: COI_pos2 = HKY + I Subset 3: (COI_pos3, COII_pos3) = HKY Subset 4: COII_pos1 = HKY + I Subset 5: COII_pos2 = HKY + I Subset 6: EF1a_pos1 = HKY + I Subset 7: EF1a_pos2 = GTR + I + G Subset 8: EF1a_pos3 = HKY + I + G Subset 9: White_pos1 = SYM + I Subset 10: White_pos2 = HKY + I + G Subset 11: White_pos3 = GTR + I + G Subset 12: (Per_pos1, Per_pos2) = GTR + G Subset 13: Per_pos3 = HKY + G Subset 14: G6PD_pos1 = HKY Subset 15: G6PD_pos2 = F81 Subset 16: G6PD_pos3 = GTR + G Subset 17: PEPCK_pos1 = HKY + I + G Subset 18: PEPCK_pos2 = JC Subset 19: PEPCK_pos3 = GTR + I + G

101

Maximum Likelihood Analyses (RAxML)

R. anxia and R. querula COI: 1st Codon Pos., 2nd Codon Pos., 3rd Codon Pos. = GTR + G. Scheme lnL = -2491.96537; Scheme AIC = 5283.93074 R. anxia and R. querula COII: (1st Codon Pos., 2nd Codon Pos.), 3rd Codon Pos. = GTR + G. Scheme lnL = -1159.64217; Scheme AIC = 2467.28434 R. anxia and R. querula EF1a: 1st Codon Pos., (2nd Codon Pos., 3rd Codon Pos.) = GTR + G. Scheme lnL = -1151.39379; Scheme AIC 2570.78758 R. anxia and R. querula G6PD: 1st Codon Pos., 2nd Codon Pos., 3rd Codon Pos. = GTR + G. Scheme lnL = -1258.15657; Scheme AIC = 2812.31314 R. anxia and R. querula PEPCK: (1st Codon Pos., 2nd Codon Pos.), 3rd Codon Pos. = GTR + G. Scheme lnL = -717.65079; Scheme AIC = 1711.30158 R. anxia and R. querula Period: (1st Codon Pos., 2nd Codon Pos.), 3rd Codon Pos. = GTR + G. Scheme lnL =-923.40787; Scheme AIC = 2114.81574 R. anxia and R. querula White: (1st Codon Pos., 2nd Codon Pos.), 3rd Codon Pos. = GTR + I + G. Scheme lnL = -1177.41104; Scheme 2634.82208

R. anxia and R. querula All Genes: PartitionFinder Input: All genes, all codon position. Greedy algorithm run to determine best-fit model and partitions; resulted in 15 subsets. Scheme lnL = -10303.77703; Scheme AIC = 21173.55406 Subset 1: (COII_pos1, COI_pos1) = GTR + I + G Subset 2: (COII_pos2, COI_pos2) = GTR + I + G Subset 3: (COII_pos3, COI_pos3) = GTR + I + G Subset 4: (EF1a_pos1, White_pos2) = GTR + I + G Subset 5: EF1a_pos2 = GTR + I + G Subset 6: EF1a_pos3 = GTR + I + G Subset 7: (PEPCK_pos2, White_pos1) = GTR + I + G Subset 8: White_pos3 = GTR + I + G Subset 9: (Per_pos1, Per_pos2) = GTR + I + G Subset 10: Per_pos3 = GTR + I + G Subset 11: G6PD_pos1 = GTR + I + G Subset 12: G6PD_pos2 = GTR + I + G Subset 13: G6PD_pos3 = GTR + I + G Subset 14: PEPCK_pos1 = GTR + I + G Subset 15: PEPCK_pos3 = GTR + I + G

*BEAST Analysis

COICOII: (COI_pos1, COI_pos2, COI_pos3, COII_pos1, COII_pos2, COII_pos3) = GTR + I + G, Scheme lnL = -7857.12668; Scheme AIC = 16184.25336 EF1a: (EF1a_pos1, EF1a_pos2, EF1a_pos3) = SYM + I, Scheme lnL = -1436.67395; Scheme AIC = 3335.3479 G6PD: (G6PD_pos1, G6PD_pos2, G6PD_pos3) = TN93 (TrN + I + G), Scheme lnL = - 2167.08558; Scheme AIC = 4798.17116 PEPCK: (PEPCK_pos1, PEPCK_pos2, PEPCK_pos3) = SYM + G, Scheme lnL = - 1289.59556; Scheme AIC = 3041.19112 102

Period: (Period_pos1, Period_pos2, Period_pos3) = HKY + G, Scheme lnL = - 1902.40222; Scheme AIC = 4264.80444 White: (White_pos1, White_pos2, White_pos3) = GTR + I + G, Scheme lnL = - 2327.89807; Scheme AIC = 5125.79614

103

SUPPLEMENTAL TABLES

Table S3.2. List of specimens, specimen identification (I.D.) codes, collection data, and GenBank Accession numbers.

Species I.D. Locality Sex COI COII EF-1a White Period G6PD PEPCK Ravinia anxia (Walker) AE79 Siskiyou, CA M KR608307 KR677021 KR608324 KR676632 KR676814 KR676761 KR676708 Ravinia anxia (Walker) AE80 Siskiyou, CA M KR608308 KR677022 KR608325 KR676633 KR676815 KR676762 KR676709 Ravinia anxia (Walker) AE81 Siskiyou, CA M JQ807069 KJ586403 KR608326 KR676634 KR676816 KR676763 KR676710 Ravinia anxia (Walker) AH52 Jefferson, OR F KJ586344 KJ586405 KR608328 KR676636 KR676817 KR676764 KR676711 Ravinia anxia (Walker) AJ18 Jefferson, OR M KJ586345 KJ586406 KR608329 KR676637 KR676818 KR676765 KR676712 Ravinia anxia (Walker) AK25 Jefferson, OR F KJ586346 KJ586407 KR608330 KR676638 KR676819 KR676766 KR676713 Ravinia anxia (Walker) AW48 Grant, NM F KJ586351 KJ586414 KR608331 KR676639 KR676820 KR676767 KR676714 Ravinia anxia (Walker) AW70 Grant, NM M KR608309 KR677023 KR608332 KR676640 KR676821 KR676768 KR676715 Ravinia floridensis (Aldrich) AA01 Duval, FL M JQ807076 KJ586349 KR608346 KR676650 KR676824 KR676772 KR676719 Ravinia querula (Walker) AF03 Siskiyou, CA M KJ586379 KJ586466 KR608372 KR676676 KR676838 KR676788 KR676735 Ravinia querula (Walker) AF16 Clackamas, OR M KJ586380 KJ586467 KR608373 KR676677 KR676839 KR676789 KR676736 Ravinia querula (Walker) AG28 Jefferson, OR M KJ586381 KJ586468 KR608374 KR676678 KR676840 KR676790 KR676737 Ravinia querula (Walker) AG30 Jefferson, OR M KJ586382 KJ586469 KR608375 KR676679 KR676841 KR676791 KR676738 Ravinia querula (Walker) AG31 Jefferson, OR M KJ586383 KJ586470 KR608376 KR676680 KR676842 KR676792 KR676739 Ravinia querula (Walker) AG32 Jefferson, OR M KJ586384 KJ586471 KR608377 KR676681 KR676843 KR676793 KR676740 Ravinia querula (Walker) AG33 Jefferson, OR M KJ586385 KJ586472 KR608378 KR676682 KR676844 KR676794 KR676741 Ravinia querula (Walker) AH42 Crook, OR M KJ586386 KJ586473 KR608379 KR676683 KR676845 KR676795 KR676742 Ravinia querula (Walker) AH51 Crook, OR F KJ586387 KJ586474 KR608380 KR676684 KR676846 KR676796 KR676743 Ravinia querula (Walker) AJ19 Jefferson, OR M KJ586388 KJ586475 KR608381 KR676685 KR676847 KR676797 KR676744 Ravinia querula (Walker) AJ23 Jefferson, OR F KJ586389 KJ586476 KR608382 KR676686 KR676848 KR676798 KR676745 Ravinia querula (Walker) AK04 Jefferson, OR M KR608316 KR677028 KR608383 KR676687 KR676489 KR676799 KR676746 Ravinia querula (Walker) AK05 Jefferson, OR M KJ586390 KJ586477 KR608384 KR676688 KR676850 KR676800 KR676747 Ravinia querula (Walker) AL32 Rabun, GA M KJ586391 KJ586478 KR608385 KR676689 KR676851 KR676801 KR676748 Ravinia querula (Walker) AS25 Marathon, WI F KJ586392 KJ586479 KR608386 KR676690 KR676852 KR676802 KR676749

104

Ravinia querula (Walker) AW30 Grant, NM M KR608317 KR677029 KR608387 KR676691 KR676853 KR676803 KR676750 Ravinia querula (Walker) AZ61 Bland, VA M JQ807101 KJ586480 KR608388 KR676692 KR676854 KR676804 KR676751 Ravinia querula (Walker) BA16 Grant, NM M JQ807102 KJ586481 KR608389 KR676693 KR676855 KR676805 KR676752 Ravinia querula (Walker) BI26 Utah, UT M KR608318 ------KR608390 KR676694 KR676856 KR676806 KR676753 Ravinia querula (Walker) BI29 Utah, UT M KR608319 KR677030 ------KR676695 ------KR676807 KR676754 Ravinia querula (Walker) BI46 Utah, UT M KR608320 KR677031 KR608391 KR676696 KR676857 KR676808 KR676755 Ravinia querula (Walker) E56 Washington, TN M KR608321 KR677032 KR608392 KR676697 KR676858 KR676809 KR676756

105

Table S3.3. BP&P results with guide tree parameters with adjusted population size (ϴ) and

root ages (τo) using a gamma distribution (α, β).

Population Size (ϴ) Root age (τo) I.D. *Model Final lnL α β α β Probability Morphology Guide Tree Large ancestral populations and 1 10 1 10 BP1 P[3]=1.00 -10558.3 deep divergences Small ancestral populations and 2 1000 2 1000 BP2 P[3]=1.00 -10565.0 shallow divergences Large ancestral populations with 1 10 2 1000 BP3 P[3]=1.00 -10557.3 shallow divergences Small ancestral populations with 2 1000 1 10 BP4 P[3]=1.00 -10563.9 large divergence Mitochondrial Guide Tree Large ancestral populations and 1 10 1 10 BP1 P[4]=1.00 -10557.7 deep divergences Small ancestral populations and 2 1000 2 1000 BP2 P[4]=1.00 -10559.1 shallow divergences Large ancestral populations with 1 10 2 1000 BP3 P[4]=1.00 -10556.9 shallow divergences Small ancestral populations with 2 1000 1 10 BP4 P[4]=1.00 -10560.3 large divergence DNA + Morphology Guide Tree Large ancestral populations and 1 10 1 10 BP1 P[5]=1.00 -10559.1 deep divergences Small ancestral populations and 2 1000 2 1000 BP2 P[5]=1.00 -10560.7 shallow divergences Large ancestral populations with 1 10 2 1000 BP3 P[5]=1.00 -10558.5 shallow divergences Small ancestral populations with 2 1000 1 10 BP4 P[5]=1.00 -10561.1 large divergence Note: *All nodes supported by a posterior probability of 1.00

106

SUPPLEMENTAL FIGURES

Figure S3.1. Inferred COI gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

107

Figure S3.2. Inferred COII gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

109

Figure S3.3. Inferred EF-1a gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

111

Figure S3.4. Inferred White gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

113

Figure S3.5. Inferred Period gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

115

Figure S3.6. Inferred G6PD gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

117

Figure S3.7. Inferred PEPCK gene-tree using maximum likelihood and Bayesian inference.

Support values for nodes are ML-bootstrap support (%)/BI-posterior probabilities.

119

Figure S3.8. bGMYC analysis: A) Heat map of coalescent units, color indicates probability

(red = p0-0.05; dark orange = p0.05-0.5; orange = p0.5-0.9; yellow = p0.9-0.95; light

yellow = p0.95-1.0). B) Placement of individuals on tree.

121

Figure S3.9. *BEAST tree selected as the most supported species delimitation scenario

according to the path sampling/stepping-stone analyses.

123

Chapter Four:

Phylogenetic relationships of the flesh fly genus Ravinia (Diptera:

Sarcophagidae).1

1 This manuscript has been formatted for submission to Molecular Phylogenetics and Evolution

125

4.1 ABSTRACT

The genus Ravinia Robineau-Desvoidy, 1863, is a genus in the flesh fly family

Sarcophagidae and represents an agriculturally important group of insects. Many of the species in this genus are sympatric and broadly distributed throughout North America.

Recent work conducted within this genus has identified cryptic species through the molecular species delimitation of Ravinia anxia and Ravinia querula. In addition, previous work using only mitochondrial loci with standard Bayesian inference and maximum likelihood analyses, inferred Ravinia floridensis and Ravinia lherminieri to be paraphyletic species. To investigate the phylogenetic status of the three cryptic species (R. anxia (sensu stricto), R. querula (sensu stricto), Ravinia n.sp.), and the status of R. floridensis and R. lherminieri, we constructed a molecular phylogeny using 12 molecular markers and expanded sampling to include additional Ravinia species. Bayesian inference, maximum likelihood, and coalescent-based species-tree methods using these markers strongly supported the distinctiveness of each of the Ravinia species, including showing the three cryptic species as monophyletic lineages. However, the paraphyletic species R. floridensis and R. lherminieri were not resolved as monophyletic in the concatenated phylogenetic analyses. This study is the first to employ a coalescent-based species-tree approach within the Sarcophagidae using a large molecular data set.

Keywords: multilocus, species-tree, North America, Chaetoravinia, cryptic species

126

4.2 INTRODUCTION

Members of Ravinia Robineau-Desvoidy, 1863, are a genus of flesh flies (Diptera:

Sarcophagidae) that represent a lineage of agriculturally important insects. Primarily,

Ravinia are directly involved in the recycling of nutrients in the environment through the breakdown of mammalian dung from a variety of livestock animals (i.e., cattle, horses, and pigs) as a result of larval feeding. These flies are broadly distributed worldwide with a total of 34 described species. Species diversity in this genus is greatly enriched in the New World with 33 species found in the North and South Americas; only a single species has an Old

World distribution (Pape 1996; Pape and Dahlem 2010). The currently defined genus,

Ravinia, has a rich taxonomic history where the genus had been separated into multiple genera or merged into a single genus at various times by various authors (Wong et al.

2015). The currently recognized genus, Ravinia (sensu lato) (Pape 1996) is composed of three synonymized genera: Ravinia, Adinoravinia Townsend, 1917, and Chaetoravinia

Townsend, 1917 (Lopes 1969). In addition to revision at the generic level, many species of

Ravinia have proven to be taxonomically difficult groups, with several species providing limited numbers of diagnostically relevant morphological characters. Ravinia was first established as a genus by Robineau-Desvoidy (1863) based on the shape of the abdomen, presence of setae on the primary segments of abdomen (female), and lack of setae on the hind tibia (male). Since the first description of Ravinia, these characteristics are no longer unique among the modern genera in the family Sarcophagidae (Wong et al. 2015). At present, identification of this genus is based on the following characters: 1) postalar wall setulate; 2) tegulae orangish in ground color; 3) frontal vitta of female at midpoint more

127 than two times the width of parafrontal plate; 4) males lacking apical scutellar bristles and with a reddish-orange genital capsule.

The first and primary tools used in the classification of Sarcophagidae species have been morphological characters, overwhelmingly from male reproductive features.

Typically highly trained taxonomists would perform these identifications and classifications using diagnostic character traits unique to the species of interest. However, the ultimate utility of morphological taxonomy in species that are cryptic or have limited numbers of distinctive species-level traits has been noted (Hebert et al. 2004a, 2004b;

Lowenstein et al. 2009; Spillings et al. 2009; Saitoh et al. 2015). The ability to integrate genetic data into the taxonomy of species would provide an objective and reproducible method for evolutionary reconstructions (Díaz-Rodríguez et al. 2015). There is a short list of studies that have used morphology in detailed evolutionary relationships within the

Sarcophaginae (Roback 1954; Lopes 1969, 1982; Pape 1996; Giroux et al. 2010). Roback

(1954) provided the first detailed phylogeny of the Sarcophaginae using morphological characters. The most recent analysis conducted by Giroux et al. (2010) for 76 species composing 19 genera within Sarcophaginae was based on 73 morphological characters.

However, the studies mentioned above focus on relationships above the species level and the morphological characters used at these levels are not likely to be informative when looking at the species level. Using genetic data can be an informative tool, especially in systems potentially lacking morphological differentiation.

Stamper et al. (2013) explored the phylogenetic relationships of flesh flies in the subfamily Sarcophaginae using three mitochondrial gene fragments. This study supported the genus Oxysarcodexia (Townsend, 1917) as the sister-group to Ravinia; a relationship

128 originally proposed based on morphological evidence (Lopes 1982; Pape 1994; Giroux et al. 2010). Wong et al. (2015) presented a phylogeny of the genus Ravinia inferred from two mitochondrial gene fragments and noted that for some species of North American Ravinia, discord was present between morphology used in identifications and the lineages inferred by molecular data. The discordance could have been a result of incomplete lineage sorting

(ILS), introgression, or the presence of undescribed species. This discord was particularly problematic in one group containing Ravinia anxia and Ravinia querula and in another group containing Ravinia floridensis and Ravinia lherminieri. However, Wong et al. (2015) at the time could not distinguish between these hypotheses using solely mitochondrial data. In the most recent study involving the genus Ravinia, Wong et al. (c.f. Chapter 3) delimited the problematic species R. anxia and R. querula using a combination of coalescent-based species-tree analyses, migration analyses, population genetic analyses, and phylogenetic inference. They discovered that R. anxia and R. querula, as currently defined by morphology, are comprised of three cryptic species. Hereafter, these three cryptic species will be referred to as R. anxia (sensu stricto=s.s.), R. querula (s.s.), and

Ravinia n.sp.

For this study we examine the molecular phylogeny of 12 species of the genus

Ravinia. We sequenced portions of five mitochondrial (mtDNA) genes and seven nuclear

(nuDNA) genes and analyzed these loci using concatenated phylogenetic (Bayesian inference and maximum likelihood) and coalescent-based species-tree methods.

129

4.3 MATERIALS AND METHODS

Taxon selection & sampling

Specimens used for molecular analyses were collected using hand nets, Malaise traps, or baited traps (using dog feces or aged chicken). All specimens were frozen and taken to lab until sorted, pinned, and identified by GAD using morphological characters, or in the case of cryptic species identified by species delimitation methods employed in Wong et al. (c.f. Chapter 3). DNA was extracted only from legs on one side of specimen, ensuring that all morphological structures important for species identification would be preserved.

The removed legs were kept, until DNA extraction, at -80˚C in a separate vial of 95% EtOH.

The remainder of the specimens were then pinned or stored in 95% EtOH at -80˚C and are currently held by GAD as vouchers (will be placed in an established natural history collection at the completion of ongoing studies).

A total of four species, from two closely related genera, were used as outgroup taxa in this study. From the genus Oxysarcodexia which is sister to Ravinia, we used two species:

Oxysarcodexia ventricosa (Wulp, 1895); O. cingarus (Aldrich, 1916). We also included in the analyses two species from the genus Blaesoxipha: Blaesoxipha arizona Pape, 1994; B. cessator (Aldrich, 1916). The genus Blaesoxipha has been difficult to place within the

Sarcophaginae (Roback 1954; Lopes 1969, 1982; Pape 1994, 1996; Giroux et al. 2010;

Kutty et al. 2010), with one study placing it as sister to Ravinia (Kutty et al. 2010). As such, we include both genera to solidify the monophyletic status of Ravinia, and the inference of

Oxysarcodexia as sister to Ravinia shown in previous studies (Lopes 1982; Pape 1994;

Giroux et al. 2010; Stamper et al. 2013).

130

Several individuals were sampled from multiple populations for 12 out of the 18 species of Ravinia in North America. The genetic analyses included 121 individuals (117

Ravinia + 4 outgroup taxa): Ravinia acerba (Walker, 1849) (1), R. anxia (s.s.) (7), Ravinia derelicta (Walker, 1853) (25), Ravinia errabunda (Wulp, 1895) (1), Ravinia floridensis

(Aldrich, 1916) (7), Ravinia lherminieri (Robineau-Desvoidy, 1830) (8), Ravinia n.sp. (12),

Ravinia planifrons (Aldrich, 1916) (7), Ravinia pusiola (Wulp, 1895) (11), R. querula (s.s.)

(17), Ravinia stimulans (Walker, 1849) (18), Ravinia vagabunda (Wulp, 1895) (2). The geographic distribution of sampling localities across the United States can be seen in Fig.

4.1. The following species were not included in this study because we were unable to obtain material, even though collection efforts were conducted in regions where they are known to occur: Ravinia anandra (Dodge, 1956), Ravinia coachellensis (Hall, 1931), Ravinia effrenata (Walker, 1861), Ravinia pectinata (Aldrich, 1916), Ravinia sueta (Wulp, 1895),

Ravinia tancituro Roback, 1952. List of specimens included in this study, along with

GenBank accession numbers and locality information are provided in Table 4.1 and Table

S4.1.

DNA extraction, PCR, & sequencing

Qiagen’s DNeasy kits (Qiagen Cat. No. 69506) were used to extract genomic DNA from a whole fly leg that was sliced into fragments. A total of 12 gene fragments were PCR- amplified, including five mitochondrial (mtDNA) and seven nuclear (nuDNA) genes: mtDNA- cytochrome oxidase subunits I and II (COI and COII, respectively), NADH dehydrogenase 4 (ND4), NADH dehydrogenase 6 (ND6), ribosomal RNA 16S (16S); nuDNA- elongation factor 1 alpha (EF-1a), glucose-6-phosphate dehydrogenase (G6PD), internal transcribed spacer 2 (ITS2), phosphoenolpyruvate carboxykinase (PEPCK), period (Per),

131 ribosomal RNA 28S, white (Wt). The genes COI, COII, and ND4 were amplified using primers pairs and specifications given in Stamper et al. (2013). A fragment of 28S spanning the D3-D5 regions was amplified in a single fragment using previously published protocol and primer pairs (Friedrich and Tautz 1997a, 1997b; DeBry et al. 2010). Portions of the nuclear genes EF-1a, G6PD, PEPCK, Per, and Wt were amplified using primer pairs and protocols given in Wong et al. (c.f. Chapter 3). For the ITS2 amplification, we used primer pairs (without tailed-ends) and protocol given in Song et al. (2008b). For the 16S gene, we used primer pairs (without tailed-ends) and protocols given in Kutty et al. (2007). All amplifications were conducted in a solution consisting of: 200 μM dNTPs (Promega), 1μM of each primer, 1.5mM MgCl2, 0.25 units Platinum Pfx DNA polymerase (Invitrogen), buffer

(final concentration: 50mM Tris, pH 8.5, 4 mM MgCl2, 20 mM KCl, 500 μg/ml bovine serum albumin, 5% DMSO) (Idaho Technology), and 3μl template DNA.

Forward and reverse DNA strands were sequenced with their respective primer on an ABI prism 3730x/ with base calling and data compilation by Beckman Coulter Genomics,

Inc. (Danvers, MA, USA).

Data alignment, assembly, & characterization

Sequence chromatograms were visually checked in FinchTV v.1.4.0 (Geospiza, Inc. www.geospiza.com) for heterozygous sites and potential sequencing errors. After general quality control, sequences were assembled and edited in CLC Main Workbench v.7.0.3 (CLC

Bio www.clcbio.com). All assembled sequences were aligned in Mesquite v.2.73 (Maddison and Maddison 2011). The program MARNA v.3.4.2 (Siebert and Backofen 2005) was used for the preliminary alignments of 16S and 28S taking into account secondary structure of the rRNA sequence. For all other loci, the program Muscle v.3.8 (Edgar 2004) was used for

132 the sequence alignments. The resulting alignments for the protein coding loci were re- checked as both the translated amino acid sequence and the nucleotide sequence. No insertions or deletions (indels) were present among sequences for the following genes:

COI, COII, ND4, ND6, 16S, 28S, EF-1a, G6PD, PEPCK, Wt. As such, the final alignments for these genes were the same as preliminary Muscle and MARNA alignments. Indels were found to be present within both ITS2 and Per. For ITS2 there were several indel differences between species, however within species the presence and location of the indels were the same. The Per alignment had two indel locations on the portion of the gene sequenced, however these did not cause any change in amino acid reading-frame and did not produce premature stop codons. Based on the examination of sequence characteristics (i.e., quality score (phred) of chromatograms, amplicon length; Song et al. 2008a), absence of in-frame stop codons, and use of taxon specific primers, we believe that silenced genes and nuclear insertions of mitochondrial DNA (numts) were not erroneously sequenced.

Nuclear gene sequences were tested for recombination with the program TOPALi v.2.5 (Milne et al. 2009) using default settings and the Difference of Sums of Squares (DSS) method. General sequence characterization was performed using the program DNasp v.5

(Librado and Rozas 2009); evaluating the number of parsimony informative sites (PI), segregating sites (S), and nucleotide diversity (π). The gene information characterized was for the data set including all Ravinia sequences (i.e., outgroup sequences were excluded).

Concatenated multi-locus analyses

The gene partition scheme and model of sequence evolution were selected using the program PartitionFinder (Lanfear et al. 2012). The Akaike Information Criterion (AIC;

Akaike 1973, 1974) was used to compare candidate models of sequence evolution and

133 partition scheme. Only those model choice/partition schemes that could be implemented in the phylogenetic programs (RAxML and MrBayes) were considered in PartitionFinder analyses using the functions “models=RAxML” (for maximum likelihood) or

“models=mrbayes” (for Bayesian inference). We analyzed the concatenated mitochondrial and nuclear loci with maximum likelihood (ML) using RAxML v.8.0.20 (Stamatakis 2014), and Bayesian inference (BI) using the program MrBayes v.3.2.1 (Ronquist and Huelsenbeck

2003).

ML analyses of the concatenated data set, conducted using RAxML, used subsets of the GTRGAMMA model (Appendix S4.1) selected by PartitionFinder. The RAxML program defaults were used to stop search and nodal support was assessed by bootstrap resampling

(Felsenstein 1985) with 5000 pseudoreplicates. Maximum likelihood bootstrap probabilities (BPML) were mapped onto the 70% majority rule consensus tree of replicates using the program SumTrees implemented in DendroPy v.3.12.0 (Sukumaran and Holder

2010). BPML ≥ 95% are considered high support in these analyses.

BI analyses of the concatenated data set, conducted with MrBayes, used two independent Metropolis-coupled Markov chain Monte Carlo (MC3) (Geyer 1991; Gilks and

Roberts 1996) analyses to sample from the posterior distribution. Each MC3 analysis had one cold and three heated chains with default settings, sampling 1000 generations for

2x107 generations and discarding the first 50% of samples as burn-in. The program

TRACER v.1.6 (Rambaut et al. 2014) was used to assess run convergence by examining: the harmonic mean, stability of parameter values, similarity of branch lengths, trees, and likelihood values of chains, and the value of average standard deviation of split frequencies.

134

Nodal support was based on Bayesian posterior probabilities (PP). Nodes with PP ≥ 0.95 are considered to be high support.

Species-tree analysis

The program *BEAST (Heled and Drummond 2010) implemented in the program

BEAST (Drummond et al. 2012), was used to estimate the species-tree under the multispecies coalescent model. *BEAST is able to estimate the species-tree directly from the sequence data and calculates the posterior probability for gene-trees, species-tree, and divergence times. To make calculations tractable, a priori information must be used to assign individuals to species groups in the species-tree. Species groups were assigned based on morphological identifications for all species, except for R. anxia (s.s.), R. querula

(s.s.), and Ravinia n.sp., which were identified based on the molecular species delimitation of Wong et al. (c.f. Chapter 3). Two data sets were used in *BEAST analyses to reconstruct the species-tree (nuDNA, nuDNA + mtDNA). The full data set consisted of 12 loci (5 mtDNA

+ 7 nuDNA) and 16 species (including 4 outgroup taxa). Since mtDNA is maternally inherited as a single unit, all mtDNA loci will have the same gene-tree. As such, the 5 mitochondrial loci were analyzed as a linked unit in the tree for *BEAST analyses. *BEAST analyses used an uncorrelated lognormal relaxed clock (Drummond et al. 2006) with a Yule model prior of divergence times. The MCMC algorithm was run for 2x107 generations, sampling every 1000 generations and discarding the first 50% as burn-in. Convergence of runs were assessed by checking for stationary in tree lengths and likelihood scores using

Tracer v.1.6 looking at stationary of branch lengths and likelihood scores. PartitionFinder was used to select the best-fitting model of nucleotide substitution from among those implemented in *BEAST. The program TreeAnnotator v.1.8.0 (Heled and Drummond 2010)

135 was used to summarize the species-trees into a single maximum clade credibility tree, and visualized using FigTree v.1.4.1 (http://tree.bio.ed.ac.uk/software/figtree/). In addition,

DensiTree (Bouckaert 2010) was used to visualize the entire posterior distribution of species-trees. This program takes all 1x104 species-trees, each represented as a single semi-transparent line, and overlays them on top of each other. The areas where there is much agreement in topology and branch lengths results in many lines being drawn, giving a darker color.

4.4 RESULTS

Molecular data

DNA sequence data were obtained for 121 individuals (including 4 outgroup specimens) comprising 12 of the 18 species of Ravinia that are found in North America

(Pape 1996; Pape and Dahlem 2010; Wong et al. 2015; c.f. Chapter 3). A total of 8188 base pairs (bp) of DNA sequence data (mtDNA = 4399 bp; nuDNA= 3789 bp) were included in the final molecular alignment. The list of samples used in molecular analyses, locality information, and GenBank accession numbers augmenting those published in (Stamper et al. 2013; Wong et al. 2015; c.f. Chapter 3) are provided in Table 4.1 and Table S4.1.

Information for sequenced loci, including nucleotide diversity, are given in Table 4.2.

Concatenated multi-locus analysis

In general, the concatenated phylogenetic analyses using both ML and BI inferred the same topology with no significant disagreement between the two analyses. As such, only the BI tree is shown in Fig. 4.2. The only disagreement between ML and BI trees was that the ML analyses could not resolve the R. acerba/R. pusiola clade and the BI analyses did, albeit with low nodal support (Fig. 4.2; PP=0.57). Six out of the twelve Ravinia species

136 included in this study were resolved with high support, allowing differentiation between these closely related lineages. Those species not able to be resolved as monophyletic with high nodal support are R. acerba, R. lherminieri, and R. derelicta.

Both ML and BI analyses found strong support for two clades within the modern

Ravinia (sensu lato) that correspond to previously synonymized genera, Chaetoravinia

Townsend, 1917 and Ravinia (Fig 4.2; BPML=100; PP=1.0). These groups are also supported by a distinct morphological character: absence or presence of setae on the R1 wing vein in

Ravinia and Chaetoravinia, respectively (Fig. 4.3; Lopes 1969). For additional information on the previously synonymized genera, Chaetoravinia and Ravinia, the authors refer readers to Wong et al. (2015). Within the Ravinia group, R. planifrons is found to be sister to the other seven species: R. anxia (s.s.), R. floridensis, R. lherminieri, Ravinia n.sp., R. pusiola,

R. querula (s.s.) (Fig. 4.2; BPML= 100; PP= 1.0). Our seven specimens of R. planifrons are monophyletic and distinct from the other species (Fig. 4.2; BPML= 100; PP= 1.0). Ravinia pusiola specimens sorted onto two distinct haplotype groups: a paraphyletic group comprised of our single specimen of R. acerba plus two R. pusiola specimens, which was sister to a clade of the remaining R. pusiola (Fig. 4.2; BPML= 100; PP=1.0).Together R. pusiola and R. acerba form a group that is sister to a clade consisting of R. anxia (s.s.), R. floridensis, R. lherminieri, Ravinia n.sp., and R. querula (s.s.) (Fig. 4.2; BPML=100; PP= 1.0).

Our eight specimens of R. lherminieri sorted onto two distinct haplotypes: the R. lherminieri specimens from New York form a highly supported paraphyletic group with all the R. floridensis specimens (Fig. 4.2; BPML= 100; PP= 1.0), which was sister to a highly supported clade of R. lherminieri from Florida and Tennessee (Fig. 4.2; BPML= 100; PP= 1.0). In the group comprising R. anxia (s.s.), R. querula (s.s.), and Ravinia n.sp., we found highly

137 supported reciprocal monophyly between these three species. Our eight specimens of R. anxia (s.s.) formed a highly supported clade (Fig. 4.2; BPML= 100; PP= 1.0) that was sister to a clade comprised of R. querula (s.s.) and Ravinia n.sp. (Fig. 4.2; BPML=100; PP=1.0). Both R. querula (s.s.) and Ravinia n.sp. were resolved as monophyletic groups with high nodal support (Fig. 4.2; BPML= 100; PP=1.0).

Within the Chaetoravinia group, R. derelicta, R. stimulans, and R. vagabunda formed a clade with high nodal support, which was sister to our single specimen of R. errabunda (Fig.

4.2; BPML= 100; PP= 1.0). Ravinia stimulans was resolved as a monophyletic group and was found to be sister to a clade comprising R. derelicta + R. vagabunda with high nodal support

(Fig. 4.2; BPML= 100; PP= 1.0). Our two R. vagabunda individuals, collected from a single locality in New Mexico, were monophyletic and strongly supported as nested within a paraphyletic R. derelicta (Fig. 4.2; BPML= 100; PP= 1.0).

Species-tree analysis

The species-trees of the mitochondrial and nuclear data set were estimated using

*BEAST, to evaluate the inferred phylogeny of the concatenated data set. The species-trees were summarized in a maximum clade credibility tree (MCCT; Fig. 4.4A) and a cloudogram

(Fig. 4.4B). The species-trees showed similar results when compared with the concatenated multi-locus analyses. The species-trees strongly supported grouping R. lherminieri and R. floridensis with a posterior probability of 1.0 (Fig. 4.4A). In addition, the species-trees grouped R. acerba and R. pusiola with high nodal support (Fig. 4.4A; PP= 1.0), and grouped R. vagabunda and R. derelicta with high nodal support (Fig. 4.4A; PP= 1.0).

Due to our low sample sizes for R. acerba and R. vagabunda, the disagreement between the

138 concatenated phylogeny and the species-tree are unclear with regard to the paraphyletic relationships.

4.5 DISCUSSION

The present study uses multi-locus analyses on a concatenated data set and coalescent-based species-tree estimations to infer relationships within the genus, Ravinia.

According to previous studies, several species of Ravinia were inferred as paraphyletic using two mitochondrial loci (Wong et al. 2015). Although mitochondrial loci (e.g., COI +

COII) are often recognized as markers displaying high levels of polymorphic divergence

(i.e., barcoding loci), COI + COII sequences in some cases may be unsuitable in differentiating phylogenetic relationships among closely related species (Nelson et al.

2007; Xiao et al. 2010). Within Ravinia there are several clades that contain morphologically distinct species that are for the most part not distinguishable using COI +

COII sequence data (e.g., R. floridensis/R. lherminieri, R. acerba/R. pusiola, and R. derelicta/R. vagabunda; Wong et al. 2015). In this study, we used five mitochondrial loci and seven nuclear loci with increased taxonomic sampling over previous studies (Wong et al. 2015; c.f. Chapter 3).

Based on the phylogenetic tree (Fig. 4.2), the species of Ravinia fall into two distinct clades (sections A and B), with high nodal support (BPML= 100; PP= 1.0). These clades represent the groups Ravinia and Chaetoravinia, previously synonymized into the currently recognized genus Ravinia (sensu lato). The separation of these groups is congruent with a previous study using only mitochondrial data to infer species relationships (Wong et al.

2015). This separation is further supported morphologically by the absence or presence of setae on the R1 wing vein (Fig 4.3) in Ravinia and Chaetoravinia, respectively. This provides

139 further evidence for recognition of these two groups as subgenera of Ravinia (sensu lato) or even as separate genera. Further sampling of individuals thought to be part of the

Adinoravinia lineage would be needed to solidify the status of these groups.

There was a single difference between the topology inferred from the concatenated data set in this study and the mtDNA phylogeny presented in Wong et al. (2015). Wong et al. (2015) showed that R. planifrons formed a highly supported monophyletic clade that is sister to a clade comprised of the species: R. anxia (s.s.), R. floridensis, R. lherminieri, Ravinia n.sp., and R. querula (s.s.). In this study, however, when the species R. acerba is added to analyses along with additional loci, the clade R. acerba + R. pusiola is found to be sister to the clade comprised of: R. anxia (s.s.), R. floridensis, R. lherminieri, Ravinia n.sp., and R. querula (s.s.); with high nodal support (Fig. 4.2; BPML= 100; PP= 1.0). While in both studies the topologies were highly supported, this study presents a higher nodal support for this relationship while including a greater amount of molecular data. However, due to the limited number of R. acerba specimens sampled, the exact placement of this species within

Ravinia is uncertain.

Ravinia anxia (s.s.), Ravinia querula (s.s.), and Ravinia n.sp.

These three cryptic species are inferred as genetically divergent, reciprocally monophyletic lineages with high nodal support (Fig. 4.2; BPML= 100; PP= 1.0). The phylogeny inferred using the full concatenated data set presents a topology similar to previous studies (Wong et al. 2015; c.f. Chapter 3). Within this clade R. anxia (s.s.) is sister to the larger group composed of the species R. querula (s.s.) and Ravinia n.sp. (Fig. 4.2;

BPML= 100; PP= 1.0). The relationship between R. querula (s.s.) and Ravinia n.sp. is largely congruent with geography, where these sister taxa are geographically proximate and

140 overlap in the northwest United States (Fig. 4.1). Although R. querula (s.s.) appears to have a much larger distribution across the United States, our collected species of R. querula (s.s.) and Ravinia n.sp. are genetically divergent in locations where the two species overlap. The species-tree inferred using *BEAST, recovered strong support for the branching of R. anxia

(s.s.) (Fig. 4.4; PP=1.0), with relatively weak support for its sister clade comprised of

R. querula (s.s.) and Ravinia n.sp. (Fig. 4.4A; PP=0.73). The next probable branching, based on the posterior distribution of species-trees in the cloudogram (Fig. 4.4B), has

Ravinia n.sp. and R. anxia (s.s.) as sister taxa.

Ravinia floridensis and Ravinia lherminieri

Our results using the concatenated data set show that R. floridensis and R. lherminieri form a strongly supported clade (Fig. 4.2; BPML= 100; PP= 1.0). Within this clade are two distinct and highly supported clades, but these are not indicative of the two species diagnoses based on morphological identifications. One clade comprises several R. lherminieri specimens collected from Tennessee and New York. The other clade is comprised of all the R. floridensis individuals as well as R. lherminieri collected from New

York. This discrepancy between the morphology and the genetic lineages has also been found in a previous study (Wong et al. 2015). Wong et al. (2015) presented several hypotheses of how this discrepancy could have arisen between the morphology and genetic lineages. In summary, the first possibility is that current species descriptions based on morphology could be incorrect and the species division presented by the genetic data could be the actual species boundaries. Second, the morphological species descriptions could be correct, in which the discrepancy could be from ILS or hybridization. Wong et al. (2015) state that the hybridization hypothesis is unlikely based on the large geographic separation

141 between R. floridensis specimens and those identified as R. lherminieri (Fig 4.1). Finally, the morphological characters used in the identification of R. floridensis exhibit variation across the species range, and as such may be indicative of cryptic species. Wong et al. (2015) concluded that future sampling of specimens and additional molecular markers would be needed to infer which of these scenarios is most likely. In the present study, the phylogeny inferred using a large concatenated data set found a similar topology as Wong et al. (2015) indicating paraphyletic relationships in R. lherminieri. When multilocus coalescent-based analyses are performed, to compare the inferred species-tree to the concatenated RAxML and MrBayes phylogeny, a similar relationship is highly supported. The species-tree recovered strong support for the grouping of R. floridensis and R. lherminieri (Fig. 4.4A;

PP=1.0). Furthermore, the strong support for this grouping was found when using either the nuDNA only or the mtDNA + nuDNA data sets (Fig. 4.4A; Fig. S4.1). To further explore the possible causes of this discordance in R. lherminieri and R. floridensis, it is recommended that a full species delimitation study be conducted using methods that do not require the a priori partitioning of individuals to species categories.

4.6 CONCLUSIONS

The purpose of this study was to explore the phylogenetic relationships within

Ravinia using increased sampling of taxa and molecular loci. In addition, we included the updated species taxonomy that has been previously suggested concerning the R. anxia and

R. querula species complex (c.f. Chapter 3). This study is the first to apply species-tree coalescent-based analyses to species in the family Sarcophagidae. As a result of these analyses, we can identify a case where sole use of mtDNA data would lead to mistaken identity of species, particularly R. floridensis and R. lherminieri. All analyses found strong

142 support for the division of Ravinia into the previously recognized groups Ravinia and

Chaetoravinia. If these divisions are maintained when including specimens believed to be part of the Adinoravinia lineage, then a formal recognition of these lineages would be necessary. This study is only focused on North American Ravinia, but there are also numerous other species found only in the Central and South Americas. For some species our sampling was limited, even though there were focused collection trips throughout the

U.S. in areas where certain species are known to be encountered (Fig. 4.1). This is a problem often encountered in molecular studies because of an inherent rarity of species

(Lim et al. 2012). Regardless of this problem several relationships were able to be resolved, including the discordant relationships seen in previous studies (Wong et al. 2015). The present study, however, cannot resolve those relationships where specimen sampling appears to be a problem (e.g., R. vagabunda, R. acerba). Further specimen collections should focus on increasing geographic sampling of species that exhibit discordance. Future research would benefit from classical taxonomy providing new character descriptions that could be used in morphological analyses and examining the Ravinia phylogeny with the addition of Old World and Neotropical taxa.

143

4.7 TABLES

Table 4.1. List of specimens, specimen identification (I.D.) codes, sequenced genes, and GenBank Accession numbers. Species I.D. COI COII ND4 ND6 16S 28S EF-1a G6PD ITS2 PEPCK Period White Blaesoxipha arizona AW32 JQ806809 JQ806788 JQ806767 KR676978 ------Blaesoxipha cessator AW27 JQ806810 JQ806789 JQ806768 KR676979 ------KR608322 KR676759 ------KR676706 KR676812 KR676630 Oxysarcodexia cingarus AP68 JQ806816 JQ806795 JQ806774 ------Oxysarcodexia ventricosa E08 GQ223321 GQ223321 GQ223292 KR676980 ------KR608323 KR676760 ------KR676707 KR676813 KR676631 Ravinia acerba AU02 KR608306 KR677020 ------KR676944 KR676861 ------KR676990 ------Ravinia n.sp. AE79 KR608307 KR677021 ------KR608324 KR676761 ------KR676708 KR676814 KR676632 Ravinia n.sp. AE80 KR608308 KR677022 ------KR608325 KR676762 ------KR676709 KR676815 KR676633 Ravinia n.sp. AE81 JQ807069 KJ586403 KR676988 KR676981 ------KR676862 KR608326 KR676763 ------KR676710 KR676816 KR676634 Ravinia n.sp. AF03 KJ586379 KJ586466 ------KR676915 KR608372 KR676788 ------KR676735 KR676838 KR676676 Ravinia n.sp. AG31 KJ586383 KJ586470 ------KR676919 KR608376 KR676792 ------KR676739 KR676842 KR676680 Ravinia n.sp. AG51 KJ586343 KJ586404 ------KR676863 KR608327 ------KR676635 Ravinia n.sp. AH52 KJ586344 KJ586405 ------KR676945 KR676864 KR608328 KR676764 KR76991 KR676711 KR676817 KR676636 Ravinia n.sp. AJ18 KJ586345 KJ586406 ------KR676865 KR608329 KR676765 ------KR676712 KR676818 KR676637 Ravinia n.sp. AJ19 KJ586388 KJ586475 ------KR676924 KR608381 KR676797 ------KR676744 KR676847 KR676685 Ravinia n.sp. AK04 KR608316 KR677028 ------KR608383 KR676799 ------KR676746 KR676849 KR676687 Ravinia n.sp. AK25 KJ586346 KJ586407 ------KR676866 KR608330 KR676766 ------KR676713 KR676819 KR676638 Ravinia n.sp. BA16 JQ807102 KJ586481 ------KR676930 KR608389 KR676805 ------KR676752 KR676855 KR676693 Ravinia anxia (s.s.) AT44 KJ586347 KJ586408 ------KR676867 ------R. anxia (s.s.) AT45 JQ807068 KJ586409 KR676989 KR676982 ------R. anxia (s.s.) AT46 KJ586348 KJ586410 ------R. anxia (s.s.) AT47 KJ586349 KJ586411 ------R. anxia (s.s.) AT49 KJ586350 KJ586412 ------R. anxia (s.s.) AT50 JQ807070 KJ586413 ------

144

R. anxia (s.s.) AW48 KJ586351 KJ586414 ------KR608331 KR676767 ------KR676714 KR676820 KR676639 R. anxia (s.s.) AW70 KR608309 KR677023 ------KR608332 KR676768 ------KR676715 KR676821 KR676640 Ravinia derelicta AA10 KJ586352 KJ586415 ------KR676946 ------KR676992 ------KR676641 R. derelicta AA13 KJ586353 KJ586416 ------KR676947 ------KR676993 ------KR676642 R. derelicta AA25 KJ586354 KJ586417 ------KR676948 ------KR608333 ------KR676994 ------KR676643 R. derelicta AA26 KJ586355 KJ586418 ------KR676949 KR676868 ------KR676995 ------R. derelicta AA38 KJ586356 KJ586419 ------KR676950 KR676869 ------KR676996 ------R. derelicta AB05 KJ586357 KJ586420 ------KR676951 KR676870 ------KR676997 ------R. derelicta AB41 KJ586358 KJ486421 ------KR676952 KR676871 ------KR676998 ------KR676644 R. derelicta AC75 KJ586359 KJ586422 ------KR676953 KR676872 ------KR676999 ------KR676645 R. derelicta AC76 KJ586360 KJ586423 ------KR676954 ------KR677000 ------R. derelicta AC81 KJ586361 KJ586424 ------KR676955 KR676873 KR608334 ------KR677001 ------R. derelicta AD01 KJ586362 KJ586425 ------KR676956 KR676874 KR608335 ------KR677002 ------R. derelicta AD02 KJ586363 KJ586426 ------KR676957 KR676875 KR608336 KR676769 KR677003 KR676716 ------KR676646 R. derelicta AD06 KJ586364 KJ586427 ------KR676958 KR676876 KR608337 ------KR677004 ------R. derelicta AD07 KJ586365 KJ586428 ------KR676959 KR676877 KR608338 ------KR677005 ------R. derelicta AD11 KJ586366 KJ586429 ------KR676960 KR676878 ------KR677006 ------R. derelicta AD40 KJ586367 KJ586430 ------KR676961 KR676879 KR608339 ------KR677007 ------R. derelicta AD42 KJ586368 KJ586431 ------KR676962 KR676880 KR608340 ------KR677008 ------R. derelicta AE57 KJ586369 KJ586432 ------KR676881 KR608341 KR676770 ------KR676717 KR676822 KR676647 R. derelicta AN13 KJ586370 KJ586433 ------KR676963 KR676882 KR608342 ------KR677009 ------R. derelicta AN18 KJ586371 KJ586434 ------KR676883 KR608343 ------R. derelicta AT48 KR608310 KR677024 ------KR676884 ------R. derelicta AZ09 JQ807072 KJ586435 ------KR676648 R. derelicta AZ62 JQ807073 KJ586436 ------R. derelicta BA29 JQ807075 KJ586437 ------KR676964 KR676885 KR608344 ------KR677010 ------R. derelicta BA30 JQ806817 JQ806796 JQ806775 KR676983 ------KR676886 ------R. errabunda AE51 KJ586372 KJ586438 ------KR676965 KR676887 KR608345 KR676771 KR67011 KR676718 KR676823 KR676649

145

Ravinia floridensis AA01 JQ807076 KJ586439 ------KR676966 KR676888 KR608346 KR676772 ------KR676719 KR676824 KR676650 R. floridensis AA03 JQ807077 KJ586440 ------KR676967 ------KR608347 KR676773 ------KR676720 KR676825 KR676651 R. floridensis AA04 KJ586373 KJ586441 ------KR676968 KR676889 KR608348 KR676774 KR677012 KR676721 KR676826 KR676652 R. floridensis AA60 JQ807078 KJ586442 ------KR676969 KR676890 KR608349 KR676775 KR677013 KR676722 KR676827 KR676653 R. floridensis AB15 JQ807079 KJ586443 ------KR676970 KR676891 KR608350 KR676776 KR677014 KR676723 KR676828 KR676654 R. floridensis AD31 KR608311 KR677025 ------KR608351 KR676777 ------KR676724 KR676829 KR676655 R. floridensis AD41 KJ586374 KJ586444 ------KR676971 KR676892 KR608352 KR676778 KR677015 KR676725 KR676830 KR676656 Ravinia lherminieri AA36 JQ807080 KJ586445 ------KR676972 KR676893 KR608353 KR676779 KR677016 KR676726 ------KR676657 R. lherminieri AN14 JQ807081 KJ586446 ------KR676894 KR608354 KR676780 ------KR676727 KR676831 KR676658 R. lherminieri AN15 JQ807082 KJ586447 ------KR676895 KR608355 KR676781 ------KR676728 KR676832 KR676659 R. lherminieri AN16 JQ807818 JQ806797 JQ806776 KR676984 ------KR676896 KR608356 KR676782 ------KR676729 KR676833 KR676660 R. lherminieri AN17 JQ807083 KJ586448 ------KR676897 KR608357 KR676783 ------KR676730 KR676834 KR676661 R. lherminieri AQ58 JQ807099 KJ586449 ------KR676898 ------KR676731 ------KR676662 R. lherminieri AZ44 JQ807084 KJ586451 ------KR676899 KR608358 KR676784 ------KR676835 KR676663 R. lherminieri AZ58 JQ807085 KJ586452 ------KR676900 ------KR676785 ------KR676664 Ravinia planifrons AE45 KR608312 ------KR676901 ------KR676732 ------R. planifrons AF05 KR608313 ------KR676902 ------R. planifrons AF06 JQ807088 KJ586453 ------KR676903 KR608359 ------KR676665 R. planifrons AG34 KJ586376 KJ586454 ------KR676904 KR608360 ------KR676666 R. planifrons AG49 KJ586377 KJ586455 ------KR676905 KR608361 KR676786 ------KR676733 KR676836 KR676667 R. planifrons AK08 JQ807089 KJ586456 ------KR676973 KR676906 KR608362 ------KR677017 ------R. planifrons AK09 KR608314 KR677026 ------KR608363 ------KR676668 Ravinia pusiola AW64 JQ807094 KJ586457 ------KR676907 KR608364 ------KR676669 R. pusiola AW73 JQ807095 KJ586458 ------KR676908 KR608365 ------R. pusiola AX61 JQ807093 KJ586459 ------KR676909 KR608366 KR676787 ------KR676734 KR676837 KR676670 R. pusiola AX62 JQ807096 KJ586460 ------KR676910 KR608367 ------KR676671 R. pusiola AX63 JQ806819 JQ806798 JQ806777 KR676985 ------KR676911 KR608368 ------KR676672 R. pusiola AY36 KR608315 KR677027 ------

146

R. pusiola AY38 KJ586378 KJ586461 ------R. pusiola AY39 JQ807090 KJ586462 ------KR676974 KR676912 KR608369 ------KR677018 ------R. pusiola AZ05 JQ807091 KJ586463 ------KR676913 ------KR676673 R. pusiola BA18 JQ807092 KJ586464 ------KR676914 KR608370 ------KR676674 R. pusiola BA19 JQ807098 KJ586465 ------KR608371 ------KR676675 R. querula (s.s.) AF16 KJ586380 KJ586467 ------KR676916 KR608373 KR676789 ------KR676736 KR676839 KR676677 R. querula (s.s.) AG28 KJ586381 KJ586468 ------KR676917 KR608374 KR676790 ------KR676737 KR676840 KR676678 R. querula (s.s.) AG30 KJ586382 KJ586469 ------KR676918 KR608375 KR676791 ------KR676738 KR676841 KR676679 R. querula (s.s.) AG32 KJ586384 KJ586471 ------KR676920 KR608377 KR676793 ------KR676740 KR676843 KR676681 R. querula (s.s.) AG33 KJ586385 KJ586472 ------KR676921 KR608378 KR676794 ------KR676741 KR676844 KR676682 R. querula (s.s.) AH42 KJ586386 KJ586473 ------KR676922 KR608379 KR676795 ------KR676742 KR676845 KR676683 R. querula (s.s.) AH51 KJ586387 KJ586474 ------KR676923 KR608380 KR676796 ------KR676743 KR676846 KR676684 R. querula (s.s.) AJ23 KJ586389 KJ586476 ------KR676925 KR608382 KR676798 ------KR676745 KR676848 KR676686 R. querula (s.s.) AK05 KJ586390 KJ586477 ------KR676926 KR608384 KR676800 ------KR676747 KR676850 KR676688 R. querula (s.s.) AL32 KJ586391 KJ586478 ------KR676927 KR608385 KR676801 ------KR676748 KR676851 KR676689 R. querula (s.s.) AS25 KJ586392 KJ586479 ------KR676975 KR676928 KR608386 KR676802 KR677019 KR676749 KR676852 KR676690 R. querula (s.s.) AW30 KR608317 KR677029 ------KR608387 KR676803 ------KR676750 KR676853 KR676691 R. querula (s.s.) AZ61 JQ807101 KJ586480 ------KR676929 KR608388 KR676804 ------KR676751 KR676854 KR676692 R. querula (s.s.) BI26 KR608318 ------KR608390 KR676806 ------KR676753 KR676856 KR676694 R. querula (s.s.) BI29 KR608319 KR677030 ------KR676807 ------KR676754 ------KR676695 R. querula (s.s.) BI46 KR608320 KR677031 ------KR608391 KR676808 ------KR676755 KR676857 KR676696 R. querula (s.s.) E56 KR608321 KR677032 ------KR676086 ------KR608392 KR676809 ------KR676756 KR676858 KR676697 Ravinia stimulans AD60 KJ586393 KJ586482 ------KR676931 ------R. stimulans AD64 KJ586394 KJ586483 ------KR676932 KR608393 KR676810 ------KR676757 KR676859 KR676698 R. stimulans AD65 KJ586395 KJ586484 ------KR676933 KR608394 ------KR676699 R. stimulans AD66 KJ586396 KJ586485 ------KR676934 ------KR676700 R. stimulans AD67 KJ586397 KJ586486 ------KR676976 KR676935 ------KR676701 R. stimulans AD68 KJ586398 KJ586487 ------KR676936 KR608395 ------KR676702

147

R. stimulans AE11 KJ586399 KJ586488 ------KR676937 KR608396 ------KR676703 R. stimulans AE12 KJ586400 KJ586489 ------KR676938 KR608397 ------KR676704 R. stimulans AR33 JQ807105 KJ586490 ------KR676939 ------R. stimulans AU03 KJ586401 KJ586491 ------R. stimulans AZ10 JQ807109 KJ586492 ------R. stimulans AZ11 JQ807103 KJ586493 ------R. stimulans AZ45 JQ807110 KJ586494 ------R. stimulans AZ46 JQ807111 KJ586495 ------R. stimulans AZ60 JQ807112 KJ586496 ------KR676940 KR608398 ------R. stimulans BA27 JQ807108 KJ586497 ------KR676941 KR608399 ------R. stimulans D45 KJ586402 KJ586498 ------R. stimulans E07 GQ223320 GQ223320 GQ223291 KR676987 ------KR608400 ------Ravinia vagabunda BA15 JQ807106 KJ586499 ------KR676942 KR608401 KR676811 ------KR676758 KR676860 ------R. vagabunda BA17 JQ807107 KJ586500 ------KR676977 KR676943 KR608402 ------KR676705

148

Table 4.2. Gene information in multilocus analyses. Name of locus is given with length

(bp), number of sequences (# Seq), number of segregating sites (S), number of

parsimony informative sites (PI), and nucleotide diversity for Ravinia.

Locus bp # Seq. S PI π Mitochondrial DNA COI 1539 117 69 63 0.06804 COII 700 114 94 83 0.06374 ND4 719 6 101 34 0.06616 ND6 944 7 174 84 0.09938 16S 497 34 24 16 0.01153 Nuclear DNA EF-1a 715 76 8 4 0.01059 G6PD 538 51 84 34 0.02111 ITS2 376 34 12 9 0.02876 PEPCK 401 51 24 16 0.01587 Period 514 47 45 31 0.02235 White 591 74 20 14 0.01428 28s 654 83 2 2 0.00377 Total 8188

149

4.8 FIGURES

Figure 4.1. Sampling map of Ravinia species used for phylogenetic analyses in this study.

Ranges of twelve species are colored according to legend with sampling locations

indicated by stars (see Table S4.1 for specific sampling localities).

150

Figure 4.2. Inferred phylogenetic tree of Ravinia species relationships of North America

using maximum likelihood analysis of 5 mitochondrial genes and 7 nuclear genes.

Support values are given in the following order: Bayesian posterior probabilities

using MrBayes/bootstrap support (%) from ML analysis using RAxML. Shaded

colors represent different Ravinia species. Insert illustrates how the tree topology is

broken up into Sections A and B.

152

Figure 4.3. Wing diagram of a fly indicating the absence or presence of setae on the R1

wing vein (red). Individuals in the Ravinia group lack setae on this wing vein, while

those in the Chaetoravinia group have setae on the R1 wing vein. (Adapted from

McAlpine 1989)

155

Figure 4.4. Species-tree estimation from five mitochondrial (COI, COII, ND4, ND6, 16S) and

seven nuclear loci (EF-1a, G6PD, ITS2, PEPCK, Per, 28S, Wt) using the program

*BEAST. (a) Maximum clade credibility tree generated by TreeAnnotator. Posterior

probabilities are shown above the nodes. (b) Cloudogram of all trees sampled and

visualized by DensiTree.

157

4.9 REFERENCES

Akaike, H. 1973. Information theory and an extension of the maximum likelihood principle.

In: Petrov B.N., Caski, F., editors. Second International Symposium on Information

Theory. Budapest: Akademiai. p. 267-281.

Akaike, H. 1974. A new look at the statistical model identification. IEEE Transactions on

Automatic Control. 19:716-723.

Bouckaert, R.R. 2010. DensiTree: making sense of sets of phylogenetic trees. Bioinformatics

26:1372-1373.

Díaz-Rodríguez, J., Gonçalves, H., Sequeira, F., Sousa-Neves, T., Tejedo, M., Ferrand, N.,

Martínez-Solano, I. 2015. Molecular evidence for cryptic candidate species in Iberian

Pelodytes (Anura, Pelodytidae). Molecular Phylogenetics and Evolution 83:224-241.

DeBry, R.W., Timm, A.E., Dahlem, G.A., Stamper, T. 2010. mtDNA-based identification of

Lucilia cuprina (Wiedemann) and Lucilia sericata (Meigen) (Diptera: Calliphoridae)

in the continental United States. Forensic Science International 202: 102-109.

Drummond, A.J., Ho. S.Y.W., Phillips, M.J., Rambaut, A. 2006. Relaxed phylogenetics and

dating with confidence. PLoS Biology 4:e88.

Drummond, A.J., Suchard, M.A., Xie, D., Rambaut, A. 2012. Bayesian phylogenetics with

BEAUti and BEAST 1.7 Molecular Biology and Evolution 29:1969-1973.

Edgar, R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and high

throughput. Nucleic Acid Research 32:1792-1797.

Felsenstein, J. 1985. Phylogenies and the comparative method. American Naturalist 125:1-

15.

159

Friedrich, M., Tautz, D. 1997a. An episodic change of rDNA nucleotide substitution rate has

occurred at the time of the emergence of the insect order Diptera. Molecular Biology

and Evolution 14:644-653.

Friedrich, M., Tautz, D. 1997b. Evolution and phylogeny of the Diptera: a molecular

phylogenetic analysis using 28s rDNA sequences. Systematic Biology 46:674-698.

Geyer, C.J. 1991. Markov chain Monte Carlo maximum likelihood. In: Keramides, E.M.,

editor. Computing Science and Statistics: Proceedings of the 23rd Symposium on the

Interface. Fairfax Station, USA, Interface Foundation. p. 156-163.

Gilks, W.R., Roberts, G.O. 1996. Strategies for improving MCMC. In: Gilks, W.R., Richardson,

S., Spiegelhalter, editors. Markov chain Monte Carlo in Practice. London, UK,

Chapman and Hall. p. 89-114.

Giroux, M., Pape, T., Wheeler, T. 2010. Towards a phylogeny of the flesh flies (Diptera:

Sarcophagidae): Morphology and phylogenetic implications of the acrophallus in the

subfamily Sarcophaginae. Zoological Journal of the Linnean Society 158:740-778.

Hebert, P.D.N., Penton, E.H., Burns, J.M., Janzen, D.H., Hallwachs, W. 2004a. Ten species in

one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly

Astraptes fulgerator. Proceedings of the National Academy of Sciences USA

101:14812-14817.

Hebert, P.D.N., Stoeckle, M.Y., Zemlak, T.S., Francis, C.M. 2004b. Identification of birds

through DNA barcodes. PLoS Biology 2:e312.

Heled, J., Drummond, A.J. 2010. Bayesian inference of species trees from multilocus data.

Molecular Biology and Evolution 27:570-580.

160

Kutty, S.N., Bernasconi, M.V., Šifner, F., Meier, R. 2007. Sensitivity analysis, molecular

systematics and natural history evolution of Scathophagidae (Diptera:

Cyclorrhapha: Calyptratae). Cladistics 23:64-83.

Kutty, S.N., Pape, T., Wiegmann, B.M., Meier, R. 2010. Molecular phylogeny of the Calypratae

(Diptera: Cyclorrhapha) with an emphasis on the superfamily Oestroidea and the

position of the Mystacinobiidae and McAlpine’s fly. Systematic Entomology 35:614-

635.

Lanfear, R., Calcott, B., Ho, S.Y.W., Guindon, S. 2012. PartitionFinder: Combined selection of

partitioning schemes and substitution models for phylogenetic analyses. Molecular

Biology and Evolution 29:1695-1701.

Librado, P., Rozas, J. 2009. DnaSP v5: A software for comprehensive analysis of DNA

polymorphism data. Bioinformatics 25:1451-1452.

Lim, G. S., Balke, M., and Meier, R. 2012. Determining species boundaries in a world full of

rarity: singletons, species delimitation methods. Systematic Biology 61:165-169.

Lopes, H.S. 1969. Family Sarcophagidae. In: Papavero, N. A Catalogue of the Diptera of the

Americas South of the United States, Vol. 103. Departamento de Zoologia, Secretaria

da Agricultura, São Paulo. p. 1-88.

Lopes, H.S. 1982. The importance of the mandible and clypeal arch of the first instar larvae

in the classification of the Sarcophagidae (Diptera). Revista Brasileira de Biologia

26:293-326.

Lowenstein, J.H., Amato, G., Kolokotronis, S.-O. 2009. The Real maccoyii: Identifying Tuna

Sushi with DNA Barcodes – Contrasting Characteristic Attributes and Genetic

Distances. PLoS ONE 4: e7866.

161

Maddison, W.P., Maddison, D.R. 2011. ‘Mesquite 2.73: a modular system for evolutionary

analysis.’ Available from [http://mesquiteproject.org].

McAlpine, J.F. (1989) Morphology and Terminology –Adults. In: McAlpine, J.F., Wood, D.M.

editors. Manual of Nearctic Diptera, Vol. 1. Agriculture Canada Research Branch,

Ottawa, CAN. p. 30.

Milne, I., Lindner, D., Bayer, M., Husmeier, D., McGuire, G., Marshall, D.F., Wright, F. 2009.

TOPALi v2: a rich graphical interface for evolutionary analyses of multiple

alignments on HPC clusters and multi-core desktops. Bioinformatics 25:126-127.

Nelson, L.A., Wallman, J.F., Dowton, M. 2007. Using COI barcodes to identify forensically and

medically important blowflies. Medical and Veterinary Entomology 21:44-52.

Pape, T. 1994. The world Blaesoxipha Loew, 1861. (Diptera: Sarcophagidae). Entomologica

Scandinavica Supplement 45:1-247.

Pape, T. 1996. Catalogue of the Sarcophagidae of the World (Insecta: Diptera), Memoirs on

Entomology International, Vol. 8. Associated Publishers, Gainesville, FL.

Pape, T., Dahlem, G.A. 2010. Sarcophagidae (Flesh Flies). In: Brown, B.V., et al., editors.

Manual of Central American Diptera: Volume 2. Ottawa, Ontario, CAN., NRC

Research Press p. 1313-1335.

Rambaut A., Suchard M.A., Xie D., Drummond A.J. 2014. Tracer v1.6, Available

from [http://beast.bio.ed.ac.uk/Tracer].

Roback, S.S. 1954. The evolution and taxonomy of the Sarcophaginae (Diptera,

Sarcophagidae). Illinois Biological Monographs 23:1-181.

Robineau-Desvoidy, J.B. 1863. Histoire naturelle des Dipteres des environs de Paris. Vol. 2.

Paris, France. p. 920.

162

Ronquist, F., Huelsenbeck, J.P. 2003. MrBayes 3: Bayesian phylogenetic inference under

mixed models. Bioinformatics 19:1572-1574.

Saitoh, T., Sugita, N., Someya, S., Iwami, Y., Kobayashi, S., Kamigaichi, H., Higuchi, A., Asai, S.,

Yamamoto, Y., Nishiumi, I. 2015. DNA barcoding reveals 24 distinct lineages as

cryptic bird species candidates in and around the Japanese Archipelago. Molecular

Ecology Resources 15:177-186.

Siebert, S., Backofen, R. 2005. MARNA: multiple alignment and consensus structure

prediction of RNAs based on sequence structure comparisons. Bioinformatics

21:3352-3359.

Song, H., Buhay, J.E., Whiting, M.F., Crandall, K.A. 2008a. Many species in one: DNA

barcoding overestimates the number of species when nuclear mitochondrial

pseudogenes are coamplified. Proceedings of the National Academy of Sciences USA

105:13486-13491.

Song, Z.-K., Wang, X.-Z., Liang, G.Q. 2008b. Species identification of some common

necrophagous flies in Guangdong province, southern China based on the rDNA

internal transcribed spacer 2 (ITS2). Forensic Science International 175: 17-22.

Spillings, B.L., Brooke, B.D., Koekemoer, L.L., Chiphwanya, J., Coetzee, M., Hunt, R.H. 2009. A

new species concealed by Anopheles funestus Giles, a major malaria vector in Africa.

American Journal of Tropical Medicine and Hygiene 81:510-515.

Stamatakis, A. 2014. RAxML Version 8: A tool for Phylogenetic Analysis of Large

Phylogenies. Bioinformatics 30:1312-1313.

163

Stamper, T., Dahlem, G.A., Cookman, C., DeBry, R.W. 2013. Phylogenetic relationships of

flesh flies in the subfamily Sarcophaginae based on three mtDNA fragments

(Diptera: Sarcophagidae). Systematic Entomology 38:35-44.

Sukumaran, J., Holder, M.T. 2010. DendroPy: A Python library for phylogenetic computing.

Bioinformatics 26:1569-1571.

Wiens, J.J., Servedio, M.R. 2000. Species delimitation in systematics: Inferring diagnostic

difference between species. Proceedings of the Royal Society of London B 267:631-

636.

Wong, E.S., Dahlem, G.A., Stamper, T.I., DeBry, R.W. 2015. Discordance between

morphological species identification and mtDNA phylogeny in the flesh fly genus

Ravinia (Diptera: Sarcophagidae). Invertebrate Systematics 29:1-11.

Xiao, J.-H., Wang, N.-X., Li, Y.-W., Murphy, R.W., Wan, D.-G., Niu, L.M., Hu, H.Y., Fu, Y.G.,

Huang, D.W. 2010. Molecular approaches to identify cryptic species and

polymorphic species within a complex community of fig wasps. PLoS ONE 5:e15067.

164

4.10 SUPPLEMENTAL MATERIALS

APPENDIX S4.1. Best Model Selection and Partition Schemes:

Bayesian Inference Analyses (MrBayes)

All Taxa, All Genes: PartitionFinder Input: All genes, all codon position. Greedy algorithm run to determine best-fit model and partitions; resulted in 22 subsets. Scheme lnL = -27455.88508; Scheme AIC = 55823.77016 Subset 1: COI_pos1 = GTR + I + G Subset 2: (COI_pos2, ND4_pos2) = GTR + I + G Subset 3: (COI_pos3, COII_pos3) = GTR + I + G Subset 4: (COII_pos1, ND4_pos1) = GTR + I + G Subset 5: COII_pos2 = HKY + I Subset 6: 28S = GTR + I + G Subset 7: EF1a_pos1 = F81 Subset 8: (EF1a_pos2, PEPCK_pos2, White_pos2) = GTR + I Subset 9: EF1a_pos3 = GTR + G Subset 10: G6PD_pos2 = GTR + I + G Subset 11: G6PD_pos3 = HKY + G Subset 12: (G6PD_pos1) = GTR + I + G Subset 13: ITS2 = HKY + G Subset 14: ND4_pos3 = GTR + G Subset 15: (ND6_pos2, ND6_pos3) = GTR + I + G Subset 16: ND6_pos1 = GTR + I + G Subset 17: PEPCK_pos3 = GTR + G Subset 18: (PEPCK_pos1, White_pos1) = GTR + I + G Subset 19: (Per_pos1, Per_pos2) = GTR + G Subset 20: Per_pos3 = HKY + G Subset 21: White_pos3 = HKY + G Subset 22: 16S = HKY + I

Maximum Likelihood Analyses (RAxML)

All Taxa, All Genes: PartitionFinder Input: All genes, all codon position. Greedy algorithm run to determine best-fit model and partitions; resulted in 19 subsets. Scheme lnL = -27466.96967; Scheme AIC = 55863.93934 Subset 1: (COI_1) = GTR + I + G Subset 2: (COI_pos2, COII_pos2, ND4_pos2) = GTR + I + G Subset 3: (COI_pos3, COII_pos3) = GTR + I + G Subset 4: (COII_pos1, ND4_pos1) = GTR + I + G Subset 5: (28S) = GTR + I + G Subset 6: (EF1a_pos1) = GTR + I + G Subset 7: (EF1a_pos2, PEPCK_pos2, White_pos2) = GTR + I + G 165

Subset 8: (EF1a_pos3) = GTR + I + G Subset 9: (G6PD_pos2) = GTR + I + G Subset 10: (G6PD_pos3) = GTR + I + G Subset 11: (G6PD_pos1) = GTR + I + G Subset 12: (ITS2) = GTR + I + G Subset 13: (ND4_pos3) = GTR + I + G Subset 14: (ND6_pos2, ND6_pos3) = GTR + I + G Subset 15: (16S, ND6_pos1) = GTR + I + G Subset 16: (PEPCK_pos3) = GTR + I + G Subset 17: (PEPCK_pos1, White_pos1) = GTR + I + G Subset 18: (Period_pos1, Period_pos2) = GTR + I + G Subset 19: (Period_pos3, White_pos3) = GTR + I + G

*BEAST Analysis

COICOII: (COI_pos1, COI_pos2, COI_pos3, COII_pos1, COII_pos2, COII_pos3) = GTR + I + G, Scheme lnL = -7857.12668; Scheme AIC = 16184.25336 ND4: (ND4_pos1, ND4_pos2, ND4_pos3) = GTR + G = Scheme lnL -825.56456; 1671.12912 ND6: (ND6_pos1, ND6_pos2, ND6_pos3) = GTR + G, Scheme lnL = -3570.56013; Scheme AIC = 7159.12026 EF1a: (EF1a_pos1, EF1a_pos2, EF1a_pos3) = SYM + I, Scheme lnL = -1436.67395; Scheme AIC = 3335.3479 G6PD: (G6PD_pos1, G6PD_pos2, G6PD_pos3) = TN93 (TrN + I + G), Scheme lnL = - 2167.08558; Scheme AIC = 4798.17116 PEPCK: (PEPCK_pos1, PEPCK_pos2, PEPCK_pos3) = SYM + G, Scheme lnL = - 1289.59556; Scheme AIC = 3041.19112 Period: (Period_pos1, Period_pos2, Period_pos3) = HKY + G, Scheme lnL = - 1902.40222; Scheme AIC = 4264.80444 White: (White_pos1, White_pos2, White_pos3) = GTR + I + G, Scheme lnL = - 2327.89807; Scheme AIC = 5125.79614 ITS2: = GTR + I + G, Scheme lnL = -838.69565; Scheme AIC = 1697.3913 28S: = GTR + I + G, Scheme lnL -1098.16588; Scheme AIC = 2216.33176 16S: = HKY + I, Scheme lnL = -882.14353; Scheme AIC = 1774.28706

166

SUPPLEMENTAL TABLES

Table S4.1. Specimen sex and locality data. Localities are in the USA and reported as

county and state.

Species I.D. Sex Locality Date Collected Blaesoxipha arizona (Pape) AW32 Male Grant County, NM 13-Aug-2007 Blaesoxipha cessator (Aldrich) AW27 Male Grant County, NM 13-Aug-2007 Oxysarcodexia cingarus (Aldrich) AP68 Male Schuyler County, NY 09-June-2007 Oxysarcodexia ventricosa (Wulp) E08 Male Hamilton County, OH 10-July-1999 Ravinia acerba (Walker) AU02 Female Kittson County, MN 05-July-2007 Ravinia n.sp. AE79 Male Siskiyou County, CA 07-June-2006 Ravinia n.sp. AE80 Male Siskiyou County, CA 07-June-2006 Ravinia n.sp. AE81 Male Siskiyou County, CA 07-June-2006 Ravinia n.sp. AF03 Male Siskiyou County, CA 07-June-2006 Ravinia n.sp. AG31 Male Jefferson County, CA 08-June-2006 Ravinia n.sp. AG51 Female Jefferson County, OR 14-June-2006 Ravinia n.sp. AH52 Female Jefferson County, OR 14-June-2006 Ravinia n.sp. AJ18 Male Jefferson County, OR 14-June-2006 Ravinia n.sp. AJ19 Male Jefferson County, OR 14-June-2006 Ravinia n.sp. AK04 Male Jefferson County, OR 14-June-2006 Ravinia n.sp. AK25 Female Jefferson County, OR 14-June-2006 Ravinia n.sp. BA16 Male Grant County, NM 13-Aug-2007 Ravinia anxia (s.s.) AT44 Female Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AT45 Female Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AT46 Female Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AT47 Female Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AT49 Male Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AT50 Male Kandiyohi County, MN 06-July-2007 R. anxia (s.s.) AW48 Female Grant County, NM 13-Aug-2007 R. anxia (s.s.) AW70 Male Grant County, NM 13-Aug-2007 Ravinia derelicta (Walker) AA10 Female Columbia County, FL 07-May-2006 R. derelicta (Walker) AA13 Female Columbia County, FL 07-May-2006 R. derelicta (Walker) AA25 Female Liberty County, FL 08-May-2006 R. derelicta (Walker) AA26 Female Liberty County, FL 08-May-2006 R. derelicta (Walker) AA38 Female Madison County, FL 08-May-2006 R. derelicta (Walker) AB05 Female Walton County, FL 09-May-2006 R. derelicta (Walker) AB41 Female Washington County, FL 10-May-2006 R. derelicta (Walker) AC75 Female Highland County, FL 12-May-2006

167

R. derelicta (Walker) AC76 Female Highland County, FL 12-May-2006 R. derelicta (Walker) AC81 Male Highland County, FL 12-May-2006 R. derelicta (Walker) AD01 Male Highland County, FL 12-May-2006 R. derelicta (Walker) AD02 Male Highland County, FL 12-May-2006 R. derelicta (Walker) AD06 Male Highland County, FL 12-May-2006 R. derelicta (Walker) AD07 Female Highland County, FL 12-May-2006 R. derelicta (Walker) AD11 Female Highland County, FL 12-May-2006 R. derelicta (Walker) AD40 Male Martin County, FL 13-May-2006 R. derelicta (Walker) AD42 Male Martin County, FL 13-May-2006 R. derelicta (Walker) AE57 Female Glenn County, CA 07-June-2006 R. derelicta (Walker) AN13 Female Jefferson County, TN 22-May-2006 R. derelicta (Walker) AN18 Male Jefferson County, TN 22-May-2006 R. derelicta (Walker) AT48 Female Kandiyohi County, MN 06-July-2007 R. derelicta (Walker) AZ09 Female Hamilton County, OH 12-June-2008 R. derelicta (Walker) AZ62 Female Bland County, VA 19-June-2008 R. derelicta (Walker) BA29 Male Hamilton County, OH 06-July-2008 R. derelicta (Walker) BA30 Male Hamilton County, OH 06-July-2008 R. errabunda (Wulp) AE51 Male Glenn County, CA 07-June-2006 Ravinia floridensis (Aldrich) AA01 Male Duval County, FL 07-May-2006 R. floridensis (Aldrich) AA03 Male Duval County, FL 07-May-2006 R. floridensis (Aldrich) AA04 Female Duval County, FL 07-May-2006 R. floridensis (Aldrich) AA60 Male Sarasota County, FL 11-May-2006 R. floridensis (Aldrich) AB15 Male Washington County, FL 10-May-2006 R. floridensis (Aldrich) AD31 Male Martin County, FL 13-May-2006 R. floridensis (Aldrich) AD41 Female Martin County, FL 12-May-2006 Ravinia lherminieri (R.-D.) AA36 Male Madison County, FL 08-May-2006 R. lherminieri (R.-D.) AN14 Male Jefferson County, TN 22-May-2006 R. lherminieri (R.-D.) AN15 Male Jefferson County, TN 22-May-2006 R. lherminieri (R.-D.) AN16 Male Jefferson County, TN 22-May-2006 R. lherminieri (R.-D.) AN17 Male Jefferson County, TN 22-May-2006 R. lherminieri (R.-D.) AQ58 Female Saratoga County, NY 13-June-2007 R. lherminieri (R.-D.) AZ44 Female Kanawha County, WV 19-June-2008 R. lherminieri (R.-D.) AZ58 Male Kanawha County, WV 19-June-2008 Ravinia planifrons (Aldrich) AE45 Male Klamath County, OR 07-June-2006 R. planifrons (Aldrich) AF05 Male Siskiyou County, CA 07-June-2006 R. planifrons (Aldrich) AF06 Male Siskiyou County, CA 07-June-2006 R. planifrons (Aldrich) AG34 Male Jefferson County, OR 14-June-2006 R. planifrons (Aldrich) AG49 Female Jefferson County, OR 14-June-2006 R. planifrons (Aldrich) AK08 Male Jefferson County, OR 14-June-2006

168

R. planifrons (Aldrich) AK09 Male Jefferson County, OR 14-June-2006 Ravinia pusiola (Wulp) AW64 Female Grant County, NM 13-Aug-2007 R. pusiola (Wulp) AW73 Male Grant County, NM 14-Aug-2007 R. pusiola (Wulp) AX61 Male Grant County, NM 14-Aug-2007 R. pusiola (Wulp) AX62 Male Grant County, NM 14-Aug-2007 R. pusiola (Wulp) AX63 Male Grant County, NM 14-Aug-2007 R. pusiola (Wulp) AY36 Male Grant County, NM 15-Aug-2007 R. pusiola (Wulp) AY38 Female Grant County, NM 15-Aug-2007 R. pusiola (Wulp) AY39 Male Grant County, NM 15-Aug-2007 R. pusiola (Wulp) AZ05 Male Grant County, NM 15-Aug-2007 R. pusiola (Wulp) BA18 Male Grant County, NM 13-Aug-2007 R. pusiola (Wulp) BA19 Male Grant County, NM 13-Aug-2007 R. querula (s.s.) AF16 Male Clackamas County, OR 09-June-2006 R. querula (s.s.) AG28 Male Jefferson County, OR 08-June-2006 R. querula (s.s.) AG30 Male Jefferson County, OR 08-June-2006 R. querula (s.s.) AG32 Male Jefferson County, OR 08-June-2006 R. querula (s.s.) AG33 Male Jefferson County, OR 08-June-2006 R. querula (s.s.) AH42 Male Crook County, OR 08-June-2006 R. querula (s.s.) AH51 Female Crook County, OR 08-June-2006 R. querula (s.s.) AJ23 Female Jefferson County, OR 14-June-2006 R. querula (s.s.) AK05 Male Jefferson County, OR 14-June-2006 R. querula (s.s.) AL32 Male Rabun County, OR 23-May-2006 R. querula (s.s.) AS25 Female Marathon County, WI 02-July-2007 R. querula (s.s.) AW30 Male Grant County, NM 13-Aug-2007 R. querula (s.s.) AZ61 Male Bland County, VA 19-June-2008 R. querula (s.s.) BI26 Male Utah County, UT 05-June-2011 R. querula (s.s.) BI29 Male Utah County, UT 05-June-2011 R. querula (s.s.) BI46 Male Utah County, UT 05-June-2011 R. querula (s.s.) E56 Male Washington County, TN 22-May-2004 Ravinia stimulans (Walker) AD60 Female Volusia County, FL 14-May-2006 R. stimulans (Walker) AD64 Male Volusia County, FL 14-May-2006 R. stimulans (Walker) AD65 Male Volusia County, FL 14-May-2006 R. stimulans (Walker) AD66 Male Volusia County, FL 14-May-2006 R. stimulans (Walker) AD67 Male Volusia County, FL 14-May-2006 R. stimulans (Walker) AD68 Male Volusia County, FL 14-May-2006 R. stimulans (Walker) AE11 Female Newberry County, SC 23-May-2006 R. stimulans (Walker) AE12 Male Newberry County, SC 23-May-2006 R. stimulans (Walker) AR33 Male Saratoga County, NY 13-June-2007 R. stimulans (Walker) AU03 Female Kittson County, MN 05-July-2007

169

R. stimulans (Walker) AZ10 Female Hamilton County, OH 15-June-2008 R. stimulans (Walker) AZ11 Female Hamilton County, OH 15-June-2008 R. stimulans (Walker) AZ45 Female Kanawha County, WV 19-June-2008 R. stimulans (Walker) AZ46 Female Kanawha County, WV 19-June-2008 R. stimulans (Walker) AZ60 Female Bland County, VA 19-June-2008 R. stimulans (Walker) BA27 Male Hamilton County, OH 06-July-2008 R. stimulans (Walker) D45 Female Hocking County, OH 03-July-2003 R. stimulans (Walker) E07 Male Hamilton County, OH 11-July-2003 Ravinia vagabunda (Wulp) BA15 Male Grant County, NM 13-Aug-2007 R. vagabunda (Wulp) BA17 Male Grant County, NM 13-Aug-2007

170

SUPPLEMENTAL FIGURES

Figure S4.1. Species-tree estimation from seven nuclear loci (EF-1a, G6PD, ITS2, PEPCK,

Per, 28S, Wt) using the program *BEAST. Maximum clade credibility tree generated

by TreeAnnotator. Posterior probabilities are shown above the nodes.

171

Chapter Five:

General Conclusions

173

The research that was conducted in this dissertation was motivated by a desire to explore the evolution of species, the processes involved in speciation, and aspects of species delimitation using molecular methods. The flesh fly genus Ravinia provided a model system for exploration in species delimitation using DNA-based techniques. This study system was chosen because of its broad geographic range, importance in other fields

(i.e., agriculture), presence of longstanding taxonomic disagreement, and ambiguous species limits based on morphology (Wong et al. 2015). Within Ravinia several species exhibit limited amounts of diagnostic characters resulting in potential problems when identifying specimens to species using morphology (Wong et al. 2015). The work presented within this dissertation is unique in several aspects. This dissertation provides the first phylogeny for members of the genus Ravinia (Wong et al. 2015; c.f. Chapter 4). In addition, this research is the first to apply coalescent-based species-tree methods to members of the flesh fly family Sarcophagidae using a multilocus data set (c.f. Chapter 3). This research is unique in terms of its integrative molecular approach to species delimitation using migration, population structure, phylogenetic inference, and coalescent-based analyses within Sarcophagidae (c.f. Chapter 3).

The first study (Chapter 2) presents a mitochondrial phylogeny of the genus Ravinia, using the loci: cytochrome oxidase subunits I and II. This is the first phylogenetic study ever conducted within the genus Ravinia. Previous studies that have included members of this genus have all been focused on relationships above the species level (Lopes 1982; Pape

1994; Giroux et al. 2010). In Chapter 2, the genetic lineages that are recovered from the mitochondrial (mtDNA) phylogeny are compared with the morphologically identified species. The purpose of this comparison was to determine if there was discordance

174

between the mtDNA phylogeny and the morphologically identified species groups. If discordance was found to exist among species relationships this would provide an impetus to pursue DNA-based methods of species delimitation within Ravinia. Several paraphyletic relationships were found to be present among species within this genus. This discordance between the mtDNA phylogeny and the morphological characters could have been a result of incomplete lineage sorting, migration and gene flow, cryptic species, or a combination of the three. However, methods based on a single locus could not differentiate these causes.

As such, a species delimitation study using a multilocus data set would need to be conducted to determine the cause(s) of this discordance. High support was found for the groups Chaetoravinia and Ravinia which had previously been separate genera before their collapse into the currently recognized genus Ravinia (sensu lato). The high nodal support of the clades along with morphological characters (i.e., R1 wing vein setae) indicates that these groups are likely naturally occurring and may deserve recognition as separate genera. However, a formal change in taxonomic status should await further examination of specimens believed to be part of the previously synonymized genus Adinoravinia.

In the second study (Chapter 3) two species within the genus Ravinia were delimited: Ravinia anxia (sensu lato) and R. querula (sensu lato). These two species were selected based on evidence of conflict between the mtDNA phylogeny and the morphologically identified species-group. In addition, previous taxonomic work had indicated high morphological variability within R. anxia (sensu lato) and designated this group as a species complex (Dahlem 1989). A total of 7 molecular markers (2 mtDNA loci and 5 nuclear loci) were analyzed using a combination of maximum likelihood and

Bayesian inference, migration analyses, population structure analyses, and coalescent-

175

based species-tree analyses. Across all methods the species scenario of R. anxia (sensu lato) and R. querula (sensu lato), delimited using the current morphological species definitions, was the least supported. This provides evidence that the current species delimitation is not accurate for these two species-groups. In general, the species delimitation scenarios received varying support across methods, with the most highly supported being the 3- species scenario. Furthermore, the incongruence of species delimitation scenarios between the methods used could be indicative of important methodological assumptions (i.e., migration/gene flow) being violated (Carstens et al. 2013). In systems like Ravinia, with high levels of genetic diversity and low levels of morphological variation, the accuracy of a species delimitation study will be dependent on both the data that is available and the choices that are made by the researchers (Satler et al. 2013). Given the support of the species delimitation analyses we recognize the three lineages recovered as evolutionary independent and, as such, distinct species. However, the distinction was only found to be present within molecular data and additional evidence using other techniques (e.g., morphology) would strengthen the status of these species. To prevent the use of nomina nuda the authors do not revise taxonomic standing; deferring to regulations regarding the naming of species in the International Code of Zoological Nomenclature (Articles 11-20;

1999). Until future research can produce morphological descriptions the authors suggested these conventions be used when referring to these genetically delimited species:

Ravinia anxia (sensu stricto) = individuals comprising Clade 1; Ravinia querula (sensu stricto) = individuals comprising Clade 2; Ravinia n.sp. = individuals comprising Clade 3 (c.f.

Chapter 3).

176

As presented in the third study (Chapter 4), the phylogenetic status of the three cryptic species (R. anxia (sensu stricto), R. querula (sensu stricto), Ravinia n.sp.), and the status of the two species Ravinia floridensis and Ravinia lherminieri was investigated using

12 molecular markers (5 mtDNA + 7 nuDNA loci). These markers strongly supported the distinctiveness of the three cryptic species as monophyletic lineages in the Bayesian inference, maximum likelihood, and species-tree analyses. This study was unique in that it was the first to employ a coalescent-based species-tree approach within the flesh fly family

Sarcophagidae using a large data set. In addition, the previously synonymized groups

Ravinia and Chaetoravinia were found to be monophyletic and could merit recognition as genera. Further studies using members from the Adinoravinia lineage will be needed to determine if a change in taxonomic status is necessary for these groups.

The work presented by this dissertation is not complete. More research will be needed, specifically detailed descriptions of the morphological characters, for the cryptic species to be recognized under the International Coded of Zoological Nomenclature.

However, this research does provide a framework from which to group and compare the cryptic species in order to identify these morphological characters. This dissertation shows the impact that gene flow and incomplete lineage sorting (ILS) can have upon the speciation process. This highlights the importance of taking these events (i.e., gene flow and ILS) into account when delimiting species and why using an independent, multilocus data set is critical.

177

REFERENCES

Carstens B.C., Pelletier T.A., Reid N.M., Satler J.D. 2013. How to fail at species delimitation.

Molecular Ecology. 22:4369-4383.

Dahlem GA. “A Revision of the Genus Ravinia (Diptera: Sarcophagidae). Ph.D. Dissertation,

Michigan State University, 1989.

Giroux, M., Pape, T., Wheeler, T. 2010. Towards a phylogeny of the flesh flies (Diptera:

Sarcophagidae): Morphology and phylogenetic implications of the acrophallus in the

subfamily Sarcophaginae. Zoological Journal of the Linnean Society. 158:740-778.

ICZN (International Commission on Zoological Nomenclature) (1999) International code of

zoological nomenclature, 4th edn. London, UK: International Trust for Zoological

Nomenclature.

Lopes, H.S. 1982. The importance of the mandible and clypeal arch of the first instar larvae

in the classification of the Sarcophagidae (Diptera). Revista Brasileira de Biologia.

26:293-326.

Pape, T. 1994. The world Blaesoxipha Loew, 1861. (Diptera: Sarcophagidae). Entomologica

Scandinavica Supplement 45:1-247.

Satler J.D., Carstens B.C., Hedin M. 2013. Multilocus Species Delimitation in a Complex of

Morphologically Conserved Trapdoor Spiders (Mygalomorphae, Antrodiaetidae,

Aliatypus). Systematic Biology. 62:805-823.

Wong, E.S., Dahlem, G.A., Stamper, T.I., DeBry, R.W. 2015. Discordance between

morphological species identification and mtDNA phylogeny in the flesh fly genus

Ravinia (Diptera: Sarcophagidae). Invertebrate Systematics 29:1-11.

178