<<

Global and local population genetics of the Mediterranean fruit , capitata, an invasive pest of fruit crops

Maria Belen Arias Mella

A thesis submitted for the degree of Doctor of Philosophy

Department of Life Sciences Imperial College London U.K.

September 2018

Abstract

Invasive are recognised as one of the most important, growing threat to food biosecurity, causing a significant economic loss in agricultural systems. Despite their damaging effect, they are attractive models for the study of evolution and adaptation in newly colonised environments. Currently, the global climate represents one key potential stressors to impact the food biosecurity because of its influence in the distribution and change in the abundance of agricultural pests. The tephritid fruit (Diptera: ) contain some of the most successful invaders and most devastating agricultural pests recognised worldwide. Among them, the Mediterranean fruit fly and the South American fruit flies in the genus are particularly important for crop production. Insecticides have been used extensively for their control. This thesis investigates factors that are related to invasiveness in these species, in order to provide novel information that will ultimately improve management control methods. First, environmental niche modelling was used to determine the influence of climate change in the potential habitat distribution of C. capitata, predicting both polewards expansion as well as greater connectivity. Next, historical global dispersal patterns of the medfly over the past two centuries were investigated using molecular and genetic approaches. In Chapter 4, different attempts to identify the point mutation G328A Ccace2 gene that confers resistance to insecticides were assessed at local and intercontinental scale, in part by studying museum specimens from before and after the use of pesticides. Additionally, to improve genetic knowledge of this invasive species, the mitogenome of different species of Anastrepha were sequenced and analysed together with others tephritid. This investigation provides crucial information revealing the evolutionary factors that influence the medfly’s successful invasions and will contribute to the development of evidence-driven pest management protocols, especially in the Americas, including the choice among different control methods as well as the establishment of quarantine procedures to interrupt colonisation routes.

2

Table of Contents

List of Figures ...... 5 List of Tables ...... 7 Declaration of Originality ...... 8 Copyright Declaration ...... 9 Acknowledgements ...... 10

CHAPTER 1 General Introduction ...... 11

CHAPTER 2 The impact of climate change on the potential distribution of the agricultural pest Ceratitis capitata: implication for pest control management ...... 24 2.1 Introduction ...... 24 2.2 Data and Methods ...... 27 2.3 Results ...... 30 2.4 Discussion ...... 40

CHAPTER 3 Population genetics and migration pathways inference of the Mediterranean fruit fly Ceratitis capitata inferred with coalescent methods ...... 45 3.1 Introduction ...... 45 3.2 Material and Methods ...... 48 3.3 Results ...... 53 3.4 Discussion ...... 69

3

CHAPTER 4 Molecular approach to insecticide resistance in museum specimens and modern natural populations of the medfly: looking for Ccace2 point mutation ...... 76 4.1 Introduction ...... 76 4.2 Material and Methods ...... 83 4.3 Results ...... 92 4.4 Discussion ...... 103

CHAPTER 5 Complete mitochondrial genome and molecular phylogeny of three species of Anastrepha ...... 110 5.1 Introduction ...... 110 5.2 Material and Methods ...... 112 5.3 Results ...... 115 5.4 Discussion ...... 125

CHAPTER 6 General Discussion ...... 129

REFERENCES ...... 136

APPENDICES APPENDIX I: ...... 157 Chapter 2 Supplementary Materials ...... 157 APPENDIX II: ...... 158 Chapter 3 Supplementary Materials ...... 158 APPENDIX III: ...... 161 Chapter 4 Supplementary Materials ...... 161 APPENDIX IV: ...... 165 Chapter 5 Supplementary Materials ...... 165

4

List of Figures

Figure 2.1 Global distribution data of the Mediterranean fruit fly Ceratitis capitata used to build the MaxEnt models...... 31

Figure 2.2 The Receiver Operating Characteristics curve (ROC) for training data of medfly with the area under the ROC curve (AUC)...... 33

Figure 2.3 Potential distribution for Medfly...... 35

Figure 2.4 Potential distribution for Medfly at worldwide scale and regional scale under two RCP emissions scenarios for the future climate conditions for 2050...... 37

Figure 2.5 Potential distribution for Medfly at worldwide scale and regional scale under two RCP emissions scenarios for the future climate conditions for 2070………………..39

Figure 3.1 Median-joining network of the Mediterranean fruit fly ...... 58

Figure 3.2 Distribution of COI haplotypes across the study area for Ceratitis capitata. . 61

Figure 3.3 Bayesian skyline plot (BSP) estimate of Medfly demographic history for the biogeographic regions ...... 65

Figure 3.4 Values of theta and migration rates between the biogeographical regions. . 67

Figure 3.5 Hypothesised historical migration paths worldwide of C. capitata using a molecular rate of 4.2% ...... 68

Figure 4.1 Cumulative number of species reported insecticide resistance through time ...... 78

5

Figure 4.2 Molecular map of the Ache gene of medfly...... 80

Figure 4.3 Diagram representing the genetic variants described in Ccace gene...... 81

Figure 4.4 Map of study areas in Colombia...... 86

Figure 4.5 Alignment of Ccace2 fragments gene and their respective genomic mapping to C. capitata...... 94

Figure 4.6 Mutation in Ccace2 gene...... 96

Figure 4.7 Frequency of the Ccace2 alleles across study sites in the Previous organophosphate (Pre-OP period)...... 97

Figure 4.8 Frequency of the Ccace2 alleles across study sites in the Post-OP period. 98

Figure 4.9 Frequency of the Ccace2 alleles across study sites in the Modern period. 100

Figure 5.1 Mitochondrial genome map of Anastrepha fraterculus...... 117

Figure 5.2 Mitochondrial genome map of Anastrepha striata...... 118

Figure 5.3 Mitochondrial genome map of Anastrepha distincta ...... 119

Figure 5.4 Atypical tRNA cloverleaf structure of trnS1(gct) found in the three species studies...... 122

Figure 5.5 Phylogenetic tree of Tephritidae family based on mitochondrial genomes. 124

6

List of Tables

Table 1.1 Chronological records of the worldwide colonisation of C. capitata ...... 156

Table 3.1 Collection and sample size for Ceratitis capitata included in this study...... 49

Table 3.2 Population genetic diversity indices and neutrality test statistics for C. capitata...... 55

Table 3.3 Pairwise Fst values between 13 populations of C. capitata...... 63

Table 4.1 Total number of cluster, species identification and number of reads obtained...... 93

Table 4.2 Allele frequency and Hardy-Weinberg equilibrium at the Ccace2 locus in C. capitata populations groups based on their insecticide treatment history...... 99

Table 4.3 Allele frequency and Hardy-Weinberg equilibrium at the Ccace2 locus in Ceratitis capitata populations collected in Colombia and Kenya...... 102

Table 5.1 Nucleotide composition of the three Anastrepha mitogenomes...... 116

Table 5.2 Characteristics of the 13 Protein-coding genes in the three species of Anastrepha...... 121

7

Declaration of Originality

I confirm that this thesis comprises my research design, analysis and discussions. Any assistance received is acknowledged here and where appropriate through the thesis.

In Chapter 2, the digitised collection of C. capitata at the Natural History Museum of

London (BMNH) was used to create the occurrence baseline. The digitised project was directed by Sandy Knapp. The georeferences were conducted by Malcolm Penn, who also gives me access to ArcGIS licence to execute the maps. The design and planning of this chapter were discussed with Paula Arribas. In Chapter 3, Samia Elfekih donate samples of C. capitata for the molecular analysis, she also provided sequences to edit in this study. Also, I am indebted to all of the individuals and institutions who provided samples for this chapter, particularly the following: Norma Nolazco (SENASA, Peru), Jose

Luis Encina (Universidad de Murcia, Spain), Beatriz Sabater-Muñoz (IVIA, Spain), Nancy

Carrejo (Universidad del Valle, Colombia). The appropriate analysis in this chapter was discussed with Ana Riesgo and Sergi Taboaba. In Chapter 4, Daniel Whitmore (BMNH,

Curator) assisted with the abdomen dissection of museum specimens. In Chapter 5 the

Colombia fieldwork planning, data collection and morphological identification of specimens were conducted in collaboration with Nancy Carrejo and the undergraduate student Monica Hernandez who did her dissertation based on the results of this fieldwork.

All molecular processing (except Illumina library preparation and sequencing), analysis, the creation of figures and writing of chapter was carried out by myself. In Chapter 5

Darren Yeo aided with the assembly of Anastrepha mitochondrial genomes.

8

I would like to acknowledge Kirsten Miller at the Swedish University of Agricultural

Sciences, Darren Yeo at the National University of Singapore, Paula Arribas at the IPNA-

CSIC, Spain and Ginez Gonzalez at the University of Cambridge for their comments and suggestions on Chapters 2-6.

Copyright Declaration

‘The copyright of this thesis rests with the author and is made available under a Creative

Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it.

For any reuse or redistribution, researchers must make clear to others the licence terms of this work’

9

Acknowledgements

I would like to dedicate my thesis to Ginez Gonzalez for his unconditional and extremely support; I am eternally grateful.

My gratitude goes to all the current and past members of The Beetle Lab particularly

Kirsten Miller, Paula Arribas, Carmelo Andujar, Benjamin Linar, Alex Crampton-Platt,

Martjin Timmermans, Samia Elfekih, Thomas Creedy, Hannah Norman, Angellina

Ceballos, Darren Yeo, Mizan, Beulah…..and of course to Alfried Vogler for his assistance and encouragement during my all these years. Many thanks for your support, motivation and constructive criticism. Also, I would like to thank those who have provided massive support in the molecular lab at the museum Steve, Elena. Andie, Ranbir, Jackie and Katty, thanks for your great support.

This thesis would not be possible without the friendship of Susy Echeverria and Nadia

Santodomingo and the great support in the lab/ over many lunches and after-work drinks including the fantastic Riesgo’s Lab: Ana, Sergy, Vassia, Carlos, Nathan and also, Dani and Katia, many thanks for your friendship and enormous technical support. My amazing collaborators in Colombia, Monica and Nancy, deserved massive thanks as well.

Finally, I would like to express my profound gratitude to my whole family, especially my mom, dad, sister and brother in law, because they all the time are with me despite the long distance. 10

CHAPTER 1

General Introduction

Biological invasions and Invasive species

Biological invasions have always been part of natural processes. However, since the last century globalisation and international economic trade activities have increased the transportation of species outside their natural ranges; therefore, humans have become active global dispersal agents as well as efficient facilitators in the establishment of invasive species (Blackburn et al. 2011; Karsten et al. 2013). The non-native species are introduced in areas where they did not evolve. These species are under different ecosystem conditions which might strongly differ from those met in their native ranges.

Despite that, these species can overcome these barriers and be successful in their invasion (Lee 2002). Invasive species are the most important growing threat to global biodiversity, ecosystem function, economy and human health (Mack et al. 2000).

Nevertheless, they offer a particular opportunity to study evolution and adaptation in entirely different environments compared to their ancestral habitats (Diamantidis et al.

2008; Vogel et al. 2010). It is known that these invasions are usually based on few individuals, in other words, a small fraction of the genetic variation of that population.

(Dlugosch & Parker 2008). Then, these specimens are exposed to the effect of the founder events which are associated with bottlenecks and the subsequent genetic

11 diversity reduction (Villablanca et al. 1998). However, how do invasive species, particularly those whose genetic variation has been depleted, persevere and spread in new environmental conditions?

Some authors declare that possibly genetic composition of founder populations determine their adaptive capacity to new regions. High genetic variability has been positively correlated with invasive success. However, some invaders are able to adapt swiftly to new conditions despite their low genetic diversity. In other cases, non-native species may show high genetic diversity even when they suffer frequent bottlenecks. For this reason, the study of these species is relevant in order to expose their adaptive genetic potential

(Dlugosch & Parker 2008). The initial genetic structure of a successful invasive population depends on factors such as effective population size of the introduction event and genetic diversity of the source population. Regarding the above, how does a variation in the genome enable species to adapt to diverse environments? Unfortunately, studies revealing molecular genetic structure of invasive species are insufficient to understand how alien species are successful in their establishment and spread. To reveal this information, studies in the origin and genetic composition of the invasive species are necessary. These studies would investigate phenomenon such as invasion pathways, migration patterns, bottlenecks following the founder pathways or expansion process when the species settle in a place.

Arthropod, especially the Insecta, is considered to be a group understudied in invasion ecology, with a strong geographical bias in Africa and Asia (Pysek et al. 2008). Some of the most successful invaders at worldwide scale are species that can cause significant losses to agriculture and can adversely affect food security (Paini et al. 2016). As a

12 consequence, an increasing concern to understand how these invasive species can affect the newly colonised regions has been established.

Tephritidae family members

Some of the most successful agricultural pest worldwide are members of the

Tephritidae or “true fruit flies”. It has around 4900 species in 500 different genera

(Norrbom 2010; White & Elson-Harris 1992) and approximately 1500 species are related to fleshy fruits with more than 250 species that represent global economic importance.

The most remarkable genera are Anastrepha (Schiner), Ceratitis (MacCleay),

(Fabricius), Rhagoletis (Loew) and Toxotrypana (Gerstaecker), which are globally recognised among the most harmful pests for agriculture (Aluja 1999; White &

Elson-Harris 1992). They inflict direct damage via the destruction of the fruit pulp during larval feeding, and consequently, the infected fruit falls prematurely. The introduction of bacteria and fungi during oviposition on the fruit causes additional economic damage.

Several species are considered quarantine pests, and therefore countries enforce strict measures for their containment (Qin et al. 2015).

Ceratitis capitata

Possibly the most notorious species is Ceratitis capitata (Wiedemann) (De Meyer et al.

2008; Malacrida et al. 2007) because of its worldwide distribution. This species is known as Mediterranean fruit fly or medfly characterised as a polyphagous species with more than 260 different hosts. The colonisation process in medfly is well-documented

(Malacrida et al. 2007). Global genetics studies and the historical dated colonisation 13 events have support three subdivision of the colonisation process. First, the ancestral medfly population located in sub-Saharan Africa. Second, an invasion to the

Mediterranean basin, here medfly was reported in the second half of the nineteenth century. Finally, the third movement into Latin America and the Pacific which is recognised as recent expansion, in fact, it was dated at the beginning of the twenty century (Table 1.1) (Gasperi et al. 2002).

14

Table 1.1 Chronological records of the worldwide colonisation of C. capitata (Malacrida et al. 1998)

Geographic Country Earliest record region South East putative source Africa Fletcher 1989 Africa area Back and South Africa 1889 Pemberton 1918 Extra- Mediterranean Canary early 1800 Fimiani 1989 islands Madeira 1829 " Mediterranean Spain 1842 " area Algeria 1850 " Tunisia 1855 " South Italy 1863 " France 1885 " Portugal 1898 " Israel end 1800 " Turkey 1904 " Greece 1915 " ex- 1947 " Yuguslovia Enkerlin et al. Latin America Brazil 1905 1989 Costa Rica 1955 " Nicaragua 1960 " Panama 1963 " Guatemala 1975 " Mexico 1977 " Pacific Hawaii 1910 Harris 1989 Hooper and Australia Australia 1897 Drew 1989

15

Overall this process took around 250 years, and therefore medfly colonisation is recognised as a rapid worldwide spread (Gasperi et al. 1991; Malacrida et al. 2007). In

Africa, a population from Kenya has presented the highest level of genetic variability compared to several medfly populations around the world (Baruffi et al. 1995; Karsten et al. 2015). As a consequence, Kenya is recognised as the ancestral population of medflies which also is supported by historical records (Table 1.1).

Population genetic approaches can be used to study medfly invasion biology. At the same time, information on demography and dispersal might be inferred from genetic data in populations that are already established. However, simple interpretations of medfly population genetic structure are complicated by severe departures from a balance between gene flow and genetic drift. In the last 250 years, the range of this species has been extended globally following by recurrent invasions, which has impacted the distribution of genetic diversity both within and among populations. Nevertheless, few studies have attempted to examine hypotheses of the medfly global colonisation. Also, to date, the evaluation of connectivity between recognise settle populations have not been examined.

Anastrepha sp

Anastrepha Schiner and Toxotrypana Gertaecker are the largest fruit flies’ clade within the Neotropical area (Norrbom 2010; Norrbom 2012). Anastrepha includes approximately

300 described species from which the most important in the Americas are Anastrepha obliqua and Anastrepha fraterculus due to their impact to agriculture productions such as citrus, guava, mango and many others (Aluja & Liedo 1993; Orono et al. 2006).

16

Anastrepha fraterculus has been recognised as complex cryptic species (Hernández-

Ortiz et al. 2012). It poses a challenge to detect and differentiate similar morphological species in this group. Additionally, the genetic information about the different species of

Anastrepha is poorly understood (Lanzavecchia et al. 2014). Ludeña et al. (2010) have reported unsuccessful amplification in some Anastrepha specimens when using the common barcode cytochrome c oxidase I (COI). Therefore, increase the currently genetic information for this group might help to develop suitable molecular analysis methods.

Pest control methods

Insecticides

The high demand to improve agricultural productivity has triggered the use of chemical insecticides to control pests. These chemicals frequently offer immediate efficacy against the target species although they present some disadvantages in controlling the whole life cycle stages. For instance, the main medfly control procedure is based on bait spray which can only target the adult stage because eggs and larvae are within fruits. As a result, these stages are protected from pesticide exposition and can survive (Jackson et al. 2013; Suckling et al. 2014). The most frequently used pesticides have been organophosphates (OP) and carbamates (Carbs), which inhibit acetylcholine esterase

(AChE), a crucial enzyme in the insect nervous system (Magaña et al. 2008). However, pesticide resistance, as well as the growing global concern for chemical substances in the environment and human health, have triggered the development of new nature- friendly pesticides such as Spinosad. In practice, the control of medfly is usually based

17 on continuous and repeated use of the same pesticide (Miller et al. 2010). This method applies constant selective pressure on the target pest population until it develops resistance to the chemical; consequently, the pesticide efficiency decreases. Resistance occurs when an insect population reduces their susceptibility to one or various pesticides, causing control failure (Hojland et al. 2014). One mechanism of resistance in medfly is related to a single point mutation, causing a glycine to alanine amino acids change in

Ccace2 (position 328) with the consequence of insensitivity in the target site to pesticides

(Elfekih et al. 2014; Magaña C et al. 2007; Magaña et al. 2008).

Usually, resistance to pesticides studies are based on insect strains selected in laboratories to identify the specific chemical control. It is an advantageous method to characterise the genes or proteins that confer resistance to pesticides. However, they cannot precisely represent insect resistance within natural populations. Additionally, designing experiments to identify pest resistance in agricultural lands at a worldwide level is difficult due to the extensive and constant use of several pesticides. These may differ country by country depending on agricultural policies. Furthermore, the situation can be complex in tropical areas when the medfly utilise large forest or jungle vegetation near agricultural fields to survive human management (Alaoui et al. 2010).

Resistance to insecticides is a serious economic problem for agriculture, therefore, it is necessary to study the phenomenon at different levels. For instance, remains unclear how the adaptation to survive in toxic plants may aid to tolerate insecticides. Also, define the association in natural populations of the medfly between phenotype (i.e., pesticide resistance) to specific genotype regions are poorly studied. They represent essential

18 knowledge on the resistance mechanism of fruit fly that helps to implement or improve the existing control management programs (Aïzoun et al. 2013).

Protein Food Baits

The protein food baits are a mixture of an attractive substance, and an insecticide where the amount of its active agent needed to control fruit flies is less compared to insecticides spray methods (Vargas et al. 2015). In wild conditions the protein-deprived flies are attracted by the available protein present in this food baits thus, the flies had contact with the toxin that kills them (Peck & McQuate 2000; Prokopy et al. 1992). Since the method started, several enzymatic proteins have been used for example hydrolysed corn protein, hydrolysed soybean protein or sugarcane molasses in combination at the beginning with organophosphate and more recently with spinosad or lufenuron (Sciarretta et al. 2018;

Stark et al. 2004). This method has been performed successfully in several regions around the world and is recognised as an economic and an environmentally friendly alternative to replacing traditional insecticides treatments (Aluja & Liedo 1993; Vargas et al. 2015).

Biological control

The main motivation to establish biocontrol strategies for fruit flies lies in the rejection at the worldwide level of agrochemicals used in fruit orchards, along with the improvements on the mass rearing techniques of parasitoids (Ovruski et al. 2000). The biological control measures against fruit fly pests are based on the use of microorganisms/microbial toxins, parasites and predator species (Stibick 2004).

19

Natural enemies employed in the biological control of fruit fly larvae-pupae include parasites (Hymenoptera) and predators (Staphylinidae) (Baranowski et al. 1993; Tormos et al. 2018). The performance of parasites and predators as biocontrol agents on fruit fly populations might be affected by their biology (e.g. low fecundity, exhibit diapause or impact on non-target species) and also, by the environmental conditions in the region treated. There are several surveys which evaluated the performance of fruit fly parasitoids/predators in specific regions like as Argentina, Mexico, Spain and Israel

(Argov et al. 2011; Guillén et al. 2002; Ovruski & Schliserman 2012; Tormos et al. 2018).

Most of them conclude that biological control is a feasible strategy for the suppression and management of tephritidae species. Nonetheless, the parasitoids/predators as a unique method in a specific area are not able to achieve the economic control required to ensure marketable fruits. For this reason, these biological agents might be in association with other methods, ideally in area-wide integrated pest management strategies, to achieve sustainable agricultural production (Flores et al. 2013; Purcell 1998).

Sterile Insect Technique (SIT)

Employing manipulated insects to control pest populations has evolved as a result of problems created by the use of insecticides. Rearing sterile insect pests on a large scale which are subsequently released into natural populations, results in controlling only those insects which pose a threat. Thus, harmless or beneficial of the same ecosystem are not adversely affected (Dyck et al. 2005). This technique consists of mass rearing, sterilising, and releasing sufficient numbers of competitive sterile medflies to overflood the wild population over a significant geographic area. In this technique, only the male contributes to induced sterility, and therefore, genetic sexing strains were developed that

20 allow for the selection of males early in the rearing process. This technique requires a sufficient number of healthy and competitive insects to be produced that, when released, will mate with wild female insects successfully.

Economic impact

Several species of tephritid are considered quarantine pests, and therefore countries enforce strict measures for their containment, such as extensive pre- and post-harvest monitoring, significant trade barriers or even importation bans (Aluja & Mangan 2008;

Szyniszewska & Tatem 2014). The annual economic losses produced by fruit flies are estimated to be more than US$1 billion worldwide (STDF, 2010).

In South America, these flies cause major financial losses. For example, in Brazil, the C. capitata generates around 242 million US$/year in damages, while in Argentina C. capitata and Anastrepha fraterculus cause losses close to US$90 million/year (Ovruski &

Schliserman 2012). Finally, in Peru, the losses are estimated to be around 25 million

US$/year, representing approximately 20% of the total losses in agricultural production

(Ortíz, 1999). The economic damage produced by fruit flies in Colombia has not yet been determined, but the fruticulture sector is known to lose 30 to 40% of the national production, rising to 70% in regions where pest management protocols are insufficiently applied (Conpes 2008).

21

Outline of Thesis.

Genetic reconstruction of invasion history has been a central topic to many research programmes in order to understand the causes and consequences of invasiveness.

In this thesis, I investigate factors that are related to invasiveness in species of the

Tephritidae family using genetics and novel statistical approaches. The agriculture and economic significance of these species make them suitable candidates for these approaches.

Chapter 2: The impact of climate change on the potential distribution of the agricultural pest Ceratitis capitata: implication for pest control management

Setting the current and future prediction on habitat suitability at a global scale for C. capitata, Chapter 2 aims to assess the feasibility in the estimation of current habitat suitability for the fly at a global scale to compare with the prediction on habitat suitability under future climate scenarios. The results are discussed potential invasion ranges modification for the species at a global scale which can be incorporated into prevention and control method.

Chapter 3: Population genetics and migration pathways of the Mediterranean fruit fly Ceratitis capitata inferred with coalescent methods

This chapter reports the outcome of different populations distributed worldwide. Here is determined the current macrogeographic population structure of medfly and using

Bayesian approaches is reconstructed the potential migration routes for the species

22 worldwide and, also at particular local levels which provide new insights into management protocols.

Chapter 4: Molecular approach to insecticide resistance in museum specimens and modern natural populations of the medfly: looking for Ccace2 point mutation

The focus of Chapter 4 elucidates the evolutionary genetic mechanism by which ace- based resistances had evolved in Ceratitis capitata using DNA from museum specimens collected previous and post pesticide era. Furthermore, current specimens collected in natural populations from Colombia were incorporated in the analysis.

Chapter 5: Complete mitochondrial genome and molecular phylogeny of three species of Anastrepha. The shortage of genomic information for tephritid species in

South America motivate the sequencing of the complete mitochondrial genome of

Anastrepha sp. with agricultural importance in the region. This chapter is concerned to provide full mitochondrial genomes which will be useful for detection and species differentiation.

Chapter 6: General Discussion, Future experiments and Concluding remarks

23

CHAPTER 2

The impact of climate change on the potential distribution of the agricultural pest Ceratitis capitata: implication for pest control management

2.1 Introduction

Global climate change together with changes in land use, human movement, transport, biotic exchange and agriculture production represent the main impact factors which threaten ecosystem function, human health and food security (Brook et al. 2008; Thuiller

2007). Additionally, climatic constraints and mechanisms of introduction are changing and facilitate the spread and establishment of organisms outside their natural ranges

(Hellmann et al. 2008; Parmesan 2006). For invasive insects, limited attention has been given to what they may teach us about evolutionary adaptation to climate change (Moran

& Alexander 2014). Although, in the case of agriculturally important pest insects, previous research has established that expanded geographic distributions and increased population densities are expected (Bebber et al. 2013; Berzitis et al. 2014; Walther et al.

24

2009). However, it is known that species responses are largely idiosyncratic for this reason, should be understood individually (Hill et al. 2016).

There are two key factors associated with the impact of climate change that need to be emphasised for their direct influence in insect’s physiology; temperature and precipitation.

The first one is relevant in insect metabolism, development, specifically the onset of diapause, and can strongly influence larval and adult behaviour (e.g. flight) (Chown &

Nicolson 2004). Meanwhile, precipitation has a direct influence on the abundance, growth and susceptibility of hosts in herbivorous/fructivorous insects (Lantschner et al. 2014).

Therefore, there is an increasing need to elucidate how global warming can influence the abundance and distribution of insects on a global scale.

The Tephritidae (Diptera: Tephritidae), or “true fruit flies”, includes some of the most destructive agricultural pest in the world; in fact, the annual economic losses produced by fruit flies are estimated to be more than US$1 billion worldwide (STDF 2010). The members of this family typically cause major damage to fruits, especially via larval feeding

(Papadopoulos et al. 2013). The Mediterranean fruit fly is considered one of the most successful invaders and major agricultural economic pest worldwide. It is a widely studied fruit fly due to its tremendous economic damage to the fruit market and expensive eradication costs (Karsten et al. 2015; Szyniszewska & Tatem 2014).

It is expected that global warming improves conditions for the flies to establish in temperate regions, through longer growing seasons, fewer frost days, more heat waves and greater frequency of warm nights (Papadopoulos et al. 2013). Additionally, it has been recently described in invasive species like the bean leaf beetle (Cerotoma trifurcata) and the Colorado potato beetle (Leptinotarsa decemlineata) that climate change and host

25 availability can affect their distribution and invasions on a global scale (Berzitis et al. 2014;

Wang et al. 2017).

In the C. capitata context, some previous attempts have been made to predict its potential geographical distribution using different modelling techniques. De Meyer et al. (2008) predicted the potential geographic distribution of the medfly at worldwide scale using the

Genetic Algorithm for Rule-set Prediction (GARP) and Principal Component Analysis

(PCA). The results suggest that medfly has a broadly high and low suitability presence, even though it presented a pattern throughout the world for this reason they consider it as invasive species. CLIMEX program was used to infer the climatic requirements of the medfly focusing on seasonal and year to year variation in climatic suitability with an emphasis in Argentina and Australia (Vera et al. 2002). Another study which aimed to assess the potential distribution of three fruit fly species in China used two different modelling methods, GARP and Maximum Entropy species distribution (MaxEnt). They predicted that C. capitata is the main threat to agriculture food security in China, however to date there are no records of the species in the country. Also, in this study, MaxEnt outperformed GARP at each test threshold (Li et al. 2009). Additional information expanding the known current distribution of the medfly and regarding seasonal effects on the potential spread, including through air passenger traffic, has even been incorporated

(Szyniszewska & Tatem 2014; Szyniszewska et al. 2016). Despite the different efforts for modelling the current distribution of medfly, to date is still not clear the potential changes in its distributional ranges in the context of climate change.

In this study, new attempts to improve previous occurrence databases was conducted.

The efforts were focused on compelling data mainly in the native area of the species

26 because of the non-native areas are already well recorded. Then, the Maximum Entropy

(MaxEnt) model was used to estimate the potential distributions of C. capitata based on their ecological niche requirements. Also, we evaluated the model’s consistency and variation in the regions that can be potential habitat areas for medfly on a global scale under future climate scenarios.

2.2 Data and Methods

Species Occurrence data

The occurrence database published by Szyniszewska and Tatem (2014) was used as a primary source. This database contained 2328 geolocated medfly references including a total of 529 unique georeferences distributed in 40 countries. This database was updated based on open access platforms such as the Global Biodiversity Information Facility

(GBIF, http://data.gbif.org), CABI Invasive Species Compendium (CABI, http://www.cabi.org/isc) and GenBank repositories (details in Material and Methods

Chapter 3). Also included was the medfly collection of the Natural History Museum of

London (BMNH) which has been recently digitised. Finally, the data collected in surveys in this study were included to construct the new medfly database. Thus, it contained current and historical presence-only records of medfly. Each entry contained the type of source, administrative locality or country, georeferenced location and source of coordinates. For occurrences without exact coordinates, a georeference was created with

GBIF protocols or Google Earth based on other information included in the source material such as the name of the location.

27

Environmental variables

Climate data were obtained from Worldclim database (v.1.4; http://www.worldclim.org/)

(Hijmans et al. 2005). All bioclimate variables were at a spatial resolution of 5 min, which corresponds to approximately 10 x 10 km at the equator. Current bioclimatic conditions were represented by monthly average data obtained from 1950 to 2000. In the case of the global climate projections which are based on average estimations for 2050 (average

2041-2060) and 2070 (average 2061-2080), here, three global climate models (GCMs):

CCSM4 (CC), HadGEM2-ES (HD) and GFDL-ESM2G (GD) were used. These models were selected to give a proper range of precipitation and air temperature changes, rather than to represent the likelihood of future climate change (Warszawski et al. 2014). Each model was run using two representative concentration pathways (RCPs), RCP4.5 and

RCP8.5 except the GD model which was only available for RCP4.5. These two RCPs reflect an intermediate and high scenario of future greenhouse gas concentrations

(Pachauri 2015).

In total, 19 bioclimatic variables were included in the analysis. These variables are relevant in species distribution modelling and usually are derived from monthly temperature and rainfall values to generate more biologically meaningful variables.

However, some of these variables are highly correlated, and for this reason, the corrplot package of R v.3.2.0 statistical software (R Core Team 2012) was used to assess collinearity among them. When two or more variables were highly correlated (r ≤ 0.8), variables with lower biologic relevance to C. capitata were dropped (Dormann et al. 2012;

Elith et al. 2006). Highly correlated variables do not add information to the model, for this reason, their exclusion is relevant to obtain a most parsimonious model.

28

Maximum entropy modelling (MaxEnt)

MaxEnt was employed to simulate the suitable climatic conditions of medfly and also predict changes in its distribution in the context of climate change. The model estimates the suitable habitat by integrating bioclimatic variables with species locations in a maximum entropy distribution; it means to try to find a distribution closest to uniform (Elith

& Leathwick 2009; Phillips et al. 2006).

MaxEnt is a stable and reliable program, even with incomplete data and small sample sizes. It was developed to use presence-only data, and both continuous and categorical environmental data can be used as input variables (Merow et al. 2013; Phillips et al.

2006). Additionally, it is a well-documented method, and its access is free to download

(http://biodiversityinformatics.amnh.org/open_source/maxent/).

Projections of the potential geographic distribution of the medfly were inferred with

MaxEnt (v. 3.3.3k). Models were calibrated using 75% of the occurrence data as training data, and the remaining 25% were used for model validation as testing data. The convergence threshold was left at the default (10-5), the maximum number of iterations was increased to 5000 to allow convergence, and all other parameters were kept as default. Ten thousand global background points were used. 30 replicates guaranteed the accuracy of the modelling and subsample validation maintained in all runs. The outputs were saved as the logistic format to obtain the probability of presence (range 0 to 1). The threshold value used to classify potential habitat areas for the medfly was the maximum training sensitivity plus specificity (Merow et al. 2013). Model results were converted into raster files and visualised in ArcGIS 10.2.2 (ESRI, Redlands, CA, USA, http://www.esri.com/).

29

Model testing

As described before, the accuracy of each model was evaluated by the partition of the occurrence data (75% training and 25% testing). Model performance and the fit of the model to the data was measured by the Receiver Operating Curve (ROC). The area under the curve known as the AUC is one the most common statistics to provide overall accuracy in the modelling. Despite AUC can produce misleading measures of fit in the model, it is considered most appropriate when sampling intensity is high (Lobo et al. 2008;

Merow et al. 2013). AUC has a range of 0 to 1, where 0 indicates that prediction is equal to a random assignment whereas 1 indicates a perfect presence-absence prediction. An

AUC value higher than 0.75 is considered acceptable (Pearce & Ferrier 2000).

2.3 Results

Occurrence data

The search of historical medfly occurrences resulted in a total of 4360 records from 87 countries. All locations contained information about the year collected; the oldest record comes from 1895 and the most recent from 2016, which gives a long temporal range for further analysis. For the modelling purpose, all duplicate points were removed for the analysis, given a total of 1206 unique occurrence points distributed worldwide (Figure 2.1)

30

Figure 2.1 Global distribution data of the Mediterranean fruit fly Ceratitis capitata used to build the MaxEnt models.

The blue dots correspond to the primary database published by Szyniszewska and Tatem (2014), and the red dots represent the newly incorporated occurrence data.

31

Variable correlation

Out of 19 bioclimatic variables, eight were found uncorrelated using the Pearson correlation prior (r > 0.80) (Supplementary Figure 2.1). Half of them were attributed to temperature, and another half were associated with precipitation. The variables used in the model related to temperature were: Mean diurnal range, Maximum/Minimum temperature of warmest/coldest month and Temperature annual range. The variables used in the model related to precipitation were Annual precipitation, Precipitation of driest month, Precipitation seasonality and Precipitation of coldest quarter associated.

Model performance

In support of the model performance, the AUC score in the training and testing dataset for the C. capitata were 0.909 and 0.900 respectively, indicating a strong algorithm performance (Figure 2.2). The MaxEnt models accurately discriminated between suitable and unsuitable areas for medfly, given the fractional predicted area of 0.3404. MaxEnt’s default analysis of variables contributions showed the percent predictive contribution of each used climate variable. Thus, the higher the contribution, the more impact that particular variable has on predicting the occurrence of the species. In medfly models, the

Precipitation seasonality (coefficient of variation) had the highest contribution in the model of 39.7%, followed by a Maximum temperature of warmest month of 21.5%.

32

Figure 2.2 The Receiver Operating Characteristics curve (ROC) for training data of medfly with the area under the ROC curve (AUC). The blue line represents mean ± standard deviation. The red line represents AUC, and the black line represents random prediction.

Current and future distributional potential

The habitat suitability was shown as different colours on the model maps (Figure 2.3).

The average value of maximum training sensitivity plus specificity area was 0.21. Based on that threshold value, four habitat suitability categories were created: no risk of invasion

(0.00-0.21), low risk (0.21-0.46), medium risk (0.46- 0.71) and high risk (0.71-1.00).

Regarding the model prediction, the current global distribution of medfly indicated that the highly suitable areas included part of Africa, as expected, due to its medfly native category included in the Central-east area to Kenya, Uganda and the central part of Ethiopia, in addition to the southern distribution area in South Africa. The Atlantic island of Santa

33

Helena also includes suitable areas. In the Mediterranean basin, high values were predicted for Portugal, Spain and Italy, and in the Americas, model predictions were for areas in California and Mexico, the South of Ecuador and the North of Peru, Chile, Bolivia and Brazil, in addition to warm coastal areas such as the Persian Gulf and Australia

(Figure 2.3). The model also accurately indicated that regions in South-East Asia and

Central America are climatically suitable for medfly.

34

Figure 2.3 Potential distribution for Medfly. The suitable habitat for medfly is represented by White: no risk; yellow: low- risk, green: medium-risk and red: high-risk regions were medfly presence was predicted. A) Worldwide scale and then regional scale B) North America; C) South America and D) Mediterranean basin.

35

Model results were developed for two RCP emissions scenarios for future climate conditions at a worldwide level. Four distribution maps of suitable habitat under a range of possible future climate scenarios for 2050 and 2070 were summarized in Figure 2.4 and Figure 2.5 respectively. The Maxent predictions were very similar across the different scenarios. Overall, most areas currently suitable for medfly will remain so into the 2050 and 2070 under these climate changes scenarios. Closer inspection of the prediction maps at 2050 under RCP 4.5 showed that highly suitable area increased in Africa, North and South America. The medium risk increased in Colombia which worked as a corridor for medfly between Ecuador and Venezuela whereas under RCP 8.5 the model showed an increased risk for some European countries such as northern France and the southern

UK (Figure 2.4).

36

Figure 2.4 Potential distribution for Medfly at worldwide scale (right side) and regional scale (left side) under two

RCP emissions scenarios for the future climate conditions for 2050.

37

Figure 2.4 (continue description). Map showing the mean predicted result for three

GCMs: CCSM4 (CC), HadGEM2-ES (HD) and GFDL-ESM2G (GD) modelled under

2050-RCP45, and the mean predicted result for two GCMs: (CC) and (HD) modelled under 2050-RCP85. The regional scale maps zoom in details of North America, South

America and Mediterranean basin from up to down respectively. White represents no risk regions; yellow represents low-risk regions, green represents medium-risk regions, and red colour represents the high-risk region for being invaded by medfly.

In the prediction map at 2070 under RCP 4.5 results were closely similar to those in both

RCP in 2050 with an only slight reduction of medium-risk areas in the Mediterranean basin (Figure 2.5). The prediction map at 2070 under RCP 8.5 showed that the medfly invasion in northern France and the southern UK returned to low risk while remarkably there was an increment of high-risk areas in the north and central zones of Chile, southern

Peru and Bolivia (Figure 2.5).

38

Figure 2.5 Potential distribution for Medfly at worldwide scale (right side) and regional scale (left side) under two

RCP emissions scenarios for the future climate conditions for 2070.

39

Figure 2.5 (continue description). Map showing the mean predicted result for three

GCMs: CCSM4 (CC), HadGEM2-ES (HD) and GFDL-ESM2G (GD) modelled under

2070-RCP45, and the mean predicted result for two GCMs: (CC) and (HD) modelled under 2070-RCP85. The regional scale maps zoom in details of North America, South

America and Mediterranean basin from up to down respectively. White represents no risk regions; yellow represents low-risk regions, green represents medium-risk regions, and red colour represents the high-risk region for being invade by medfly.

2.4 Discussion Occurrence and model variables

In this study, the MaxEnt models indicate that the global distribution of suitable habitat for medfly will increase with climate change, although in some areas it is predicted to remain stable or to increase or decrease only slightly. Temperature and precipitation were the most important climatic variables to the models, as they have a profound effect on the C. capitata lifespan, but they also affect the fruit crops that the medfly infests, leaving questions as to how exactly climate change will affect the medfly’s ecosystem.

The occurrence dataset available in this study was expanded by more than twofold, from

529 unique georeferences records in 40 countries to 1206 unique georeferenced in 87 countries. Notwithstanding, this enormous improvement brings a trade-off in the quality of the records, and the current analysis loses the possibility of studying month of collection data which was the aim of Szyniszewska and Tatem (2014), while retaining only the collection year and sampling site.

40

Current model predictions

The potential distribution in the present map identified some countries as high-risk, but they do not have medfly population established at present, such as California or Chile, where medfly was eradicated in 1990 and 1995 respectively (Aluja and Liedo 1993, Joint

FAO/IAEA 1999). In Eastern Australia, the dominance of Queensland fruit fly (Bactrocera tryoni) has apparently led to the displacement of the medfly (Dominiak and Daniels 2012), while the lack of medfly occurrence in Southern India and Asia might be due to limited introduction. Despite of the unoccupied areas, this map was more conservative and presented narrower medfly suitability range to those described in GARP models (De

Meyer et al. 2008). The differences are surprising given that they used seven of the eight variables which were modelled in our maps, even though few agreements at high-risk habitat for medfly were found, mostly in Europe. The disagreement in both modelling methods has been described before in sand flies. In that case, the MaxEnt model was considered to perform the best based on the shorter variability on the outputs of True Skill

Statistics (TSS) (Carvalho et al. 2015). It is supported by various comparisons of modelling algorithms, where MaxEnt was the best method that does not use absence data (Elith et al. 2006). Unexpectedly, this updated potential distribution of medfly continues very similarly to the previously described using MaxEnt by Szyniszewska and

Tatem (2014) although the database was enormously increased. Discrepancies were found in Australia, Southern-Asia and Northern part of America.

41

Future model predictions

Future projections from our models indicate that at 2050 the range shift is poleward to the northern part of Europe where is increased the suitable medfly habitat in France and UK.

These shifts are expected to occur in insects in the context of global warming (Fu et al.

2014). Nevertheless, these results might thoughtfully interpret because medfly occurrence in this part of Europe is more likely occasional. Thus, the increment in the potential suitable habitat is probably related to an artefact of the MaxEnt modelling rather than a potentially expansion region for medfly. Additionally, the intermountain basin in

Colombia, which is an area of highly congregated fruit production, the risk of medfly invasion is increased to ‘medium’. It is therefore likely that such region becomes as a corridor between the high-risk areas in South America (Ecuador and Peru) to the Central and Caribbean regions, which will increase the medfly dispersal regionally.

In contrast, the 2070 model at 4.5 RCP presents a local reduction in habitat suitability in some European countries. Such decrease in climate suitability was described at a global scale in a previous study using the CLIMEX model (Hill et al. 2016). They used 12 tephritid species in a 50 km2 grid cell. Usually, this size of grid represent a long-term average across a large area, and many variations for a particular variable are not incorporated, even though Hill et al (2016) results were similar to those found in this MaxEnt prediction.

Also, the 2070 model at 8.5 RCP shows the tendency described before but again the poleward shift was apparent, specially in the southern part of Peru and in northern Chile.

A note of caution is due here since the medfly remains eradicated in Chile, and thus it is crucial to maintaining strict pest control protocol, in particular, to continue with the Sterile

42

Insect Techniques (SIT) release in Arica which is the most northern city bordering to Peru and quarantine point.

Limitations and future research

Despite a good comprehensive dataset of medfly presence points, there remain uncertainties in the various output maps. In particular for northern European countries, including the UK, there are identified some inconsistency on the samples records, which possible cause errors that are difficult to track. Additionally, many factors can influence medfly presence and abundance for which global spatial data do not exist. For this reason, potential distribution cannot be predicted exclusively based on climate. It is imperative to consider the distribution of competitor species and distribution/availability of host plants. Moreover, the limiting factors considered to affect the geographical distributions should include geographical features, natural barriers, human activities and control methods coverage. The MaxEnt models might have some limitations because it could not incorporate these additional factors for this reason, in further analyses they might have to be considered to obtain accurate predictions of species invasion. However, it is also evident that much of the available range is already occupied by this highly invasive species, indicating the rapid movements around the world in just over a century since it first was discovered outside its native range in sub-Saharan Africa. This rapid invasion can be expected for any area at the periphery of its current distribution that newly attains suitable conditions for survival of this species.

43

Conclusions

Fruit fly invasions under climate change will challenge global, regional and local food security. Regions such as South America where the fruit production is economically significant will likely be forced to maintain strict quarantine protocols, particularly when export to the Northern Hemisphere such as Europe that is another suitable region to invade. Such complexity response to climate change suggests that predicting and managing the medfly future invasions will be a difficult challenge.

44

CHAPTER 3

Population genetics and migration pathways inference of the Mediterranean fruit fly Ceratitis capitata inferred with coalescent methods

This chapter has been published as Maria Belen Arias, Samia Elfekih & Alfried P. Vogler, in PeerJ.

3.1 Introduction

Globalisation and international economic trade have increased the transportation of species outside their natural ranges. Thus, these human activities assist the spread of exotic species, therefore the pest arrival rates in many countries increased (Blackburn et al. 2011; Karsten et al. 2013). These invasive species are direct associated with biodiversity losses, changes in ecosystem function and negative impacts on economy, agriculture and human health (Mack et al. 2000). While they have many adverse effects, invasive species offer the unique opportunity to study evolution and adaptation within

45

entirely different environments compared to their ancestral habitats (Diamantidis et al.

2008; Vogel et al. 2010).

The Mediterranean fruit fly, Ceratitis capitata, also known as medfly, is considered to pose a severe economic threat to agriculture, especially fruit production, due to its broad host plant range of more than 260 different plant species and worldwide distribution (Malacrida et al. 2007). Chronological records and global studies based on genetic markers assume that C. capitata populations are subdivided into three different groups: an ancestral population in Sub-Saharan Africa, a younger population in the Mediterranean basin, and various recently derived populations in tropical and subtropical America, Australia and

Oceania (Gasperi et al. 1991, Malacrida et al. 1992, Malacrida et al. 2007).

Current medfly management approaches vary between countries, although insecticides

(baits and full cover-sprays) are the predominantly used method. However, due to harmful effects of insecticides, the Sterile Insect Technique (SIT) using the release of males subjected to sublethal X-ray irradiation is becoming increasingly common (Dyck et al.

2005). The successful implementation of pest control strategies using SIT relies on information about possible movements and effective population sizes in the regions under management.

Population genetic studies can be used to understand medfly invasion biology by focusing on the degree of subdivision within and among local regions. Additionally, information on demography and dispersal can be inferred from genetic data. However, the interpretation of these data in the medfly is challenging as repeated range expansions and invasions, as well as several cases of regional eradication, have impacted its distribution and genetic diversity. The medfly population genetic structure and invasion routes have been

46

previously studied using various molecular approaches at local (Bonizzoni et al. 2004;

Bonizzoni et al. 2001; Elfekih et al. 2010; Karsten et al. 2013) as well as global level

(Bonizzoni et al. 2000; Gasperi et al. 2002; Malacrida et al. 2007). Most of the proposed colonisation routes have been calculated based on traditional methods such as genetic distance or the private-allele method of Slatkin’s (Bonizzoni et al. 2000; Gasperi et al.

2002; Malacrida et al. 1998). Despite these past efforts, the implementation of coalescence methods to investigate medfly invasion have been limited to one study revealing the origin of medfly in Australia (Bonizzoni et al. 2004), and to a another study by Karsten et al. (2015), which used the Approximate Bayesian Computation (ABC) method to show a decrease in genetic diversity outside of Africa, the presumed origin of the introduced range described above. Even though this study provides invaluable genetic information for the medfly colonisation, it provides detailed information on African populations only, therefore, the incorporation of new populations especially in the

Palearctic and Neotropical regions is required to improve the current knowledge of medfly dispersal.

In this study, a large-scale phylogeographic analysis was conducted using the cytochrome oxidase gene I (COI) for a pathway analysis of medfly populations across their distribution range. We aim to determine the current macrogeographic population structure of C. capitata collected from different populations around the globe, and to reconstruct plausible migration routes using Bayesian coalescence approaches.

47

3.2 Material and Methods

Sample collection

Specimens of C. capitata were collected from 11 sites across all biogeographic regions where the species occurs (Afrotropical, Palaearctic, Australasian and Neotropical) (Table

3.1) between 2009 and 2014. Whole specimens were collected via traps in orchards and reared from infested fruit, all flies were preserved in 80% ethanol at -20ºC until tissue was used for DNA extraction.

Sequences representing populations at further sites (Kenya, Ghana, and Iran) were obtained from GenBank using the R package rentrez. These sequences were also used to increase the sample size at collection sites (see Supplementary Table 3.1 for details).

Below, specimens from the same sampling site will be referred to as a population.

48

Table 3.1 Collection and sample size for Ceratitis capitata included in this study.

Details of sample sites location, name of host plant where the individuals were collected and number of individuals collected at each sample site.

Biogeographic Sample Country Location Host Region size Afrotropical South Africa Stellenbosch Guava 37 Egypt - - 25

Israel Gedera Fig 19 Lachish Orange 19

Ness Ziona Orange 19

Neta'im Guava 19

Shefayim Orange 18 Palearctic Yad Mordechai Lemon 15

Tunisia Bizerte Orange 1 Greece Thessaloniki Apple 20 Aetolia- Orange 3 Acarnania Spain Valencia Fig 21 Malaga Peach 15

Murcia Orange 5 Australasian Australia Perth Orange 24 Guatemala Santa Barbara Coffee 51 Colombia Cundinamarca Peach 18

Neotropical Nariño Coffee 4

Brazil Salvador Guava 11 Peru Ica Orange 8

49

DNA extraction, sequencing, and alignment

After morphological identification of the collected specimens, genomic DNA was extracted from each specimen using DNeasy Blood & Tissue Spin Column Kit (Qiagen). A fragment of the mitochondrial gene cytochrome c oxidase subunit I (COI) was amplified using the following primers LCO1490 (5’-GGTCAACAAATCATAAAGATATTGG-3’) and HCO2198

(5’- TAAACTTCAGGGTGACCAAAAAATCA-3’) (Folmer et al. 1994). PCRs were conducted in a 20 μl reaction volume, with 0.5 μl of genomic DNA, 0.1 mM dNTPS, 0.5

U/µM BIOTAQ DNA Polymerase, 3 mM MgCl2, 0.3 μM of forward and reverse primers.

The PCR program included an initial denaturation step of 94ºC for 5 min followed by 35 cycles of 94ºC, 30 s; annealing at 51ºC for 54 s, 72ºC for 54 s and the final extension at

72ºC for 7 min. To confirm amplification was successful all PCR products were visualized using GelRedTM (Biotium) on 1% agarose gel. Clear amplicons were purified and sequenced in both directions on ABI 3730xl automated sequencer (Applied Biosystems) at the Wolfson Biomolecular Sequencing Facility of the Natural History Museum, London.

Individual sequences were aligned in Geneious software v.7.1.7 (Kearse et al. 2012).

Sequences obtained from Genbank were included in this alignment to generate the final

COI database.

Genetic diversity and population structure

Levels of genetic diversity were determined using the following parameters: the number of haplotypes (k), number of polymorphic sites (S), haplotype diversity (h) and nucleotide diversity (π) using DNAsp version 5 and ARLEQUIN version 3.5.2.1 (Librado and Rozas

2009, Excoffier and Lischer 2010).

50

The median-joining (MJ) network (Bandelt et al. 1999) was used to estimate the genealogical relationships in C. capitata haplotypes computed in POPART version 1.7

(Leigh and Bryant 2015). Population genetic structure was estimated by population pairwise Fst. The significant test statistic was performed using 1,000 permutations, and it was computed in ARLEQUIN version 3.5.2.1.

Demographic inferences

The Tajima’s D (Tajima, 1989) and Fu’s Fs (Fu, 1997) statistic tests were performed to identify deviations from neutral models in ARLEQUIN version 3.5.2.1. The past population dynamic through time for the various C. capitata haplogroups was inferred using a

Bayesian skyline plot method (BSP). Two independent simulations were run using the

Hasegawa-Kishino-Yano (HKY) substitution model and uncorrelated lognormal relaxed molecular clock. Each independent run was performed for 5 x 107 Markov chain Monte

Carlo (MCMC) iterations (sampled every 1000 iterations) and discarding 10% of the trees burn-in implemented in BEAST v.2.4 (Drummond & Rambaut 2007).

In addition to the considerable variation in mutation and substitution rates between genes and taxa, there is also a substantial disparity between mutation rates estimated directly from population studies and those inferred by phylogenetic (species level) studies (Ho et al. 2005). To avoid potential bias defined by the transition between short-term mutation and long-term substitution rate, we compared two molecular rates. The standard invertebrate mitochondrial divergence rate µ = 1.15x 10-8 per year (Brower 1994;

Papadopoulou et al. 2010), and the mutation rate based on Drosophila melanogaster laboratory strain estimations µ = 6.2x10-8 per generation (Haag-Liautard et al. 2008). The

51

latter was used to extrapolate a molecular rate of 4.29 x10-7 for C. capitata which has an average of 6.92 generations per year (Diamantidis et al. 2011). Each run was validated in TRACER ensuring a minimum of 200 effective samplings for each statistic. The two- run results were combined using LogCombiner v. 2.4.5 (Drummond and Rambaut 2007).

Finally, the results were visualised by median of skyline plots using TRACER 1.6.

Migration rate estimates

Connectivity was explored with the software LAMARC v.2.1.10 which estimates demographic parameters such as theta (θ), population growth (g) and migration rates (M)

(Kuhner 2006). Theta values were estimated as θ = 2µNe, where Ne is the effective population size, and µ represents the mutation rate per nucleotide and generation (see below for details). Migration rate was estimated as M = m/µ, where m is the probability of immigrants per generation and µ is the mutation rate per site per generation. The migration rate was multiplied by the θ value of the corresponding recipient population to obtain the migrants per generation value (Nm) (Kuhner 2006). The search strategy consisted of five initial and four final chains; the Bayesian estimation was used with ten initial chains with an interval of 20 using a burn-in of 1000 samples per chain. The analysis results were checked for convergence and effective sample size values (ESS ≥ 200) in

TRACER.

To inspect the spread of haplotypes among regions through time, a Bayesian discrete phylogeographic approach was implemented using BEAST version 2.4 (Drummond and

Rambaut 2007). A Bayesian stochastic search variable selection (BSSVS) method including, an HKY substitution model, relaxed lognormal molecular clock, and a GMRF

52

Bayesian Skyride demographic model was used. As described above, two independent runs of 5 x 107 generations were performed using the different molecular rates mentioned.

Each run was validated in TRACER ensuring a minimum of 200 ESS which is the support for inferred ages of migration events. The output files were combined in LogCombiner version 2.4.5 and annotated in TreeAnnotator version 1.8.4 (Drummond and Rambaut

2007). The annotated tree was visualised using SPREAD v1.0.6 (Bielejec et al. 2011).

This program converted the discrete phylogeographic analysis output obtained from

BEAST into a “kml file” before visualisation in Google Earth.

3.3 Results

Genetic diversity and population structure in Ceratitis capitata

A total of 403 sequences of C. capitata collected in 14 sites distributed worldwide were included in the analysis. The final truncated alignment was 538 bp in length corresponding to 179 amino acids of the mtDNA COI gene. No insertion/deletion or stop codons were detected in the whole data set. Most of the nucleotide substitutions were synonymous, but six non-synonymous mutations were identified. The changes corresponding to

Methionine to Leucine (Iran), Alanine to Threonine (Peru), Isoleucine to Threonine

(Kenya), Proline to Serine (Spain), Valine to Isoleucine (Israel and Australia), Methionine to Leucine (Iran). All these changes were placed at the tips of the haplotype network

(Figure 3.1).

The number of polymorphic sites (S) fluctuated between 23 in Kenya and 1 in Greece

(Table 3.2). The number of haplotypes (k) varied between 18 in Kenya to 1 and 2 in

53

Tunisia and Greece respectively. Tunisia has kept aside for the further analysis because of the lack of different haplotypes.

The haplotype diversity (h) and the nucleotide diversity (π) were much lower in the localities outside the Afrotropical region comparied to KE, and SA (Table 3.2). Also, the number of unique haplotypes was far higher in the Afrotropical region than in any other of the analysed localities, despite the fact that the total number of flies analysed in SP,

GU or IS was at least twice higher than KE (Table 3.2).

54

Table 3.2 Population genetic diversity indices and neutrality test statistics for C. capitata. The indices are shown as n: number of samples; k: number of haplotypes; S: number of segregating sites; h: haplotype diversity (with standard deviation SD); π: nucleotide diversity (with standard deviation SD). Tajima’s D and Fu’s Fs tests were considered statistically significant when * P-value<0.05, **P-value<0.01 and ***P-value<0.001.

Biogeographic Population Code n k S h ± SD π ± SD Tajima’s D Fu’s Fs region Kenya KE 22 18 23 0.969 ± 0.027 0.0057 ± 0.0034 -1.94** -15.89*** Afrotropical South Africa SA 37 17 16 0.941 ± 0.020 0.0053 ± 0.0031 -0.85 -8.56*** Ghana GH 5 4 4 0.900 ± 0.161 0.0033 ± 0.0009 -0.41 -1.19 Egypt EG 25 3 3 0.353 ± 0.112 0.0016 ± 0.0005 0.28 1.15 Israel IS 109 3 4 0.072 ± 0.034 0.0002 ± 0.0001 -1.58** -1.62 Tunisia TU 1 1 - - - - - Palearctic Iran IR 12 3 2 0.318 ± 0.164 0.0006 ± 0.0003 -1.14 -1.18* Greece GR 29 2 1 0.069 ± 0.063 0.0001 ± 0.0001 -1.02 -2.38 Spain SP 42 6 5 0.592± 0.068 0.0012 ± 0.0012 -1.69* -3.34*** Australasian Australia AU 24 5 4 0.377 ± 0.122 0.0007 ± 0.0002 0.03 -3.63* Guatemala GU 55 10 6 0.766 ± 0.039 0.0024 ± 0.0003 0.25 0.24 Colombia CO 22 3 2 0.567 ± 0.051 0.0011 ± 0.0001 0.15 -0.01 Neotropical Brazil BR 12 3 2 0.621 ± 0.087 0.0013 ± 0.0002 0.06 -0.22 Peru PE 8 3 2 0.679 ± 0.122 0.0014 ± 0.0003 0.06 -0.22

55

The median-joining haplotype network contained a total of 58 distinct haplotypes with low number of unsampled haplotypes (Figure 3.1). The network presents at one terminal an expanded genealogy meanwhile in the other side a star-like topology (Figure 3.1). In one cluster the haplotypes are segregated according to the biogeographic region with a large number of unique haplotypes originating in KE, SA, and GH. Overall, the Afrotropical haplotypes were found to be more diverse than those of other regions (35 haplotypes from 64 sequences); indeed, this cluster forms a separate block that is connected to another cluster by the haplotype Cc21. It is the most frequent haplotype (62.28%), it is distributed across almost all localities and occupies a central position with a starburst shape radiation from where Palearctic, Australasian and Neotropical sampled regions were somewhat more derived. Additionally, the presence of two specimens from SA in the haplotype Cc21 might be considered as a link between the Afrotropical and the others sampled biogeographic regions.

56

57

Figure 3.1 Median-joining network based on 403 individuals of the Mediterranean fruit fly generated using 538bp of mtDNA COI gene, showing location and frequency of haplotypes. Each circle represents an observed haplotype; the colours reflect sampling location and small black circles indicate unsampled haplotypes inferred from the data. The reticulated network segregated haplotypes according to the different biogeographic region. The more common haplotype in the Afrotropical region cluster is Cc_13 from where singletons are extending outwards. On the other side, the most common haplotype Cc_21 occupies a central position with starburst shape radiation from where derived the other haplotypes related to Palearctic, Australasian and Neotropical regions. Cc: correspond to C. capitata. The * in haplotype label refers to non- synonymous mutation.

58

Eight haplotypes were shared between at least two localities (Figure 3.2, see the identification code in Table 3.2), of which the haplotypes Cc_42 and Cc_49 were the most dominant, besides the ubiquitous Cc_21. In the Afrotropical cluster, only the haplotypes

Cc_04, Cc_06, Cc_13, and Cc_14 were shared between South Africa and Kenya, but none of them was the centre of an expanded genealogy as the common haplotype Cc_21 in the rest of the world.

59

60

Figure 3.2 Distribution of COI haplotypes across the study area for Ceratitis capitata. The map shows the study locations (country names are abbreviated as in Table 3.2), and the pie charts indicate the haplotype composition of the population from that location. Each colour represents a shared haplotype found across the study area, and the unique haplotypes (refer to haplotypes found in the samples from one particular population and are absent in the samples from other populations) are uniformly represented in white within pie charts. Native and non-native areas are represented according to Malacrida et al. (1998).

61

The pairwise Fst analysis performed to the 13 localities (Tunisia excluded) showed that the majority of the populations were significantly different (Table 3.3). Some exceptions were found for neighbouring sites including South Africa and Kenya in the Afrotropical group; Iran, Egypt, Israel and Greece in the Palearctic; or Brazil and Colombia in the

Neotropical region. However, some remote sites also presented non-significant differences such as Iran and Australia (Table 3.3).

62

Table 3.3 Pairwise Fst values between 13 populations of C. capitata. The Fst values for pairwise comparisons among populations calculated from mtDNA data. Significant tests were performed using 1,000 permutations. Bold values are statistically significant. Population code: KE, Kenya; SA, South Africa; GH, Ghana; EG, Egypt; IS, Israel; IR, Iran; GR,

Greece; SP, Spain; AU, Australia; GU, Guatemala; CO, Colombia; BR, Brazil; PE, Peru.

KE SA GH EG IS IR GR SP AU GU CO BR PE KE SA -0.002 GH 0.058 0.074 EG 0.346 0.302 0.492 IS 0.694 0.608 0.854 0.146 IR 0.315 0.278 0.474 -0.009 0.107 GR 0.516 0.439 0.772 0.079 -0.007 0.054 SP 0.236 0.210 0.312 0.088 0.313 0.070 0.189 AU 0.332 0.289 0.471 0.004 0.125 -0.024 0.062 0.074 GU 0.140 0.131 0.187 0.169 0.416 0.153 0.280 0.112 0.149 CO 0.231 0.211 0.320 0.236 0.576 0.220 0.415 0.168 0.168 0.118 BR 0.188 0.177 0.269 0.297 0.689 0.280 0.542 0.196 0.228 0.115 -0.051 PE 0.154 0.142 0.224 0.187 0.621 0.168 0.454 0.055 0.166 0.075 0.161 0.181

63

Demographic history

Across the entire dataset, only six of thirteen sites (Tunisia excluded) were significant for Tajima’s D and Fu’s Fs (Table 3.2). From the Afrotropical cluster, South Africa and

Kenya were highly significant and negative for these neutrality tests. Negative values were also found in the Palearctic (Israel, Iran, Spain) and Australasian regions. These findings may indicate either purifying selection acting on protein coding regions or may be due to recent population expansion that favour a non-random variation of haplotypes.

The Bayesian skyline plots exhibited differences in the effective population size calculated among the biogeographical regions (Figure 3.3; only the results obtained from the simulations using the corrected mutation rate of D. melanogaster are shown).

The time to the most recent common ancestor (tmrca) was estimated at around 11,600 years ago in the Afrotropical (Figure 3.3A). This group also showed a substantial increment (one order of magnitude) in the effective population size after the outset of

Holocene (~10,000 years ago) suggesting a signature of recent expansion which became significant around 3,500 years ago (i.e. when the 95% highest posterior density (HPD) limits no longer includes older estimates), after which the population size was largely stable until the present time (Figure 3.3A). In contrast, the Palaearctic group (Figure 3.3B) had a lower effective population size and showed more recent date estimations compared to the Afrotropical group, which exhibited significant population expansion only after about 500 years ago. The Australasian and Neotropical groups

(Figure 3.3C and D) remained at a stable population size from about 1,000 years ago and then showed a slight but not significant increment.

64

Figure 3.3 Bayesian skyline plot (BSP) estimate of Medfly demographic history for the biogeographic regions (A) Afrotropical; (B) Palaearctic; (C) Australasian and (D) Neotropical. The X-axis is in units of time before present (BP), and the Y-axis is equal to the log scale of NeT (the product of effective population size and the generation time in years). Each BSP plot described the demographic history per biogeographic region represented by a median line (solid line horizontal) with 95% High

Posterior Distribution in grey (HPD; equivalent to margins of error).

65

Figure 3.3 (legend continue) The dash line t1 is the time of population expansion per biogeographic group. The Time to the most recent common ancestor (tmrca) is represented in the Afrotropical population around 11,600 BP.

The coalescent analysis using Lamarc estimated a θ ranging from 0.1550 (Afrotropical) to 0.0031 (Australasian). The Lamarc results also indicated asymmetric migration between the Afrotropical and the other populations, whereby the Palearctic received the lowest migrants per generation (Nm= 4.35) and the Australasian population the highest (Nm= 5.47), while migrant flow in the opposite direction was insignificant

(Figure 4A). Migration analyses conducted exclusively on populations from within each biogeographic region showed high levels of unidirectional exchange within the

Afrotropical and Neotropical regions (Figure 3.4B), such as the remarkably high Nm SA to KE =133.4 and Nm COL to GUA = 4.5, while migrants were notably lower in the reverse direction (Nm KE to SA = 0.429 and Nm GUA to COL = 0.032 respectively) (Figure 3.4B).

66

Figure 3.4 Values of theta and migration rates between the biogeographical regions. The figure A) contains the Theta and migration values among the four biogeographic regions. Yellow circles represent the theta value (θ) per biogeographic group and the values in brackets are the 95% HPD confidence. The arrows indicate the direction of the migration and their thickness is the proportion of the migrants per generation. B) Migration rate and migrants per generation of specific medfly populations within the biogeographic region, country names are abbreviated as in

Table 3.2.

67

Patterns of historical migration among sites supported the inference that KE was the location root with 10,500 BP root-height estimated. Then SA appeared in 6,200 BP as another root location for medfly in the Afrotropical region. The first migration event deduced was between Afrotropical (SA-KE) and SP (Palearctic) in the 5,900 BP.

Around 3,500 BP and onwards, several cycles of immigration and emigration have apparently occurred between Palearctic, Neotropical and Australasian regions (Figure

3.5).

Figure 3.5 Hypothesised historical migration paths worldwide of C. capitata using a molecular rate of 4.2%

68

3.4 Discussion

The aim of this study was to describe the current genetic structure and recent demography of C. capitata and to provide insights into potential invasion routes leading to its worldwide distribution. Our extensive phylogeographic analysis has revealed a rapid colonisation process over the last 500 years, and a complex genetic structure of

C. capitata with clear variation between biogeographic regions. The colonisation process of the medfly is well documented by both historical records (Malacrida et al.

1998; Myers et al. 2000) and molecular studies (Barr 2009; Bonizzoni et al. 2004;

Gasperi et al. 2002; Karsten et al. 2015; Karsten et al. 2013). In fact, the recreation of

C. capitata invasion routes in this study fits broadly with that proposed in the literature, i.e. the medfly populations first migrated from the ancestral Afrotropical region to the

Palearctic and then to the Australasian and Neotropical regions (Malacrida et al. 2007).

The highest genetic diversities were found in Kenya and South Africa belonging to the

Afrotropical cluster. This was expected because the south-eastern African countries had been identified as the medfly’s ancestral, native range (De Meyer et al. 2002).

Western African areas have been proposed to be part of this large native population source distributed across all of Sub-Saharan Africa (Gasparich et al. 1997), but this interpretation conflicts with our finding that the sample from Ghana shows lower genetic variation than Kenya and South Africa, and the Fst results are statistically significantly between Ghana and the other two populations. In the network, the Ghana haplotypes are part of the Afrotropical cluster, but all of them are unique. These findings support the existence of native, but genetically differentiated populations in West Africa, although the number of individuals (a total of five) remains too low to resolve the

69

contradicting literature on the subject of population subdivision in Sub-Saharan Africa

(De Meyer et al. 2002; Gasperi et al. 2002; Malacrida et al. 1992).

Biological invasions are often associated with a decrease in genetic diversity of the invasive species due to a small number of founder events in their introduced ranges

(Lockwood et al. 2005; Lockwood et al. 2009). It is therefore unsurprising that we found evidence of a gradual loss of genetic variability from the ancestral Afrotropical region to the Palearctic and all other populations. Low genetic diversity was particularly obvious in the population from Israel, which was represented by the largest number of individuals of all sampled regions, and yet exhibited very low levels of genetic diversity.

Similar results were reported for another mitochondrial gene ND4 (NADH subunit 4) in two different populations collected in Israel (Elfekih et al. 2010). Iran and Greece also had low genetic variation compared to other populations, possibly because of limited hosts and climatic ranges suitable for the medfly in these regions. In addition, the constant eradication efforts in these countries, and in particular the use of SIT, might have resulted in occasional population bottlenecks and reduced genetic diversity in these populations. In contrast, the populations from Spain were the most genetically diverse in the Palearctic. The finding may in part be affected by the origin of the Spanish specimens, which were from multiple sites and thus may contain local variation that is not incorporated at most other country samples. In addition, populations in Spain might have a longer phylogeographic history and thus greater diversification, as the likely entry point to the Mediterranean basin.

The curious presence of a shared haplotype (Cc_21) among extremely distant populations suggests a recent connection of all non-native populations. This common

70

haplotype is also present in South Africa, but only as a very small proportion, and it is peripheral to the haplotype network of African haplotypes. While the distribution of

Cc_21 indicates a shared history of all non-native populations, the derivation of this haplotype from the South African population is not strongly supported. An individual carrying the Cc_21 haplotype could be the ancestor to all non-native populations, even if this haplotype was rare in the ancestral population, but conceivably the source population could be from elsewhere in Africa where the Cc_21 haplotype is more prevalent. Only more detailed surveys of native African populations will resolve this question.

The interpretation of demographic history results differs among the biogeographic region. The Bayesian Skyline plot result for the Afrotropical region showed it to be the most ancient population dated to about 11,600 years ago. Nevertheless, this time frame is far younger than ages usually associated with the time since species, or even the age of a closely similar tephritid fossil found in the Dominican Republic dated to the mid-Miocene to early Eocene (Norrbom 1994). However, the signature of population expansion coincides with a period when the region underwent the African Humid Period which is characterised by major climatic changes that influenced ancient human settlements (Manning & Timpson 2014). In this context, new strategies for plant use were developing in Africa about 17,000 years ago, although plant domestication was recognised only later at around 4,000 BP (Marshall & Hildebrand 2002). In the

Afrotropical region, the significant expansion signature detected about 3,500 years ago by Bayesian skyline plot is coincident with the plant domestication period in the region.

Currently, this region is stable as we can see in the plot but also can be supported by

71

the negative Tajima’s D and Fu’s values for Kenya and South Africa which are best interpreted as the result of purifying selection, as expected in mitochondrial protein- coding gene evolution (Meiklejohn et al. 2007).

On the other hand, in Spain as one of the first points of colonisation and presumed early origin of populations, the patterns of COI variation may be explained by purifying selection. In contrast, the Palearctic BSP showed a population expansion, as also described before in this region (Reyes & Ochando 2004). Non-synonymous substitutions, which are generally rare in mitochondrial genes, especially in the cytochrome oxidase genes (Pentinsaari et al. 2016), were found in all regions, but predominantly in introduced populations characterised by non-significant neutrality tests as expected for star-like topologies. Their position near the tips of the haplotype network suggests that they correspond to neutral variation or slightly deleterious mutations that are maintained in fast expanding populations, rather than adaptive changes affecting, for example, the metabolic rate due to the new environmental conditions exposed (Castoe et al. 2008), and thus these changes are consistent with the inference of fast population expansion.

Control methods implications and outcomes

The migration patterns within geographic regions might be affected differently in various parts of the world (Figure 3.3). For example, the pest management control methods presented notable differences among the countries. It is known that

Guatemala joined efforts in 1975 with Mexico to establish the Mediterranean Fruit Fly

Eradication Program (Moscamed Program) which received the collaboration of the

72

USA in 1977. The aim was the prevention, control and eradication of this pest from

Guatemala. After ten years, the Guatemala Moscamed program became a Barrier

Program due to the lack of economic support, preventing the movement of flies to

Mexico and USA (Aluja & Liedo 1993). This Program continues until the present day, effectively containing the fly within Central America (Enkerlin et al. 2017). A notable success of this program was the development of the Sterile Insect Technique (SIT) with outstanding results, for example, the eradication of medfly in some Mexican regions and the reduction to the minimum of medfly outbreaks in California and Florida

(Szyniszewska & Tatem 2014). On the other side, Colombia has also recognised the presence of the medfly (ICA 2010) and triggered the National Fruit fly program focusing on detection, control and eradication methods of medfly based on mass trapping and chemical application (Conpes 2008; Lasprilla 2011). Despite of that, these phytosanitary efforts are not enough to reduce the potential pest risk of Colombian commodities (APHIS 2018; Szyniszewska et al. 2016), and the COI study produced clear evidence for the unidirectional migration from Colombia to Guatemala. Given the differences in their pest control management methods, it is hardly surprising that high migration rate from Colombia with a deficiency in pest control methods now interferes with the successful Guatemalan program established some 40 years ago.

73

Predictions of medfly movements

Finally, the hypothesised historical migration patterns have shown substantial evidence that medfly had the origin in Kenya, however the medfly origin was dated 10,500 BC.

It is older comparing to the historical records by previous studies (Karsten et al. 2015;

Malacrida et al. 1998). Nervetheles, this date has a biological meaning when is compared to the development of agriculture and the domestication of animals that sustained human civilisations (Marshall & Hildebrand 2002). Medfly subsequently dispersed further southwards into southwest Africa (SA) around 6,000 BC, at this period the temperature in the region started to increase (Karsten et al. 2015; Manning

& Timpson 2014). The movement of flies from Afrotropical regions to the Palearctic occurred around 5,900 BC, but after that, the COI data suggests the flies dispersed in multiple directions, hindering the identification of dispersal directions among the studied biogeographic regions.

Conclusions

The colonisation process of the medfly appears to be associated with a relatively stable demographic structure separating the Afrotropical region and the introduced range

(Palearctic, Neotropical and Australasian), but characterised by residual levels of connectivity at regional scales despite considerable distance separating the populations, such as Egypt and Iran or Brazil and Peru. However, the COI marker used in this study has limitations due to comparatively low variation that may be insufficient to resolve events on the time scale of the medfly dispersal. Yet, using an appropriate mutation rate, the demographic analysis produced plausible scenarios associated with

74

the Holocene era, which is closely related to the agriculture and domestication process in the humanity. The inferred migration patterns among populations provide crucial information for the understanding of successful medfly invasions and thus pinpointing where countermeasures are required, in particular in a world connected via agricultural commodities trade. The case of successful containment in Guatemala and the dangers of fruit fly migration from elsewhere in the South and Central American regions illustrates these problems clearly. We used the most basic of molecular markers to study these phenomena, based on short fragment of a single locus, and studied pattern and process of medfly history at global levels based on just 14 local sites. The results are highly plausible and consistent with other studies using diverse approaches.

However, the conclusions have to remain tentative, given the limited detail of sampling.

Genomic approaches and much denser sampling at regional and global scales will be required to confirm the conclusions drawn here.

75

CHAPTER 4

Molecular approach to insecticide resistance in museum specimens and modern natural populations of the medfly: looking for Ccace2 point mutation

4.1 Introduction

Insecticides are widely used in most of the agricultural sectors to prevent or reduce losses by arthropod pests. Since 1940 the introduction of the synthetic organic insecticides such as DDT and organophosphates led to a significant improvement in the efficacy of insects control, with the resulting broad scale and expanded use at a global level (Sparks & Nauen 2015). Not surprisingly, the drawback of the excessive long-term use of insecticides in the past 65 years was a swift rise in the number of cases of insecticide resistance in natural insect populations (Figure 4.1) (Feyereisen et al. 2015; Ffrench-Constant 2013). Resistant pest populations can decrease crop yields and consequently their profitability, and for this reason, they are considered one of the major threats to agricultural production and commercial crops (Hardstone & Scott 2010;

Pimentel 2005). As a consequence, the study of insecticide resistance has been

76

extremely focused on preventing monetary losses by the targeted pest, rather than investigate it as an evolutionary phenomenon (Alyokhin & Chen 2017; Feyereisen et al. 2015).

Despite the concern about resistance, insecticides remain as a key tool for insect control. Among the synthetic insecticides used, organophosphates (OP) and carbamates (CB) had been widely applied for more than four decades (Figure 4.1)

(Kakani & Mathiopoulos 2008). Both chemical groups applied their toxicity to the irreversible inhibition of the enzyme acetylcholinesterase (ACHE). It is a key enzyme in the central nervous system through regulation of the neurotransmitter acetylcholine at cholinergic synapses and terminates the impulse transmission by catalysing the hydrolytic degradation of acetylcholine (Menozzi et al. 2004). OP and CB are analogous to the acetylcholine substrate and compete with it to enter the active-site of the enzyme (Zimmerman & Soreq 2006). This inhibition leads to the accumulation of acetylcholine in the synaptic region, which leaves the acetylcholine receptors permanently open, resulting in the death of the insect (Aldridge 1950).

Traditionally resistance has been characterised as the result of either biochemical- metabolic or target-site modifications such as increased metabolism, and decreased penetration or alteration of a target protein (Feyereisen et al. 2015; Georghiou 1972).

Whatever the mechanism, in broad terms, insecticide resistance is considered an adaptive trait in which a set of genes are favourably selected to keep the insect alive and able to reproduce when exposed to insecticides (Denholm & Devine 2001). Up to

77

now, around 580 species have been reported to be resistant at least to one type of insecticide (Figure 4.1) (Sparks & Nauen 2015). For this reason, understanding the mechanisms underlying insecticide resistance is a high priority as this knowledge can improve the current resistance management programs.

Figure 4.1 Cumulative number of arthropod species reported insecticide resistance through time (a) a total number of species resistant to one or more insecticides, (b-d) resistance in response to the three most widely used classes of insecticides until 1990. Data adapted from Denholm and Device (2001) and Sparks and Nauen (2015).

The target site resistance is characterised by a mutation that affects the coding sequence of a gene and thereby structurally alters the gene product. A considerable amount of literature has been published on the determination of point mutations that

78

caused decreasing sensitivity of the main target sites such as voltage-gated sodium channel (vssc), GABA receptors and acetylcholinesterase (ace) genes in several . However, the molecular basis of the insecticides’ mode of action themselves is crucial to understanding this resistance mechanism.

In insects such as mosquitoes, leafhopper and aphids, two acetylcholinesterase (ace1 and ace2) genes have been described whereas in Tephritidae there is only a single one (ace2). Each gene locus consists of discrete regions of exons that will become part of a transcript. In the case of the medfly, the Ccace2 coding sequence (Ceratitis capitata ace2 gene) covers the exons 2 to 9 separated by large introns (Figure 4.2, a).

It has an open reading frame of ~2000 bp encoding a protein of 669 amino acids that is of similar structure to the well-studied orthologous sequence of Bactrocera dorsalis

(99%) and Bactrocera oleae (97%) (Kakani & Mathiopoulos 2008; Vontas et al. 2011).

The comparison between the ace gene susceptible and resistant strains of the medfly revealed one point mutation at 328 position (Torpedo number). The mutation changes a GGC codon to GCC and leading to the replacement of glycine by alanine in resistant flies (Figure 4.2, c and Figure 4.3) (Magaña et al. 2008; Sussman et al. 1991).

79

Figure 4.2 Molecular map of the Ache gene of medfly. a) Medfly Ache gene organisation of the regions coding the mature protein. Colour boxes and black lines represent the exons and introns respectively (size and length were designed in scale).

Locus position is established based on the genomic medfly DNA (Scaffold 242), the total length is about ~115 kb long due to longer introns. b) Genome location at the end of exon 5 and beginning of exon 6 of the putative point mutation that confers insecticide resistance to medfly. Introns and the alternative splicing sites of each exon are identified in light red. c) Codon conformation of the putative mutation of insecticide resistance in medfly, red letters highlight the genetic variant. Data adapted from Elfekih et al. (2014).

80

Figure 4.3 Diagram representing the genetic variants described in Ccace gene.

The upper part depicts the Ccace transcript and the point mutation reported by Magaña et al. (2008). The lower part is the protein sequence with the positions of the putative point mutation in medfly. The Gly328Ala (G328A) refers to the corresponding amino acid replacement, however the name is not related to the real position (420) in the protein (Supplementary figure 4.2).

The resistance to insecticides is considered as a relatively fast acquisition of a new phenotypic trait based on simple genetic changes in the organisms (Hartley et al.

2006). The mutations of acetylcholinesterase are well documented in insects like

Musca domestica (Walsh et al. 2001), Drosophila melanogaster (Menozzi et al. 2004) as well as described above in Tephritidae members (Hsu et al. 2006; Kakani et al.

2013). However, an essential but often overlooked aspect is how the arthropods become resistant to insecticides at a microevolution level which until now, is not clear as an evolutionary phenomenon. For this reason, it is essential to identify whether how the microevolutionary process to acquire the newly beneficial traits proceeds by

81

the selection of existing rare variants (preadapt) or it is process which wait until arose the new mutation once the selective pressure is imposed.

Without knowledge of populations prior to the use of insecticides, the experimental design may not be conclusive to understand how the historical and current processes contribute to the evolution of the insecticide resistance in natural populations. In this context, museum collections hold vital records to science and society and also act as biological diversity libraries, but they have been underutilised as a genetic resource

(Suarez & Tsutsui 2004; Tin et al. 2014). This is in part because DNA extractions can be destructive, and curators are rightfully protective but also, museum specimens contain degraded DNA which reduces the possibility of success in the most commonly used methods of gene sequencing.

Although collection specimens have been exploited in a limited number of cases, most of them were focused on mitochondrial DNA recovery and phylogeographic reconstructions (Andersen & Mills 2012; Hunter et al. 2008; van Houdt et al. 2010).

There is only one study which described the resistance to malathion

(organophosphate) insecticide in pinned specimens of Australian sheep blowflies that were collected before the introduction of the insecticide (Hartley et al. 2006). By contrast, in C. capitata, there is one study that described the point mutation G328A in the coding sequence as resistance mechanism to malathion based on a laboratory strain (Magaña et al. 2008) (Supplementary figure 4.2). The presence/absence of this mutation was examined in contemporary populations of medfly (Elfekih et al. 2014) and detected only the resistant allele in medflies collected in Brazil and Spain, probably linked to the insecticide treatment history of these sites.

82

Even though today there are alternative strategies like Sterile Insect Technique (SIT) to be implemented as pest control, they are either not efficient enough or considerably costlier. Thus, the use of chemical insecticides, particularly OP and pyrethroids, remains the most efficient and economical approach, especially in developing countries due to their low price and broad spectrum of target (Saunders et al. 2012).

Subsequently, their intensive and non-prudent use has resulted in the progressing development and spread of insecticide resistance in natural insect populations.

Based on the intense use of insecticides for C. capitata control and its economic impact in agriculture, the first aim was to elucidate the evolutionary genetic mechanism by which ace-based resistance had evolved in medfly using DNA from museum specimens collected prior to the pesticide era. The second aim was to examine the distribution of the resistance Ccace2 alleles associated with the G328A mutation in field-collected (modern) in a site of heavy pesticide use in in Colombia to test for the correlation with the assumed resistance alleles.

4.2 Material and Methods

Sample collection

Museum dry-pinned specimens

The recently digitised C.capitata collection at the Natural History Museum of London

(BMNH) has a total of 706 dry-pinned specimens, of which only 272 have the year and locality of collection recorded, covering a historical period spanning from 1901 until

2011 and distributed in countries across four continents. The museum samples were

83

divided into two groups according to the history of OP use, where the first group called

Pre-OP period was designated from 1901 to 1939 and the second group called Post-

OP from 1940 to 1967. In the BMNH collection, a total of 131 specimens were collected

Pre-OP period meanwhile 101 specimens were collected just at the beginning of the

Post-OP. With the intention to protect the valuable information that the flies’ collection provides, only a 10% of the total specimens for each group were used for molecular purposes. In addition, a group of 16 medflies from the BMNH collection were included in this study; they were collected in Kenya, 2001.

Modern specimens

Modern medfly samples were collected in the Department of Nariño, south-west area of Colombia in 2015 (Figure 4.4). Infected and ripe fruits were collected in commercial orchards, gardens and backyards on their fringes and also non-commercial trees and bushes. The sites were chosen based on host plants, type of treatment and chemical and type of insecticide employed in the locality. The infected fruits were moved to laboratory conditions; they were classified according to size and placed in rearing chambers. The fruits were kept for at least seven weeks and examined daily for the appearance of puparia (Rwomushana et al. 2008). Once the incubation period was over, the emerging fruit flies were transferred to new plastic containers (12 cm diameter) and sexed. Adult flies were fed for three days with a mixture of sugared water (70%) and honey drops, while water was supplied through soaked cotton wool.

Once the flies were fully developed and attained their specific colouration, they were counted and preserved in 97% ethanol.

84

This method provides specimens with higher confidence that they came from that specific orchards and were or were not exposed to insecticides (depend on information recorded for a site), as compared to other popular methods to catch medflies.

Additionally, medflies collected by Jackson and McPhail traps in the Department of

Cundinamarca, central Colombia were included in this study in order to compare geographically distance population within Colombia. Due to the sampling method, this population was classified as having an unknown insecticide treatment.

85

Figure 4.4 Map of study areas in Colombia. A) Dark green in the centre of Colombia corresponds to Cundinamarca, the black dot represents the sampling location. The dark green area in the South corresponds to Nariño department. B) Zoom in of Nariño

86

to identify the sampling sites for details. The dots represent each sampling site, and the colour corresponds to the type of treatment use. The white lines in the map represent the administrative division among the townships.

Specimen preparation and DNA extraction

Museum dry-pinned specimens

To be sure that the collection specimens suffered the least possible damage from DNA extraction, trial using modern medflies was performed. First, ethanol preserved flies were dried in a laboratory incubator (Shellab, USA) for at least 72 hours at 32°C.

Second, one specimen was kept intact (whole fly), and the others were dissected to perform a semi-destructive extraction method of their abdomen or legs. Third, the DNA was extracted from the specimens described above but also for one museum specimen leg, collected in 1998, using the protocol described by Thomsen et al., (2012). After this process, each specimen was evaluated by Daniel Whitmore (NHM Curator of

Tephritidae) to identify possible damage to critical taxonomic characteristics of medfly

(i.e. antenna, bristles, wing) (De Meyer 2000). His evaluation and the molecular analysis findings suggested that performing a semi-destructive DNA extraction method removing the abdomen, might give enough yield to perform the molecular analysis.

Thus, each museum abdomen was carefully removed by Daniel Whitmore to continue with the DNA extraction.

Full precautions were taken to prevent contaminating the museum samples with previously amplified modern DNA in the laboratory. The DNA extractions were

87

conducted in an isolated room in the laboratory that had never been exposed to medfly

DNA. Furthermore, the extraction was carried out in a biosafety cabinet (Labculture

Class II, ESCO) with UV-sterilized equipment. Dedicated new pipettes set and reagents were different from those used in other experiments in the lab. Filter tips were used for all experimental procedures.

The abdomens were placed in 2 ml Eppendorf tubes, fully immersed in the extraction buffer (Thomsen et al. 2009) (Supplementary Buffer List 4.1) and incubated overnight

(at least 12 hours) at 55 °C with gentle agitation. After the incubation, the abdomens were removed from the buffer and placed in ascending EtOH concentrations from 70% to 100% in periods of 2 hours to stop further digestion and preserved in 100% EtOH until their mount and replacement in the collection. Then, the nucleic acids were purified from the extraction buffer using a DNA Clean & Concentrator kit -5 (Zymo Research).

Negative controls were used during every extraction to monitor for contamination.

For some specimens, the DNA fragment size was evaluated with an Agilent Genomic

DNA screen tape (Agilent Technologies), and DNA was quantified using Qubit 2.0

Fluorometer (Invitrogen) with the Qubit dsDNA HS Assay kit (Invitrogen).

Modern specimens

The modern specimens (all of them preserved in 97% EtOH at -20 °C) were morphologically identified, and the genomic DNA was extracted individually using

DNeasy Blood & Tissue Spin Column Kit (Qiagen), details Chapter 3.

88

Primer selection, Tagged design and PCR conditions

The primer for amplifying the SNP in the acetylcholinesterase gene responsible for insecticide resistance in medfly were described by Elfekih et al. (2014). The primers were modified for Illumina sequencing by including identifier tag sequences at the 5’ end of both forward and reverse primers, in order to distinguish the DNA sequences from each specimen in the same sequencing run and to check for tag coherence after sequencing (Supplementary Table 4.1).

When designing tags, it is essential to consider their structure and similarity as errors introduced during tag synthesis, PCR, library preparation or sequencing can severely impact downstream read processing (Chen et al. 1999). Furthermore, the experimental protocol used in this study based on amplification by PCR followed by Illumina sequencing-by-synthesis can introduce errors in the results. Amplification during PCR or library preparation introduces substitution, gap errors and forms chimeras (Dohm et al. 2008; Kozarewa et al. 2009). Meanwhile, sequencing errors generated by Illumina platforms are also largely substitutions and are more common at the ends of sequence reads, where tags are located (Shendure & Ji 2008).

The tags were calculated with “Barcode Generator v2.8” (available from http://comailab.genomecenter.ucdavis.edu/index.php/Barcode_generator). Each tag was 6 bp long and differed from all other tags by 3 bp (Supplementary Table 4.1). All primers were tested using the Thermo Scientific Multiple Primer Analyzer (available from https://www.thermofisher.com/uk/en/home/brands/thermo-scientific/molecular- biology/molecular-biology-learning-center/molecular-biology-resource-library/thermo-

89

scientific-web-tools/multiple-primer-analyzer.html) considering their physical and structural properties.

The PCR for modern samples was performed in 20 µl reaction volume containing 1 µl

DNA extract, 3 mM MgCl2, 2 µl 10x reaction buffer, 0.1 mM each dNTP, 0.3 µM each primer, 1 U BIOTAQ DNA polymerase. The PCR reaction for museum samples was performed in 25 µl volume with the same concentration as modern samples; only exceptions were 2 µl DNA template and 1 U BIOTAQ DNA polymerase. The PCR profile for DNA amplification in modern samples was as follows 2 min at 94 °C for initial denaturation, 30 cycles of 30 s at 94 °C, 1 min at 60 °C and 30 s at 72 °C, the final extension for 10 min at 72 °C. For museum samples, the number of cycles was increased to 40. Extraction and PCR blanks were amplified and checked for contamination during each PCR from museum material.

Only the PCR product of museum samples was purified using Agencourt AMPure XP

(Beckman Coulter). A total of 5 µl of the resulting purified PCR product was used as a template for a second PCR (nested) using the same conditions described above for museum samples to increase the final target product. To confirm amplification was successful all PCR products were visualized using GelRed (Biotium) on a 2% agarose gel.

Once all samples were amplified, three concentration classes based on gel band intensity were established. The three classes were cleaned up using Agencourt

AMPure XP SPRI beads and quantified using Qubit 2.0 Fluorometer with the Qubit dsDNA HS Assay kit. Thereafter, the PCR products were pooled into 96 separate libraries in equimolar proportion so as to ensure that the potential weak products were

90

not overwhelmed by the stronger ones during sequencing, thereby maximising read recovery of each sample.

Library preparation was conducted with the Nextera XT kit (300 bp paired-end) and sequenced on a HiSeq 2500 lane at the Earlham Institute, Norwich, UK.

Bioinformatic processing

Reads from each library were assessed with Fastqc

(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to ensure read length distribution and quality scores were within the expected range. Reads were demultiplexed based on a table with the specific tag sequences and using the raw sequences file pairs using Cutadapt v.1.12 (Martin 2011) allowing for one-base mismatches. The primers were trimmed using “fastx_trimmer”

(http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_trimmer_usage).

PEAR was then used to merge forward and reverse reads (Zhang et al. 2014b). This output was quality filtered using “fastq_filter” in USEARCH v.7.0 with a 0.5 E total expected error (Edgar 2010).

To identify unique authenthic sequences, including intra-specific genetic variants, the

UNOISE algorithm was implemented in order to remove probably read errors (Edgar

2016). Subsequently, the sequences were clustered using USEARCH v.8.0 with a 0% of dissimilarity between sequences for clustering (zero-radius OTUs) and denoising to correct point errors. The program outputs a fasta file containing a single representative for each cluster or OUT, which were checked with nucleotide Basic Local Alignment

Search Tool (BLASTn) for molecular identification (Altschul et al. 1990). A semi-

91

automated bioinformatic pipeline was created using Perl to process the above- described steps (NAPtime script; Creedy, unpublished).

Data preparation and analysis

A clustering table (total number of reads in each OTUs x medfly specimen) was obtained from the bioinformatic steps. The list was determined the heterozygosity of samples and provided details on the total number of reads that each sample contained.

Thus samples with a high number of reads in one cluster were recorded as the homozygous meanwhile a sample with similar number of reads in a different cluster was consider as heterozygous. The Hardy-Weinberg equation was used to calculate the expected frequency in the different populations and tested using chi-square.

4.3 Results

Read processing and clustering delimitation

After the demultiplexing, over 6 million paired end reads were obtained. A total of 3,5 million (52.8%) were removed in the subsequent stages during the bioinformatic processing, leaving a considerable proportion of reads for further analysis. The

UNOISE software produced a total of three clusters (Table 4.1). Cluster 3 corresponds to bacterial contamination, and it was only found in one museum negative extraction control which presented the lowest number of total reads and also differed in length compared to the target fragment size. For this reason, it was removed from further analyses.

92

Table 4.1 Total number of cluster, species identification and number of reads obtained.

Fragment size Cluster ID Species identification Total number of reads (bp)

1 Ceratitis capitata 103 2,177,987

2 Ceratitis capitata 103 1,011,980

3 Ralstonia solanacearum 105 71

The two other clusters each were represented by high read numbers in approximate

2:1 ratio across the whole data set. These two clusters were of the same sequence length (Table 4.1) and matched precisely with scaffold 242 (NW.004523739.1) of the

C. capitata draft genome which contained the Ccace2 gene (Papanicolaou et al. 2016).

These alleles 1 and 2 of the ace gene differ in three nucleotide positions (Figure 4.5).

93

Figure 4.5 Alignment of Ccace2 fragments gene and their respective genomic mapping to C. capitata. Dashed lines represent the nucleotide substitutions found in the amplified fragment.

Within each sampling group, the total number of reads obtained in the museum samples ranged from 0 to 10,235 (mean 2,699) and 14 to 18,378 (mean 3,061) in Pre-

OP and Post OP group respectively while the Kenyan samples fluctuated from 1 to

26,636 (mean 6,236). The modern samples collected in Colombia ranged from 0 to

18,907 reads per sample (mean 4,433).

The primers used in this study amplified the last part of exon 5 and the beginning of downstream intron (Figure 4.2), which does not contain the non-synonymous resistance mutation (G328A). Further examination of the amplified sequence revealed a putative splicing sites for the Ccace gene near the G328A site, in agrement with the

Drosophila melanogaster and Caenorhabditis elegans genomes in the Berkeley

Drosophila Genome Project and NetGene2 respectively (Brunak et al. 1991; Reese et

94

al. 1997). These models have confirmed the previously described location of exons 5 and 6 in Figure 4.2 and 4.6. However, these findings were inconsistent with Elfekih et al. (2014), which prompted me to contact them, who explained a mistake in the published primers sequence and provided the published erratum (Elfekih et al. 2015).

Despite that, the analyses were carried on because the amplicon was placed on the enzyme’s active site. Thus any nucleotide substitution in it can result in a modification of the acetylcholine protein.

Once the exon-intron borders were defined, only 58 bp of the amplified fragment correspond to the coding part of the acetylcholinesterase enzyme remained, including the active site. The fragment contained only one nucleotide substitution resulting in a synonymous nucleotide change (Figure 4.6 d). The other two described polymorphisms were located in the intron region downstream, even though the further analyses were performed using the entire alleles sequence found due to the stability in their information transmission from one generation to the next.

95

Figure 4.6 Mutation in Ccace2 gene. a) Genomic location of exons in the scaffold

242, the bracket indicates the location where the primers were aligned. b) The total length of the amplicon obtained and the specific location of the SNP found. c) The alleles detected in all the sample types studied (Pre-Op, Post-OP and Modern), codons in light blue correspond to exon meanwhile codons in light red correspond to the intron. d) Predicted protein of the amplified fragment, the red letter indicated the silent nucleotide substitution.

96

Museum samples

A total of 41 sequences were recovered from 42 pinned flies which had DNA extracted.

In the Pre-OP period (1904-1939) a total of 13 sequences were recovered, which correspond to specimens collected in Saint Helena (n= 1), Zimbabwe (n= 1), Ghana

(n= 6), Lybia (n= 3), Portugal (n= 1) and Spain (n= 1). In the Post-OP period (1957-

1967), only 11 sequences were obtained from flies collected in Saint Helena (n= 5),

Zimbabwe (n= 1) and Portugal (n= 5).

In the Pre-Op period, Allele 2 was found in four of the six study sites, in fact, the highest frequencies were exclusively found in African sites, on the contrary, Allele 1 dominated the European countries (Figure 4.7). The heterozygous specimens were found in the centre of the sampling range.

Figure 4.7 Frequency of the Ccace2 alleles across study sites in the Previous organophosphate (Pre-OP period).

97

The highest frequency of the Post-OP period was the Allele 2 in Zimbabwe, but it was based only on one specimen. In contrasts to Pre-OP allele distribution, Saint Helena

(Africa) has the three genotypes described among which the heterozygous individuals were the most frequent (Figure 4.8).

Figure 4.8 Frequency of the Ccace2 alleles across study sites in the Post-OP period.

Using samples pooled across the study groups, no heterozygote excess or heterozygote deficiency was observed in the two OP groups (Table 4.2). These parameters suggested that both medflies groups were at Hardy-Weinberg equilibrium

(HWE) for the nucleotide substitution found at 829,280 in the medfly scaffold242.

98

Table 4.2 Allele frequency and Hardy-Weinberg equilibrium at the Ccace2 locus in C. capitata populations groups based on their insecticide treatment history. N is the number of individuals analysed. Alleles represent the polymorphism identified.

Genotypes (Ccace2) Cluster Heterozygous Cluster 2 n X² (df = 1) P value 1 (TT) (T/C) (C/C) Pre-OP 14 5 5 4 1.1137 0.2912 Post- 10 4 5 1 0.0978 0.7544 OP

On the other hand, a total of 17 sequences were obtained from the pinned samples collected in Kenya (2001). Although they were treated as museum specimens in the

DNA extraction method, their alleles frequencies were incorporated in the below analysis as modern samples.

Modern samples From 302 flies collected in 12 localities in Colombia, 299 sequences were obtained and divided into three groups: organophosphate insecticide-treated (OP), no organophosphate insecticide-treated (No-OP) and unknown insecticide treatment

(Unk) which had 156, 122 and 21 specimens respectively. Additionally, the specimens from Kenya (2001) were incorporated in the last group for comparison purpose.

99

Figure 4.9 Frequency of the Ccace2 alleles across study sites in the Modern period. The graph was based on those populations ≥ 5 individuals in total. Code identification of sites in axis x, refer to Table 4.3.

Allele 2 was found in six of the 12 study sites in Colombia, overall at low frequency across the three groups. In contrast, Allele 1 was present in nine of the 12 Colombian populations, in fact, it was present in highest frequencies in AB (83,3%). The same tendency was found in the African samples (KE) at 82,3% (Figure 4.9).

Most of the samples across the study sites in the department of Nariño and

Cundinamarca in Colombia showed no deviations from Hardy-Weinberg equilibrium

(Table 4.3) independent of the group that belonged. The only deviation from HWE was found in AG sites from the non-insecticide group; although this calculation can be caused by the small sample size or exchange of migrants from closer populations

100

exposed to insecticides. A significant departure from Hardy-Weinberg equilibrium was observed in Kenya, which was associated with an excess of homozygotes (Table 4.3).

101

Table 4.3 Allele frequency and Hardy-Weinberg equilibrium at the Ccace2 locus in Ceratitis capitata populations

collected in Colombia and Kenya. Location, sampling sites (reference Figure 4.4). Group refers to the specific chemical

insecticide used or not as a treatment in the sample site. N: number of individuals studied. nd: not determined because the

population is monomorphic. The significant deviation from HWE are in bold (P value column).

Genotypes (Ccace2) Host Pesticide Cluster 1 Heterozygous Cluster 2 X² (df P Continent Country County Location N plant treatment (TT) (T/C) (C/C) = 1) value America Colombia Imues AS Coffee ORG 70 22 34 14 0.0172 0.8953 America Colombia Iles TC Coffee ORG 77 20 36 21 0.323 0.5697 America Colombia Iles AB Coffee ORG 6 5 1 0 0.0495 0.8237 America Colombia Imues AJ Coffee ORG 1 1 0 0 nd nd America Colombia Imues AX Tangerine ORG 1 0 1 0 nd nd America Colombia Iles IC Coffee ORG 1 1 0 0 nd nd America Colombia Imues PC Tangerine NO 75 27 33 15 0.7039 0.4014 America Colombia Funes AM Loquat NO 33 4 15 15 0.0071 0.9326 America Colombia Imues AG Loquat NO 5 1 3 0 5 0.0253 America Colombia Iles MC Tangerine NO 9 0 4 5 0.7346 0.3913 Africa Kenya - UNK 17 14 2 1 9.9622 0.0015 Los America Colombia CN Peach UNK 20 9 11 0 2.8775 0.0898 Duraznos America Colombia Funes AM Coffee UNK 1 0 0 1 nd nd

102

4.4 Discussion

The results presented and analysed are based on the data generated with a set of primers which amplified a fragment of the exon that contained the target mutation, but unexpectedly they cannot reach it entirely. Due to time limitations, a new sequencing run using the primers that amplify the G328A mutation in the whole dataset could not be performed. Despite that, it is somewhat surprising the mutations upstream of exon 6 were noted in the different groups analysed.

Museum samples

The success in amplifying a nuclear gene from museum specimens collected up to 113 years old suggests that pinned material permit temporal surveys of polymorphisms for a wide variety of traits in medfly. It is critical to bear in mind that only 100 bp long were sequenced but is remarkable that the amplicon was produced in the 97.6% of the specimens tested. These findings are also supported by the few number of reads found in negative extraction and PCR controls which helped to discard cross-contamination between samples.

Our concern for using museum samples was the cumulative damage to the DNA that can cause incorrect bases insertion during enzymatic amplification. Some authors have been described miscoding lesions, usually causing C to T transitions (Sefc et al. 2007; Shapiro

& Hofreiter 2012). This transition was part of the allele 2 found in the museum samples of this study, but a relevant part of the modern natural samples also presented the same mutation. Hence, the allele 2 is unlikely to be a DNA damage artefact because the

103

modern samples also presented the same mutation. In this case, the modern samples functioned as a control and ensured the genetic data obtained from the museum samples was reliable.

To date, a single study showed that resistance to malathion (organophosphate insecticide) was detected in pinned specimens of Lucilia cuprina and Lucilia sericata collected up to 1930. This result has many implications at evolution level, but one relevant is the pre-existing resistance alleles in this pinned specimens might not carry the fitness cost that is associated with new mutations (Ffrench-Constant 2007; Hartley et al. 2006).

The results of this study do not explain the findings described in Hartley et al. (2006) however, the presence of a synonymous mutation in the active site of acetylcholinesterase may have implications for developing resistance to insecticides.

In the Pre-OP period the allele 2 (rare allele) were presented exclusively on the native population of C.capitata (African countries), but then in the Post-OP period, this tendency changed because the allele was found in the three population belong to this group. This finding, while preliminary, suggests that the allele 2 (which contained the SNP) may have a potential link to an specific phenotype, such as resistance to insecticides although first a genetic association with the G238A mutation might be validated.

Modern samples

The two alleles found among 299 medflies collected during this study in Colombia had no causal link with resistance to OP. Although in the present study we have three different field treatments to compare, the results were unable to demonstrate significant differences among them based on HWE. In fact, the population that shows significant

104

differences (AG) had not been treated with organophosphates. A possible explanation for this might be that it was one of the smallest populations in the HWE analysis (Hedrick

2011).

Another explanation could be that flies may have already dispersed from areas with high insecticide pressure to untreated areas like AG. It is known that C. capitata is a species where the majority of adults do not move large distances, its average range being ~10 km (Gavriel et al. 2012). The AG sample site is located in the Imues state where also was sampled one of the most extensive treated site (AS) and it was in a ratio of ~ 2 km distant to AG. In this context, a reduced availability of one host might trigger the dispersal of flies to another site. However, it is unlikely to occur because of the sampling method implemented (picked up ripe fruit and reared in laboratory condition) considerably reduced the crossing sampling effect.

The Kenya population was highly significant to HWE. Again, it might be an effect of the sample sizes, but on the contrary to the Colombian population, this samples presented a high number of allele 1. A note of caution is due here since the samples came from three different regions of Kenya (Central, Coast and West) which is consider the native region of medfly (Chapter 3) and probably lead to Wahlund effect (Garnier-Géré & Chikhi 2013).

As a result, an excess of homozygotes were observed of this structured population and lead to a significant departure from HWE at a total population level.

105

Integration of historical and modern data in the study of resistance to insecticides in Medfly.

In the current study, the number of reads in the museum and the modern natural samples did not show relevant differences among them. Of course, the museum samples were sequenced using the highest number of cycles and nested PCRs comparing to the single

PCR performed to the modern natural samples. It might be the reason that the number of total reads was quite similar in the different groups.

On the other hand, the results of this study only found a synonymous mutation in the 312

(Torpedo numbering) provided two alleles across the whole analyse dataset. This phenomenon had been previously described in the medfly ace gene (Ccace2) with four synonymous mutations (Magaña et al. 2008) all of them different to the found in this study.

Also, a similar situation was described in the Bactrocera dorsalis ace gene, the cDNA sequence derived from fenitrothion (organophosphate insecticide) resistant specimens presented several silent nucleotide substitutions, but only three mutations resulted in codon changes (Hsu et al. 2006). The authors declared that the linkage combination of the three nonsynonymous mutations might affect the structure and the activity of the acetylcholine enzyme.

Both studies, do not explore the association between synonymous and nonsynonymous changes and how this allelic association may exert effects on the activity of the ACHE enzyme remain unknown. In this context, it is also essential to develop a better understanding of how these mutations may apply effects on specific genes and their products (Hsu et al. 2008).

106

Among members of Diptera, the coding region of the ace gene are conserved, and an extensive divergence at intron level is found (Batterham et al. 1996). Noticeable is the lack of evidence to establish a relationship between resistance and fitness cost in insects, however it is believed that a little cost in fitness could produce single mutations (Mutero et al. 1994). In this context, the combination of these single point mutations can be crucial in the generation of resistance to enzymes (e.g. insecticides) (Walsh et al. 2001). For this reason, based on the findings obtained in this study which contained individuals from three different continents, a further study conducted on the genetic association between the mutation and the non-synonymous mutation downstream (G328A) are recommended.

Here was implemented a novel technique which allows to rapidly survey polymorphism of hundred of medfly specimens at low cost. In museum collections, these findings suggest that pinned medfly specimens are suitable source for insecticide resistance and population genetic studies. The DNA extraction from the abdomen is an adequate method. These findings are promising for the wider use of this methodology in future large-scale applications.

Conclusions and Future directions

It is well understood that target resistances to insecticides are associated to point mutations that produced substitutions in the predicted amino acids (Ffrench-Constant

2013). However, it was not possible to assess the G328A mutation in this study; therefore, it is unclear if the findings observed are directly associated with the insecticide resistance.

This study has demonstrated a rapid and low-cost method based on high throughput

Illumina sequencing that is able to recover genetic variation across a different source of

107

medfly specimens. Additionally, this study has been confirming that the unique tagging of both primers in high degraded samples (museum collection) can be possible and even convenient regarding the obtained output. These findings suggest that using this method in future studies may produce promising results in validating other target site mutation such as GABA receptors or Voltage-sensitive sodium channel (VSSC) genes.

On the other hand, the amplification only of one fragment of the Ccace2 gene was the main weakness of this study. Notwithstanding this limitation, the study leads, based on the methodology described, to explore other mutations in the gene among the multiple samples collected. In case of other mutations appeared in the gene, a linkage disequilibrium analysis in association with the described point mutation G328A will be needed.

The mutation is located on the active site of the acetylcholinesterase enzyme nearby the

G328A mutation. Nonetheless they are separated by a larger intron compare to the mean intron length described in the whole medfly genome (Papanicolaou et al. 2016). As expected, if the mutational event in medfly occurred recently, this long distance might be helpful because in this context will be expected a large region of linkage disequilibrium.

Thus, the museum samples set will be informative in terms of pre-adaptation to this mutation before the insecticide pressure started. Also, the Kenyan and Colombian samples will help to understand potential changes in this populations, specifically in the linkage disequilibrium analysis. In the non-native population of Colombia, it is known that this population arose relatively recently (Chapter 3) this situation might increase the strength of linkage disequilibrium expect to observe compared to the native population

(Kenya) as an effect of migration rate.

108

Currently, the correct primers for the mutation G328A (Elfekih et al. 2014) were preliminarily tested in a small subsample of this study using Sanger sequencing. Due to the low quality of this sequencing method and the crucial of it in order to determine the point mutation, I decided to process the samples in an Illumina platform that is expected to send to sequence in February/March 2018 for further analysis.

109

CHAPTER 5

Complete mitochondrial genome and molecular phylogeny of three species of Anastrepha

5.1 Introduction

Anastrepha (Diptera: Tephritidae) is the most economically important genus of fruit flies in the Americas distributed from the southern region of United States (Florida) to South

America without occurrence in Chile (Hernández-Ortiz & Aluja 1993; Norrbom 2012).

Currently, at least seven species of Anastrepha are considered major economic pest because of the severe damage that can cause to wide variety of fleshy fruits. These species are Anastrepha fraterculus, Anastrepha grandis, Anastrepha ludens, Anastrepha obliqua, Anastrepha serpentine, Anastrepha striata and, Anastrepha suspensa (Norrbom

2010). The taxonomic status of A. fraterculus is still recognised as a complex species.

Nowadays, there are around 29 species comprised the fraterculus group where some of them are recognised as cryptic species complex (Caceres et al. 2009; Hernández-Ortiz et al. 2012).

110

In addition, Anastrepha distincta has been considered a pest of secondary importance because it has been only associated with wild species of Inga genus (Oropeza et al.

2008). However, in the past years, there is an increment in the reports of sporadic infections to economically important fruits such as oranges and mangoes by this fly, for this reason, it is currently considered a pest of several commercial fruits (Norrbom 2012).

Despite their relevance in agriculture and the great diversity of these fruit flies, the evolutionary and genetic studies in Anastrepha sp have had limited usage, and the relationship of these flies has remained poorly understood (Mengual et al. 2017). It is known that the variations in the COI gene are not a reliable method to distinguish between all flies of the fraterculus group (Frey 2013), as consequence molecular diagnostic methods are required. One option is to use the complete mitochondrial genome

(mitogenome), which have become established as one of the essential resources for marker design and has been used for comparative studies in phylogeny, phylogeography, diagnostics and molecular evolutionary studies. To date, the only complete mitogenome available is the one of A. fraterculus (Isaza et al. 2017).

Based on the limited genetic data available for these species, I used the next-generation sequencing (NGS) to obtain the complete mitochondrial genome for three species of fruit flies in the genus Anastrepha (A. fraterculus, A. striata, and A. distincta). Then, the mitogenome data were combined with other tephritidae species to obtain the phylogenetic relationship of the family.

111

5.2 Material and Methods

Sample collection and DNA extraction

This study used specimens of Anastrepha sp. collected in the Department of Nariño,

Colombia as described in Chapter 5. The specimens were morphologically identified based on taxonomic keys (Norrbom et al. 2012).

Genetic Analyses

Genomic DNA was carried out as in Chapter 3 for each fly specimen. To corroborate the identity of each fly species, the cytochrome c oxidase 1 (COI) barcode region was amplified. This approach was selected to use the COI sequences as a “bait” and, to link the assembled mitogenome to the specific fly species. A fragment of 420 bp of the mitochondrial gene cytochrome c oxidase I (COI) was amplified using the modified primers III_B_F and Fol_degen_rev (Supplemetary Table 5.1) because they included

Nextera tags and Illumina tails. PCR was conducted in 25 µl reaction volume, with 1 µl genomic DNA, 0.1 mM dNTPs, 0.5 U/µM BIOTAQ DNA Polymerase, 3 mM MgCl2, 0.3

µM forward and reverse primers. The PCR program included an initial denaturation step of 94°C for 4 min followed by 40 cycles of 94°C for 30 s; annealing at 48°C for 30s, 72 °C for 45s and a final extension at 72°C for 7 min. To confirm amplification was successful all PCR products were visualized using GelRedTM.

As described in Chapter 4, three classes of amplicon intensity were cleaned up using

Agencourt AMPure XP SPRI beads and quantified using Qubit 2.0 Fluorometer with the

Qubit dsDNA HS Assay kit. Thereafter, the PCR products were pooled into 96 separate libraries in equimolar proportion so as to ensure that the potential weak products were

112

not overwhelmed by the stronger ones during sequencing. Library preparation was conducted with the Nextera XT kit (300 bp paired-end) and sequenced on a HiSeq 2500 lane at the Earlham Institute, Norwich, UK.

Because the specimenes are close related species, three separate libraries were conducted in equimolar pools with other species to reduce sequencing cost. Library preparation and sequencing were conducted on a HiSeq 2500 lane (300 bp paired-end) at the Earlham Institute, Norwich, UK.

Bioinformatics Pipeline

COI Barcode processing

The COI sequences were demultiplexed by the sequencing facility to provide 96 libraries then, these samples were further demultiplexed to individual samples using Cutadapt v.1.12 (Martin 2011) allowing for one-base mismatches. The primers were trimmed using

“fastx_trimmer”http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_trimmer

_usage). PEAR was then used to merge forward and reverse reads (Zhang et al. 2014b).

This output was quality filtered using “fastq_filter” in USEARCH v.7.0 with a 0.5 E total expected error (Edgar 2010). Finally, to select a single read as putative barcode sequence per sample, each read was BLASTed against the Tephritidae nucleotide database in order to remove possible internal contamination reads (e.g. bacterial or fungal) and also to identify the sequences for the three Anastrepha species (Altschul et al. 1990).Then, the remained reads were cluster based on a 0.9 identity among them. A semi-automated bioinformatic pipeline was created using Perl to process the above-

113

described steps (NAPtime script, Creedy, unpublished). The obtained barcodes were loaded in Geneious v7.0.1 for visualisation and manually edition if required.

Mitogenome sequencing and analysis

The quality of FASTQ files for each library was assessed using fastqc

(https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) and any remaining adapter sequences trimmed using Trimmomatic v0.33 (ILLUMINACLIP:2:30:10) (Bolger 2014).

Putative mitochondrial reads were extracted by running a BLASTn search (Altschul et al.

1990) against a reference library of Tephritidae mitogenomes which was downloaded from the NCBI GenBank (E ≤1e-5; maximum target sequences 1 and DUST filtering disabled). Reads matching Tephritidae mitochondrial sequences were retained for the novo assembly using SPAdes v3.1 option meta (k value range 21, 33, 55, 77, 99, 127)

(Bankevich et al. 2012). The resulting contigs were loaded in Geneious and then were baited using COI sequences to identify the corresponding species. The contig annotations were in MITOS (Bernt et al. 2013) followed by manual validations of the protein-coding genes (PCGs) in Geneious. The tRNAs were detected using the ARWEN platform (Laslett

& Canback 2008).

Phylogenetic analyses

The sequences of 13 protein-coding genes (PCGs) were used in the phylogenetic analysis. The alignment was performed in MAFFT using FFT-NS-2 method and default settings (Katoh et al. 2017). The alignment per each gene was imported into Geneious for evaluation and curation when it was necessary. The 13 genes alignments were

114

concatenated using seqCat.pl (https://www.uni- oldenburg.de/ibu/systematik_evolutionsbiologie/programme/#Sequences). The

Maximum likelihood (ML) analysis was conducted with RAxML 8.2.10 (Stamatakis 2014) with 1,000 bootstrap replicates, using the rapid bootstrap feature (random seed value

12345).

5.3 Results

Sequencing and assembly

From the three different libraries sequenced, in A. fraterculus a total of 8,330,980 paired- end reads were obtained, of these 273,305 pairs showed high similarity to Tephritidae mitochondrial sequences. Thus, 12 contigs of 10,000 bp or more were obtained,of which, the longest contig represented the complete mitogenome. In A. striata a total of 7,413,896 paired-end reads were obtained, of these, 105,599 pairs showed high similarity to

Tephritidae mitochondrial genomes. Only seven contigs longer than 10,000 bp in length were obtained, of which, the longest contig represented the complete mitogenome.

Finally, A. distincta library had a total of 16,938,511 paired-end reads, of these, 1,180,762 pairs showed high similarity to Tephritidae mitochondrial sequences given 14 contigs with more than 10,000 bp in length, of which, the longest contig represented the complete mitogenome. The summary of the nucleotide composition of the three species is in Table

5.1.

115

Table 5.1 Nucleotide composition of the three Anastrepha mitogenomes.

Species A % T % G % C % G+C % Anastrepha fraterculus 41.3 36.1 9 13.6 22.6 Anastrepha striata 40.5 35.5 9.1 14.8 24 Anastrepha distincta 41 35.9 9.1 35.9 23.1

Mitogenome features

The length of the three complete mitogenomes obtained from the Anastrepha species was similar. The longest corresponded to 16,936 bp of A. fraterculus, followed by 16,613 bp of A. striata and finally 16,418 bp of A. distincta. The three mitogenomes presented the typical composition and arrangements of all metazoans. Each mitochondrial genome contained a set of 37 genes including 13 protein-coding genes (PCGs), two ribosomal

RNA (rRNA) genes, 22 transfer RNA (tRNA) genes and a control region (Figure 5.1). In the major strand were located nine PCGs (NAD2, COI, COII, ATP8, ATP6, COIII, NAD3,

NAD6, CYTB), 14 tRNAs (trnI(gat), trnM(cat), trnW(tca), trnL(taa), trnK(ctt), trnD(gtc), trnG(tcc), trnA(tgc), trnR(tcg), trnN(gtt), trnS1(gct), trnE(ttc), trnS(tga)), and also the control region. In the minor strand were located four PCGs (NAD5, NAD4, NAD4L,

NAD1), eight tRNAs (trnQ(ttg), trnC(gca), trnY(gta), trnF(gaa), trnH(gtg), trnP(tgg), trnL(tag), trnV(tac)) and two rRNAs (rrnS and rrnL). Mitochondrial genome maps for each species are in Figures 5.1 to 5.3.

116

Figure 5.1 Mitochondrial genome map of Anastrepha fraterculus. Arrows indicate the orientation of gene transcription. The PCGs are represented in yellow, tRNAs are the small arrows in pink, rRNAs transcript are represented in red and the Control region (non- coding region) in blue.

117

Figure 5.2 Mitochondrial genome map of Anastrepha striata. Arrows indicate the orientation of gene transcription. The PCGs are represented in yellow, tRNAs are the small arrows in pink, rRNAs transcript are represented in red and the Control region (non- coding region) in blue.

118

Figure 5.3 Mitochondrial genome map of Anastrepha distincta Arrows indicate the orientation of gene transcription. The PCGs are represented in yellow, tRNAs are the small arrows in pink, rRNAs transcript are represented in red and the Control region (non- coding region) in blue.

119

Among the three species, most of the PCGs started with the ATN codon, the most common codon among the three flies was ATG (COII, COIII, NAD4L, CYTB and NAD1) followed by ATA (NAD3, NAD4, NAD6). The start codon in ATP8 started with ATC in

A.striata and A. distincta while in A. fraterculus started with ATT. The species with more differences in the start codon was A. striata in NAD5, NAD4, NAD1 with ATA, ATG and

ATT respectively (Table 5.2). The three flies presented an exception for COI gene which started with TCG codon.

120

Table 5.2 Characteristics of the 13 Protein-coding genes in the three species of Anastrepha. In light red are highlight the start codon that differed across the samples. In light blue are highlight the stop codons which are an exception among the samples.

A. fraterculus A. striata A. distincta Codon Codon Codon Size Size Size Location Start Stop Location Start Stop Location Start Stop Gene (bp) (bp) (bp) NAD2 265 - 1293 1029 ATT TAG 222 - 1250 1029 ATT TAA 265 - 1293 1029 ATT TAA COI 1713 - 3251 1539 TCG TAA 1646 - 3184 1539 TCG TAA 1691 - 3229 1539 TCG TAA COII 3397 - 4119 723 ATG TAA 3278 - 4000 723 ATG TAA 3394 - 4116 723 ATG TAA ATP8 4248 - 4409 162 ATT TAA 4139 - 4300 162 ATC TAA 4250 - 4411 162 ATC TAA ATP6 4445 - 5080 636 ATT TAA 4336- 4971 636 ATT TAA 4447 - 5082 636 ATT TAA COIII 5090 -5878 789 ATG TAA 4981 -5769 789 ATG TAA 5092 -5880 789 ATG TAA NAD3 5971 - 6327 357 ATA TAA 5855 - 6211 357 ATA TAA 5971 - 6327 357 ATA TAA NAD5 6978 - 8726 1749 ATT TAA 6841 - 8562 1722 ATA TAA 6979 - 8727 1749 ATT TAA NAD4 8816 - 10138 1323 ATA TAA 8678 - 10000 1323 ATG TAA 8817 - 10139 1323 ATA TAA NAD4L 10150 - 10446 297 ATG TAA 10012 - 10308 297 ATG TAA 10151 - 10447 297 ATG TAA NAD6 10585 - 11106 522 ATA TAA 10447 - 10968 522 ATA TAA 10583 - 11107 525 ATA TAA CYTB 11111 - 12247 1137 ATG TAG 10973 - 12109 1137 ATG TAA 11112 - 12248 1137 ATG TAG NAD1 12330 - 13268 939 ATG T 12110 - 13129 1020 ATT TAG 12331 - 13269 939 ATG T

121

The commonest stop codon was TAA in 10 PCGs (COI, COII, ATP8, ATP6, COIII, NAD3,

NAD5, NAD4, NAD4L, NAD6). The remaining genes stop with TAG except for NAD1 in

A.fraterculus and A. distincta which had incomplete stop codon T (Table 5.2) (details in

Supplementary Table 5.1, 5.2 and 5.3).

The typical 22 tRNAs for insect mitogenomes were found in the three species. The shorter tRNA was trnR(tcg) in A. striata with 64 bp, and the longest was trnV(tac) with 72 bp in the three species (details in Supplementary Table 5.1, 5.2 and 5.3). Only one atypical cloverleaf structure was found among the three flies, the trnS1(gct) (serine) lacked the dihydorouridine (DHU) arm (Figure 5.4).

Figure 5.4 Atypical tRNA cloverleaf structure of trnS1(gct) found in the three species studies.

122

Phylogenetic relationship

The database based on 13 protein-coding genes (all three codon positions included) with

11,667 nucleotides were used to build the phylogenetic tree. Figure 5.5 shows a phylogenetic tree for the species in the present work, with Drosophila melanogaster and

Drosophila suzukii as outgroups. The topology structure conducted from ML analysis presented two major clades. The first includes Procecidochares and Ceratitis species, and the second clade form by Anastrepha and Bactrocera, both are placed as a sister group with moderate support (ML:79). The two A.fraterculus mitogenomes, one from

Isaza et al. (2017) and the other described by this study, form a clade with A distincta as the sister group. A. striata is placed as a sister species of this fraterculus group, all of the branches in this clade are highly supported (ML: 100).

123

Figure 5.5 Phylogenetic tree of Tephritidae family based on mitochondrial genomes. Maximum likelihood analysis (RAxML) based on 13 protein-coding genes of

Tephritidae fruit flies and two Drosophila species used as an outgroup. Bootstrap support values are located in the nodes (left).

124

5.4 Discussion

This study is the first report where the complete mitochondrial genome of three

Anastrepha sp. have been shown. The mitochondrial genome of insects provides valuable information for phylogeny and evolution studies (Cameron 2014). Additionally, it can be used to establish species-specific markers which are currently necessary for the

Anastrepha group (Hendrichs et al. 2015).

The three mitogenomes share a similar gene composition, organisation and codon usage.

The mitochondrial genome of A. fraterculus, A. striata and A. distincta are a closed circular molecule of 16,936 bp, 16,613bp and 16,418 bp respectively, all of them are the longest among the other 20 tephritid mitogenomes currently available (Supplementary Table 5.5).

The mitogenome size previously described by Isaza et al. (2017) for A. fraterculus was

16,739 bp; it is shorter than the mitogenome found in this study. When both mitogenomes were compared, the length of the 13 PCGs, tRNAs and rRNAS were similar. However, a possible explanation for the length increment could be attributed to the control region which increased from 1182 bp to 1382 bp in our result.

The first in-frame codon of COI gene, TCG, serves as the starter point of the gene in the three mitogenomes. These results were detected in all the available mitogenomes obtained in other tephritid studies (Yong et al. 2015; Yong et al. 2016). Also, this atypical start codon had been identified in other insect’s mitochondrial sequences (Beckenbach &

Stewart 2009). The ATC start codon in atp8 gene described in A. striata and A. distincta have been previously only defined in D. longicornis and B. oleae while the start codon

ATT found in A. fraterculus is widely distributed among the described mitogenomes in

125

tephritidae (Jiang et al. 2016). On the other hand, the more common stop codons were

TAA and TAG which are in agreement with those obtained in other fruit flies such as

Bactrocera zonata, Bactrocera minax and longicornis (Choudhary et al. 2015;

Jiang et al. 2016; Zhang et al. 2014a). The incomplete T-stop codon was found only in A. fraterculus and A. distincta at the NAD1 gene, although it can be converted to TAA by post-translational adenylation (Ojala et al. 1981), it had been described in 18 of the 20 available tephritidae mitogenomes.

Among the tRNAs, three main clusters of tRNAs characteristics in the Anastrepha sp mitogenomes were found (Figure 5.1 to 5.3). These include 1) I-Q-M; 2) W-C-Y, and 3)

A-R-N-S1-E-F, which are also observed in other insect mitogenomes. The only unusual cloverleaf structure was the lack of the DHU arm in trnS1(gct) gene, however; it is considered one exception in other metazoan mitogenomes (Cameron 2014; Juhling et al.

2012).

In the present study, the genus Anastrepha is placed as a sister group to the genus

Bactrocera with moderate support (ML: 79) whereas, Ceratitis did group with

Procecidochares shown a distance with the studied group. Similar results were shown using the 13 PGCs in a Bayesian approach, but on the contrary, to this study, they had high posterior probabilities (ML:100) (Su et al. 2017). Additionally, similar pattern was described using a COII fragment and Neighbour-joining method in six genera of

Tephritidae family, where the Anastrepha, Toxotrypana and Rhagoletis were included in the subfamily , and Ceratitis was located in the subfamily (Fernández et al. 2004). Unfortunately, this study has been unable to demonstrate the Trypetinae cluster due to the absence of samples in the Toxotrypana and Rhagoletis genera. The

126

monophyly of fraterculus group (A. fraterculus and A. distincta) and the A. striata sister group have been described using only COII fragment gene in samples collected in

Ecuador, although with lower supported bootstrap (Ludeña et al. 2010). A similar tendency has been described of the complete phylogenetic relationship published for

Toxotrypanini tribe (Mengual et al. 2017). In this study, they used six DNA regions, three mitochondrial and three nuclear genes in a total of 146 species. They described a monophyletic A. striata group with high support, however, when it is placed as a sister branch of Anastrepha group the support values decrease. The overall bootstrap level in the Anastrepha sp. cluster was found to be 100 in this study, far above to those observed by Ludeña et al. (2010) and Mengual et al. (2017). The greater number of total genes used in the phylogenetic analysis might explain the high support values found in this study. Mengual et al. (2017) reconstructed the phylogeny of the Toxotrypanini tribe, and they obtained a poorly resolved relationship among the species, some exceptions though were found in the Anastrepha group. They suggested that the study of missing sequences and the addition of new DNA regions can further resolve the relationship of the species as we show in this study.

Conclusions and Future directions

The purpose of the current study was to sequence the whole mitogenome of A. fraterculus, A.striata and, A. distincta using next-generation sequencing platform. The results of this investigation show that the three species presented mitogenome features similar to other tephritid fruit flies. These results support the described phylogenetic relationships between Anastrepha and Bactrocera, in fact, the bootstrap levels of the relationship of Anastrepha group (A. fraterculus and A. distincta) and the A. striata sister

127

group are the highest. These flies constitute a pest with detrimental effect to the economy of countries in the Western World, for this reason, and based on the findings of my study,

I suggest to increase the efforts to identify new molecular markers, based on mitochondrial genes, which would allow to identify fundamental differences between species in the Anastrepha genus and therefore, bring new insights into the development and implementation of more efficient pest control protocols.

128

CHAPTER 6

General Discussion

Thesis Overview

The primary aim of this study was to determine the current dispersal patterns of Ceratitis capitata and examine the characteristics that confer invasiveness to this species, using molecular and novel statistical approaches. The investigation performed here contributes to the development of evidence-driven pest management protocols, especially in the

Americas, in addition to addressing questions regarding connectedness of pest occupied sites and the evaluation of genes associated with insecticide resistance. The strategy involved compiling occurrence data from several open access sources to identify the potential change in habitat distribution from the effect of climate change. Once critical regions were identified, the global dispersal patterns of medfly were investigated using molecular genetic approaches. An evaluation of insecticide resistance was performed using genetic analysis, as resistance is another economic problem associated with invasive species. Finally, the shortage of genomic information for tephritid species in

South America, a region which is intensively affected by the medfly and related tephritids, motivated the sequencing of the complete mitogenome of Anastrepha sp. Here I will provide a summary and discussion of my findings.

The modelling simulations for medfly performed in Chapter 2, predicted that climate change would alter its suitable habitat areas, with some areas are predicted to remain stable while others show an range expansion. As a consequence, the increase of suitable

129

habitat will likely force to maintain and even expand strict border surveillance and quarantine protocols because of an increase of suitable habitat could support the connectivity among distant populations. In this study, the predictions suggest an increase in the connectivity between Central and South America through the intermountain basin of Colombia (these zones are the main fruit producers in the country). Usually, the commercial trade between these regions involves transportation by trucks, which facilitates the free movement of the medfly across the countries and increases the risk of invasion from countries such as Ecuador and Peru to the northward regions.

The potential habitat modelling in this study supports previous predictions of the effects of climate change showing a poleward shift of habitat and even reaching countries with a worldwide recognised successful program for medfly eradication like Chile. It is the first time that the impact of climate change on medfly is modelled at a global scale. These results can contribute to develop new protocols of trade relationships among the countries in order to prevent medfly expansion.

The history of the colonisation of the medfly is well documented, and the results obtained in Chapter 3 support this idea. High levels of genetic diversity were found in the

Afrotropical native region and a decrease in this diversity in the introduced ranges. This study supports an East or South African origin for C.capitata (in accordance with previous studies). The demographic history analysis of Afrotropical region dated the most recent common ancestor of medfly at 11,600 BP in the onset of Holocene. The migration route analysis localised the root for medfly in Kenya around 10,500 BP. Both analyses estimated similar date for medfly origin which coincides with major climatic change in the

African region as well as the first trace of human movement and plant-

130

domestication register in the region. Regarding the migration analysis, I was able to identify the lack of connectivity between some adjacent regions, e.g. Spain and Greece, which may be an indication that quarantine measures are broadly successful. However, in other regions, I found migrations among populations which reflect a lack of quarantine protocols, previously only described in the native region of medfly (Karsten et al. 2015).

My results suggest an unidirectional connection between South America and Central

America, where flies from Colombia are able to settle in Guatemala. This situation is a reflection of the efforts in pest control that each country has implemented. Guatemala has a well-known system to intercept and reduction of medfly populations (Moscamed programs), whereas Colombia only controls the fly based on insecticides. This evidence of movement does not bode well for the fruit industry in Central and South American countries where various attempts for the establishment of Sterilize Insect Technique (SIT) have failed because of political or economic disturbances.

Insecticide resistance was another economic problem associated with the medfly that I addressed in Chapter 4 using genetic approaches. I developed a methodology of genotyping at low-cost, which allows to work with DNA of museum samples (usually degraded DNA) and modern specimens. Despite the low-quality DNA obtained from the museum collections specimens, good sequences of the target amplicon were obtained.

Additionally, the bioinformatic analysis allows to identify homozygous and heterozygous specimens from these sets of samples.

Regarding to the sequence, there is a known SNP associated with resistance to insecticides described in medfly corresponding to the G328A amino acid change in the

Ccace2 protein. The primers used in this study were previously described by Elfekih et

131

al. (2014) (Vogler’s lab group). However, after sequenced the expected fragment, I realised that the amplicon did not contain the mutation site (exon 6). Instead, a fragment of similar size was obtained which is located at the end of exon 5. In this sequence, I found synonymous mutation at position 312 (Alanine) Ccace2 protein, in addition to two

SNPs in the adjacent intron that are in perfect linkage disequilibrium with the former.

Further studies are required to evaluate a potential link between this synonymous mutation and insecticide resistance, and to test if the three mutations are in linkage disequilibrium also with the critical G328A mutation that is separated by >10 kb in the genome map.

In Chapter 5, the complete mitochondrial genomes of A. fraterculus, A. striata and A. distincta were sequenced. This study has identified the longest mitochondrial genomes of the 20 complete mitogenomes available for Tephritidae. The three mitogenomes share a similar gene composition, organisation and codon usage. In the Americas, C. capitata shares habitat with the South American fruit fly (Anastrepha sp). Both species produce economic losses in crop production and, as a consequence, their presence in the region reduced the possibilities for access to global markets because of the quarantine measures. Full mitochondrial genomes will be useful for detection and species differentiation, and could provide insights into the phylogeny and populations structure of these species, which could help to implement control method strategies in the region.

132

Future experiments

This study has led to other questions related with Tephritidae potential distribution predictions in the context of global warming and population genetic analysis, so to address these questions, I have summarised the following experiments.

In the field of modelling; several other methods are available to model the potential distribution of medfly. Here, the occurrence dataset of medfly was expanded which provides an invaluable source to explore other modelling methods. It is known that medfly physiological information is well described across the world, for this reason, the implementation of Ecological Niche Factor Analysis (ENFA) to identify the minimum set of climatic variables related with the medfly’s occurrence is plausible. As a consequence, the potential distribution model can be more accurate (i.e. species and environment relationship persist preserved during projections) comparing to the maximum entropy model which tended to overestimate its predictions. In addition, due to the agricultural implications of Tephritidae species at a global level, it is necessary to incorporate new indices that would improve the projections. For instance, at a global level the production or exportation of fruit, and at a local level the incorporation of an index based on means of transport and their potential to bring fruit from one region to another. Thus, the potential distribution for future scenarios in the context of global warming will be more realistic and helpful to incorporated in management control programs.

Here, the most basic molecular marker provide novelty information regarding demographic inferences for medfly. For this reason, the implementation used in this study to calculate the corrected mutation rate of medfly can be a new approach to include in future molecular analysis including more molecular markers. However, persists the debt

133

in the study of medfly populations to do a detail surveillance sampling at a worldwide level based on multiples genome sequencing (e.g. Restriction site Associated DNA

Sequencing, RAD-Seq) which will provide valuable information to infer the complete picture of the invasion pathways essential to incorporate into management strategies.

Additionally, these genetic analyses will provide novel insights about genes under selection which can be useful to answer the question presented in Chapter 3. Regarding the issue in the identification of the point mutation related to insecticide resistance, further genetic analysis will be required to confirm the hypothesised scenarios in the architecture of ace gene in medfly. The precise nature of these associations will be informative to establish management strategies of pest species, specifically in the border region of

Nariño, Colombia.

Concluding remarks

In the field of agricultural pest remains uncovered part of the complexity associated with

Tephritidae species. This study used relatively inexpensive methodologies to investigate factors that are related to invasiveness in these species. The influence of climate change in the potential distribution of C. capitata was determined by environmental niche modelling. The results predicted poleward expansion associated with connectivity, especially in Neotropical countries. The colonisation and expansion process of C. capitata was partially resolved using an appropriate mutation rate. The origin of this process was associated with the Holocene era which is related to the plant domestication in the humanity. In addition, the migration rates identified the Neotropical as the region where countermeasures are urgently required. Also, I established a low-cost methodology to

134

identified insecticide resistance in museum and modern specimens of C. capitata. Finally,

I sequence the mitochondrial genome of different Anastrepha species to provide new insights into the relationship with other tephritids. The investigations conducted in this study have disclosed trends that influence the tephritid invasiveness, and especially the medfly’s successful invasions that need to be count in pest management protocols, especially in the Americas.

135

REFERENCES

Aïzoun N, Ossè R, Azondekon R, Alia R, Oussou O, Gnanguenon V, Aikpon R, Padonou G, and Akogbéto M. 2013. Comparison of the standard WHO susceptibility tests and the CDC bottle bioassay for the determination of insecticide susceptibility in malaria vectors and their correlation with biochemical and molecular biology assays in Benin, West Africa. Parasites & Vectors 6:147. Alaoui A, Imoulan A, El Alaoui-Talibi Z, and El Meziane A. 2010. Genetic Structure of Mediterranean Fruit Fly (Ceratitis capitata) Populations from Moroccan Endemic Forest of Argania spinosa. International Journal of Agriculture and Biology 12:291-298. Aldridge WN. 1950. Some properties of specific cholinesterase with particular reference to the mechanism of inhibition by diethyl p-nitrophenyl thiophosphate (E 605) and analogues. Biochemical Journal 46:451-460. Altschul SF, Gish W, Miller W, Myers EW, and Lipman DJ. 1990. Basic local alignment search tool. J Mol Biol 215:403-410. Aluja M. 1999. Fruit fly (Diptera: Tephritidae) research in Latin America: myths, realities and dreams Anais da Sociedade Entomológica do Brasil 28:565–594. Aluja M, and Liedo P. 1993. Fruit Flies: Biology and Management. New York: Springer- VerJag. Aluja M, and Mangan RL. 2008. Fruit fly (Diptera: Tephritidae) host status determination: critical conceptual, methodological, and regulatory considerations. Annu Rev Entomol 53:473-502. DOI 10.1146/annurev.ento.53.103106.093350 Alyokhin A, and Chen YH. 2017. Adaptation to toxic hosts as a factor in the evolution of insecticide resistance. Current Opinion in Insect Science 21:33-38. 10.1016/j.cois.2017.04.006 Andersen JC, and Mills NJ. 2012. DNA Extraction from Museum Specimens of Parasitic Hymenoptera. PLoS One 7:e45549.

136

APHIS. 2018. Supplemental Requirements for Importation of Fresh Citrus From Colombia Into the United States In: Animal and Plant Health Inspection Service U, editor. APHIS–2017–0074. Washington, DC: Federal Register. Argov Y, Blanchet A, and Gazit Y. 2011. Biological control of the Mediterranean fruit fly in Israel: Biological parameters of imported parasitoid wasps. Biological Control 59:209-214. DOI 10.1016/j.biocontrol.2011.07.009 Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, and Pevzner PA. 2012. SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology 19:455-477. DOI 10.1089/cmb.2012.0021 Baranowski R, Glenn H, and Sivinski J. 1993. Biological Control of the Caribbean Fruit Fly (Diptera: Tephritidae). The Florida Entomologist 76:245-251. Barr NB. 2009. Pathway Analysis of Ceratitis capitata (Diptera: Tephritidae) Using Mitochondrial DNA. Journal of Economic Entomology 102:401-411. DOI 10.1603/029.102.0153 Baruffi L, Damiani G, Guglielmino C, Bandi C, Malacrida A, and Gasperi G. 1995. Polymorphism within and between populations of Ceratitis capitata: comparison between RAPD and multilocus enzyme electrophoresis data. Heredity 74 425- 437. Bebber DP, Ramotowski MAT, and Gurr SJ. 2013. Crop pests and pathogens move polewards in a warming world. Nature Climate Change 3:985-988. DOI 10.1038/nclimate1990 Beckenbach AT, and Stewart JB. 2009. Insect mitochondrial genomics 3: the complete mitochondrial genome sequences of representatives from two neuropteroid orders: a dobsonfly (order Megaloptera) and a giant lacewing and an owlfly (order Neuroptera). Genome 52:31-38. DOI 10.1139/G08-098 Bernt M, Donath A, Juhling F, Externbrink F, Florentz C, Fritzsch G, Putz J, Middendorf M, and Stadler PF. 2013. MITOS: improved de novo metazoan mitochondrial genome annotation. Molecular Phylogenetics and Evolution 69:313-319. DOI 10.1016/j.ympev.2012.08.023

137

Berzitis EA, Minigan JN, Hallett RH, and Newman JA. 2014. Climate and host plant availability impact the future distribution of the bean leaf beetle (Cerotoma trifurcata). Global Change Biology 20:2778-2792. DOI 10.1111/gcb.12557 Blackburn TM, Pysek P, Bacher S, Carlton JT, Duncan RP, Jarosik V, Wilson JR, and Richardson DM. 2011. A proposed unified framework for biological invasions. Trends in Ecology and Evolution 26:333-339. DOI 10.1016/j.tree.2011.03.023 Bolger AM, Lohse, M., Usadel, B. 2014. Trimmomatic: A flexible trimmer for Illumina Sequence Data. Bioinformatics:btu170. Bonizzoni M, Guglielmino CR, Smallridge CJ, Gomulski M, Malacrida AR, and Gasperi G. 2004. On the origins of medfly invasion and expansion in Australia. Molecular Ecology 13:3845-3855. DOI 10.1111/j.1365-294X.2004.02371.x Bonizzoni M, Malacrida AR, Guglielmino CR, Gomulski LM, Gasperi G, and Zheng L. 2000. Microsatellite polymorphism in the Mediterranean fruit fly, Ceratitis capitata. Insect Molecular Biology 9:251-261. Bonizzoni M, Zheng L, Guglielmino C, Haymer D, Gasperi G, Gomulski L, and Malacrida A. 2001. Microsatellite analysis of medfly bioinfestations in California. Molecular Ecology 10:2515-2524. Brook BW, Sodhi NS, and Bradshaw CJ. 2008. Synergies among extinction drivers under global change. Trends in Ecology & Evolution 23:453-460. DOI 10.1016/j.tree.2008.03.011 Brower AV. 1994. Rapid morphological radiation and convergence among races of the butterfly Heliconius erato inferred from patterns of mitochondrial DNA evolution. PNAS 91:6491-6495. Brunak S, Engelbrecht J, and Knudsen S. 1991. Prediction of Human mRNA Donor and Acceptor Sites from the DNA Sequence. J Mol Biol 220:49-65. Caceres C, Segura D, Vera T, Wornoayporn V, Cladera J, Teal P, Sapountzis P, Bourtzis K, and Zacharopoulou A. 2009. Incipient speciation revealed in Anastrepha fraterculus (Diptera; Tephritidae) by studies on mating compatibility, sex pheromones, hybridization, and cytology. Biological Journal of the Linnean Society 97:1095-8312. DOI 10.1111/j.1095-8312.2008.01193.x

138

Cameron SL. 2014. Insect mitochondrial genomics: implications for evolution and phylogeny. Annual Review of Entomology 59:95-117. DOI 10.1146/annurev- ento-011613-162007 Castoe TA, Jiang ZJ, Gu W, Wang ZO, and Pollock DD. 2008. Adaptive evolution and functional redesign of core metabolic proteins in snakes. PLoS One 3:e2201. DOI 10.1371/journal.pone.0002201 Chen D, Yan Z, Cole DL, and Srivatsa GS. 1999. Analysis of internal (n-1)mer deletion sequences in synthetic oligodeoxyribonucleotides by hybridization to an immobilized probe array. Nucleic Acids Research 27:389-395. Choudhary JS, Naaz N, Prabhakar CS, Rao MS, and Das B. 2015. The mitochondrial genome of the peach fruit fly, Bactrocera zonata (Saunders) (Diptera: Tephritidae): Complete DNA sequence, genome organization, and phylogenetic analysis with other tephritids using next generation DNA sequencing. Gene 569:191-202. DOI 10.1016/j.gene.2015.05.066 Chown SL, and Nicolson WN. 2004. Insect Physiological Ecology: Mechanism and Patterns. Oxford: Oxford University Press. Conpes. 2008. Política Nacional fitosanitaria y de inocuidad para las cadenas de frutas y otros vegetales. Consejo Nacional de Política Económica y Social República de Colombia. p 45. De Meyer M. 2000. Systematic revision of the subgenus Ceratitis MacLeay s.s. (Diptera, Tephritidae). Zoological Journal of the Linnean Society 128:439-467. DOI 10.1006/zils De Meyer M, Copeland RS, Wharton RA, and McPheron BA. 2002. On the geographic origin of the Medfly Ceratitis capitata (Wiedemann) (Diptera: Tephritidae). In: Barnes BN, editor. Proceedings of the 6th International Symposium on fruit flies of economic importance. Stellenbosch, South Africa. p 45-53. De Meyer M, Robertson MP, Peterson AT, and Mansell MW. 2008. Ecological niches and potential geographical distributions of Mediterranean fruit fly (Ceratitis capitata) and Natal fruit fly (). Journal of Biogeography 0:270–281. DOI 10.1111/j.1365-2699.2007.01769.x

139

Denholm I, and Devine G. 2001. Insecticide Resistance. In: SA L, ed. Encyclopedia of Biodiversity: Elsevier, 465-477. Diamantidis AD, Carey JR, Nakas CT, and Papadopoulos NT. 2011. Population- specific demography and invasion potential in medfly. Ecology and Evolution 1:479-488. DOI 10.1002/ece3.33 Diamantidis AD, Carey JR, and Papadopoulos NT. 2008. Life-history evolution of an invasive tephritid. Journal of Applied Entomology 132:695-705. DOI 10.1111/j.1439-0418.2008.01325.x Dlugosch KM, and Parker IM. 2008. Founding events in species invasions: genetic variation, adaptive evolution, and the role of multiple introductions. Molecular Ecology 17:431-449. DOI 10.1111/j.1365-294X.2007.03538.x Dohm JC, Lottaz C, Borodina T, and Himmelbauer H. 2008. Substantial biases in ultra- short read data sets from high-throughput DNA sequencing. Nucleic Acids Research 36:e105. DOI 10.1093/nar/gkn425 Dormann CF, Elith J, Bacher S, Buchmann C, Carl G, Carré G, Marquéz JRG, Gruber B, Lafourcade B, Leitão PJ, Münkemüller T, McClean C, Osborne PE, Reineking B, Schröder B, Skidmore AK, Zurell D, and Lautenbach S. 2012. Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography 36:27-46. DOI 10.1111/j.1600-0587.2012.07348.x Drummond AJ, and Rambaut A. 2007. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evolutionary Biology 7:214. DOI 10.1186/1471-2148-7- 214 Dyck VA, Hendrichs J, and Robinson A. 2005. Sterile Insect Technique. Principles and Practice in Area-Wide Integrated Pest Management. Netherlands: Springer. Edgar RC. 2010. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26:2460-2461. DOI 10.1093/bioinformatics/btq461 Edgar RC. 2016. UNOISE2: Improved error-correction for Illumina 16S and ITS amplicon sequencing. bioRxiv. DOI 10.1101/081257 Elfekih S, Makni M, and Haymer DS. 2010. Detection of novel mitochondrial haplotype variants in populations of the Mediterranean fruit fly, Ceratitis capitata, from

140

Tunisia, Israel and Morocco. Journal of Applied Entomology. DOI 10.1111/j.1439-0418.2009.01500.x Elfekih S, Shannon M, Haran J, and Vogler A. 2014. Detection of the Acetylcholinesterase Insecticide Resistance Mutation (G328A) in Natural Populations of Ceratitis capitata. Journal of Economic Entomology 107:1965- 1968. 10.1603/EC14166 Elith J, H. Graham C, P. Anderson R, Dudík M, Ferrier S, Guisan A, J. Hijmans R, Huettmann F, R. Leathwick J, Lehmann A, Li J, G. Lohmann L, A. Loiselle B, Manion G, Moritz C, Nakamura M, Nakazawa Y, McC. M. Overton J, Townsend Peterson A, J. Phillips S, Richardson K, Scachetti-Pereira R, E. Schapire, R. S, J.,, Williams S, S. Wisz M, and E. Zimmermann N. 2006. Novel methods improve prediction of species’ distributions from occurrence data. Ecography 29:129– 151. Elith J, and Leathwick JR. 2009. Species Distribution Models: Ecological Explanation and Prediction Across Space and Time. Annual Review of Ecology, Evolution, and Systematics 40:677-697. 10.1146/annurev.ecolsys.110308.120159 Enkerlin WR, Gutiérrez Ruelas JM, Pantaleon R, Soto Litera C, Villaseñor Cortés A, Zavala López JL, Orozco Dávila D, Montoya Gerardo P, Silva Villarreal L, Cotoc Roldán E, Hernández López F, Arenas Castillo A, Castellanos Dominguez D, Valle Mora A, Rendón Arana P, Cáceres Barrios C, Midgarden D, Villatoro Villatoro C, Lira Prera E, Zelaya Estradé O, Castañeda Aldana R, López Culajay J, Ramírez y Ramírez F, Liedo Fernández P, Ortíz Moreno G, Reyes Flores J, and Hendrichs J. 2017. The Moscamed Regional Programme: review of a success story of area-wide sterile insect technique application. Entomologia Experimentalis et Applicata 164:188-203. DOI 10.1111/eea.12611 Fernández P, Segura D, Callejas C, and Ochando MD. 2004. A phylogenetic study of the family Tephritidae (Insecta: Diptera) using a mitochondrial DNA sequence. Proceedings of the 6th International Symposium on fruit flies of economic importance. Stellenbosch, South Africa: Isteg Scientific Publications. p 439–443

141

Feyereisen R, Dermauw W, and Van Leeuwen T. 2015. Genotype to phenotype, the molecular and physiological dimensions of resistance in arthropods. Pesticide Biochemistry and Physiology 121:61-77. DOI 10.1016/j.pestbp.2015.01.004 Ffrench-Constant RH. 2007. Which came first: insecticides or resistance? Trends in Genetics 23:1-4. DOI 10.1016/j.tig.2006.11.006 Ffrench-Constant RH. 2013. The molecular genetics of insecticide resistance. Genetics 194:807-815. DOI 10.1534/genetics.112.141895 Flores S, Campos S, Villaseñor A, Valle Á, Enkerlin W, Toledo J, Liedo P, and Montoya P. 2013. Sterile males of Ceratitis capitata (Diptera: Tephritidae) as disseminators of Beauveria bassiana conidia for IPM strategies. Biocontrol Science and Technology 23:1186-1198. DOI 10.1080/09583157.2013.822473 Frey J, Guillén, L., Frey, B., Samietz, J., Rull, J., Aluja, M. . 2013. Developing diagnostic SNP panels for the identification of true fruit flies (Diptera: Tephritidae) within the limits of COI-based species delimitation. BMC Evolutionary Biology 13:106. Fu L, Li ZH, Huang GS, Wu XX, Ni WL, and Qu WW. 2014. The current and future potential geographic range of West Indian fruit fly, Anastrepha obliqua (Diptera: Tephritidae). Insect Science 21:234-244. DOI 10.1111/1744-7917.12018 Garnier-Géré P, and Chikhi L. 2013. Population Subdivision, Hardy-Weinberg Equilibrium and the Wahlund Effect. DOI 10.1002/9780470015902.a0005446.pub3 Gasparich GE, Silva JG, Han HY, McPheron BA, Steck GJ, and Sheppard WS. 1997. Population genetic structure of the Mediterranean fruit fly (Diptera: Tephritidae) and implications for worldwide colonization patterns. Annals of the Entomological Society of America 90:790-797. Gasperi G, Bonizzoni M, Gomulski LM, Murelli V, Torti C, Malacrida AR, and Guglielmino CR. 2002. Genetic Differentiation, Gene Flow and the Origin of Infestations of the Medfly, Ceratitis Capitata. Genetica 116:1573-6857. Gasperi G, Guglielminq C, Malacrida A, and Milani R. 1991. Genetic variability and gene flow in geographical populations of Ceratitis capitata (Wied.) (medfly). Heredily 67:347-356.

142

Gavriel S, Gazit Y, Leach A, Mumford J, and Yuval B. 2012. Spatial patterns of sterile Mediterranean fruit fly dispersal. Entomologia Experimentalis et Applicata 142:17-26. DOI 10.1111/j.1570-7458.2011.01197.x Georghiou G. 1972. The evolution of resitance to pesticides. Annual Review of Ecology and Systematics 3 133-168. Guillén L, Aluja Mn, Equihua M, and Sivinski J. 2002. Performance of Two Fruit Fly (Diptera: Tephritidae) Pupal Parasitoids (Coptera haywardi [Hymenoptera: Diapriidae] and Pachycrepoideus vindemiae [Hymenoptera: Pteromalidae]) under Different Environmental Soil Conditions. Biological Control 23:219-227. DOI 10.1006/bcon.2001.1011 Haag-Liautard C, Coffey N, Houle D, Lynch M, Charlesworth B, and Keightley PD. 2008. Direct estimation of the mitochondrial DNA mutation rate in Drosophila melanogaster. PLoS Biology 6:e204. DOI 10.1371/journal.pbio.0060204 Hardstone MC, and Scott JG. 2010. A review of the interactions between multiple insecticide resistance loci. Pesticide Biochemistry and Physiology 97:123-128. DOI 10.1016/j.pestbp.2009.07.010 Hartley CJ, Newcomb RD, Russell RJ, Yong CG, Stevens JR, Yeates DK, La Salle J, and Oakeshott JG. 2006. Amplification of DNA from preserved specimens shows blowflies were preadapted for the rapid evolution of insecticide resistance. Proceedings of the National Academy of Sciences of the United States of America 103:8757-8762. DOI 10.1073/pnas.0509590103 Hedrick PW. 2011. Genetics of Populations. USA: Jones and Bartlett. Hellmann J, Byers J, Bierwagen B, and Dukes J. 2008. Five Potential Consequences of Climate Change for Invasive Species. Conservation Biology 22:534-543. DOI 10.HH/j.1523-1739.2008.00951.x Hendrichs J, Vera MT, De Meyer M, and Clarke AR. 2015. Resolving cryptic species complexes of major tephritid pests. Zookeys:5-39. DOI 10.3897/zookeys.540.9656 Hernández-Ortiz V, and Aluja M. 1993. Listado de especies del género neotopical Anastrepha (Diptera: Tephritidae) con notas sobre su distribución y plantas hospederas. Folia Entomológica Mexicana 88:89-105.

143

Hernández-Ortiz V, Bartolucci AF, Morales-Valles P, Frías D, and Selivon D. 2012. Cryptic Species of the Anastrepha fraterculus Complex (Diptera: Tephritidae): A Multivariate Approach for the Recognition of South American Morphotypes. Annals of the Entomological Society of America 105:305-318. DOI 10.1603/an11123 Hijmans RJ, Cameron SE, Parra JL, Jones PG, and Jarvis A. 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25:1965-1978. DOI 10.1002/joc.1276 Hill MP, Bertelsmeier C, Clusella-Trullas S, Garnas J, Robertson MP, and Terblanche JS. 2016. Predicted decrease in global climate suitability masks regional complexity of invasive fruit fly species response to climate change. Biological Invasions 18:1105-1119. DOI 10.1007/s10530-016-1078-5 Ho SY, Phillips MJ, Cooper A, and Drummond AJ. 2005. Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Molecular Biology Evolution 22:1561-1568. DOI 10.1093/molbev/msi145 Hojland DH, Jensen KM, and Kristensen M. 2014. Adaptation of Musca domestica L. field population to laboratory breeding causes transcriptional alterations. PLoS One 9:e85965. DOI 10.1371/journal.pone.0085965 Hsu JC, Haymer DS, Wu WJ, and Feng HT. 2006. Mutations in the acetylcholinesterase gene of Bactrocera dorsalis associated with resistance to organophosphorus insecticides. Insect Biochemistry and Molecular Biology 36:396-402. DOI 10.1016/j.ibmb.2006.02.002 Hsu JC, Wu WJ, Haymer DS, Liao HY, and Feng HT. 2008. Alterations of the acetylcholinesterase enzyme in the oriental fruit fly Bactrocera dorsalis are correlated with resistance to the organophosphate insecticide fenitrothion. Insect Biochemistry and Molecular Biology 38:146-154. DOI 10.1016/j.ibmb.2007.10.002 Hunter SJ, Goodall TI, Walsh KA, Owen R, and Day JC. 2008. Nondestructive DNA extraction from blackflies (Diptera: Simuliidae): retaining voucher specimens for DNA barcoding projects. Molecular Ecology Resources 8:56-61. DOI 10.1111/j.1471-8286.2007.01879.x

144

ICA. 2010. Boletin epidemiologico seccional Narino. Plan Nacional Moscas de la Fruta. Colombia: Instituto Colombiano Agropecuario. Isaza JP, Alzate JF, and Canal NA. 2017. Complete mitochondrial genome of the Andean morphotype of Anastrepha fraterculus (Wiedemann) (Diptera: Tephritidae). Mitochondrial DNA Part B 2:210-211. DOI 10.1080/23802359.2017.1307706 Jackson CJ, Liu JW, Carr PD, Younus F, Coppin C, Meirelles T, Lethier M, Pandey G, Ollis DL, Russell RJ, Weik M, and Oakeshott JG. 2013. Structure and function of an insect alpha-carboxylesterase (alphaEsterase7) associated with insecticide resistance. Proceedings of the National Academy of Sciences of the United States of America 110:10177-10182. DOI 10.1073/pnas.1304097110 Jiang F, Pan X, Li X, Yu Y, Zhang J, Jiang H, Dou L, and Zhu S. 2016. The first complete mitochondrial genome of Dacus longicornis (Diptera: Tephritidae) using next-generation sequencing and mitochondrial genome phylogeny of Dacini tribe. Scientific Reports 6:36426. DOI 10.1038/srep36426 Juhling F, Putz J, Bernt M, Donath A, Middendorf M, Florentz C, and Stadler PF. 2012. Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements. Nucleic Acids Research 40:2833-2845. DOI 10.1093/nar/gkr1131 Kakani EG, and Mathiopoulos KD. 2008. Organophosphosphate resistance-related mutations in the acetylcholinesterase gene of Tephritidae. Journal of Applied Entomology 132:762-771. DOI 10.1111/j.1439-0418.2008.01373.x Kakani EG, Trakala M, Drosopoulou E, Mavragani-Tsipidou P, and Mathiopoulos KD. 2013. Genomic structure, organization and localization of the acetylcholinesterase locus of the , Bactrocera oleae. Bulletin of Entomological Research 103:36-47. DOI 10.1017/S0007485312000478 Karsten M, van Vuuren BJ, Addison P, Terblanche JS, and Leung B. 2015. Deconstructing intercontinental invasion pathway hypotheses of the Mediterranean fruit fly (Ceratitis capitata) using a Bayesian inference approach:

145

are port interceptions and quarantine protocols successfully preventing new invasions?. Diversity and Distributions 21:813-825. DOI 10.1111/ddi.12333 Karsten M, van Vuuren BJ, Barnaud A, and Terblanche JS. 2013. Population genetics of Ceratitis capitata in South Africa: implications for dispersal and pest management. PLoS One 8:e54281. DOI 10.1371/journal.pone.0054281 Katoh K, Rozewicki J, and Yamada KD. 2017. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics. DOI 10.1093/bib/bbx108 Kozarewa I, Ning Z, Quail MA, Sanders MJ, Berriman M, and Turner DJ. 2009. Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomes. Nature Methods 6:291-295. DOI 10.1038/nmeth.1311 Kuhner MK. 2006. LAMARC 2.0: maximum likelihood and Bayesian estimation of population parameters. Bioinformatics 22:768-770. DOI 10.1093/bioinformatics/btk051 Lantschner MV, Villacide JM, Garnas JR, Croft P, Carnegie AJ, Liebhold AM, and Corley JC. 2014. Temperature explains variable spread rates of the invasive woodwasp Sirex noctilio in the Southern Hemisphere. Biological Invasions 16:329-339. DOI 10.1007/s10530-013-0521-0 Lanzavecchia SB, Juri M, Bonomi A, Gomulski L, Scannapieco AC, Segura DF, Malacrida A, Cladera JL, and Gasperi G. 2014. Microsatellite markers from the 'South American fruit fly' Anastrepha fraterculus: a valuable tool for population genetic analysis and SIT applications. BMC Genetics 15 Suppl 2:S13. DOI 10.1186/1471-2156-15-S2-S13 Laslett D, and Canback B. 2008. ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics 24:172-175. DOI 10.1093/bioinformatics/btm573 Lasprilla D. 2011. Estado actual de fruticultura colombiana y perspectivas para su desarrollo. Revista Brasileira de Fruticultura 33:199-205. Lee C. 2002. Evolutionary genetics of invasive species. Trends in Ecololy and Evolution 17:386-391.

146

Li B, Ma J, Hu X, Liu H, and Zhang R. 2009. Potential Geographical Distributions of the Fruit Flies Ceratitis capitata, Ceratitis cosyra, and Ceratitis rosa in China. Journal of Economic Entomology 102:1781-1790. DOI 10.1603/029.102.0508 Lobo JM, Jiménez-Valverde A, and Real R. 2008. AUC: a misleading measure of the performance of predictive distribution models. Global Ecology and Biogeography 17:145–151. DOI 10.1111/j.1466-8238.2007.00358.x Lockwood JL, Cassey P, and Blackburn T. 2005. The role of propagule pressure in explaining species invasions. Trends in Ecology and Evolution 20:223-228. DOI 10.1016/j.tree.2005.02.004 Lockwood JL, Cassey P, and Blackburn TM. 2009. The more you introduce the more you get: the role of colonization pressure and propagule pressure in invasion ecology. Diversity and Distributions 15:904-910. DOI 10.1111/j.1472- 4642.2009.00594.x Ludeña B, Bayas-Rea R, and Pintaud J. 2010. Phylogenetic relationships of Andean- Ecuadorian populations of Anastrepha fraterculus (Wiedemann 1830) (Diptera: Tephritidae) inferred from COI and COII gene sequences. Annales de la Société Entomologique de France 46:344-350. Mack RN, Simberloff D, Mark Lonsdale W, Evans H, Clout M, and Bazzaz FA. 2000. Biotic invasions: causes, epidemiology, global consequences, and control. Ecological Applications 10:689–710. Magaña C, Hernandez-Crespo P, Ortego F, and P. C. 2007. Resistance to Malathion in Field Populations of Ceratitis capitata. Insecticide Resistance and Resistance Management 100: 1836-1843.

Magaña C, Hernandez-Crespo P, Brun-Barale A, Couso-Ferrer F, Bride JM, Castanera P, Feyereisen R, and Ortego F. 2008. Mechanisms of resistance to malathion in the medfly Ceratitis capitata. Insect Biochemistry and Molecular Biology 38:756-762. DOI 10.1016/j.ibmb.2008.05.001

Malacrida AR, Gomulski LM, Bonizzoni M, Bertin S, Gasperi G, and Guglielmino CR. 2007. Globalization and fruitfly invasion and expansion: the medfly paradigm. Genetica 131:1-9. DOI 10.1007/s10709-006-9117-2

147

Malacrida AR, Guglielmino CR, Gasperi G, Baruffi L, and R. M. 1992. Spatial and Temporal Differentiation in Colonizing Populations of Ceratitis capitata. Heredity 69:101-111. Malacrida AR, Marinoni F, Torti C, Gomulski LM, Sebastiani F, Bonvicini C, Gasperi G, and Guglielmino CR. 1998. Genetic Aspects of the Worldwide Colonization Process of Ceratitis capitata. The American Genetic Association 89:501–507. Manning K, and Timpson A. 2014. The demographic response to Holocene climate change in the Sahara. Quaternary Science Reviews 101:28-35. DOI 10.1016/j.quascirev.2014.07.003 Marshall F, and Hildebrand E. 2002. Cattle Before Crops: The Beginnings of Food Production in Africa. Journal of World Prehistory 16:99-143. Martin M. 2011. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetJournal 17:10-12. DOI dx.doi.org/10.14806/ej.17.1.200 Meiklejohn CD, Montooth KL, and Rand DM. 2007. Positive and negative selection on the mitochondrial genome. Trends in Genetics 23:259-263. Mengual X, Kerr P, Norrbom AL, Barr NB, Lewis ML, Stapelfeldt AM, Scheffer SJ, Woods P, Islam MS, Korytkowski CA, Uramoto K, Rodriguez EJ, Sutton BD, Nolazco N, Steck GJ, and Gaimari S. 2017. Phylogenetic relationships of the tribe Toxotrypanini (Diptera: Tephritidae) based on molecular characters. Molecular Phylogenetics and Evolution 113:84-112. DOI 10.1016/j.ympev.2017.05.011 Menozzi P, Shi M, Lougarre A, Tang Z, and Fournier D. 2004. Mutations of acetylcholinesterase which confer insecticide resistance in Drosophila melanogaster populations. BMC Evolutionary Biology 4. DOI 10.1186/1471- 2148-4-4. Merow C, Smith MJ, and Silander JA. 2013. A practical guide to MaxEnt for modeling species’ distributions: what it does, and why inputs and settings matter. Ecography 36:1058-1069. DOI 10.1111/j.1600-0587.2013.07872.x Miller AL, Tindall K, and Leonard BR. 2010. Bioassays for monitoring insecticide resistance. Journal of Visualized Experiments. 10.3791/2129

148

Moran EV, and Alexander JM. 2014. Evolutionary responses to global change: lessons from invasive species. Ecology Letters 17:637-649. DOI 10.1111/ele.12262 Mutero A, Pralavorio M, Bride JM, and Fournier D. 1994. Resistance-associated point mutations in insecticide insensitive acetylcholinesterase. Proceedings of the National Academy of Sciences of the United States of America 91:5922-5926. Myers JH, Simberloff D, Kuris A, and Carey JR. 2000. Eradication revisited: dealing with exotic species. Trends in Ecology & Evolution 15:316-320. Norrbom AL. 1994. New genera of Tephritidae (Diptera) from Brazil and Dominican Amber, with phylogenetic analysis of the tribe Ortalotrypetini. Insecta Mundi 8. Norrbom AL. 2010. Tephritidae (fruit flies, moscas de frutas). . In: Brown BV, and Borkent A, Cumming, J.M., Wood, D.M., Woodley, N.E., Zumbado, M.A., eds. Manual of Central American Diptera. Ottawa: NRC Research Press, 909–954. Norrbom AL, Korytkowski, C.A., Zucchi, R.A., Uramoto, K., Venable, G.L., McCormick, J., Dallwitz, M.J. 2012. Anastrepha and Toxotrypana: Descriptions, illustrations, and interactive keys. (accessed 28 January 2017). Ojala D, Montoya J, and Attardi G. 1981. tRNA punctuation model of RNA processing in human mitochondria. Nature 290:470-474. Orono LE, Albornoz-Medina P, Nunez-Campero S, Nieuwenhove GAv, Bezdjian LP, Martin CB, Schliserman P, and Ovruski SM. 2006. Update of host plant list of Anastrepha fraterculus and Ceratitis capitata in Argentina. 7 International symposium on fruit flies of economic importance: from basic to applied knowledge. Brazil. Oropeza A, Ruiz LC, and Toledo J. 2008. Larval Parasitoids Associated to Anastrepha distincta (Diptera: Tephritidae) in Two Host Fruits at the Soconusco Region, Chiapas, Mexico. Florida Entomologist 91:498-500. Ovruski S, Aluja M, Sivinski J, and Wharton R. 2000. Hymenopteran parasitoids on fruit-infesting Tephritidae (Diptera) in Latin America and the southern United States: Diversity, distribution, taxonomic status and their use in fruit fly biological control. Integrated Pest Management Reviews 5:81-107. DOI 10.1023/A:1009652431251

149

Ovruski SM, and Schliserman P. 2012. Biological Control of Tephritid Fruit Flies in Argentina: Historical Review, Current Status, and Future Trends for Developing a Parasitoid Mass-Release Program. Insects 3:870-888. DOI 10.3390/insects3030870 Pachauri RK, Meyer, L., Plattner, G. K. & Stocker, T. 2015. IPCC, 2014: Climate Change 2014: Synthesis Report. Contribution of Working Groups I, II and III to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change. . In: IPCC, editor. Geneva, Switzerland, . p 151. Paini DR, Sheppard AW, Cook DC, De Barro PJ, Worner SP, and Thomas MB. 2016. Global threat to agriculture from invasive species. Proceedings of the National Academy of Sciences of the United States of America 113:7575-7579. DOI 10.1073/pnas.1602205113 Papadopoulos NT, Plant RE, and Carey JR. 2013. From trickle to flood: the large-scale, cryptic invasion of California by tropical fruit flies. Proceedings of the Royal Society B: Biological Sciences 280:20131466. DOI 10.1098/rspb.2013.1466 Papadopoulou A, Anastasiou I, and Vogler AP. 2010. Revisiting the insect mitochondrial molecular clock: the mid-Aegean trench calibration. Molecular Biology and Evolution 27:1659-1672. DOI 10.1093/molbev/msq051 Papanicolaou A, Schetelig MF, Arensburger P, Atkinson PW, Benoit JB, Bourtzis K, Castanera P, Cavanaugh JP, Chao H, Childers C, Curril I, Dinh H, Doddapaneni H, Dolan A, Dugan S, Friedrich M, Gasperi G, Geib S, Georgakilas G, Gibbs RA, Giers SD, Gomulski LM, Gonzalez-Guzman M, Guillem-Amat A, Han Y, Hatzigeorgiou AG, Hernandez-Crespo P, Hughes DS, Jones JW, Karagkouni D, Koskinioti P, Lee SL, Malacrida AR, Manni M, Mathiopoulos K, Meccariello A, Murali SC, Murphy TD, Muzny DM, Oberhofer G, Ortego F, Paraskevopoulou MD, Poelchau M, Qu J, Reczko M, Robertson HM, Rosendale AJ, Rosselot AE, Saccone G, Salvemini M, Savini G, Schreiner P, Scolari F, Siciliano P, Sim SB, Tsiamis G, Urena E, Vlachos IS, Werren JH, Wimmer EA, Worley KC, Zacharopoulou A, Richards S, and Handler AM. 2016. The whole genome sequence of the Mediterranean fruit fly, Ceratitis capitata (Wiedemann), reveals

150

insights into the biology and adaptive evolution of a highly invasive pest species. Genome Biology 17:192. DOI 10.1186/s13059-016-1049-2 Parmesan C. 2006. Ecological and Evolutionary Responses to Recent Climate Change. Annual Review of Ecology, Evolution, and Systematics 37:637-669. DOI 10.1146/annurev.ecolsys.37.091305.110100 Pearce J, and Ferrier S. 2000. An evaluation of alternative algorithms for fitting species distribution models using logistic regression. Ecolollical Modelling 128:127–147. Peck S, and McQuate G. 2000. Field tests of environmentally friendly malathion replacements to suppress wild Mediterranean fruit fly (Diptera: Tephritidae) populations. Journal of Economic Entomology 93:280-289. DOI 10.1603/0022- 0493-93.2.280 Pentinsaari M, Salmela H, Mutanen M, and Roslin T. 2016. Molecular evolution of a widely-adopted taxonomic marker (COI) across the animal tree of life. Scientific Reports 6:35275. DOI 10.1038/srep35275 Phillips SJ, Anderson RP, and Schapire RE. 2006. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190:231-259. DOI 10.1016/j.ecolmodel.2005.03.026 Pimentel D. 2005. ‘Environmental and Economic Costs of the Application of Pesticides Primarily in the United States’. Environment, Development and Sustainability 7:229-252. DOI 10.1007/s10668-005-7314-2 Prokopy RJ, Papaj DR, Hendrichs J, and Wong TTY. 1992. Behavioral responses of Ceratitis capitata flies to bait spray droplets and natural food. Entomologia Experimentalis et Applicata 64:247-257. DOI doi: 10.1111/j.1570- 7458.1992.tb01615.x. Purcell M. 1998. Contribution of Biological Control to Integrated Pest Management of Tephritid Fruit Flies in the Tropics and Subtropics. Integrated Pest Management Reviews 3 63–83. Pysek P, Richardson DM, Pergl J, Jarosik V, Sixtova Z, and Weber E. 2008. Geographical and taxonomic biases in invasion ecology. Trends in Ecology & Evolution 23:237-244. DOI 10.1016/j.tree.2008.02.002

151

Qin Y, Paini DR, Wang C, Fang Y, and Li Z. 2015. Global establishment risk of economically important fruit fly species (Tephritidae). PLoS One 10:e0116424. DOI 10.1371/journal.pone.0116424 Reese MG, Eeckman FH, Kulp D, and Haussler D. 1997. Improved splice site detection in Genie. Journal of Computational Biology 4:311-323. Reyes A, and Ochando MD. 2004. Mitochondrial DNA variation in Spanish populations of Ceratitis capitata (Wiedemann) (Tephritidae) and the colonization process. Journal of Applied Entomology 128:358–364. DOI 10.1111/j.1439- 0418.2004.00858.358-364 Rwomushana I, Ekesi S, Gordon I, and Ogol CKPO. 2008. Host Plants and Host Plant Preference Studies for (Diptera: Tephritidae) in Kenya, a New Invasive Fruit Fly Species in Africa. Annals of the Entomological Society of America 101:331-340. Saunders M, Magnanti BL, Correia Carreira S, Yang A, Alamo-Hernández U, Riojas- Rodriguez H, Calamandrei G, Koppe JG, Krayer von Krauss M, Keune H, and Bartonova A. 2012. Chlorpyrifos and neurodevelopmental effects: a literature review and expert elicitation on research and policy. Environmental Health 11:S5. DOI doi.org/10.1186/1476-069X-11-S1-S5 Sciarretta A, Tabilio MR, Lampazzi E, Ceccaroli C, Colacci M, and Trematerra P. 2018. Analysis of the Mediterranean fruit fly [Ceratitis capitata (Wiedemann)] spatio- temporal distribution in relation to sex and female mating status for precision IPM. PLoS One 13:e0195097. DOI 10.1371/journal.pone.0195097 Sefc KM, Payne RB, and Sorenson MD. 2007. Single base errors in PCR products from avian museum specimens and their effect on estimates of historical genetic diversity. Conservation Genetics 8:879-884. DOI 10.1007/s10592-006-9240-8 Shapiro B, and Hofreiter M. 2012. Ancient DNA Methods and Protocols. USA: Humana Press. Shendure J, and Ji H. 2008. Next-generation DNA sequencing. Nature Biotechnology 26:1135-1145. DOI 10.1038/nbt1486

152

Sparks TC, and Nauen R. 2015. IRAC: Mode of action classification and insecticide resistance management. Pesticide Biochemistry and Physiology 121:122-128. DOI 10.1016/j.pestbp.2014.11.014 Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysis and post- analysis of large phylogenies. Bioinformatics 30:1312-1313. DOI 10.1093/bioinformatics/btu033 Stark JD, Vargas R, and Miller N. 2004. Toxicity of Spinosad in Protein Bait to Three Economically Important Tephritid Fruit Fly Species (Diptera: Tephritidae) and Their Parasitoids (Hymenoptera: Braconidae). Journal of Economic Entomology 97:911-915. DOI 10.1093/jee/97.3.911 STDF SaTDF. 2010. A coordinated multi-stakeholder approach to control fruit fly in West Africa.: Food and Agriculture Organization. p 2. Stibick J. 2004. Natural Enemies of True Fruit Flies (Tephritidae). In: Agriculture USDo, editor. Riverdale: USDA. p 1-86. Su Y, Zhang Y, Feng S, He J, Zhao Z, Bai Z, Liu L, Zhang R, and Li Z. 2017. The mitochondrial genome of the wolfberry fruit fly, asiatica (Becker) (Diptera: Tephritidae) and the phylogeny of Neoceratitis Hendel genus. Scientific Reports 7:16612. DOI 10.1038/s41598-017-16929-7 Suarez AV, and Tsutsui ND. 2004. The Value of Museum Collections for Research and Society. BioScience 54:66-74. Suckling DM, Stringer LD, Stephens AE, Woods B, Williams DG, Baker G, and El- Sayed AM. 2014. From integrated pest management to integrated pest eradication: technologies and future needs. Pest Management Science 70:179- 189. DOI 10.1002/ps.3670 Sussman J, Harel M, Frolow F, Oefner O, Goldman A, Toker L, and Silman I. 1991. Atomic structure of acetylcholinesterase from Torpedo californica: a prototypic acetylcholine-binding protein. Science 253:872-879. Szyniszewska A, and Tatem A. 2014. Global assessment of seasonal potential distribution of Mediterranean fruit fly, Ceratitis capitata PLoS One 9:e111582. Szyniszewska AM, Leppla NC, Huang Z, and Tatem AJ. 2016. Analysis of Seasonal Risk for Importation of the Mediterranean Fruit Fly, Ceratitis capitata (Diptera:

153

Tephritidae), via Air Passenger Traffic Arriving in Florida and California. Journal of Economic Entomology 109:2317-2328. DOI 10.1093/jee/tow196 Thomsen PF, Elias S, Gilbert M, Haile J, Munch K, Kuzmina S, Froese DG, Sher A, Holdaway RN, and Willerslev E. 2009. Non-Destructive Sampling of Ancient Insect DNA. PLoS One 4:e5048. DOI doi.org/10.1371/journal.pone.0005048 Thuiller W. 2007. Climate change and the ecologist. Nature 448:550-552. Tin MM, Economo EP, and Mikheyev AS. 2014. Sequencing degraded DNA from non- destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics. PLoS One 9:e96793. DOI 10.1371/journal.pone.0096793 Tormos J, Beitia F, Asís JD, and de Pedro L. 2018. Intraguild interactions between two biological control agents in citrus fruit: implications for biological control of medfly. Annals of Applied Biology 172:321-331. DOI 10.1111/aab.12422 van Houdt JK, Breman FC, Virgilio M, and M DEM. 2010. Recovering full DNA barcodes from natural history collections of Tephritid fruitflies (Tephritidae, Diptera) using mini barcodes. Molecular Ecology Resources 10:459-465. DOI 10.1111/j.1755-0998.2009.02800.x Vargas RI, Pinero JC, and Leblanc L. 2015. An Overview of Pest Species of Bactrocera Fruit Flies (Diptera: Tephritidae) and the Integration of Biopesticides with Other Biological Approaches for Their Management with a Focus on the Pacific Region. Insects 6:297-318. DOI 10.3390/insects6020297 Vera MT, Rodriguez R, Segura DF, Cladera JL, and Sutherst RW. 2002. Potential Geographical Distribution of the Mediterranean Fruit Fly, Ceratitis capitata (Diptera: Tephritidae), with Emphasis on Argentina and Australia. Environmental Entomology 31:1009-1022. DOI 10.1603/0046-225x-31.6.1009 Villablanca F, Roderick G, and Palumbi S. 1998. Invasion genetics of the Mediterranean fruit fly: variation in multiple nuclear introns. Mol Ecol 7:547–560. Vogel V, Pedersen JS, Giraud T, Krieger MJB, and Keller L. 2010. The worldwide expansion of the Argentine ant. Diversity and Distributions 16:170-186. DOI 10.1111/j.1472-4642.2009.00630.x

154

Vontas J, Hernández-Crespo P, Margaritopoulos JT, Ortego F, Feng H-T, Mathiopoulos KD, and Hsu J-C. 2011. Insecticide resistance in Tephritid flies. Pesticide Biochemistry and Physiology 100:199-205. DOI 10.1016/j.pestbp.2011.04.004 Walsh SB, Dolden TA, Moores G, D.,, Kristensen M, Lewis T, Devonshire AL, and Williamson MS. 2001. Identification and characterization of mutations in housefly (Musca domestica) acetylcholinesterase involved in insecticide resistance. Biochemical Journal 359:175-181. Walther GR, Roques A, Hulme PE, Sykes MT, Pysek P, Kuhn I, Zobel M, Bacher S, Botta-Dukat Z, Bugmann H, Czucz B, Dauber J, Hickler T, Jarosik V, Kenis M, Klotz S, Minchin D, Moora M, Nentwig W, Ott J, Panov VE, Reineking B, Robinet C, Semenchenko V, Solarz W, Thuiller W, Vila M, Vohland K, and Settele J. 2009. Alien species in a warmer world: risks and opportunities. Trends in Ecology & Evolution 24:686-693. DOI 10.1016/j.tree.2009.06.008 Wang C, Hawthorne D, Qin Y, Pan X, Li Z, and Zhu S. 2017. Impact of climate and host availability on future distribution of Colorado potato beetle. Scientific Reports 7:4489. DOI 10.1038/s41598-017-04607-7 Warszawski L, Frieler K, Huber V, Piontek F, Serdeczny O, and Schewe J. 2014. The Inter-Sectoral Impact Model Intercomparison Project (ISI-MIP): project framework. Proceedings of the National Academy of Sciences of the United States of America 111:3228-3232. DOI 10.1073/pnas.1312330110 White IM, and Elson-Harris M. 1992. Fruit flies of economic significance: their identification and bionomics. Wallingford: CAB International. Yong HS, Song SL, Lim PE, Chan KG, Chow WL, and Eamsobhana P. 2015. Complete mitochondrial genome of Bactrocera arecae (Insecta: Tephritidae) by next- generation sequencing and molecular phylogeny of Dacini tribe. Scientific Reports 5:15155. DOI 10.1038/srep15155 Yong HS, Song SL, Lim PE, Eamsobhana P, and Suana IW. 2016. Complete Mitochondrial Genome of Three Bactrocera Fruit Flies of Subgenus Bactrocera (Diptera: Tephritidae) and Their Phylogenetic Implications. PLoS One 11:e0148201. DOI 10.1371/journal.pone.0148201

155

Zhang B, Nardi F, Hull-Sanders H, Wan X, and Liu Y. 2014a. The complete nucleotide sequence of the mitochondrial genome of Bactrocera minax (Diptera: Tephritidae). PLoS One 9:e100558. DOI 10.1371/journal.pone.0100558 Zhang J, Kobert K, Flouri T, and Stamatakis A. 2014b. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30:614-620. DOI 10.1093/bioinformatics/btt593 Zimmerman G, and Soreq H. 2006. Termination and beyond: acetylcholinesterase as a modulator of synaptic transmission. Cell and Tissue Research 326:655-669. DOI 10.1007/s00441-006-0239-8

156

APPENDICES

APPENDIX I:

Chapter 2 Supplementary Materials

Supplementary Figure 2.1 Correlation matrix among the 19 Bioclimatic variables obtained from WorldClim.

157

APPENDIX II:

Chapter 3 Supplementary Materials

Supplementary Table 3.1 Unique haplotypes based on 51 sequences of the cytochrome oxidase gene I (COI) included in this study. N, number of individuals per haplotype; Haplotype code, corresponding to each unique haplotype nomenclature.

Genbank Biogeographic Samples Haplotype State/Province/Locality N accession region site code number ND 1 Cc01 GQ154189 Ruiru 1 Cc02 JN705010 Ruiru 1 Cc03 JN705011 JN705012, 2 Ruiru Cc04 AY788415 Kibwesi 1 Cc05 JN705013 Western Highlands 1 Cc06 JN705014 Afrotropical Kenya Watamu 1 Cc07 JN705015 Ruiru 1 Cc08 JN705016

Nairobi 1 Cc09 JN705017

Western Highlands 1 Cc10 JN705018

Ololua forest 1 Cc11 JN705019

Watamu 1 Cc12 JN705020

Kakamega, Ngong 4 Cc13 JN705022, Road Forest, JN705025,

158

Msambweni, Ololua JN705028, forest JN705041

Arabuko Sokoke 1 JN705023 Forest Cc14

Nairobi 1 Cc15 JN705024

Sabaki 1 Cc16 JN705026

Voi 1 Cc17 JN705027

Ololua forest 1 Cc18 JN705042

Kade 1 Cc32 JN705034

JN705035, 2 Cc33 Ghana Kade JN705037 Kade 1 Cc34 JN705036

Kade 1 Cc35 JN705038

JQ668128, ND, Amol, Behshahr, KM660641- Jouybar, Neka, Nur, KM660643, 10 Cc21 Sari, Tonekabon, KM660646, Iran Qaemshahr KM660648- KM660652

Neka 1 Cc43 KM660645 Palearctic Neka 1 Cc44 KM660647

HQ677179- Thessaloniki, Aetolia- 5 Cc21 HQ677182, Acarnania Greece HQ677184,

Thessaloniki 1 Cc45 HQ677183

Spain Canary Islands 1 Cc21 GQ154188

Guatemala Antigua, HQ677174- Neotropical 4 Cc21 Mazatenango HQ677177

159

Brazil ND 1 Cc66 DQ116363

160

APPENDIX III:

Chapter 4 Supplementary Materials

Supplementary Table 4.1 Tagged primer sequences showing Illumina tail, tag and original primer sequence for each.

Original Primer Illumina Tail Tag Final Primer (3' → 5') Primer Name

CTGACGACC Ccace2_F00 TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGAC GCAGTG CCATGAAAA TGTGTATAAGAGACAG AGGCAGTGCTGACGACCCCATGAAAATG TG

Ccace2_F0 TCGTCGGCAGCGTC CTGACGA TCGTCGGCAGCGTCAGATGTGTATA 1 GATTC AGATGTGTATAAGA CCCCATG AGAGACAGGATTCCCTGACGACCCC C GACAG AAAATG ATGAAAATG

Ccace2_F0 TCGTCGGCAGCGTC CTGACGA TCGTCGGCAGCGTCAGATGTGTATA 2 ACAAT AGATGTGTATAAGA CCCCATG AGAGACAGGATTCCCTGACGACCCC C GACAG AAAATG ATGAAAATG

Ccace2_F0 TCGTCGGCAGCGTC CTGACGA TCGTCGGCAGCGTCAGATGTGTATA 3 GTTCT AGATGTGTATAAGA CCCCATG AGAGACAGGATTCCCTGACGACCCC C GACAG AAAATG ATGAAAATG

Ccace2_R CAAATTTT GTCTCGTGGGCTCG GTCTCGTGGGCTCGGAGATGTGTAT 00 GCAGT GTCTCAT GAGATGTGTATAAG AAGAGACAGGCAGTGCAAATTTTGTC G TCCTTAA AGACAG TCATTCCTTAACTTTG CTTTG

Ccace2_R CAAATTTT GTCTCGTGGGCTCG GTCTCGTGGGCTCGGAGATGTGTAT 01 GATTC GTCTCAT GAGATGTGTATAAG AAGAGACAGGCAGTGCAAATTTTGTC C TCCTTAA AGACAG TCATTCCTTAACTTTG CTTTG

161

Ccace2_R CAAATTTT GTCTCGTGGGCTCG GTCTCGTGGGCTCGGAGATGTGTAT 02 ACAAT GTCTCAT GAGATGTGTATAAG AAGAGACAGGCAGTGCAAATTTTGTC C TCCTTAA AGACAG TCATTCCTTAACTTTG CTTTG

Ccace2_R CAAATTTT GTCTCGTGGGCTCG GTCTCGTGGGCTCGGAGATGTGTAT 03 GTTCT GTCTCAT GAGATGTGTATAAG AAGAGACAGGCAGTGCAAATTTTGTC C TCCTTAA AGACAG TCATTCCTTAACTTTG CTTTG

Supplementary Buffer List 4.2 Buffer solution for DNA extraction pinned specimens (Thomsen et al. 2009).

To stock 200 ml (1x)

CaCL2 0.0882 gr

SDS 2% 4gr

Tris pH 8 1M 20 ml

NaCl 1.168 gr

Working solution (1x)

10 ml stock solution

DTT 0.06 gr

To dissolve the DTT previously the solution needs to warm at room temperature, after that the extraction buffer needs the specimen and proteinase K to leave in extraction incubation.

162

Supplementary Figure 4.1 Nucleotide sequence and the translated aminoacid sequence of the C. capitata. The figure is based on the alignment of five nucleotide sequences obtained from NCBI (XM.012301166.2, XM.012301167.2,

NM.001279434.1, FJ.480223.1, FJ.480224.1). Underline is the substrate binding of the acetylcholinesterase enzyme and in red is highlighting the active site. The blue square corresponds to the mutation G328A which is only presented in the FJ.480224.1 sequence. It was described the resistant allele to malathion by Magaña et al. (2008).

163

164

APPENDIX IV:

Chapter 5 Supplementary Materials

Supplementary Table 5.1 Tagged primer sequences showing Illumina tail, tag and original primer sequence for each.

Original Primer Illumina Tail Tag Final Primer (3' → 5') Primer Name

CCNGAYATR III_B_F_00 TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTAT TATGCG GCNTTYCCN TGTGTATAAGAGACAG GCGCCNGAYATRGCNTTYCCNCG CG

CCNGAYATR TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGAAC AACACC GCNTTYCCN III_B_F_01 TGTGTATAAGAGACAG ACCCCNGAYATRGCNTTYCCNCG CG

CCNGAYATR TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGACTC ACTCTG GCNTTYCCN III_B_F_13 TGTGTATAAGAGACAG TGCCNGAYATRGCNTTYCCNCG CG

CCNGAYATR TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGAGA AGAAGC GCNTTYCCN III_B_F_14 TGTGTATAAGAGACAG AGCCCNGAYATRGCNTTYCCNCG CG

CCNGAYATR TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGAGG AGGACA GCNTTYCCN III_B_F_17 TGTGTATAAGAGACAG ACACCNGAYATRGCNTTYCCNCG CG

CCNGAYATR TCGTCGGCAGCGTCAGA TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGGTA GTACAG GCNTTYCCN III_B_F_46 TGTGTATAAGAGACAG CAGCCNGAYATRGCNTTYCCNCG CG

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGTA Fol_degen TATGCG GGRTGNCCR ATGTGTATAAGAGACAG TGCGTANACYTCNGGRTGNCCRAARAAYCA _rev_00 AARAAYCA

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAA Fol_degen AACACC GGRTGNCCR ATGTGTATAAGAGACAG CACCTANACYTCNGGRTGNCCRAARAAYCA _re _01 AARAAYCA

165

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAC Fol_degen ACTCTG GGRTGNCCR ATGTGTATAAGAGACAG TCTGTANACYTCNGGRTGNCCRAARAAYCA _rev_13 AARAAYCA

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAG Fol_degen AGAAGC GGRTGNCCR ATGTGTATAAGAGACAG AAGCTANACYTCNGGRTGNCCRAARAAYCA _rev_14 AARAAYCA

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGAG Fol_degen AGGACA GGRTGNCCR ATGTGTATAAGAGACAG GACATANACYTCNGGRTGNCCRAARAAYCA _rev_17 AARAAYCA

TANACYTCN GTCTCGTGGGCTCGGAG GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGGT Fol_degen GTACAG GGRTGNCCR ATGTGTATAAGAGACAG ACAGTANACYTCNGGRTGNCCRAARAAYCA _rev_46 AARAAYCA

166

Supplementary Table 5.2 Characteristics of the mitochondrial genome of Anastrepha fraterculus.

Codon Gene Location Size (bp) Start Stop

trnI(gat) 1 - 66 66 trnQ(ttg) 64 - 132 69 trnM(cat) 196 - 264 69 nad2 265 - 1293 1029 ATT TAG trnW(tca) 1356 - 1423 68 trnC(gca) 1416 - 1484 69 trnY(gta) 1602 - 1668 67 cox1 1713 - 3251 1539 TCG TAA trnL2(taa) 3247 - 3312 66 cox2 3397 - 4119 723 ATG TAA trnK(ctt) 4085 - 4155 71 trnD(gtc) 4180 - 4247 68 atp8 4248 - 4409 162 ATT TAA atp6 4445 - 5080 636 ATG TAA cox3 5090 -5878 789 ATG TAA trnG(tcc) 5908 - 5973 66 nad3 5971 - 6327 357 ATA TAA trnA(tgc) 6354 - 6418 65 trnR(tcg) 6470 - 6534 65 trnN(gtt) 6588 - 6653 66 trnS1(gct) 6654 - 6721 68 trnE(ttc) 6722 - 6787 66 trnF(gaa) 6940 - 7006 67 nad5 6978 - 8726 1749 AAT TAA trnH(gtg) 8745 - 8811 67 nad4 8816 - 10138 1323 ATA TAA nad4l 10150 - 10446 297 ATG TAA trnT(tgt) 10449 - 10513 65 trnP(tgg) 10514 - 10579 66 nad6 10585 - 11106 522 ATA TAA cob 11111 - 12247 1137 ATG TAG trnS2(tga) 12246 - 12313 68 nad1 12330 - 13268 939 ATG T trnL1(tag) 13279 - 13343 65 rrnL 13344 - 14668 1325 trnV(tac) 14694 - 14765 72 rrnS 14765 - 15554 790 167

Control region 15555 - 16936 1382

168

Supplementary Table 5.3 Characteristics of the mitochondrial genome of Anastrepha striata.

Codon Gene Location Size (bp) Start Stop

trnI(gat) 1 - 66 66 trnQ(ttg) 64 - 132 69 trnM(cat) 152 - 221 70 nad2 222 - 1250 1029 ATT TAA trnW(tca) 1380 - 1447 68 trnC(gca) 1440 - 1504 65 trnY(gta) 1543 - 1610 68 cox1 1646 - 3184 1539 TCG TAA trnL2(taa) 3180 - 3245 66 cox2 3278 - 4000 723 ATG TAA trnK(ctt) 3966 - 4036 71 trnD(gtc) 4070 - 4138 69 atp8 4139 - 4300 162 ATC TAA atp6 4336 - 4971 636 ATG TAA cox3 4981 -5769 789 ATG TAA trnG(tcc) 5791 - 5857 67 nad3 5855 - 6211 357 ATA TAA trnA(tgc) 6254 - 6318 65 trnR(tcg) 6379 - 6442 64 trnN(gtt) 6502 - 6567 66 trnS1(gct) 6568 - 6635 68 trnE(ttc) 6636 - 6701 66 trnF(gaa) 6803 - 6869 67 nad5 6841 - 8562 1722 TAT TAA trnH(gtg) 8608 - 8673 66 nad4 8678 - 10000 1323 ATG TAA nad4l 10012 - 10308 297 ATG TAA trnT(tgt) 10311 - 10375 65 trnP(tgg) 10376 - 10441 66 nad6 10447 - 10968 522 ATA TAA cob 10973 - 12109 1137 ATG TAA trnS2(tga) 12116 - 12183 68 nad1 12110 - 13129 1020 ATT TAG trnL1(tag) 13149 - 13213 65 rrnL 13214 - 14534 1321 trnV(tac) 14559 - 14630 72 rrnS 14630 - 15419 790 169

Control region 15420 - 16612 1193

170

Supplementary Table 5.4 Characteristics of the mitochondrial genome of Anastrepha distincta.

Codon Gene Location Size (bp) Start Stop

trnI(gat) 1 - 67 67 trnQ(ttg) 65 - 133 69 trnM(cat) 196 - 264 69 nad2 265 - 1293 1029 ATT TAA trnW(tca) 1360 - 1427 68 trnC(gca) 1420 - 1488 69 trnY(gta) 1578 - 1644 67 cox1 1691 - 3229 1539 TCG TAA trnL2(taa) 3225 - 3290 66 cox2 3394 - 4116 723 ATG TAA trnK(ctt) 4082 - 4152 71 trnD(gtc) 4181 - 4249 69 atp8 4250 - 4411 162 ATC TAA atp6 4447 - 5082 636 ATG TAA cox3 5092 -5880 789 ATG TAA trnG(tcc) 5908 - 5973 66 nad3 5971 - 6327 357 ATA TAA trnA(tgc) 6352 - 6416 65 trnR(tcg) 6469 - 6533 65 trnN(gtt) 6586 - 6651 66 trnS1(gct) 6652 - 6719 68 trnE(ttc) 6720 - 6785 66 trnF(gaa) 6941 - 7007 67 nad5 6979 - 8727 1749 ATT TAA trnH(gtg) 8746- 8812 67 nad4 8817 - 10157 1341 ATA TAA nad4l 10151 - 10447 297 ATG TAA trnT(tgt) 10450 - 10514 65 trnP(tgg) 10515 - 10580 66 nad6 10583 - 11107 525 ATA TAA cob 11112 - 12248 1137 ATG TAG trnS2(tga) 12247 - 12314 68 nad1 12331 - 13269 939 ATG T trnL1(tag) 13280 - 13344 65 rrnL 13345 - 14668 1324 trnV(tac) 14694 - 14765 72 rrnS 14765 - 15555 791 171

Control region 15556 - 16418 863

172

Supplementary Table 5.5 Characteristics and GenBank accession for mitochondrial genomes of Tephritidae used in the phylogenetic analysis in this study.

Accession Length Genus Subgenus Species Population Number (bp) Kuala Lumpur, Bactrocera Bactrocera B. arecae KR233259 15900 Malaysia B. carambolae EF014414 15915 Japan B. correcta JX456552 15936 Yunnan, China Guangdong, B. dorsalis DQ845759 15915 China B. latifrons KT881556 15977 Malaysia B. melastomatos KT881557 15954 Malaysia B. tryoni HQ130030 15925 - B. umbrosa KT881558 15898 Malaysia B. zonata KP296150 15935 Ranchi, India Mirandela, B. oleae AY210702 15815 Portugal Chongqing, Tetradacus B. minax HM776033 16043 China B. caudata KT625491 15866 Malaysia B. cucurbitae JN635562 15825 Yunnan, China Chongqing, B. diaphora KT159730 15890 China Guangdong, B. scutellata KP722192 15915 China Guangdong, B. tau KP711431 15687 China Ceratitis Ceratitis C. capitata AJ242872 15980 Laboratory Strain Dacus Callantra D. longicornis KX345846 16253 Yunnan, China Procecidochares P. utilis KC355248 15922 Yunnan, China Toxotrypanini Anastrepha A. fraterculus KX926433 16739 Tolima, Colombia

173