Systematic Entomology Page 2 of 55
1
1 Out of the Orient: Post-Tethyan transoceanic and trans-Arabian routes
2 fostered the spread of Baorini skippers in the Afrotropics
3
4 Running title: Historical biogeography of Baorini skippers
5
6 Authors: Emmanuel F.A. Toussaint1,2*, Roger Vila3, Masaya Yago4, Hideyuki Chiba5, Andrew
7 D. Warren2, Kwaku Aduse-Poku6,7, Caroline Storer2, Kelly M. Dexter2, Kiyoshi Maruyama8,
8 David J. Lohman6,9,10, Akito Y. Kawahara2
9
10 Affiliations:
11 1 Natural History Museum of Geneva, CP 6434, CH 1211 Geneva 6, Switzerland
12 2 Florida Museum of Natural History, University of Florida, Gainesville, Florida, 32611, U.S.A.
13 3 Institut de Biologia Evolutiva (CSIC-UPF), Passeig Marítim de la Barceloneta, 37, 08003
14 Barcelona, Spain
15 4 The University Museum, The University of Tokyo, Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
16 5 B. P. Bishop Museum, 1525 Bernice Street, Honolulu, Hawaii, 96817-0916 U.S.A.
17 6 Biology Department, City College of New York, City University of New York, 160 Convent
18 Avenue, NY 10031, U.S.A.
19 7 Biology Department, University of Richmond, Richmond, Virginia, 23173, USA
20 8 9-7-106 Minami-Ôsawa 5 chome, Hachiôji-shi, Tokyo 192-0364, Japan
21 9 Ph.D. Program in Biology, Graduate Center, City University of New York, 365 Fifth Ave., New
22 York, NY 10016, U.S.A.
23 10 Entomology Section, National Museum of the Philippines, Manila 1000, Philippines
24
25 *To whom correspondence should be addressed: E-mail: [email protected] Page 3 of 55 Systematic Entomology
2
26
27 ABSTRACT
28 The origin of taxa presenting a disjunct distribution between Africa and Asia has puzzled
29 biogeographers for centuries. This biogeographic pattern has been hypothesized to be the result
30 of transoceanic long-distance dispersal, Oligocene dispersal through forested corridors, Miocene
31 dispersal through the Arabian Peninsula or passive dispersal on the rifting Indian plate. However,
32 it has often proven difficult to pinpoint the mechanisms at play. We investigate the biotic
33 exchange between the Afrotropics and the Oriental region during the Cenozoic, a period in
34 which geological changes altered landmass connectivity. We use Baorini skippers (Lepidoptera,
35 Hesperiidae) as a model, a widespread clade of butterflies in the Old-World tropics presenting a
36 disjunct distribution between the Afrotropics and the Oriental region. We use anchored
37 phylogenomics to infer a robust evolutionary tree for Baorini skippers and estimate divergence
38 times and ancestral ranges to test biogeographic hypotheses. Our phylogenomic tree recovers
39 strongly supported relationships for Baorini skippers and clarifies the systematics of the tribe.
40 Dating analyses suggest that these butterflies originated in the Oriental region, Greater Sunda
41 Islands, and Philippines in the early Miocene approximately 23 million years ago. Baorini
42 skippers dispersed from the Oriental region toward Africa at least five times in the past 20
43 million years. These butterflies colonized the Afrotropics primarily through trans-Arabian
44 geodispersal after the closure of the Tethyan seaway in the mid-Miocene. Range expansion from
45 the Oriental region toward the African continent likely occurred via the Gomphotherium land
46 bridge through the Arabian Peninsula. Alternative scenarios invoking long-distance dispersal and
47 vicariance are not supported. The Miocene climate change and biome shift from forested areas to
48 grasslands possibly facilitated geodispersal in this clade of skippers.
49
50 Systematic Entomology Page 4 of 55
3
51
52
53 KEYWORDS
54 Anchored phylogenomics; Arabotropical forest dispersal; butterfly evolution; Gomphotherium
55 land bridge; Hesperiidae; Lepidoptera phylogenomics; Neotethys Seaway closure; Old World
56 biogeography; transoceanic long-distance dispersal Page 5 of 55 Systematic Entomology
4
57 INTRODUCTION
58 The remarkable geological features of the Old-World tropics, such as the Indo-Australian
59 Archipelago (IAA), the Himalayan mountain range, and the Deccan traps are testaments to the
60 dynamic geological evolution that took place throughout the Cenozoic (Seton et al. 2012). The
61 collision of the Asian, Australian, and Pacific plates resulted in the assemblage of Wallacea (Hall
62 2013) and orogeny of New Guinea (Toussaint et al. 2014), allowing the Asian and Australian
63 biotas to connect through shallow marine straits and ephemeral land bridges between myriad
64 tropical islands (Voris 2000). The collision of the rifting Indian plate with the Asian plate in the
65 Paleocene (Hu et al. 2016), triggered the orogeny of the Himalayas (Zhang et al. 2012), and
66 connected the relic Gondwanan stock with the Laurasian biota. The closure of the Tethyan
67 Ocean in the Oligocene (Pirouz et al. 2017) triggered the orogeny of the Zagros mountain chain
68 along the Alpine-Himalayan orogenic belt, thereby connecting African and Asian biotas by a
69 land bridge (i.e., the Gomphotherium land bridge; Rögl 1998) through the Arabian Peninsula
70 (Harzhauser et al. 2007; Berra & Angiolini 2014). Disentangling the impact of these major
71 geological rearrangements on the evolution of the regional biota is necessary to understand the
72 mechanisms of lineage diversification in the Old-World tropics.
73 Of particular interest are clades of organisms disjunctly distributed in Asia and Africa,
74 because these allow testing hypotheses related to dispersal, vicariance, and extinction with
75 respect to the geology of the African, Arabian, Asian, and Indian plates. The distribution of such
76 clades has been explained by four biogeographic hypotheses. The first hypothesis, transoceanic
77 long-distance dispersal (LDD) between Asia and Africa, has been invoked to explain disjunct
78 distributions. Under this hypothesis (H1), the colonization of distant areas from a source
79 geographic range is accomplished directly and is not the result of range expansion followed by
80 widespread extinction/range contraction. A second hypothesis posits range expansion with trans-
81 Arabian dispersal (H2, also known as the “Gomphotherium land bridge hypothesis”, Harzhauser Systematic Entomology Page 6 of 55
5
82 et al. 2007) followed by regional extinction in the Arabian Peninsula. The Gomphotherium land
83 bridge hypothesis is based on geological evidence that the eastern part of the Tethys Ocean (i.e.,
84 Neotethys) that once separated Africa and Asia closed as a result of the Arabian plate moving
85 northward and colliding with the Asian plate in the Oligocene ca. 27 million years ago (Ma),
86 engendering the orogeny of the Zagros mountain chain in the process (Pirouz et al. 2017). A
87 third hypothesis is derived from the “boreotropical dispersal hypothesis”. This latter postulate
88 suggests that lineages expanded their geographic ranges in higher latitudes through boreotropical
89 forests that existed in the Eocene and Oligocene, before climate cooling triggered the wane of
90 boreotropical forests, in turn promoting ecological vicariance (Wolfe 1975). Recent studies (e.g.,
91 Couvreur et al. 2011; Sánchez-Ramírez et al. 2015) suggest that a similar process could be
92 invoked for Africa-Asia disjunctions, where range expansion could have happened after the
93 closure of the Tethys Ocean via now-vanished tropical forests in the Arabian Peninsula
94 (Ghazanfar & Fisher 1998; Griffin 2002). This hypothesis, which we coin “Arabotropical forest
95 dispersal” (H3), implies that range expansion would had to have happened between the
96 emergence of these habitats and their disappearance, triggered by the onset of a cooler climate in
97 the late Eocene and Oligocene (Zachos et al. 2001). As a result, this hypothesis predicts range
98 expansions resulting in ancestral widespread distributions, followed by vicariance triggered by
99 the aridification of the Arabian Peninsula and associated floristic reconfiguration (Ghazanfar &
100 Fisher 1998; Griffin 2002). The last hypothesis proposes that organisms rafted on the
101 Gondwanan Indian plate (“biotic ferry hypothesis” or “out-of-India hypothesis”; Datta-Roy &
102 Karanth 2009) and subsequently diversified in Asia. Under this hypothesis (H4), lineages
103 occurring in Gondwana during the Cretaceous and therefore in the landmass comprising Africa,
104 India, and Madagascar at the time, would have been transported via the rafting Indian plate
105 across the Indian Ocean, before docking in the Oriental region in the Eocene ca. 60 Ma (Hu et al.
106 2016). Although this hypothesis was initially supported by phylogenetic patterns, it has since Page 7 of 55 Systematic Entomology
6
107 been largely rejected by numerous studies of unrelated plants and animals that use molecular
108 dating approaches (Toussaint et al. 2016). A related mechanism is the use of the Indian rafting
109 plate as a “stepping stone” whereby taxa dispersed from the Oriental region to insular India
110 before its final docking and then dispersed toward other Gondwanan landmasses (Toussaint et al.
111 2018a). Both processes imply ancient African and Oriental divergences. Overall, there are few
112 empirical studies focusing on animal taxa that investigate the evolution of African-Asian
113 disjunctions in a phylogenetic framework with the aim to test these competing hypotheses.
114 Baorini grass skippers (Lepidoptera, Hesperiidae, Hesperiinae) are widespread butterflies
115 distributed from Madagascar, the Mascarenes, Africa and southern Europe, through the Arabian
116 Peninsula, and to the Oriental region and the IAA (Warren et al. 2009). The tribe currently
117 includes 17 genera (Fan et al. 2016; Zhu et al. 2016; de Jong & Coutsis 2017) and approximately
118 100 species, which are mostly distributed in the Oriental region and the IAA, with a few African
119 species in geographically widely distributed genera or species-rich clades (Fan et al. 2016).
120 Parnara Moore, [1881] comprises several Oriental/Indo-Australian species and two species in
121 Madagascar, the Mascarenes, and southeastern coastal Africa. Zenonia Evans, 1935 is endemic
122 to Africa and comprises three widespread species, and Gegenes Hübner, [1819] has three African
123 endemic species and another more widely distributed species in the Oriental and Palearctic
124 regions. The mostly Oriental and Indo-Australian genus, Pelopidas Walker, 1870, comprises two
125 species with geographic ranges extending from the IAA to continental Africa and Madagascar.
126 The four genera Afrogegenes de Jong & Coutsis, 2017, Borbo Evans, 1949, Brusa Evans, 1937
127 and Larsenia Chiba, Fan & Sáfián, 2016, are wholly African, with multiple endemic species,
128 some restricted to Madagascar. The other genera Baoris Moore, 1881, Caltoris Swinhoe, 1893,
129 Iton de Nicéville, 1895, Polytremis Mabille, 1904, Prusiana Evans, 1937, Pseudoborbo Lee,
130 1966, Tsukiyamaia Zhu, Chiba & Wu, 2016, Zinaida Evans, 1937, Zenonoida Fan and Chiba,
131 2016 are distributed from the Oriental region including India to the Australian region and some Systematic Entomology Page 8 of 55
7
132 Pacific islands (e.g., Pelopidas lyelli (Rothschild, 1915) in the Solomon Islands and Vanuatu;
133 Braby 2000). A few recent molecular systematic studies have clarified some phylogenetic
134 relationships of Baorini (Jiang et al. 2013; Fan et al. 2016; Zhu et al. 2016; Li et al. 2017; Tang
135 et al. 2017), but these studies included a relatively small number of loci ( 5) and species ( 50),
136 and have been unable to confidently place genera, making it difficult to assess their
137 biogeographic history. In the present study, we use a phylogenomic approach to sequence 13
138 gene fragments including those most commonly used in butterfly phylogenetics (Wahlberg and
139 Wheat 2008). We then combine this newly generated dataset with other available data to infer a
140 robust time-tree of nearly 70 Baorini species and test whether the distribution of Afrotropical
141 taxa can be linked to one of the four biogeographic hypotheses described above. We use a
142 hypothetico-deductive framework to test whether the phylogenomic pattern, divergence time
143 estimates, and ancestral range estimation are consistent with one or more of the hypotheses. We
144 do not test the Indian biotic ferry hypothesis because divergence time evidence suggests that
145 Baorini are much younger than the invoked geological events (e.g., Espeland et al. 2018).
146
147 MATERIALS AND METHODS
148 Taxon sampling and molecular biology
149 Samples of 32 Baorini species were collected in the field (Appendix 1) and placed in glassine
150 envelopes or directly in >95% ethanol. In the lab, DNA was extracted from abdomens or legs
151 using an OmniPrepTM DNA extraction kit (G-Biosciences) following Toussaint et al. (2018b).
152 Quantified DNA extracts were submitted to RAPiD Genomics (Gainesville, FL, USA) for library
153 preparation, hybridization enrichment and sequencing. Random mechanical shearing of DNA
154 was conducted with an average size of 300 bp followed by an end-repair reaction and ligation of
155 an adenine residue to the 3'-end of the blunt-end fragments to allow ligation of barcode adapters
156 and PCR-amplification of the library. Following library construction, solution-based target Page 9 of 55 Systematic Entomology
8
157 enrichment of Agilent SureSelect probes was conducted in a pool containing 16 libraries. These
158 libraries were enriched with the SureSelect Target Enrichment System for Illumina Paired-End
159 Multiplexed Sequencing Library protocol. Paired-end sequencing of 150-bp reads was
160 accomplished with an Illumina HiSeq.
161 We used the BUTTERFLY2.0 probe set (Kawahara et al. 2018) to capture the following
162 13 loci: acetyl-CoA (ACOA, 1020 bp), carbamoyl-phosphate synthetase 2, aspartate
163 transcarbamylase, and dihydroorotase (CAD, 1854 bp), catalase (CAT, 1290 bp), cytochrome
164 oxidase c subunit 1 (CO1, 1341 bp), dopa decarboxylase (DDC, 702 bp), elongation factor 1
165 alpha (EF1A, 1059 bp), glyceraldhyde-3-phosphate dehydrogenase (GAPDH, 606 bp), hairy cell
166 leukemia protein 1 (HCL, 633 bp), isocitrate dehydrogenase (IDH, 708 bp), malate
167 dehydrogenase (MDH, 681 bp), ribosomal protein S2 (RPS2, 471 bp), ribosomal protein S5
168 (RPS5, 555) and wingless (WGL, 240 bp).
169 We sampled sequences of 47 additional Baorini species that were available on GenBank
170 and BOLD, which were included in recent systematic studies of the tribe (Fan et al. 2016; Zhu et
171 al. 2016; Tang et al. 2017). We included sequence data for the mitochondrial ribosomal 16S (533
172 bp), and nuclear ribosomal 18S (378 bp) and 28S (857 bp). When possible, we built chimeras
173 between newly sequenced specimens and other specimens of the same species for which other
174 sequence data were available to improve sequence coverage for terminals. We included 19
175 skipper outgroups from all subfamilies and major clades (Toussaint et al. 2018b) to allow the use
176 of secondary calibrations (see below). All sequence data were imported in Geneious R8.1.8
177 (Biomatters, USA), cleaned and aligned separately for each locus using MUSCLE (Edgar 2004).
178 The alignments of ribosomal loci were straightforward and did not require fine-tuning, likely
179 because of the evolutionary scale this taxon sampling represents. Gene trees were inferred using
180 FastTree 2.1.5 (Price et al. 2010) and inspected for contamination and sequence quality. Trees
181 were rooted with Macrosoma hyacinthina (Warren, 1905) (Hedylidae), since Hedylidae are the Systematic Entomology Page 10 of 55
9
182 sister group to Hesperiidae (Espeland et al. 2018; Toussaint et al. 2018b). The final matrix
183 included 89 species including 70 Baorini, and 16 concatenated loci totaling 12,928 aligned
184 nucleotides.
185
186 AHE Data assembly and cleanup
187 We used the pipeline for anchored phylogenomics of Breinholt et al. (2018) to create a matrix
188 from raw Illumina reads. Paired-end Illumina data were cleaned with Trim Galore! ver. 0.4.0
189 (www.bioinformatics.babraham.ac.uk), allowing a minimum read size of 30 bp, and we removed
190 bases with a Phred score below 20. The 13 loci that were sequenced using the BUTTERFLY2.0
191 kit (Kawahara et al. 2018) were assembled with iterative baited assembly (IBA, Breinholt et al.
192 2018), in which only reads with both forward and reverse reads that passed filtering were
193 included. Assembled reads from the probe region were blasted against the Danaus plexippus
194 (Linnaeus, 1758) (Nymphalidae) reference genome and BLAST results were used for single hit
195 and orthology filtering. Following Breinholt et al. (2018), the loci were screened for orthology
196 with a single hit threshold of 0.9 and genome mapping. The identified orthologous sequences
197 were screened for contamination by identifying and removing sequences that were nearly
198 identical at the family and genus level (Breinholt et al. 2018). Sequences from each locus were
199 separately aligned with MAFFT v. 7.245 (Katoh and Standley 2013) and concatenated with
200 FASconCAT-G 1.0.4 (Kück and Meusemann 2010). Aliscore 2.2 (Kück et al. 2010) was used to
201 check for saturation and sites that appeared to evolve randomly. The individual locus alignments
202 and files used for phylogenetic inference in this study (see below) are available on Dryad
203 (XXXX).
204
205 Datasets, model partitioning and phylogenetic analysis Page 11 of 55 Systematic Entomology
10
206 We simultaneously estimated the best partitioning scheme and corresponding models of
207 nucleotide substitution in IQ-TREE 1.6.9 (Nguyen et al. 2015) using ModelFinder
208 (Kalyaanamoorthy et al. 2017) with the greedy algorithm and based on the Bayesian information
209 criterion corrected (BIC). The optimal models of nucleotide substitution were determined across
210 all available models in IQ-TREE including the FreeRate model (+R, Soubrier et al. 2012), that
211 relaxes the assumption of gamma distributed rates. The 13 protein-coding loci were divided by
212 codon positions (39 partitions), and the three non-protein-coding loci were left as independent
213 partitions (3 partitions). To find the most likely tree, 100 maximum likelihood (ML) searches
214 were conducted in IQ-TREE, with two calculations of nodal support: ultrafast bootstrap
215 (UFBoot) and SH-aLRT tests. We generated 1000 replicates for UFBoot (Hoang et al. 2018) and
216 SH-aLRT (Guindon et al. 2010). To reduce the risk of overestimating branch supports with
217 UFBoot due to severe model violations, we used hill-climbing nearest neighbor interchange
218 (NNI) to optimize each bootstrap tree (Hoang et al. 2018). All phylogenomic analyses were
219 conducted on the University of Florida HiPerGator High Performance Computing Cluster
220 (www.hpc.ufl.edu). When discussing branch support, we refer to 'strong' support as SH-aLRT ≥
221 80 and UFBoot ≥ 95, and 'moderate' support as SH-aLRT ≥ 80 or UFBoot ≥ 95.
222 We also performed an additional set of analyses based on AHE data only, excluding all
223 information from GenBank to ensure that chimeric constructions and molecular matrix
224 incompleteness were not biasing the phylogenetic inference. This reduced dataset comprised 50
225 taxa including 31 Baorini representatives. The methods were identical to the ones detailed above
226 for the full dataset. The best partitioning scheme and corresponding models of nucleotide
227 substitution were simultaneously estimated in IQ-TREE using ModelFinder based on the BIC,
228 and 100 ML searches were conducted to avoid local optima with computation of 1000 UFBoot
229 replicates and 1000 SH-aLRT tests.
230 Systematic Entomology Page 12 of 55
11
231 Divergence Time Estimation
232 We estimated divergence times in a Bayesian framework with BEAST 1.10.1 (Suchard et al.
233 2018). The best partitioning scheme and models of substitution were selected in PartitionFinder2
234 (Lanfear et al. 2017) using the greedy algorithm and the BIC across all models included in
235 BEAST (option models=beast). The dataset was partitioned a priori, similar to the ModelFinder
236 analysis (see above). To take into account the importance of clock partitioning, we implemented
237 (i) a single clock for all partitions, (ii) two clocks, one for the mitochondrial partition and one for
238 the nuclear partitions, and (iii) one clock for first and second codon positions of protein-coding
239 genes, one clock for third codon positions of protein-coding genes, and one clock for each
240 ribosomal gene for a total of five clocks. We assigned a Bayesian lognormal relaxed clock model
241 to the different clock partitions. We also tested different tree models by using a Yule or a birth-
242 death model in different analyses. Clock rates were set with an approximate continuous time
243 Markov chain rate reference prior (Ferreira & Suchard 2008). All analyses consisted of 100
244 million generations with a parameter and tree sampling every 5000 generations. We estimated
245 marginal likelihood estimates (MLE) for each analysis using path-sampling and stepping-stone
246 sampling (Baele et al. 2012, 2013), with 1000 path steps, and chains running for one million
247 generation with a log likelihood sampling every 1000 cycles.
248 Two hesperiid fossils, Pamphilites abdita Scudder, 1875 and Protocoeliades kristenseni
249 de Jong, 2016 were candidates to be included as fossil calibrations for this study (de Jong 2017).
250 However, preliminary testing in BEAST revealed that the age estimates obtained with these two
251 fossils were largely underestimated compared to the ages from Espeland et al. (2018) or Chazot
252 et al. (2019). This discrepancy is partly due to the difficulty of placing the oldest described
253 skipper fossil into an extant taxon: Protocoeliades kristenseni from Denmark, dated at ca. 55 Ma
254 (de Jong 2016). This fossil has been described as belonging to the Coeliadinae based on
255 morphology (de Jong 2016). However, because relationships among Coeliadinae genera and Page 13 of 55 Systematic Entomology
12
256 their monophyly are uncertain, any placement of the fossil elsewhere than on the stem of
257 Coeliadinae remains contentious (de Jong 2016, 2017). Therefore, because Coeliadinae is sister
258 to the remainder of Hesperiidae (Espeland et al. 2018; Toussaint et al. 2018b), the fossil would
259 only calibrate a minimum age of 55 million years for crown Hesperiidae, while Espeland et al.
260 (2018) estimated this node to be ca. 76 Ma and the split between Hesperiidae and Hedylidae to
261 be ca. 106 Ma, and Chazot et al. (2019) estimated these nodes to ca. 65 Ma and ca. 99 Ma.
262 Because Chazot et al. (2019) has a more extensive taxon sampling including Baorini genera, we
263 chose this study for secondary calibrations over the more robust phylogenomic tree of Espeland
264 et al. (2018). We constrained the node corresponding to the split between Hesperiidae and
265 Hedylidae with a uniform prior encompassing the 95% credibility interval (lower=80.73 /
266 upper=119.22) estimated for this node in Chazot et al. (2019). We also constrained the crown of
267 Hesperiidae (lower=55.8 / upper=78.1) and the node corresponding to all Baorini excluding
268 Parnara (lower=10.3 / upper=17.2).
269
270 Ancestral Range Estimation
271 We used the R-package BioGeoBEARS 1.1.1 (Matzke 2018) to estimate ancestral ranges in
272 Baorini. Because the validity of model comparison in BioGeoBEARS is debated (Ree &
273 Sanmartin 2018), the analyses were only performed under the dispersal-extinction-cladogenesis
274 (DEC) model (Ree 2005; Ree & Smith 2008). The DEC model specifies instantaneous transition
275 rates between geographical ranges along the branches of a given phylogeny. The likelihoods of
276 ancestral states at cladogenesis events in the phylogeny are then estimated in a ML framework,
277 with probabilities of range transitions as a function of time. We used the BEAST Maximum
278 Clade Credibility (MCC) tree from the best analysis (see Results) with outgroups pruned. The
279 geographic distribution of Baorini was extrapolated from the literature (Chiba & Eliot 1991; de
280 Jong & Treadaway 1993; Braby 2000; Zhang et al. 2010; Huang 2011). In order to test the three Systematic Entomology Page 14 of 55
13
281 hypotheses related to the colonization of Africa, we used the following areas in the
282 BioGeoBEARS analyses: Western Palearctic (P) from southern Europe to Turkey; Africa,
283 Madagascar and Mascarenes (F); Arabian Peninsula (A); Oriental (O) from Iran to Japan and
284 from China to the Malaysian Peninsula; Greater Sunda Islands and Philippines (G) including
285 Borneo, Java, Palawan, Sumatra and satellite islands; Wallacea (W) including the Lesser Sunda
286 Islands, Moluccas and Sulawesi including satellite islands; and Australia and New Guinea (N)
287 including satellite islands.
288 Considering the relative geological stability of the African and Asian regions in the late
289 Cenozoic, we only took into account the dynamic geological history of the IAA by designing
290 two time slices with differential dispersal rate scalers. TS1 (25 Ma – 15 Ma) corresponds to the
291 period predating the acceleration of orogenies in the Philippines archipelago, Wallacea and New
292 Guinea (Yumul et al. 2008; Hall 2013; Toussaint et al. 2014), and TS2 (15 – present)
293 corresponds to the above-mentioned events, more active formation of the IAA and intensifying
294 sea-level fluctuations in the Plio-Pleistocene (Voris 2000). The dispersal rate scaler values were
295 selected according to terrain and water body positions throughout the timeframe of the age of the
296 tribe (Appendix 2). The maximum number of areas per ancestral state was fixed to five. To
297 reduce the computational burden, ancestral states corresponding to unrealistic disjunct areas
298 were removed manually from the list of possible states.
299
300 RESULTS
301 Phylogenetic Relationships
302 The phylogenetic tree based on the reduced dataset (AHE data only) with the highest likelihood
303 (lnL = -113733.176) from the 100 searches performed in IQ-TREE is presented in Figure 1 (see
304 Appendix 3 for the full tree including outgroups), and the one based on the full dataset (LnL = -
305 126096.182) is presented in Figure 2. Since the phylogenies inferred in ML with both datasets Page 15 of 55 Systematic Entomology
14
306 gave similar branching patterns, we only discuss the ones recovered with the greater taxon
307 sampling and presented in Figure 2. The inferred tree is well-resolved with more than 85% of
308 nodes that recovered moderate or strong support. The phylogeny inferred in BEAST under
309 Bayesian inference (BI) gave a very similar topology (Figure 2, Appendix 3) and a greater
310 number of nodes with strong support values (PP 0.95). Hesperia Fabricius, 1793 and Saliana
311 Evans, 1955 are recovered as sister to Baorini with strong support in both ML and BI (UFBoot =
312 100 / SH-aLRT = 100 / PP = 1.0). Baorini is recovered as monophyletic with strong support
313 (UFBoot = 100 / SH-aLRT = 100 / PP = 1.0) and is split into five clades. Parnara is recovered in
314 Clade I and as sister to the remainder of the tribe with strong support (UFBoot = 100 / SH-aLRT
315 = 100 / PP = 1.0). Clade II, which comprises the genera Iton, Tsukiyamaia, Zenonia, Zenonoida
316 and Zinaida; it is recovered with weaker support in ML (UFBoot = 37 / SH-aLRT = 50) but with
317 strong support in BI (PP = 1.0). Clade III is recovered with moderate to strong support (UFBoot
318 = 90 / SH-aLRT = 100 / PP = 0.99) and comprises Prusiana as sister to Caltoris. In Clade IV,
319 Baoris is recovered as sister to Pelopidas with moderate to strong support (UFBoot = 91 / SH-
320 aLRT = 100 / PP = 0.99). Finally, in Clade V, Polytremis lubricans (Herrich-Schäffer, 1869) is
321 sister to a clade comprising the monotypic Pseudoborbo, the polyphyletic Borbo, Gegenes and
322 the sister genera Afrogegenes and Larsenia. Species of Baorini distributed in Africa,
323 Madagascar, or the Mascarenes are recovered in different clades across the tree.
324
325 Divergence Times and Biogeography
326 Dating analyses using different numbers of clocks and tree models estimated similar ages with
327 broadly overlapping credibility intervals (Table 1). Based on the MLE comparison (Table 1), the
328 analysis with two clocks and a Yule model was selected for further analyses. Results of this
329 analysis are presented in Figure 3 (see Appendix 4 for the full chronogram), along with results of
330 the DEC model in BioGeoBEARS (DEC, lnL = -176.24). The BioGeoBEARS analysis Systematic Entomology Page 16 of 55
15
331 suggested an origin in the Oriental region and Greater Sunda Islands/Philippines, with multiple
332 independent colonization events of Africa, Madagascar and the Mascarenes, but also of regions
333 east of Wallace’s Line in the Miocene. We recover an origin of Baorini in the early Miocene ca.
334 23 Ma in the Oriental region and Greater Sunda Islands/Philippines. Based on this result, we
335 estimate range expansion across several areas in Parnara (Clade I) toward the IAA and
336 Australian region. This initial range expansion in Parnara is followed by a second range
337 expansion toward Africa, Madagascar and the Mascarenes in the late Miocene to early Pliocene.
338 We infer a range contraction in the remainder of Baorini, a lineage that shows an Oriental origin
339 in the mid-Miocene (Figure 3). Within Clade II, we estimate range expansion toward West
340 Palearctic, Africa, Madagascar and the Mascarenes from the Oriental region in the Miocene with
341 a potential vicariance between Zenonia and Zinaida (Figure 3). We estimate an origin of Clade
342 III in the Greater Sunda Islands/Philippines, followed by range contraction in the Oriental region
343 in Caltoris. The ancestor of Prusiana colonized Wallacea sometime in the Miocene or Plio-
344 Pleistocene. We also infer range expansion toward the Greater Sunda Islands/Philippines in
345 Caltoris, followed by range expansion toward Wallacea and Australian region in the Miocene or
346 Pliocene. In Clade IV, the ancestor of Pelopidas is inferred to have expanded its range from the
347 Oriental region to Wallacea in the late Miocene, allowing a subsequent range expansion toward
348 Australia and New Guinea in P. lyelli. A similar range expansion is estimated in the sister pair P.
349 agna (Moore, [1866])/P. mathias (Fabricius, 1798). The colonization of Africa, Madagascar and
350 the Mascarenes by the ancestors of P. mathias and P. thrax (Hübner, [1821]) is estimated to have
351 occurred in the late Miocene to Plio-Pleistocene through independent range expansions. We
352 estimate a mid-Miocene range expansion from the Oriental region toward a vast region
353 encompassing the Arabian Peninsula, Western Palearctic region, Africa, Madagascar and the
354 Mascarenes, Oriental region and Greater Sunda Islands/Philippines in Clade CV. The Arabian
355 Peninsula was also colonized from Africa in the past 5 million years in Afrogegenes (Figure 3). Page 17 of 55 Systematic Entomology
16
356
357 DISCUSSION
358 Phylogenetics and taxonomy of Baorini
359 Our phylogenomic analysis recovers a well-resolved phylogenetic hypothesis for Baorini
360 skippers, especially in the BEAST analysis. Zhu et al. (2016) recovered Parnara as sister to the
361 remainder of Baorini, a relationship consistent with our results (Figures 1 to 3) but at odds with
362 the conclusions of Fan et al. (2016). which recovered Larsenia as sister to the remainder of
363 Baorini including Parnara. Unlike the results of Fan et al. (2016), Larsenia is nested within
364 Clade V and these relationships are strongly supported in our analyses, although the exact
365 placement of Larsenia within this clade is uncertain (Figures 1 to 3). The placement of other
366 genera in the studies of Zhu et al. (2016) and Fan et al. (2016) were only partially consistent with
367 our results, albeit with weak support. These prior studies were based on few genetic markers
368 (three and five loci, respectively), resulting in inconsistencies among studies. Our inferences
369 consistently recover five clades with strong support that are somewhat consistent with the current
370 tribal classification. The only taxonomic discrepancy in our inferences is the polyphyly of Borbo.
371 Fan et al. (2016) described Larsenia to comprise a number of exclusively African species of
372 Borbo because their analysis recovered this clade as sister to all Baorini. Based on our
373 reconstruction, Larsenia is one of the most derived clades within Baorini (Figures 1 to 3). It is
374 unclear what should be done regarding Borbo detecta (Trimen, 1893) since it is recovered as
375 sister to Afrogegenes and Larsenia. Additional taxon sampling would be needed to propose
376 taxonomic changes in this clade.
377
378 Origins of Baorini skippers
379 The placement of Baorini within Hesperiinae remains unclear due to the staggering diversity of
380 the subfamily (> 2000 described species) and the lack of a comprehensive phylogeny for this Systematic Entomology Page 18 of 55
17
381 group. However, recent phylogenomic evidence (Toussaint et al. 2018b) suggests that Baorini
382 might be part of a large phylogenetic grade, thereby rendering difficult the inclusion of
383 outgroups that might inform the estimation of ancestral ranges (Toussaint et al. 2018a). Our
384 BioGeoBEARS analysis recovers an origin of Baorini in the Oriental region plus Greater Sunda
385 Islands/Philippines. We purposefully chose a broader definition of the Oriental region to avoid
386 losing biogeographic signal while retaining the power to test hypotheses relative to the
387 mechanisms for the colonization of Africa. Ancestors of Baorini likely originated in an area
388 climatically and biologically similar to the modern Oriental region, but a finer-scale
389 biogeographical and ecological study of specific clades would be needed to test this hypothesis.
390 Colonization of Wallacea and Sahul (i.e., the Australian region) would have been difficult before
391 ca. 1520 Ma, except perhaps through LDD (Hall 2013); the inferred ancestral ranges of early
392 diverging Baorini in the Oriental region is therefore unsurprising.
393
394 Hypothesis testing: African colonization of Baorini skippers
395 Our divergence time estimation supports H2 (Gomphotherium land bridge), as the African
396 continent was mostly colonized by range expansion (e.g., ancestors of Pelopidas mathias and P.
397 thrax in clade IV, Figure 3) likely taking advantage of the Gomphotherium land bridge (Rögl
398 1998). Consequently, the BioGeoBEARS analysis rejects H1 (transoceanic long-distance
399 dispersal) although some dispersal events were likely required to explain for instance the
400 transition from the Oriental region to a larger range also comprising Western Palearctic and the
401 African continent in Clade II (Figure 3). Both our dating and BioGeoBEARS analysis reject H3
402 (Arabotropical forest dispersal). For the ancestors of Baorini to take advantage of tropical forests
403 as found in plants and fungi (Couvreur et al. 2011; Sánchez-Ramírez et al. 2015), they would
404 have had to colonize the Arabian Peninsula region in the Oligocene when these forests still
405 existed. Instead, our BioGeoBEARS estimate suggests that the ancestors of Baorini originated in Page 19 of 55 Systematic Entomology
18
406 the Oriental region in the Miocene. By the time Baorini skippers began to disperse to other
407 regions west of the Oriental region, the boreotropical forests had disappeared, due to a decrease
408 in global temperature (Zachos et al. 2001). It is therefore unlikely that these butterflies would
409 have taken advantage of these ecosystems to colonize the African continent.
410
411 Biogeographic history of Baorini skippers
412 The first colonization event toward Africa is found in Parnara, where a range expansion
413 event allowed the ancestor of Parnara except P. amalia (Semper, [1879]) to disperse from a
414 widespread range in the Palearctic, Oriental region, and IAA toward eastern Africa. Parnara
415 naso (Fabricius, 1798) is one of only two Parnara species to live in Africa, and its distribution is
416 limited to Madagascar and the Mascarenes (Chiba & Eliot 1991); the other African Parnara, P.
417 monasi (Trimen, 1889), is restricted to eastern coastal Africa. The current distribution of these
418 two species is estimated to be the result of range expansion in the Western (H1). We argue that
419 long-distance dispersal (H2) is possible alternative scenario because of the peculiar distribution
420 of Parnara skippers in southeastern coastal Africa. When considering LDD as an alternative
421 hypothesis to the range expansion pattern recovered by the DEC model (Figure 3) for the
422 colonization of Africa in Parnara, it is possible that now-sunken islands between India and the
423 Mascarenes facilitated such dispersal, acting as stepping-stones during periods of low sea-level
424 (e.g., Shapiro et al. 2002; Bradler et al. 2005, Warren et al. 2010). Bathymetric fluctuations have
425 occurred throughout the Cenozoic, with lowstands of up to 50 m below present sea level in the
426 last 5 million years (Miller et al. 2005). Such fluctuations coincide with colonization of Africa by
427 the ancestor of P. naso (and likely P. monasi), possibly by using islands that were subaerial at
428 the time as stepping-stones, including, for instance, Saya de Malha (Warren et al. 2010). African
429 Brusa (two extant species), which were not sampled in our study, may represent an additional
430 colonization event of Africa. Including P. monasi and generating more data for these two Systematic Entomology Page 20 of 55
19
431 African species will be critical in order to understand the underlying mechanism of colonization
432 in Parnara.
433 We infer a Miocene range expansion event from the Oriental region to Western Palearctic
434 and Africa in Clade II. Although a colonization route through the Palearctic region and into
435 northern Africa is possible, it is equally likely that Baorini skippers dispersed through the
436 Arabian Peninsula (see below). Our analysis suggests range expansion through the Arabian
437 Peninsula and Africa in Clade V. This pattern is consistent with the colonization of Africa by
438 Pelopidas and is largely compatible with H1. All of these colonization events seem to postdate
439 the closure of the Tethys Ocean, a geological suture that was until recently believed to be of
440 early Miocene age (e.g., Rögl 1998; Bosworth et al. 2005; Harzhauser et al. 2007). New
441 geological evidence based on magnetostratigraphy and strontium isotope analyses of
442 sedimentation in the High Zagros points to a slightly older origin of this land bridge, likely in the
443 mid-Oligocene, ca. 27 Ma (Pirouz et al. 2017). According to H1, the closure of the eastern part of
444 the Tethys Ocean (i.e., Neotethys) in the mid-Oligocene permitted the formation of the
445 Gomphotherium land bridge, a land corridor between Africa and the Oriental region (Rögl 1998).
446 Studies of unrelated taxa have suggested that this passage was used as a means of dispersal. For
447 instance, Bourguignon et al. (2017) hypothesized that multiple clades of termites (Isoptera) used
448 this bridge to disperse between the two continents. They argued that because these termite clades
449 were fungus-growers, geodispersal was more likely than LDD. Tamar et al. (2018) also
450 recognized the role played by the Gomphotherium land bridge in allowing dispersal to and from
451 Africa in a study on the biogeography of Uromastyx Merrem, 1820 lizards (Squamata,
452 Agamidae). In particular, they argued that the aridification of North Africa and the Arabian
453 Peninsula in the early Miocene might have favored dispersal in this clade. Miletinae butterflies
454 are also suggested to have dispersed into Africa via this corridor in the early Miocene
455 (Kaliszewska et al. 2015). Other butterfly studies have shed light on patterns consistent with the Page 21 of 55 Systematic Entomology
20
456 Gomphotherium land bridge hypothesis (Aduse-Poku et al. 2009, 2015; Sahoo et al. 2018). In the
457 case of Baorini skippers, the potential passage of the Gomphotherium land bridge is dated
458 between ca. 5 and 15 Ma in three different clades. When considering the credibility intervals
459 associated with these estimates, the crossing of the Gomphotherium land bridge could be
460 combined with more humid conditions in the Arabian Peninsula (Griffin 2002), as well as the
461 emergence of C4-grassland ecosystems in Eastern Mediterranean and African regions as early as
462 the mid-Miocene, followed by their dominance in the late Miocene (Edwards et al. 2010). All
463 Baorini skipper larvae are Poaceae-feeders and we hypothesize that the synchronous emergence
464 of large grasslands and savannas in this key region might have facilitated dispersal toward
465 Africa, but also fostered diversification in Africa as suggested for other Lepidoptera (Kergoat et
466 al. 2012; Toussaint et al. 2012). Based on the co-occurrence of a land-bridge, appropriate host
467 plants, and a humid climate on the Arabian Peninsula and North Africa in the Miocene, the
468 disjunct distribution of Baorini skippers between Africa and Asia seems to stem principally from
469 range expansion from the Oriental region rather than transoceanic dispersal toward Africa.
470
471 ACKNOWLEDGEMENTS
472 We thank Niklas Wahlberg, an anonymous reviewer and Thomas Simonsen for insightful
473 comments and suggestions that contributed to strengthen this manuscript. We thank Antonio
474 Giudici, Geoffrey Prosser, Jee & Rani Nature Photography, Laitche, Hsu Hong Lin and Charles
475 James Sharp who provided photographs that were illustrated in this study. The University of
476 Florida's HiPerGator provided computational resources and technical support. EFAT
477 acknowledges UF Health Shands Hospital for assistance, and Marius Toussaint and Claire
478 Toussaint for inspiring this project on the 19th of May 2018. DJL acknowledges support from
479 DEB-1541557 and 1120380. AYK acknowledges funding from DEB-1541500 and 1541560. RV
480 acknowledges funding for sample collecting from grant CGL2016-76322-P (AEI/FEDER, UE). Systematic Entomology Page 22 of 55
21
481
482 CONFLICT OF INTEREST STATEMENT
483 The authors declare no conflict of interest.
484
485 REFERENCES
486 Aduse-Poku, K., Brattström, O., Kodandaramaiah, U., Lees, D.C., Brakefield, P.M. and
487 Wahlberg, N. (2015) Systematics and historical biogeography of the old-world butterfly
488 subtribe Mycalesina (Lepidoptera: Nymphalidae: Satyrinae). BMC Evolutionary Biology,
489 15(1), 167.
490 Aduse-Poku, K., Vingerhoedt, E. and Wahlberg, N. (2009) Out-of-Africa again: a phylogenetic
491 hypothesis of the genus Charaxes (Lepidoptera: Nymphalidae) based on five gene regions.
492 Molecular Phylogenetics and Evolution, 53(2), 463–478.
493 Baele, G., Lemey, P., Bedford, T., Rambaut, A., Suchard, M.A. & Alekseyenko, A.V. (2012).
494 Improving the accuracy of demographic and molecular clock model comparison while
495 accommodating phylogenetic uncertainty. Molecular Biology and Evolution, 29(9), 2157–
496 2167.
497 Baele, G., Li, W.L.S., Drummond, A.J., Suchard, M.A. and Lemey, P. (2012) Accurate model
498 selection of relaxed molecular clocks in Bayesian phylogenetics. Molecular Biology and
499 Evolution, 30(2), 239–243.
500 Berra, F. & Angiolini, L. (2014). The evolution of the Tethys region throughout the Phanerozoic:
501 a brief tectonic reconstruction. Petroleum systems of the Tethyan region: AAPG Memoir,
502 106, 1–27.
503 Bosworth, W., Huchon, P. & McClay, K. (2005). The Red Sea and Gulf of Aden basins. Journal
504 of African Earth Sciences, 43(1–3), 334–378. Page 23 of 55 Systematic Entomology
22
505 Bourguignon, T., Lo, N., Šobotník, J., Ho, S.Y., Iqbal, N., Coissac, E., Lee, M., Jendryka, M.M.,
506 Sillam-Dussès, D., Křížková, B. & Roisin, Y. (2017). Mitochondrial phylogenomics
507 resolves the global spread of higher termites, ecosystem engineers of the
508 tropics. Molecular Biology and Evolution, 34(3), 589–597.
509 Braby, M.F. (2000). Butterflies of Australia: their identification, biology and distribution.
510 CSIRO publishing.
511 Bradler, S., Cliquennois, N. & Buckley, T.R. (2015). Single origin of the Mascarene stick
512 insects: ancient radiation on sunken islands? BMC Evolutionary Biology, 15(1), 196.
513 Breinholt, J.W., Earl, C., Lemmon, A.R., Lemmon, E.M., Xiao, L. & Kawahara, A.Y. (2018).
514 Resolving relationships among the megadiverse butterflies and moths with a novel pipeline
515 for anchored phylogenomics. Systematic Biology, 67(1), 78–93.
516 Chazot, N., Wahlberg, N., Freitas, A.V.L., Mitter, C., Labandeira, C., Sohn, J.C., Sahoo, R.K.,
517 Seraphim, N., de Jong, R. and Heikkilä, M. (2019) Priors and posteriors in Bayesian timing
518 of divergence analyses: the age of butterflies revisited. Systematic Biology, in press.
519 Chiba, H. & Eliot, J.N. (1991). A revision of the genus Parnara Moore (Lepidoptera,
520 Hesperiidae), with special reference to the Asian species. Lepidoptera Science, 42(3), 179–
521 194.
522 Couvreur, T.L., Pirie, M.D., Chatrou, L.W., Saunders, R.M., Su, Y.C., Richardson, J.E. &
523 Erkens, R.H. (2011). Early evolutionary history of the flowering plant family Annonaceae:
524 steady diversification and boreotropical geodispersal. Journal of Biogeography, 38(4),
525 664–680.
526 Datta-Roy, A. & Karanth, K.P. (2009). The Out-of-India hypothesis: What do molecules
527 suggest? Journal of Biosciences, 34(5), 687–697. Systematic Entomology Page 24 of 55
23
528 de Jong, R. & Coutsis, J.G. (2017). A re-appraisal of Gegenes Hübner, 1819 (Lepidoptera,
529 Hesperiidae) based on male and female genitalia, with the description of a new genus,
530 Afrogegenes. Tijdschrift voor Entomologie, 160(1), 41–60.
531 de Jong, R. & Treadaway, C.G. (1993). The Hesperiidae (Lepidoptera) of the Philippines.
532 Zoologische Verhandelingen, 288, 1–115.
533 de Jong, R. (2016). Reconstructing a 55-million-year-old butterfly (Lepidoptera: Hesperiidae).
534 European Journal of Entomology, 113, 423–428.
535 de Jong, R. (2017). Fossil butterflies, calibration points and the molecular clock (Lepidoptera:
536 Papilionoidea). Zootaxa, 4270(1), 1–63.
537 Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high
538 throughput. Nucleic Acids Research, 32(5), 1792–1797.
539 Edwards, E.J., Osborne, C.P., Strömberg, C.A., Smith, S.A. & C4 Grasses Consortium (2010).
540 The origins of C4 grasslands: integrating evolutionary and ecosystem science. Science,
541 328(5978), 587–591.
542 Espeland, M., Breinholt, J., Willmott, K.R., Warren, A.D., Vila, R., Toussaint, E.F.A., Maunsell,
543 S.C., Aduse-Poku, K., Talavera, G., Eastwood, R., Jarzyna, M.A., Guralnick, R., Lohman,
544 D.J., Pierce, N.E. & Kawahara, A.Y. (2018). A comprehensive and dated phylogenomic
545 analysis of butterflies. Current Biology, 28(5), 770–778.
546 Fan, X., Chiba, H., Huang, Z., Fei, W., Wang, M. & Sáfián, S. (2016). Clarification of the
547 phylogenetic framework of the tribe Baorini (Lepidoptera: Hesperiidae: Hesperiinae)
548 inferred from multiple gene sequences. PloS one, 11(7), e0156861.
549 Ferreira, M.A. & Suchard, M.A. (2008). Bayesian analysis of elapsed times in continuous‐time
550 Markov chains. Canadian Journal of Statistics, 36(3), 355–368.
551 Ghazanfar, S.A. & Fisher, M. (1998). Vegetation of the Arabian Peninsula. Kluwer Academic
552 Publishers, Dordrecht, 365 pp. Page 25 of 55 Systematic Entomology
24
553 Griffin, D.L. (2002). Aridity and humidity: two aspects of the late Miocene climate of North
554 Africa and the Mediterranean. Palaeogeography, Palaeoclimatology,
555 Palaeoecology, 182(1–2), 65–91.
556 Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W. & Gascuel, O. (2010). New
557 algorithms and methods to estimate maximum-likelihood phylogenies: assessing the
558 performance of PhyML 3.0. Systematic Biology, 59(3), 307–321.
559 Hall, R. (2013). The palaeogeography of Sundaland and Wallacea since the Late
560 Jurassic. Journal of Limnology, 72(s2), 1.
561 Harzhauser, M., Kroh, A., Mandic, O., Piller, W.E., Göhlich, U., Reuter, M. & Berning, B.
562 (2007). Biogeographic responses to geodynamics: a key study all around the Oligo–
563 Miocene Tethyan Seaway. Zoologischer Anzeiger-A Journal of Comparative
564 Zoology, 246(4), 241–256.
565 Hoang, D.T., Chernomor, O., von Haeseler, A., Minh, B.Q. & Le, S.V. (2018). UFBoot2:
566 Improving the ultrafast bootstrap approximation. Molecular Biology and Evolution, 35(2),
567 518–522.
568 Hu, X., Garzanti, E., Wang, J., Huang, W., An, W. & Webb, A. (2016). The timing of India-Asia
569 collision onset–Facts, theories, controversies. Earth-Science Reviews, 160, 264–299.
570 Huang, H. (2011): Notes on the genera Caltoris Swinhoe, 1893 and Baoris Moore, [1881] from
571 China (Lepidoptera: Hesperiidae). Atalanta 42(1–4), 201–220.
572 Jiang, W., Zhu, J., Song, C., Li, X., Yang, Y. & Yu, W. (2013). Molecular phylogeny of the
573 butterfly genus Polytremis (Hesperiidae, Hesperiinae, Baorini) in China. PloS one, 8(12),
574 e84098.
575 Kaliszewska, Z.A., Lohman, D.J., Sommer, K., Adelson, G., Rand, D.B., Mathew, J., Talavera,
576 G. & Pierce, N.E. (2015). When caterpillars attack: biogeography and life history evolution
577 of the Miletinae (Lepidoptera: Lycaenidae). Evolution, 69(3), 571–588. Systematic Entomology Page 26 of 55
25
578 Kalyaanamoorthy, S., Minh, B.Q., Wong, T.K., von Haeseler, A. & Jermiin, L.S. (2017).
579 ModelFinder: Fast model selection for accurate phylogenetic estimates. Nature Methods,
580 14(6), 587.
581 Katoh, K. & Standley, D.M. (2013). MAFFT multiple sequence alignment software version 7:
582 improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–
583 780.
584 Kawahara, A.Y., Breinholt, J.W., Espeland, M., Storer, C.G., Plotkin, D., Dexter, K.M.,
585 Toussaint, E.F.A., St Laurent, R.A., Brehm, G., Vargas, S., Forero, D., Pierce, N.E.,
586 Lohman, D.J. (2018) Phylogenetics of moth-like butterflies (Papilionoidea: Hedylidae)
587 based on a new 13-locus target capture probe set. Molecular Phylogenetics and Evolution,
588 127, 600–605.
589 Kergoat, G.J., Prowell, D.P., Le Ru, B.P., Mitchell, A., Dumas, P., Clamens, A.L., Condamine,
590 F.L. & Silvain, J.F. (2012). Disentangling dispersal, vicariance and adaptive radiation
591 patterns: a case study using armyworms in the pest genus Spodoptera (Lepidoptera:
592 Noctuidae). Molecular Phylogenetics and Evolution, 65(3), 855–870.
593 Kück, P. and Meusemann, K. (2010). FASconCAT: Convenient handling of data matrices.
594 Molecular Phylogenetics and Evolution, 56(3), 1115–1118.
595 Kück, P., Meusemann, K., Dambach, J., Thormann, B., von Reumont, B.M., Wägele, J.W. &
596 Misof, B. (2010). Parametric and non-parametric masking of randomness in sequence
597 alignments can be improved and leads to better resolved trees. Frontiers in Zoology, 7(1),
598 10.
599 Lanfear, R., Frandsen, P.B., Wright, A.M., Senfeld, T. & Calcott, B. (2016). PartitionFinder 2:
600 new methods for selecting partitioned models of evolution for molecular and
601 morphological phylogenetic analyses. Molecular Biology and Evolution, 34(3), 772–773. Page 27 of 55 Systematic Entomology
26
602 Li, Y.P., Gao, K., Yuan, F., Wang, P. & Yuan, X.Q. (2017). Molecular systematics of the
603 butterfly tribe Baorini (Lepidoptera: Hesperiidae) from China. Journal of the Kansas
604 Entomological Society, 90(2), 100–108.
605 Matzke, N.J. (2018) BioGeoBEARS: BioGeography with Bayesian (and likelihood)
606 Evolutionary Analysis with R Scripts. version 1.1.1, published on GitHub on November
607 6, 2018.
608 Miller, K.G., Kominz, M.A., Browning, J.V., Wright, J.D., Mountain, G.S., Katz, M.E.,
609 Sugarman, P.J., Cramer, B.S., Christie-Blick, N. & Pekar, S.F. (2005). The Phanerozoic
610 record of global sea-level change. Science, 310, 1293–1298.
611 Nguyen, L.T., Schmidt, H.A., von Haeseler, A. & Minh, B.Q. (2014). IQ-TREE: a fast and
612 effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular
613 Biology and Evolution, 32(1), 268–274.
614 Pirouz, M., Avouac, J.P., Hassanzadeh, J., Kirschvink, J.L. & Bahroudi, A. (2017). Early
615 Neogene foreland of the Zagros, implications for the initial closure of the Neo-Tethys and
616 kinematics of crustal shortening. Earth and Planetary Science Letters, 477, 168–182.
617 Price, M.N., Dehal, P.S. & Arkin, A.P. (2010). FastTree 2–approximately maximum-likelihood
618 trees for large alignments. PloS one, 5(3), e9490.
619 Ree, R.H. (2005). Detecting the historical signature of key innovations using stochastic models
620 of character evolution and cladogenesis. Evolution, 59(2), 257–265.
621 Ree, R.H. & Sanmartín, I. (2018). Conceptual and statistical problems with the DEC+ J model of
622 founder‐event speciation and it comparison with DEC via model selection. Journal of
623 Biogeography, 45(4), 741–749.
624 Ree, R.H. & Smith, S.A. (2008). Maximum likelihood inference of geographic range evolution
625 by dispersal, local extinction, and cladogenesis. Systematic Biology, 57(1), 4–14. Systematic Entomology Page 28 of 55
27
626 Rögl, F. (1998). Palaeogeographic considerations for Mediterranean and Paratethys seaways
627 (Oligocene to Miocene). Annalen des Naturhistorischen Museums in Wien. Serie A für
628 Mineralogie und Petrographie, Geologie und Paläontologie, Anthropologie und
629 Prähistorie, 279–310.
630 Sahoo, R.K., Lohman, D.J., Wahlberg, N., Müller, C.J., Brattström, O., Collins, S.C., Peggie, D.,
631 Aduse-Poku, K. and Kodandaramaiah, U. (2018) Evolution of Hypolimnas butterflies
632 (Nymphalidae): Out-of-Africa origin and Wolbachia-mediated introgression. Molecular
633 Phylogenetics and Evolution, 123, 50–58.
634 Sánchez‐Ramírez, S., Tulloss, R.E., Amalfi, M. & Moncalvo, J.M. (2015). Palaeotropical
635 origins, boreotropical distribution and increased rates of diversification in a clade of edible
636 ectomycorrhizal mushrooms (Amanita section Caesareae). Journal of Biogeography, 42(2),
637 351–363.
638 Seton, M., Müller, R.D., Zahirovic, S., Gaina, C., Torsvik, T., Shephard, G., Talsma, A., Gurnis,
639 M., Turner, M., Maus, S. & Chandler, M. (2012). Global continental and ocean basin
640 reconstructions since 200 Ma. Earth-Science Reviews, 113(3–4), 212–270.
641 Shapiro, B., Sibthorpe, D., Rambaut, A., Austin, J., Wragg, G.M., Bininda-Emonds, O.R., Lee,
642 P.L. & Cooper, A. (2002). Flight of the dodo. Science, 295(5560), 1683–1683.
643 Suchard, M.A., Lemey, P., Baele, G., Ayres, D.L., Drummond, A.J. & Rambaut, A. (2018)
644 Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10. Virus
645 Evolution, 4(1), vey016.
646 Tamar, K., Metallinou, M., Wilms, T., Schmitz, A., Crochet, P.A., Geniez, P. & Carranza, S.
647 (2018). Evolutionary history of spiny‐tailed lizards (Agamidae: Uromastyx) from the
648 Saharo‐Arabian region. Zoologica Scripta, 47(2), 159–173.
649 Tang, J., Huang, Z., Chiba, H., Han, Y., Wang, M. & Fan, X. (2017). Systematics of the genus
650 Zinaida Evans, 1937 (Hesperiidae: Hesperiinae: Baorini). PloS one, 12(11), e0188883. Page 29 of 55 Systematic Entomology
28
651 Toussaint, E.F.A., Condamine, F.L., Kergoat, G.J., Capdevielle-Dulac, C., Barbut, J., Silvain,
652 J.F. & Le Ru, B.P. (2012). Palaeoenvironmental shifts drove the adaptive radiation of a
653 noctuid stemborer tribe (Lepidoptera, Noctuidae, Apameini) in the Miocene. PloS one,
654 7(7), e41377.
655 Toussaint, E.F.A., Hall, R., Monaghan, M.T., Sagata, K., Ibalim, S., Shaverdo, H.V., Vogler,
656 A.P., Pons, J. & Balke, M. (2014). The towering orogeny of New Guinea as a trigger for
657 arthropod megadiversity. Nature Communications, 5, 4001.
658 Toussaint, E.F.A., Fikáček, M. & Short, A.E. (2016). India–Madagascar vicariance explains
659 cascade beetle biogeography. Biological Journal of the Linnean Society, 118(4), 982–991.
660 Toussaint, E.F.A. & Short, A. (2018a). Transoceanic Stepping–stones between Cretaceous
661 waterfalls? The Enigmatic Biogeography of pantropical Oocyclus cascade beetles.
662 Molecular Phylogenetics and Evolution, 127, 416–428.
663 Toussaint, E.F.A., Breinholt, J.W., Earl, C., Warren, A.D., Brower, A.V.Z., Yago, M., Dexter,
664 K.M., Espeland, M., Pierce, N.E., Lohman, D.J., Kawahara, A.Y. (2018b). Anchored
665 phylogenomics illuminates the skipper butterfly tree of life. BMC Evolutionary Biology,
666 18, 101.
667 Voris, H.K. (2000). Maps of Pleistocene sea levels in Southeast Asia: shorelines, river systems
668 and time durations. Journal of Biogeography, 27(5), 1153–1167.
669 Wahlberg, N. & Wheat, C.W. (2008). Genomic outposts serve the phylogenomic pioneers:
670 designing novel nuclear markers for genomic DNA extractions of Lepidoptera. Systematic
671 Biology, 57, 231–242.
672 Warren, A.D., Ogawa, J.R. & Brower, A.V. (2009). Revised classification of the family
673 Hesperiidae (Lepidoptera: Hesperioidea) based on combined molecular and morphological
674 data. Systematic Entomology, 34(3), 467–523. Systematic Entomology Page 30 of 55
29
675 Warren, B.H., Strasberg, D., Bruggemann, J.H., Prys‐Jones, R.P. & Thébaud, C. (2010). Why
676 does the biota of the Madagascar region have such a strong Asiatic flavour? Cladistics,
677 26(5), 526–538.
678 Wolfe, J.A. (1975). Some aspects of plant geography of the Northern Hemisphere during the late
679 Cretaceous and Tertiary. Annals of the Missouri Botanical Garden, 264–279.
680 Yumul Jr, G.P., Dimalanta, C.B., Maglambayan, V.B. & Marquez, E.J. (2008). Tectonic setting
681 of a composite terrane: a review of the Philippine island arc system. Geosciences Journal,
682 12(1), 7.
683 Zachos, J., Pagani, M., Sloan, L., Thomas, E. & Billups, K. (2001). Trends, rhythms, and
684 aberrations in global climate 65 Ma to present. Science, 292(5517), 686–693.
685 Zhang, Y.L., Xue, G.X. & Yuan, F. (2010). Descriptions of the female genitalia of three species
686 of Caltoris (Lepidoptera: Hesperiidae: Baorini) with a key to the species from China.
687 Proceedings of the Entomological Society of Washington, 112(4), 576–584.
688 Zhang, J., Santosh, M., Wang, X., Guo, L., Yang, X. & Zhang, B. (2012). Tectonics of the
689 northern Himalaya since the India–Asia collision. Gondwana Research, 21(4), 939–960.
690 Zhu, J.Q., Chiba, H. & Wu, L.W. (2016). Tsukiyamaia, a new genus of the tribe Baorini
691 (Lepidoptera, Hesperiidae, Hesperiinae). ZooKeys (555), 37. Page 31 of 55 Systematic Entomology
30
692 TABLES
693 Table 1. Comparison of BEAST divergence time analyses.
Analy Relaxed Tree SS MLE PS MLE Baorini Clade I Clade II Clade III Clade IV Clade V sis Clocks Model A1 1 Yule - - 22.519 (18.277- 5.373 (3.378- 13.906 (11.661- 12.215 (10.048- 10.715 (8.625- 12.250 (10.189- 127312.2 127312.46 26.284) 8.031) 16.214) 14.222) 12.701) 14.071) 48 1 A2 1 Birth- - - 21.629 (17.145- 5.017 (3.137- 13.230 (10.716- 11.506 (9.246- 10.059 (7.924- 11.566 (9.366- death 127319.4 127320.46 25.808) 7.547) 15.774) 13.682) 12.198) 13.621) 29 59 A3 2 Yule - - 22.839 (19.543- 5.868 (3.193- 15.105 (13.193- 13.139 (11.137- 11.323 (9.496- 13.282 (11.447- 127146.0 127146.82 26.577) 10.337) 16.643) 14.880) 13.107) 14.824) 05 9 A4 2 Birth- - - 22.676 (18.794- 5.282 (2.872- 14.369 (11.995- 12.362 (10.014- 10.780 (8.695- 12.551 (10.456- death 127155.6 127155.34 26.654) 9.538) 18.373) 14.395) 12.674) 14.435) 05 1 A5 5 Yule - - 22.258 (18.489- 5.246 (3.263- 13.736 (11.478- 12.189 (10.170- 10.400 (8.515- 12.130 (10.253- 127256.9 127256.65 25.943) 7.751) 15.978) 14.113) 12.235) 13.938) 42 6 A6 5 Birth- - - 21.755 (17.937- 4.908 (3.057- 13.129 (10.827- 11.627 (9.604- 9.897 (8.004- 11.604 (9.624- death 127261.9 127261.82 25.543) 7.324) 15.351) 13.619) 11.774) 13.460) 44 2 694
695 Notes: PS, path sampling; SS, stepping-stone sampling; MLE, marginal likelihood estimate from the path sampling and stepping-stone analyses
696 conducted in BEAST; the values in the “Clade” columns are median ages in millions of years estimated in BEAST along with 95% credibility
697 intervals. Systematic Entomology Page 32 of 55
31
698 FIGURE LEGENDS
699 Figure 1. Maximum likelihood tree of Baorini grass skippers based on anchored phylogenomics
700 Best scoring ML phylogenetic tree inferred in IQ-TREE based on the concatenated dataset of 13
701 loci comprised in the BUTTERFLY2.0 kit (Kawahara et al. 2018). The five main Baorini clades
702 are labelled (I-V). A historical drawing of Pelopidas mathias is presented from the book Indian
703 Insect Life by Harold Maxwell-Lefroy (1909).
704
705 Figure 2. Maximum likelihood tree of Baorini grass skippers based on an extended dataset
706 Best scoring ML phylogenetic tree inferred in IQ-TREE based on the concatenated dataset of 16
707 loci including ribosomal loci recovered from GenBank. The five main Baorini clades are labelled
708 (I-V). Green stars indicate species for which data was generated in this study while others for
709 which data was already available are not labelled. Photographs of Baorini species are given on
710 the right side of the figure. From top to bottom: Parnara ganga (credit: Antonio Giudici), Iton
711 semamora (credit: Antonio Giudici), Zenonia zeno (credit: Geoffrey Prosser), Zenonoida
712 discreta (credit: Antonio Giudici), Caltoris cahira (credit: Antonio Giudici), Baoris farri (Jee &
713 Rani Nature Photography), Pelopidas mathias (credit: Laitche), Borbo cinnara (credit: Hsu
714 Hong Lin), Gegenes pumilio (credit: Charles James Sharp).
715
716 Figure 3. Bayesian divergence time estimates and historical biogeography of Baorini skippers
717 Median age timetree from the BEAST analysis using two Bayesian relaxed clocks and a Yule
718 speciation tree model. 95% credibility intervals are shown as grey bars on all nodes. Nodal
719 support values are posterior probabilities (PP), and nodes with PP > 0.95 are labelled with an
720 asterisk. The biogeographic estimation of ancestral ranges from the DEC analysis performed in
721 BioGeoBEARS is presented with the most likely state at each node. Range expansion and long-
722 distance dispersal events are indicated following the inserted caption. The major geological Page 33 of 55 Systematic Entomology
32
723 events that occurred in the past 40 million years and are relevant to the evolution of Baorini are
724 presented under the geological timescale. A historical drawing of Pelopidas mathias is presented
725 from the book Indian Insect Life by Harold Maxwell-Lefroy (1909). Polytremis lubricans Systematic Entomology Page 34 of 55 Pelopidas mathias
Pelopidas jansonis
Pelopidas agna
Pelopidas thrax Borbo fatuellus Borbo Borbo borbonica
Borbo cinnara Borbo gemella
Gegenes nostrodamus Baoris oceia Gegenes pumilio
Afrogegenes letterstedti Baoris farri Caltoris philippina Afrogegenes hottentota
Borbo detecta Caltoris cormasa V Larsenia fallax IV Prusiana prusias Larsenia holtzi
Larsenia perobscura Zenonia zeno III Larsenia sirena Baorini Hedylidae Zenonia anax Macrosoma hyacinthina
Badamia exclamationis Zinaida pellucida II Hasora chromus Iton watsonii Burara etelka Iton semamora I Choaspes benjaminii
Coeliades forestan Parnara bada Euschemon rafflesia
Parnara apostata Mylon illineatus Epargyreus clarus Parnara guttata Passova gellias
Leptalina unicolor
Piruna pirus Hesperiidae Saliana esperi Trapezites petalia Coeliadinae
Euschemoninae Hesperia comma Zela zeus luedheri Eudaminae Isma bononia Pyrginae Leona SH-aLRT > 80 and UFBoot > 95 Heteropterinae SH-aLRT > 80 or UFBoot > 95 Trapezitinae Aeromachus inachus SH-aLRT < 80 and UFBoot < 95 Hesperiinae Outgroups Page 35 of 55 Systematic Entomology Parnara amalia I Parnara bada Parnara apostata Parnara naso Parnara ganga 91/85 Parnara ogasawarensis Parnara guttata Parnara batta Iton semamora II 50/37 Iton watsonii Tsukiyamaia albimacula Zenonia zeno Zenonia anax Baorini 73/28 Zenonoida discreta Zenonoida eltota Zinaida pellucida 94/89 Zinaida zina Zinaida jigongi 87/74 Zinaida matsuii Zinaida kiraizana Zinaida gigantea Zinaida suprema 85/90 Zinaida nascens Zinaida micropunctata Zinaida caerulescens Zinaida gotama Zinaida mencia Zinaida theca Prusiana prusias III 100/90 Caltoris brunnea Caltoris cormasa Caltoris philippina 100/92 Caltoris malaya Caltoris aurociliata 100/85 Caltoris bromus Caltoris kumara Caltoris sirius Caltoris cahira Caltoris septentrionalis 89/80 Baoris pagana Baoris oceia Baoris farri IV Baoris leechii 100/91 Baoris penicillata Pelopidas thrax Pelopidas lyelli Pelopidas jansonis Pelopidas subochracea Pelopidas assamensis Pelopidas conjuncta 71/61 Pelopidas mathias Pelopidas agna Pelopidas sinensis Polytremis lubricans 0.03 Pseudoborbo bevani 91/77 Borbo fatuellus Borbo ratek V Borbo borbonica Borbo cinnara Borbo gemella Gegenes nostrodamus Gegenes pumilio Borbo detecta AHE data (BUTTERFLY2.0 kit) Afrogegenes hottentota SH-aLRT > 80 and UFBoot > 95 Afrogegenes letterstedti SH-aLRT > 80 or UFBoot > 95 Larsenia holtzi SH-aLRT < 80 and UFBoot < 95 Larsenia fallax Larsenia perobscura Larsenia sirena * Parnara amalia Systematic Entomology * ParnaraPage bada 36 of 55 I Parnara apostata Parnara naso * Parnara ganga * Parnara ogasawarensis * Parnara guttata II * Parnara batta * Iton watsonii * Iton semamora Tsukiyamaia albimacula * * Zenonia zeno * Zenonia anax * Zenonoida discreta Zenonoida eltota * * Zinaida pellucida Zinaida zina * * Zinaida matsuii * Zinaida kiraizana * * Zinaida gigantea Zinaida suprema Zinaida jigongi * Zinaida mencia Zinaida theca * * Zinaida gotama Zinaida caerulescens * Zinaida nascens Zinaida micropunctata III Prusiana prusias * Caltoris brunnea Caltoris philippina Caltoris cormasa Caltoris aurociliata Caltoris malaya Range expansion Caltoris sirius Caltoris septentrionalis Caltoris cahira * BEAST PP ≥ 0.95 Caltoris kumara Caltoris bromus BEAST 95% CI * Baoris pagana * Baoris oceia Baoris penicillata Western Palearctic Baoris leechii * IV Africa / Madagascar / Mascarenes Baoris farri Pelopidas thrax Arabian Peninsula * Pelopidas lyelli * Pelopidas jansonis Oriental Region Pelopidas subochracea Greater Sunda Islands / Philippines Pelopidas sinensis * Pelopidas agna Wallacea Pelopidas mathias * Pelopidas conjuncta Australia / New Guinea Pelopidas assamensis Polytremis lubricans * Pseudoborbo bevani * * Borbo fatuellus * Borbo ratek * Borbo borbonica V * * Borbo cinnara Borbo gemella * Gegenes nostrodamus * Gegenes pumilio Borbo detecta * * Afrogegenes hottentota * Afrogegenes letterstedti Larsenia holtzi Larsenia fallax * Larsenia perobscura Larsenia sirena TS1 TS2 Time OLIGOCENE Intermittent Mascarenes/SeychellesMIOCENE subaerial island chain Plio. Ple. (Ma) 30 20 10 0 Formation of Wallacea Tethys Sea opened Formation of the Gomphotherium land bridge C4 grassland emergence Intermittent Mascarenes/Seychelles subaerial island chain