1 Mitochondrial genomes resolve the phylogeny

2 of (Coleoptera) and confirm tiger

3 (Cicindelidae) as an independent family 4 Alejandro López-López1,2,3 and Alfried P. Vogler1,2 5 1: Department of Life Sciences, Natural History Museum, London SW7 5BD, UK 6 2: Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot SL5 7PY, UK 7 3: Departamento de Zoología y Antropología Física, Facultad de Veterinaria, Universidad de Murcia, Campus 8 Mare Nostrum, 30100, Murcia, 9 10 Corresponding author: Alejandro López-López ([email protected]) 11 12 Abstract 13 The suborder Adephaga consists of several aquatic (‘Hydradephaga’) and terrestrial 14 (‘Geadephaga’) families whose relationships remain poorly known. In particular, the position 15 of Cicindelidae (tiger beetles) appears problematic, as recent studies have found them either 16 within the Hydradephaga based on mitogenomes, or together with several unlikely relatives 17 in Geadeadephaga based on 18S rRNA genes. We newly sequenced nine mitogenomes of 18 representatives of Cicindelidae and three ground beetles (Carabidae), and conducted 19 phylogenetic analyses together with 29 existing mitogenomes of Adephaga. Our results 20 support a basal split of Geadephaga and Hydradephaga, and reveal Cicindelidae, together 21 with , as sister to all other Geadephaga, supporting their status as Family. We 22 show that alternative arrangements of basal adephagan relationships coincide with increased 23 rates of evolutionary change and with nucleotide compositional bias, but these confounding 24 factors were overcome by the CAT-Poisson model of PhyloBayes. The mitogenome + 18S 25 rRNA combined matrix supports the same topology only after removal of the hypervariable 26 expansion segments. Dense taxon sampling of mitogenomes, analyzed with site 27 heterogeneous mixture models, produce well-supported trees of the Adephaga. Mitochondrial 28 genomes are an increasingly valuable tool for resolving deep phylogenetic relationships, 29 outperforming the previously most widely used ribosomal RNA markers. 30 31 32 Keywords 33 Coleoptera; Adephaga; Cicindelidae; phylogeny; mitochondrial genomes 34 35

1 36 Introduction 37 Mitochondrial genome sequences are now widely available, and with greater taxon sampling 38 they provide an increasingly complete picture of deep relationships among and within the 39 orders (Li et al., 2015; Simon and Hadrys, 2013; Song et al., 2016; Timmermans et al., 40 2016). However, the complexities of rate heterogeneity and compositional heterogeneity that 41 are prevalent in mitochondrial sequence evolution confound phylogenetic inferences from 42 these data (Pons et al., 2010; Song et al., 2010; Talavera and Vila, 2011). Several recent 43 studies have shown that long-branch attraction is partly overcome through the use of 44 Bayesian mixture models implemented in the PhyloBayes that allow multiple independent 45 substitution processes with their own rate estimates and equilibrium frequencies, and whose 46 optimal number is estimated from the data (Lartillot et al., 2007; Lartillot et al., 2009). The 47 use of this software has resolved relationships that could not be estimated with other 48 likelihood and Bayesian models (Song et al., 2016; Talavera and Vila, 2011). For example, in 49 the Coleoptera (beetles), the mitochondrial sequence of Tetraphalerus, a member of the 50 suborder , was widely observed in a spurious position within the , but 51 it was correctly placed outside of the other coleopteran orders in trees generated with 52 PhyloBayes (Timmermans et al., 2016; Timmermans et al., 2010). 53 54 Despite these advances in data generation and phylogenetic methodology, on various 55 occasions mitochondrial genomes have failed to establish relationships correctly. In the 56 current study we address the problem of basal relationships in the coleopteran suborder 57 Adephaga, which were recently analyzed as part of a much larger study of coleopteran 58 phylogeny based on some 250 mitogenomes (Timmermans et al., 2016). That study was 59 successful in resolving the relationships of the major superfamilies of Polyphaga (the 60 suborder containing some 90% of all described beetle ), but relationships in Adephaga 61 remained implausible and specifically failed to recover the widely established sister 62 relationship of the terrestrial Geadephaga and aquatic Hydradephaga. This was a surprising 63 finding given the evidence from 18S rRNA data (Maddison et al., 1999; Shull et al., 2001) 64 and a combination of rRNA genes and single-copy nuclear markers (McKenna et al., 2015), 65 but was in accordance with earlier morphological studies that consider the terrestrial groups, 66 including the large family Carabidae (ground beetles), to be derived from within the aquatic 67 lineages. These topologies would imply either a secondary loss of aquatic lifestyle leading to 68 Geadephaga or the independent origins of various aquatic lineages, as proposed previously 69 (Beutel, 1993; Kavanaugh, 1986). The alternative arrangement of a Geadephaga that is

2 70 paraphyletic for Hydradephaga has also been proposed, suggesting a single origin of aquatic 71 life style from a terrestrial ancestor near the base of the Carabidae. This scenario is supported 72 by some characters in Trachypachidae, a lineage presumed to represent a basal split within 73 the Geadephaga that, although entirely terrestrial, exhibits some characteristics that have been 74 interpreted to show similarities to those of the extant aquatic lineages (Bell, 1966; Crowson, 75 1960; Hammond, 1979). 76 77 The question about the relationships among terrestrial and aquatic groups is further 78 complicated by the poor understanding of the basal branches in Geadephaga, in particular 79 with regard to the placement of the small families Cicindelidae, and Paussidae, 80 which are morphologically divergent from carabids and have been variously considered as a 81 sublineage within Carabidae or as closely related lineages outside of the Carabidae + 82 Trachypachidae clade (Bell, 1994; Beutel, 1992). The uncertainty about their phylogenetic 83 position has also affected the taxonomic status of these lineages, in particular of the 84 Cicindelidae (tiger beetles), which in the literature have been considered as separate family or 85 as subfamily of Carabidae, without reaching a consensus (Bils, 1976; Crowson, 1981; 86 Regenfuss, 1975). They are usually awarded family status due to their highly divergent 87 morphology and adaptations of the larvae, which are sit-and-wait predators hunting from 88 deep burrows in the ground (Cassola, 2001; Pearson and Vogler, 2001). It is still disputed 89 whether these characteristics are derived from carabids, or if they point to an independent 90 origin of the cicindelids leading to their peculiar life style. 91 92 Mitochondrial genomes have produced to date a confusing picture of cicindelid relationships 93 with other taxa, as the only available sequence (Habrodera) was placed as sister taxon to the 94 representative of the aquatic family (Timmermans et al., 2016), i.e. in a highly 95 improbable position distant from the Carabidae. Equally, the relationships of the aquatic 96 families were highly unexpected, because the burrowing were the sister to all other 97 Adephaga, although they are usually considered related to the Dytiscoidea (Dytiscidae, 98 Amphizoidae, Hygrobiidae and Aspidytidae), a monophyletic lineage that includes the great 99 majority of all hydradephagan species. The remaining aquatic lineages including the algal- 100 feeding and surface-hunting Gyrinidae were arranged in a comb-like fashion near 101 the base of the Adephaga (Timmermans et al., 2016). These findings are clearly in 102 contradiction to the topologies from nuclear genes (Maddison et al., 1999; McKenna et al., 103 2015; Shull et al., 2001), and mitochondrial loci may drive trees in combined analyses with

3 104 nuclear genes (Bocak et al., 2014). Yet, the rRNA loci, which remain the most widely 105 available nuclear markers, have themselves resulted in unexpected relationships, in particular 106 in regard to the position of Cicindelidae in a derived lineage of carabids, together with 107 , Paussinae and Scaritini, to form the so-called CRPS quartet (Maddison et al., 108 1998, 1999). This locus thus groups some of the morphologically and ecologically most 109 derived geadephagan lineages and places them near the species-rich and evolutionarily recent 110 , which was unexpected from morphological analyses, but detailed data 111 exploration could not attribute the CRPS quartet to the result of long-branch attractions 112 (Maddison et al., 1999). 113 114 The current study attempts to resolve the question about basal relationships in Adephaga and 115 to reconcile the differences in topology obtained with mitochondrial and nuclear markers. We 116 address specifically the key discrepancies between the results obtained with both markers 117 regarding the Geadephaga - Hydradephaga split, which is supported by rRNA (Maddison et 118 al., 1999; Shull et al., 2001) and single-copy nuclear genes (McKenna et al., 2015) only, and 119 the position of Cicindelidae, which is placed in a derived position within Carabidae based on 120 rRNA and outside of Carabidae by mitogenomes. Using eight additional mitogenomes for a 121 more complete sample of cicindelids, supplemented with additional carabids including the 122 CRPS quartet, and compiling newly available mitogenomes from other adephagan taxa, we 123 present phylogenetic trees that clarify the basal adephagan relationships, and by combining 124 these analyses with rRNA sequences we resolve the causes of topological incongruence 125 between both markers. 126 127 128 Material and Methods 129 DNA extractions from eight tiger beetles species (Omus cazieri, Platychile pallida, 130 Pseudotetracha mendacia, Australicapitona hopei, Odontocheila sp, 131 subtiligrossum, tibialis and ), comprising a wide range of 132 cicindelid phylogenetic diversity, and three other Adephaga including contractus 133 (Paussinae), cephalotes (), and buparius (Scaritini) were retrieved 134 from the frozen tissue collections of the Natural History Museum (London). The DNA 135 concentration was determined for each sample using the Qubit dsDNA high-sensitivity kit 136 (Invitrogen), and samples were pooled in equal concentrations to maximize assembly 137 success. This pool was sequenced together with approx. 100 species on a single flow cell of

4 138 the Illumina MiSeq (2x300 bp), following procedures described previously (Crampton-Platt 139 et al., 2015). A total of 8529168 reads were obtained. 140 141 We extracted and assembled the mitogenomes following a modified version of the pipeline 142 by Crampton-Platt et al. (2015), using FastQC (Andrews, 2010) to check the quality of the 143 data and removing the adapters and indexes with Trimmomatic (Bolger et al., 2014). The 144 reads were assembled into contigs with various assembling software. We used Celera 145 Assembler (Myers et al., 2000), IDBA-UD (Peng et al., 2012)and Newbler (Margulies et al., 146 2005), which produced 18298, 51978 and 141176 contigs, respectively. These contigs were 147 merged into supercontigs using Geneious (Kearse et al., 2012), and only those of >1000bp 148 showing similarity to mitogenomes of Coleoptera in Blast searches were retained. Among the 149 resulting mitogenome assemblies we obtained 23 sequences containing the cox1 and cob 150 regions of which eleven could be assigned to particular species based on a set of existing bait 151 sequences. We thus obtained eight species of Cicindelidae and three species of Carabidae to 152 represent the Paussinae (Metrius), Scaritini (Scarites), and Broscini (Broscus). 153 154 Each of the identified contigs was manually checked for circularity and annotated using the 155 MITOS web server (Bernt et al., 2013). The annotations were manually edited in order to 156 correct possible mistakes. The complete sequences of the 11 protein-coding genes (atp6, cytb, 157 cox1, cox2, cox3, nad1, nad2, nad3, nad4, nad5 and nad6) and two ribosomal RNA genes 158 (rrnL and rrnS) were extracted from each mitochondrial genome. Two fragments of the 159 nuclear 18S rRNA (18Sa and 18Sb) were obtained for each species from GenBank, 160 corresponding to position 32 to 720 and 777 to 1938 of Drosophila melanogaster. Where 18S 161 sequences were not available for a particular species, related taxa were used to complete the 162 data matrix, thereby creating chimerical terminals composed of closely related species. The 163 sequences of each fragment were aligned to those of 30 other Adephaga whose mitochondrial 164 genomes had been previously sequenced (Linard et al., pers. communication) (Supplementary 165 Table 1), using the Muscle algorithm (Edgar, 2004) in Geneious and manually correcting 166 some errors. Additional sequences from other suborders of Coleoptera were included as 167 outgroups (Supplementary Table 1). 168 169 The sequences for the 15 gene fragments were concatenated to generate a matrix of 13948 bp 170 (Supplementary Table 2). This matrix was submitted to Phylobayes 4.1 (Lartillot et al., 171 2009), running 4 independent chains for 30000 generations and using a different partition for

5 172 each fragment. Additional analyses in which the dataset was modified were carried out: a) a 173 data set including only the mitochondrial fragments; b) the same data set but also excluding 174 , Meru and ; c) a mitochondrial dataset excluding Rhysodes, Meru, Noterus 175 and the Megacephalini; d) a data set in which the missing 18S sequences were filled with 176 sequences from GenBank; e) the same data set but excluding Rhysodes, Meru and Noterus; f) 177 a data set from which the hypervariable regions of the 18S fragments were removed; g) the 178 same dataset excluding Rhysodes, Meru and Noterus; and h) the dataset without the 179 hypervariable 18S regions only including one cicindelid (Cicindela campestris). For each 180 analysis, the convergence of the chains was checked and a consensus tree was built. In some 181 cases where one of the chains was stuck in a local maximum (largest discrepancy across all 182 bipartitions > 0.3), that chain was isolated and removed before building the consensus tree. 183 Tip-to-root distances were obtained with Dendroscope 3 (Huson and Scornavacca, 2012). 184 185 The original matrix was additionally analyzed using BEAST (Drummond and Rambaut, 186 2007) via the user interface BEAUti. Each gene was used as a separate partition and the best 187 nucleotide substitution model for each was calculated in jMODELTEST, considering 188 different parameters for each codon position. A strict clock was set for each partition, using 189 normal distributions according to the rates calculated by Pons et al. (2010) for the protein 190 coding fragments and estimating the rate for the rRNA fragments (rrnL, rrnS, 18Sa and 18Sb) 191 together with the tree. The analysis ran for 100 million generations, sampling a tree each 192 5000 steps. These trees were summarized with the TREEANNOTATOR software 193 (distributed with BEAST), removing the first 2000 trees. 194 195 The proportion of GC-rich amino acid codons (GARP) and CG content were calculated in

196 MEGA (Kumar et al., 2016) and the mean Ka (non-synonymous substitution rate) values 197 between each taxon and the rest of taxa was determined with DNASP (Librado and Rozas, 198 2009). The AliGROOVE software was used to assess bias in nucleotide composition and 199 evolutionary rates; the program uses comparisons between pairs of species to assess 200 heterogeneities in sequence divergence based on a model of evolution derived from the entire 201 matrix, to identify divergent or misaligned sequences that may confound phylogenetic 202 inference (Kuck et al., 2014). 203 204 205 Results

6 206 The complete mitochondrial genome of Manticora tibialis was recovered in all three 207 assemblies. The mitogenomes for the other ten sequenced species were nearly complete. The 208 gene order followed the presumed ancestral mitogenome gene order in (Cameron, 209 2014). However, the mitogenome of Omus cazieri showed two major rearrangements and 210 constitutes only the second discovery of a gene arrangement in Coleoptera involving a 211 protein coding gene (Figure 1). These rearrangements were observed independently in the 212 output of the three assemblers used. 213 214 The AliGROOVE analysis found strongly divergent patterns in the two species of 215 Megecephala (Cicindelidae), Rhysodes (Rhysodinae) and Meru (Meruidae) (Figure 2). 216 Generally, members of Cicindelidae differed from other species, including other cicindelids, 217 whereas divergences within Carabidae and Hydradephaga more closely resembled the 218 expected distributions. Tip-to-root distances were strikingly greater in Noterus, Meru and 219 Rhysodes, which was equally evident in non-synonymous changes (Figure 3). The 220 representatives of Cicindelidae generally showed slightly elevated rates, while most other 221 species were slightly below the mean (Figure 3). Finally, the proportion of amino acids 222 coded by GC-rich codons (Glutamine, Alanine, Threonine and Proline, ‘GARP’) was 223 elevated in Rhysodes and in Cicindelidae, in particular in Megacephalini, but lower in most 224 other groups, with lowest scores for Meru, Noterus and Trechus (Carabidae) (Figure 3). 225 226 Tree topologies obtained from the combined data set of mitogenomes and full set of 18S 227 sequences using the CAT-Poisson model in PhyloBayes recovered a paraphyletic 228 Hydradephaga, in agreement with other mitogenome studies (Timmermans et al., 2016), as 229 well as the CRPS quartet of the 18S study of (Maddison et al., 1999), although the latter also 230 included the large carabid subfamily Harpalinae (Fig. 4). The recovery of the CRPS quartet 231 thus placed the Cicindelidae as a derived member of Carabidae. The tree also revealed the 232 extremely long terminal branches leading to the Meru and Noterus separating near the base of 233 the tree, and to Rhysodes (a member of the CRPS group) (Fig. 4). The search using the CAT- 234 GTR model was overall very similar but produced a basal polytomy of Geadephaga, 235 Dytiscoidea and other Hydradephaga, i.e. no longer supporting the paraphyly of 236 Hydradephaga, although lacking support for its monophyly and sister relationship to 237 Geadephaga. 238

7 239 Further analyses based on the CAT-Poisson model sought to establish the causes of the 240 conflicting analyses of mitogenomes and 18S data, by first removing the 18S data altogether. 241 The resulting tree recovered the Geadephaga to be paraphyletic for Hydradephaga and 242 Noterus (Noteridae) occupying the basal branch, similar to the mitogenome tree of 243 Coleoptera (Timmermans et al., 2016). Removal of the three long-branch taxa produced the 244 reciprocal monophyly of Geadephaga and Hydradephaga, while Cicindelidae together with 245 Trachypachidae split at the basal node of Geadephaga (Sup. Fig. 1). Finally, removing the 246 18S sequences of the chimerical terminals or retaining all 18S sequences but removing the 247 hypervariable regions (Fig. 5), we found a) the reciprocal monophyly of Geadephaga and 248 Hydradephaga; b) the split of Hydradephaga into Dytiscoidea (=Aspidytidae, Hygrogbiidae, 249 Dystiscidae) and all others (Haliplidae (Gyrinidae, (Noteridae, Meruidae)))); and c) the sister 250 relationship of Cicindelidae + Trachypachidae to all Carabidae, including Rhysodinae, 251 Paussinae and Scaritini, which however no longer showed close affinities, i.e. the CRPS 252 quartet was no longer recovered. 253 254 The final analysis concerned the long-branch taxa (Rhysodes, Noterus, Meru), whose removal 255 from the dataset lacking the 18S rRNA hypervariable regions had no discernable impact on 256 the basal relationships. A further concern could be the over-representation of Cicindelidae, 257 but removal of all representatives except for C. campestris, produced a topology consistent 258 with the larger dataset although the basal nodes were not resolved, resulting in a polytomy of 259 Carabidae + Cicindelidae + Trachypachidae. Unlike the PhyloBayes analysis, the topologies 260 obtained with BEAST (Supplementary Fig. 3) showed the long-branch taxa Meru, Noterus 261 and Rhysodes outside of all other ingroup taxa, and Cicindelidae as sister to an implausible 262 clade consisting of Gyrinidae + (Carabidae + Hydradephaga), while Trachypachidae is sister 263 to the carabid subfamily Carabinae. Removing the long-branch taxa resulted in the same 264 improbable relationships. In summary, under the PhyloBayes analysis the mitogenome data 265 recovered the basal separation of Geadephaga - Hydradephaga, placed the Cicindelidae 266 (together with Trachypachidae) as sister to Carabidae, and removed the CRPS quartet, while 267 18S data concurred on the basal Geadephaga - Hydradephaga separation, but created the 268 spurious position of Cicindelidae that was mainly driven by hypervariable, alignment 269 sensitive expansion segments (Sup. Fig. 2). 270 271 272 Discussion

8 273 This study was motivated by the inconsistent phylogenetic trees obtained from mitochondrial 274 and 18S rRNA genes, and the wider question about the utility of mitochondrial genomes for 275 beetle phylogenetics. Whole mitochondrial genomes can now be generated at low cost for 276 dense taxon sampling (Cameron, 2014; Gillett et al., 2014), on par with the much more 277 widely available rRNA genes. Greater taxon density may alleviate the effects of high rates of 278 molecular evolution and variation in nucleotide composition that frequently compromise the 279 utility of mitochondrial sequences. In this combined analysis of mitogenomes and 18S rRNA 280 of the Adephaga we find that three highly divergent mitogenomes affect the recovery of the 281 Hydradephaga - Geadephaga monophyly, while 18S rRNA is responsible for the erroneous 282 recovery of the CRPS quartet. The preferred topology of Fig. 3 can be arrived at by either 283 removing the three divergent mitogenomes (Rhysodes, Noterus, Meru), or by removing the 284 expansion regions (or by removing several heterologous 18S rRNAs that are from close 285 relatives to those with available mitogenomes). After removal of the expansion segments, the 286 18S does make an important contribution, as it compensates the negative effect of the three 287 divergent genomes in a joint analysis. Finally, mitogenomes only produced the preferred 288 topology when analyzed with the PhyloBayes software that is known to overcome problems 289 of long-branch attraction (Lartillot et al., 2007; Talavera and Vila, 2011). 290 291 We attempted to dissect various biases of mitogenome evolution by establishing the effect of 292 variation in evolutionary rates and compositional bias. Both factors were not correlated. 293 Using root-to-tip distances, greatly increased rates were found for Meru and Noterus, whose 294 sister relationships are well established in morphological and molecular studies (Balke et al., 295 2008; Beutel et al., 2006; Dressler et al., 2011), but Rhysodes is an independent lineage with 296 greatly increased rates. The long branches are explained by a high number of non-

297 synonymous substitutions (see Ka rate; Fig. 3). The Cicindelidae also show above-average 298 branch lengths compared to other Adephaga, but this seems to be correlated with their higher 299 proportion of GC and the correlated use of GC-rich codons (GARP), but it is not linked to

300 higher Ka and thus synonymous changes are mainly responsible for the increased apparent 301 molecular rates in Cicindelidae. The AliGroove analysis also signifies the non-conformant 302 composition of the long-branched and GC-rich lineages, in particular the two most GC-rich 303 cicindelids (Pseudotetracha and Australicapitona) representing the subtribe Megacephalini, 304 which are even more divergent in this analysis than the long-branch taxa. Interestingly, the 305 GC content increases from the basal to some derived nodes, with highest proportions in the 306 Cicindelina + Megacephalina clade. High GC content separates the Cicindelidae from all

9 307 other taxa, followed closely by Rhysodes, while Meru + Noterus have lowest GC. The 308 differences in GC content among these three long-branch taxa may reduce their susceptibility 309 to being attracted to each other (while causing the paraphyly of Hydradephaga when Meru + 310 Noterus are attracted to the outgroups). 311 312 The 18S locus has consistently produced an unexpected placement of Cicindelidae within 313 Carabidae, together with other groups that have been difficult to place in part due to their 314 unique ecology as ant associates (many Paussinae), feeding on slime molds in rotten wood 315 (Rhysodinae) and burrowing underground (). The CRPS quartet (which in the 316 current study is extended here to include the vast subfamily Harpalinae) was robust to various 317 modelling approaches that simulate sequence evolution based on the actual rates and 318 composition of the overall data set (Maddison et al., 1999). The CRPS quartet is also present 319 in trees from 28S rRNA and from combined analysis of 18S + 28S rRNA with the nuclear 320 wingless marker (Maddison et al., 2009), as well as the combined analysis of six single-copy 321 nuclear markers and 18S rRNA (McKenna et al., 2015). In contrast, wingless alone did not 322 recover the CRPS quartet, and instead finds Cicindelidae as sister taxon to Carabidae (that 323 also includes Trachypachidae) (Maddison et al., 2009), which suggests that the rRNA 324 markers drive this relationship. However, the CRPS quartet was also obtained in separate 325 analysis of McKenna et al.’s (2015) single-copy nuclear markers without the 18S rRNA 326 marker (their Suppl. Fig. 5), which argues against the hypothesis that the CRPS quartet is an 327 artifact produced by rRNA, but we could not reproduce this result in a reanalysis of the data 328 made available with their manuscript, while we readily obtained this grouping when adding 329 the 18S rRNA partition to these data (not shown). Thus, the rRNA is confounded, in 330 particular by the expansion segments, which in the Carabidae and Cicindelidae are among the 331 longest of any Coleoptera and which accumulate ‘simple sequences’ partly composed of short 332 repeat motifs whose AT content is correlated with sequence length (Vogler et al., 1997). 333 These sequences thus show a tendency to converge on similar sequence composition leading 334 to strong long-branch attraction. Likelihood models assume independence of sites, lack of 335 insertion and deletion events, and thus do not capture the full complexity of 18S rDNA 336 evolution (Maddison et al., 1999). 337 338 Cicindelids are separated from carabids by several well-defined differences in morphological, 339 ecological and behavioural features linked to the peculiar predatory lifestyle of adult and 340 larval tiger beetles (Erwin, 1984). Previous studies have struggled to attribute these

10 341 differences to indicate a long separate evolutionary history (Cassola 2001) or to a highly 342 derived state originating from within a carabid ancestral stock. The mitogenome and single 343 copy nuclear markers all support Cicindelidae (with or without the Trachypachidae) as sister 344 to Carabidae, supporting the scenario of an early divergence. The position of Trachypachidae 345 is important due to the apparently plesiomorphic traits shared with Hydradephaga (sometimes 346 formalised as Glabricornia; (Deuve, 1993)), including similarity in female genitalia, 347 abdominal muscles, coxal morphologies that might have been present in the ancestral 348 Adephaga (Bils, 1976; Deuve, 1993; Nichols, 1985), but was lost in Cicindelidae and 349 Carabidae. Equally, the Cicindelidae possess presumed plesiomorphic adephagan characters, 350 such as the production of in the pygidial gland secretion (Ball, 1979). 351 352 The remaining members of the CRPS quartet (Rhysodinae, Paussinae and Scaritinae = RPS) 353 were recovered as a monophyletic group even with the mitogenomes alone, supporting their 354 position near the carabid subfamily Harpalinae. Basal relationships inside of Carabidae 355 suggest the separation of branches leading to Carabinae first, then to Nebrinae and 356 Elaphrinae, then , and finally the Harpaline + RPS quartet and its related groups 357 and , which presents a useful framework for future work on Carabidae 358 using mitogenomes. The fairly close relationship of the RPS group has been recognised in the 359 older literature, e.g. suggesting that the rhysodines with their subcortical lifestyle are derived 360 from a scaritine-like ancestor burrowing in the soil (Bell, 1994). However, this literature 361 generally supported a grouping of rhysodine and paussine/metriine lineage outside of 362 Carabidae (e.g. (Regenfuss, 1975)). The mitogenomes (and all other loci) support the notion 363 that rhysodids, paussines and scaritines are derived carabids, although their common origin 364 within the Harpalinae is not entirely conclusive and requires further taxon sampling, 365 including a wider range of basal carabid groups and a wider range of Harpalinae (see (Ober, 366 2002)). The close association of these taxa is plausible based on existing literature but their 367 disparate and distinctive appearance is difficult to explain (Maddison et al., 1999). 368 Consolidating their phylogenetic placement will be a first step towards understanding the 369 apomorphic traits of these divergent geadephagan lineages, which can now be studied 370 separately for the PRS group with their obscure and partially non-predatory lifestyles and for 371 the agile, day-active hunting cicindelids. 372 373 374 Acknowledgements

11 375 We thank Alex Crampton-Platt for her valuable help with the assemblies and to Peter Foster 376 for his comments on the phylogenetic analyses. This work was funded by the NHM 377 Biodiversity Initiative. 378 379 Bibliography 380 Andrews, S., 2010. FastQC: A quality control tool for high throughput sequence data. 381 Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc. 382 Balke, M., Ribera, I., Beutel, R., Viloria, A., Garcia, M., Vogler, A.P., 2008. Systematic 383 placement of the recently discovered beetle family Meruidae (Coleoptera: Dytiscoidea) 384 based on molecular data. Zool. Scr. 37, 647-650. 385 Bell, R.T., 1966. and the origin of Hydradephaga. Coleopts Bull. 20, 107-112. 386 Bell, R.T., 1994. Beetles that cannot bite: Functional morphology of the head of adult 387 rhysodines (Coleoptera: Carabidae or Rhysodidae). Canadian Entomologist 126, 667- 388 672. 389 Bernt, M., Donath, A., Juhling, F., Externbrink, F., Florentz, C., Fritzsch, G., Putz, J., 390 Middendorf, M., Stadler, P.F., 2013. MITOS: Improved de novo metazoan mitochondrial 391 genome annotation. Mol. Phylogenet. Evol. 69, 313-319. 392 Beutel, R.G., 1992. Phylogenetic analysis of thoracic structures of Carabidae (Coleoptera, 393 Adephaga). Zeitschrift fur Zoologische Systematik und Evolutionsforschung 30, 53-74. 394 Beutel, R.G., 1993. Phylogenetic analysis of Adephaga (Coleoptera) based on characters of 395 the larval head. Syst. Entomol. 18, 127-147. 396 Beutel, R.G., Balke, M., Steiner, W.E., 2006. The systematic position of Meruidae 397 (Coleoptera, Adephaga) and the phylogeny of the smaller aquatic adephagan beetle 398 families. Cladistics 22, 102-131. 399 Bils, W., 1976. Das Abdomenende weiblicher, terrestrisch lebender Adephaga (Coleoptera) 400 und seine Bedeutung für die Phylogenie. Zoomorphologie 84, 113-193. 401 Bocak, L., Barton, C., Crampton-Platt, A., Chesters, D., Ahrens, D., Vogler, A.P., 2014. 402 Building the Coleoptera tree-of-life for > 8000 species: composition of public DNA data 403 and fit with Linnaean classification. Syst. Entomol. 39, 97-110. 404 Bolger, A.M., Lohse, M., Usadel, B., 2014. Trimmomatic: a flexible trimmer for Illumina 405 sequence data. Bioinformatics 30, 2114-2120. 406 Cameron, S.L., 2014. Insect mitochondrial genomics: Implications for evolution and 407 phylogeny. Annu. Rev. Entomol. 59, 95-117. 408 Cassola, F., 2001. Studies on tiger beetles. CXXIII. Preliminary approach to the 409 macrosystematics of the tiger beetles (Coleoptera: Cicindelidae). Russian Entomological 410 Journal 10, 265-272.

12 411 Crampton-Platt, A., Timmermans, M., Gimmel, M.L., Kutty, S.N., Cockerill, T.D., Khen, C.V., 412 Vogler, A.P., 2015. Soup to tree: The phylogeny of beetles inferred by Mitochondrial 413 Metagenomics of a Bornean rainforest sample. Mol. Biol. Evol. 32, 2302-2316. 414 Crowson, R.A., 1960. The phylogeny of Coleoptera. Ann. Rev. Entomol. 5, 111-134. 415 Crowson, R.A., 1981. The biology of Coleoptera. Academic Press, London. 416 Deuve, T., 1993. L' et les genitalia des femelles de Coléoptères Adephaga. Mém. 417 Mus. natn. Hist. nat. 155, 1-184. 418 Dressler, C., Ge, S.Q., Beutel, R.G., 2011. Is Meru a specialized noterid (Coleoptera, 419 Adephaga)? Syst. Entomol. 36, 705-712. 420 Drummond, A.J., Rambaut, A., 2007. BEAST: Bayesian evolutionary analysis by sampling 421 trees. BMC Evol. Biol. 7, Art. 214. 422 Edgar, R.C., 2004. MUSCLE: multiple sequence alignment with high accuracy and high 423 throughput. Nucleic Acids Research 32. 424 Gillett, C., Crampton-Platt, A., Timmermans, M., Jordal, B.H., Emerson, B.C., Vogler, A.P., 425 2014. Bulk De Novo Mitogenome Assembly from Pooled Total DNA Elucidates the 426 Phylogeny of (Coleoptera: Curculionoidea). Mol. Biol. Evol. 31, 2223-2237. 427 Hammond, P.M., 1979. Wing-folding mechanisms of beetles with special reference to 428 investigations of Adephagan phylogeny. In: Erwin, T.L., Ball, G.E., Whitehead, D.R., 429 Halpern, A. (Eds.), Carabid beetles; their evolution, natural history, and classification. W. 430 Junk, The Hague, pp. 113-180. 431 Huson, D.H., Scornavacca, C., 2012. Dendroscope 3: An interactive tool for rooted 432 phylogenetic trees and networks. Syst. Biol. 61, 1061-1067. 433 Kavanaugh, D., 1986. A systematic review of Amphizoid beetles (Amphizoidae: Coleoptera) 434 and their phylogenetic relationships to other Adephaga. Proceedings of the California 435 Academy of Sciences 44, 67-109. 436 Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., 437 Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Meintjes, P., Drummond, A., 438 2012. Geneious Basic: An integrated and extendable desktop software platform for the 439 organization and analysis of sequence data. Bioinformatics 28, 1647-1649. 440 Kuck, P., Meid, S.A., Gross, C., Wagele, J.W., Misof, B., 2014. AliGROOVE - visualization of 441 heterogeneous sequence divergence within multiple sequence alignments and detection 442 of inflated branch support. BMC Bioinformatics 15. 443 Kumar, S., Stecher, G., Tamura, K., 2016. MEGA7: Molecular Evolutionary Genetics 444 Analysis version 7.0 for bigger datasets. Mol. Biol. Evol. 33, 1870-1874. 445 Lartillot, N., Brinkmann, H., Philippe, H., 2007. Suppression of long-branch attraction 446 artefacts in the phylogeny using a site-heterogeneous model. BMC Evol. Biol. 7 447 (Suppl 1), S4.

13 448 Lartillot, N., Lepage, T., Blanquart, S., 2009. PhyloBayes 3: a Bayesian software package for 449 phylogenetic reconstruction and molecular dating. Bioinformatics 25, 2286-2288. 450 Li, H., Shao, R., Song, N., Song, F., Jiang, P., Li, Z., Cai, W., 2015. Higher-level phylogeny 451 of paraneopteran insects inferred from mitochondrial genome sequences. Scientific 452 reports 5. 453 Librado, P., Rozas, J., 2009. DnaSP v5: a software for comprehensive analysis of DNA 454 polymorphism data. Bioinformatics 25, 1451-1452. 455 Maddison, D.R., Baker, M.D., Ober, K.A., 1998. A preliminary phylogenetic analysis of 18S 456 ribosomal DNA of carabid beetles (Insecta: Coleoptera). In: Ball, G.E., Casale, A., Vigna- 457 Taglianti, A. (Eds.), Phylogeny and classification of Caraboidea (Coleoptera: Adephaga). 458 Museo Regionale di Scienze Naturali, Atti, Torino, , pp. 229-250. 459 Maddison, D.R., Baker, M.D., Ober, K.A., 1999. Phylogeny of carabid beetles as inferred 460 from 18S ribosomal DNA (Coleoptera: Carabidae). Syst. Entomol. 24, 103-138. 461 Maddison, D.R., Moore, W., Baker, M.D., Ellis, T.M., Ober, K.A., Cannone, J.J., Gutell, R.R., 462 2009. Monophyly of terrestrial adephagan beetles as indicated by three nuclear genes 463 (Coleoptera: Carabidae and Trachypachidae). Zool. Scr. 38, 43-62. 464 Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J., 465 Braverman, M.S., Chen, Y.J., Chen, Z.T., Dewell, S.B., Du, L., Fierro, J.M., Gomes, X.V., 466 Godwin, B.C., He, W., Helgesen, S., Ho, C.H., Irzyk, G.P., Jando, S.C., Alenquer, M.L.I., 467 Jarvie, T.P., Jirage, K.B., Kim, J.B., Knight, J.R., Lanza, J.R., Leamon, J.H., Lefkowitz, 468 S.M., Lei, M., Li, J., Lohman, K.L., Lu, H., Makhijani, V.B., McDade, K.E., McKenna, M.P., 469 Myers, E.W., Nickerson, E., Nobile, J.R., Plant, R., Puc, B.P., Ronan, M.T., Roth, G.T., 470 Sarkis, G.J., Simons, J.F., Simpson, J.W., Srinivasan, M., Tartaro, K.R., Tomasz, A., 471 Vogt, K.A., Volkmer, G.A., Wang, S.H., Wang, Y., Weiner, M.P., Yu, P.G., Begley, R.F., 472 Rothberg, J.M., 2005. Genome sequencing in microfabricated high-density picolitre 473 reactors. Nature 437, 376-380. 474 McKenna, D.D., Wild, A.L., Kanda, K., Bellamy, C.L., Beutel, R.G., Caterino, M.S., Farnum, 475 C.W., Hawks, D.C., Ivie, M.A., Jameson, M.L., Leschen, R.A.B., Marvaldi, A.E., McHugh, 476 J.V., Newton, A.F., Robertson, J.A., Thayer, M.K., Whiting, M.F., Lawrence, J.F., 477 Slipinski, A., Maddison, D.R., Farrell, B.D., 2015. The beetle tree of life reveals that 478 Coleoptera survived end- mass extinction to diversify during the 479 terrestrial revolution. Syst. Entomol. 40, 835-880. 480 Myers, E.W., Sutton, G.G., Delcher, A.L., Dew, I.M., Fasulo, D.P., Flanigan, M.J., Kravitz, 481 S.A., Mobarry, C.M., Reinert, K.H.J., Remington, K.A., Anson, E.L., Bolanos, R.A., Chou, 482 H.H., Jordan, C.M., Halpern, A.L., Lonardi, S., Beasley, E.M., Brandon, R.C., Chen, L., 483 Dunn, P.J., Lai, Z.W., Liang, Y., Nusskern, D.R., Zhan, M., Zhang, Q., Zheng, X.Q.,

14 484 Rubin, G.M., Adams, M.D., Venter, J.C., 2000. A whole-genome assembly of Drosophila. 485 Science 287, 2196-2204. 486 Nichols, S.W., 1985. Omophron and the origin of Hydradephaga (Insecta: Coleoptera: 487 Adephaga). Proceedings of the Academy of Natural Sciences of Philadelphia 137, 182- 488 201. 489 Ober, K.A., 2002. Phylogenetic relationships of the carabid subfamily Harpalinae 490 (Coleoptera) based on molecular sequence data. Mol. Phylogenet. Evol. 24, 228-248. 491 Pearson, D.L., Vogler, A.P., 2001. Tiger beetles: the evolution, ecology and diversity of the 492 cicindelids. Cornell University Press, Ithaca, NY. 493 Peng, Y., Leung, H.C.M., Yiu, S.M., Chin, F.Y.L., 2012. IDBA-UD: a de novo assembler for 494 single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 495 28, 1420-1428. 496 Pons, J., Ribera, I., Bertranpetit, J., Balke, M., 2010. Nucleotide substitution rates for the full 497 set of mitochondrial protein-coding genes in Coleoptera. Mol. Phylogenet. Evol. 56, 796- 498 807. 499 Regenfuss, H., 1975. Die Antennen-Putzeinrichtung der Adephaga (Coleoptera), parallele 500 evolutive Vervollkommnung einer komplexen Struktur. Zeitschrift fur zoologische 501 Systematik und Evolutionsforschung 13, 278-299. 502 Shull, V.L., Vogler, A.P., Baker, M.D., Maddison, D.R., Hammond, P.M., 2001. Sequence 503 alignment of 18S ribosomal RNA and the basal relationships of adephagan beetles: 504 evidence for monophyly of aquatic families and the placement of Trachypachidae. Syst. 505 Biol. 50, 945-969. 506 Simon, S., Hadrys, H., 2013. A comparative analysis of complete mitochondrial genomes 507 among . Mol. Phylogenet. Evol. 69, 393-403. 508 Song, F., Li, H., Jiang, P., Zhou, X., Liu, J., Sun, C., Vogler, A.P., Cai, W., 2016. Capturing 509 the phylogeny of Holometabola with mitochondrial genome data and Bayesian site- 510 heterogeneous mixture models. Genome Biology and Evolution, evw086. 511 Song, H.J., Sheffield, N.C., Cameron, S.L., Miller, K.B., Whiting, M.F., 2010. When 512 phylogenetic assumptions are violated: base compositional heterogeneity and among-site 513 rate variation in beetle mitochondrial phylogenomics. Syst. Entomol. 35, 429-448. 514 Talavera, G., Vila, R., 2011. What is the phylogenetic signal limit from mitogenomes? The 515 reconciliation between mitochondrial and nuclear data in the Insecta class phylogeny. 516 BMC Evol. Biol. 11, 15. 517 Timmermans, M.J., Barton, C., Haran, J., Ahrens, D., Culverwell, C.L., Ollikainen, A., 518 Dodsworth, S., Foster, P.G., Bocak, L., Vogler, A.P., 2016. Family-level sampling of 519 mitochondrial genomes in Coleoptera: compositional heterogeneity and phylogenetics. 520 Genome Biology and Evolution 8, 161-175.

15 521 Timmermans, M.J.T.N., Dodsworth, S., Culverwell, C.L., Bocak, L., Ahrens, D., Littlewood, 522 D.T.J., Pons, J., Vogler, A.P., 2010. Why barcode? High-throughput multiplex sequencing 523 of mitochondrial genomes for molecular systematics. Nucleic Acids Research 38. 524 Vogler, A.P., Welsh, A., Hancock, J.M., 1997. Phylogenetic analysis of slippage-like 525 sequence variation in the V4 rRNA expansion segment in tiger beetles (Cicindelidae). 526 Mol. Biol. Evol. 14, 6-19. 527 528 529 530 531 532 FIGURES 533 534 Figure 1. Schematic diagram of the two rearrangements in the mitochondrial genome of 535 Omus cazieri. The arrows indicate the changes of mitochondrial coding genes relative to the 536 ancestral gene order. 537 538 Figure 2. AliGROOVE analysis for the mitochondrial fragments considering only the 539 third codon positions. The mean similarity score between sequences is represented by a 540 colored square, based on AliGROOVE scores from -1, indicating great difference in sequence 541 composition from the remainder of the data set (red coloring), to +1, indicating similarity to 542 all other comparisons (blue coloring). Relevant taxonomic groups have been separated by 543 thicker lines and different colors (orange: Carabidae, green: Cicindelidae, purple: 544 Trachypachidae, blue: Hydradephaga). The taxa corresponding to long branches are marked 545 with an asterisk. 546 547 Figure 3. Heterogeneity in molecular rates and nucleotide composition. The bar charts 548 display for each taxon, from top to bottom: a) the root-to-tip distance obtained from the tree 549 topology from mitogenomes and full-length 18S data for non-chimerical taxa (Supplementary

550 Figure 2); b) the mean Ka (non synonymous substitution rate) values in pairwise comparisons 551 between the focal taxon and all other taxa; c) the proportion of GC-rich amino acid codons

552 (GARP); and d) the CG content (%). The Ka values, GARP proportion and GC content were 553 calculated from the protein coding genes only. All values are given as differences to the mean 554 value of each parameter. The absolute values are given in Supplementary Table 3. The taxa 555 with long branches are marked with a star. 556 557 Figure 4. Tree topology based on mitogenomes and full-length 18S sequences. The figure 558 shows the posterior consensus tree obtained from the four Phylobayes chains using the CAT- 559 Poisson model. Node labels represent posterior probability. The length of the branches is 560 proportional to the genetic distance. The long branches of Rhysodes, Meru and Noterus 561 (dashed lines) are not to scale; the inset shows the tree with unmodified branch lengths. 562 Major taxonomic groups are shown by vertical bars. 563

16 564 Figure 5. Tree topology after removing the 18S hypervariable regions. All annotations 565 are as in Fig. 4. 566 567 Figure 6: Simplified representation of the effects of changes in the dataset (left) on the 568 topology (right): a) original dataset; b) only the mitochondrial fragments; c) only the 569 mitochondrial fragments excluding the taxa with long branches (Rhysodes, Meru and 570 Noterus); d) including 18S sequences from GenBank; e) removing taxa with long branches 571 (Rhysodes, Meru and Noterus); and f) removing the hypervariable regions of the 18S 572 fragment. The mitochondrial fragments are colored in purple, whereas the 18S fragments are 573 red (our data) or dark red (from GenBank). CRPS quartet: Cicindelidae + Rhysodinae + 574 Paussinae + Scaritinae. 575 576 577 SUPPLEMENTARY FIGURES 578 579 Supplementary Figure 1. Tree topology based on the mitochondrial fragments after 580 removing three long-branch taxa. The figure shows the posterior consensus tree obtained 581 from the four Phylobayes chains using the CAT-Poisson model. Node labels represent 582 posterior probability. The length of the branches is proportional to the genetic distance. The 583 long branches of Rhysodes, Meru and Noterus (dashed lines) are not to scale: a version of the 584 tree with unmodified branch lengths is provided in the lower right corner. Taxonomic 585 assignments for each taxon are shown at the right. 586 587 Supplementary Figure 2. Tree topology from mitogenomes and full-length 18S data for 588 non-chimerical taxa only. See legend for Suppl. Fig. 1 for further details. 589 590 Supplementary Figure 3. Tree topology from mitogenomes and full-length 18S using 591 BEAST. The figure shows the posterior tree summarized from the BEAST analysis. Node 592 labels represent posterior probability. The length of the branches is proportional to time. 593 Taxonomic assignments for each taxon are shown at the right. 594 595 596 SUPPLEMENTARY TABLES 597 598 Supplementary Table 1: Lengths of the aligned fragments used in this work. 599 600 Supplementary Table 2: List of taxa analyzed in this work. The origin of the mitogenomes, 601 from Linard et al. (pers. com.) or sequenced de novo for this study, is shown. The 602 presence/absence of each marker in the alignment is detailed. The accession number for the 603 18S fragments that were retrieved from GenBank is provided. The outgroups are marked with 604 an asterisk (*). 605

606 Supplementary Table 3: Distances from the root to the tip nodes, mean Ka (non 607 synonymous substitution rate) values between each taxon and the rest of taxa (middle), and 608 the proportion of GC-rich amino acid codons corresponding to each taxon.

17 609 610 611

18 Figure 1. Schematic diagram of the two rearrangements in the mitochondrial genome of Omus cazieri. The arrows indicate the changes of mitochondrial coding genes relative to the ancestral gene order.

1 Figure 2. AliGROOVE analysis for the mitochondrial fragments considering only the third codon positions. The mean similarity score between sequences is represented by a colored square, based on AliGROOVE scores from -1, indicating great difference in sequence composition from the remainder of the data set (red coloring), to +1, indicating similarity to all other comparisons (blue coloring). Relevant taxonomic groups have been separated by thicker lines and different colors (orange: Carabidae, green: Cicindelidae, purple: Trachypachidae, blue: Hydradephaga). The taxa corresponding to long branches are marked with an asterisk.

2 Figure 3. Heterogeneity in molecular rates and nucleotide composition. The bar charts display for each taxon, from top to bottom: a) the root-to-tip distance obtained from the tree topology from mitogenomes and full-length 18S data for non-chimerical taxa (Supplementary

Figure 2); b) the mean Ka (non synonymous substitution rate) values in pairwise comparisons between the focal taxon and all other taxa; c) the proportion of GC-rich amino acid codons

(GARP); and d) the CG content (%). The Ka values, GARP proportion and GC content were calculated from the protein coding genes only. All values are given as differences to the mean value of each parameter. The absolute values are given in Supplementary Table 3. The taxa with long branches are marked with a star.

3 Figure 4. Tree topology based on mitogenomes and full-length 18S sequences. The figure shows the posterior consensus tree obtained from the four Phylobayes chains using the CAT- Poisson model. Node labels represent posterior probability. The length of the branches is proportional to the genetic distance. The long branches of Rhysodes, Meru and Noterus (dashed lines) are not to scale; the inset shows the tree with unmodified branch lengths. Major taxonomic groups are shown by vertical bars.

4 Figure 5. Tree topology after removing the 18S hypervariable regions. All annotations are as in Fig. 4.

5 Figure 6: Simplified representation of the effects of changes in the dataset (left) on the topology (right): a) original dataset; b) only the mitochondrial fragments; c) only the mitochondrial fragments excluding the taxa with long branches (Rhysodes, Meru and Noterus); d) including 18S sequences from GenBank; e) removing taxa with long branches (Rhysodes, Meru and Noterus); and f) removing the hypervariable regions of the 18S fragment. The mitochondrial fragments are colored in purple, whereas the 18S fragments are red (our data) or dark red (from GenBank). CRPS quartet: Cicindelidae + Rhysodinae + Paussinae + Scaritinae.

6 ● Mitochondrial genomes successfully resolve deep phylogenetic relationships ● Cicindelidae are an independent lineage, separated from Carabidae ● Geadephaga and Hydradephaga are reciprocally monophyletic groups ● Long branch lineages and 18S expansion segments distort the phylogenies Supplementary Figure 1. Tree topology based on the mitochondrial fragments after removing three long-branch taxa. The figure shows the posterior consensus tree obtained from the four Phylobayes chains using the CAT-Poisson model. Node labels represent posterior probability. The length of the branches is proportional to the genetic distance. The long branches of Rhysodes, Meru and Noterus (dashed lines) are not to scale: a version of the tree with unmodified branch lengths is provided in the lower right corner. Taxonomic assignments for each taxon are shown at the right.

1 Supplementary Figure 2. Tree topology from mitogenomes and full-length 18S data for non-chimerical taxa only. See legend for Suppl. Fig. 1 for further details.

2 Supplementary Figure 3. Tree topology from mitogenomes and full-length 18S using BEAST. The figure shows the posterior tree summarized from the BEAST analysis. Node labels represent posterior probability. The length of the branches is proportional to time. Taxonomic assignments for each taxon are shown at the right.

3 Supplementary Table 1: Lenghts of the aligned fragments used in this work. Fragment Length (bp)

18Sa 795 (original), 807 (including GenBank sequences), 690 (without hypervariable region)

18Sb 1340 (original), 1453 (including GenBank sequences), 843 (whithout hypervariable region) atp6 678 coxb 1122 cox1 1392 cox2 669 cox3 774 nad1 948 nad2 945 nad3 351 nad4 1329 nad5 1703 nad6 545 rrnL 569 rrnS 788

Supplementary Table 2: List of taxa analyzed in this work. The origin of the mitogenomes, from Linard et al. (pers. com.) or sequenced de novo for this study, is shown. The presence/absence of each marker in the alignment is detailed. The accession number for the 18S fragments that were retrieved from GenBank is provided. The outgroups are marked with an asterisk (*).

MARKERS SPECIES TAXONOMIC MITOGENO GROUP ME ORIGIN 18Sa 18Sb a c c c c n n n n n n rr rr t o o o o a a a a a a n n

1 p b x x x d d d d d d L S 6 1 2 3 1 2 3 4 5 6

Priacma serrata Archostemata Linard et al yes yes y y y y y y n y y y y y n (*) e e e e e e o e e e e e o s s s s s s s s s s s

Tetraphalerus Archostemata Linard et al no no y y y y y y y y y y y y y bruchi (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Aspidytes niobe Aspidytidae Linard et al yes yes y y y y y y y y y y y y y e e e e e e e e e e e e e s s s s s s s s s s s s s

Broschus Carabidae: This work AF01 AF01 y y y y y y n y y y y y y cephalotes Broscinae 2499 2499 e e e e e e o e e e e e e s s s s s s s s s s s s

Calosoma sp Carabidae: Linard et al AF00 AF00 y y y y y y y y y y y y y Carabinae 2800 2800 e e e e e e e e e e e e e s s s s s s s s s s s s s

Carabus Carabidae: Linard et al yes yes y y y y y y n y y y y y y nemoralis Carabinae e e e e e e o e e e e e e s s s s s s s s s s s s

Cychrus Carabidae: Linard et al yes yes y y y y y y y y y y y y y caraboides Carabinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Hexagonia Carabidae: Linard et al no no y y y y y y y y y y y n n terminalis Ctenodactylinae e e e e e e e e e e e o o s s s s s s s s s s s

Elaphrus Carabidae: Linard et al yes yes y y y y y y y y y y y y n cupreus Elaphrinae e e e e e e e e e e e e o s s s s s s s s s s s s

Lebia Carabidae: Linard et al no no y y y y y y y y y y y n n chlorocephala e e e e e e e e e e e o o s s s s s s s s s s s

Microlestes Carabidae: Linard et al JN61 yes y y y y y y y y y y y n n minutulus Lebiinae 9072 e e e e e e e e e e e o o s s s s s s s s s s s

Loricera Carabidae: Linard et al yes yes y y y y y n y y y y y y n pilicornis Loricerinae e e e e e o e e e e e e o s s s s s s s s s s s

Nebria Carabidae: Linard et al yes yes y y y y y y y y y y y y y brevicollis Nebriinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Metrius Carabidae: This work KP41 KP41 y y y y y y y y y y y y y contractus Paussinae 9168 9168 e e e e e e e e e e e e e s s s s s s s s s s s s s

Abax Carabidae: Linard et al yes yes y y y y y y y y y y y y y parallelepipedus e e e e e e e e e e e e e s s s s s s s s s s s s s

Amara Carabidae: Linard et al AF00 AF00 y y y y y y y y y y y y y

2 communis Pterostichinae 2774 2774 e e e e e e e e e e e e e s s s s s s s s s s s s s

Pterostichus Carabidae: Linard et al yes yes y y y y y y y y y y y y y niger Pterostichinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Stomis Carabidae: Linard et al yes yes y y y y y y y y y y y y y pumicatus Pterostichinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Scarites buparius Carabidae: This work AF00 AF00 y y y y y y y y y y y y y Scaritinae 2795 2795 e e e e e e e e e e e e e s s s s s s s s s s s s s

Bembidion Carabidae: Linard et al yes yes y y y y y y y y y y y y y laetum Trechinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Pogonus Carabidae: Linard et al GU5 GU5 y y y y y y y y y y y n n iridipennis Trechinae 5614 5614 e e e e e e e e e e e o o 4 4 s s s s s s s s s s s

Trechus obtusus Carabidae: Linard et al yes no y y y y y y y y y y y y y Trechinae e e e e e e e e e e e e e s s s s s s s s s s s s s

Habrodera Cicindelidae: Linard et al yes yes y y y y y y y y y y y y y capensis Cicindelini e e e e e e e e e e e e e s s s s s s s s s s s s s

Cicindela Cicindelidae: This work KP41 KP41 y y y y y y y y y y y n n campestris Cicindelini 9048 9048 e e e e e e e e e e e o o s s s s s s s s s s s

Odontocheila sp Cicindelidae: This work AF42 AF42 y y y y y y y y y y y y n Cicindelini 3041 3041 e e e e e e e e e e e e o s s s s s s s s s s s s

Pogonostoma Cicindelidae: This work AF43 AF43 y y y y y y y y y y y y y subtiligrossum Collyridini 2048 2048 e e e e e e e e e e e e e s s s s s s s s s s s s s

Manticora Cicindelidae: This work AF42 AF42 y y y y y y y y y y y y y tibialis Manticorini 3056 3056 e e e e e e e e e e e e e s s s s s s s s s s s s s

Australicapitona Cicindelidae: This work AF42 AF42 y y y y y y y y y y y y y hopei Megacephalini 3054 3054 e e e e e e e e e e e e e s s s s s s s s s s s s s

Pseudotetracha Cicindelidae: This work DQ1 no y y y y y y n y y y y n n mendacia Megacephalini 5209 e e e e e e o e e e e o o 7 s s s s s s s s s s

Omus cazieri Cicindelidae: This work AF01 AF01 y y y y y y y y y y y y y Megacephalini 2519 2519 e e e e e e e e e e e e e s s s s s s s s s s s s s

Platychile Cicindelidae: This work AF42 AF42 y y y y y y y y y y y y y pallida Megacephalini 3059 3059 e e e e e e e e e e e e e s s s s s s s s s s s s s

Acilius sulcatus Dytiscidae Linard et al yes yes y y y y y y y y y y y y y

3 e e e e e e e e e e e e e s s s s s s s s s s s s s

Colymbetes Dytiscidae Linard et al yes no y y y y y y y y y y y y y fuscus e e e e e e e e e e e e e s s s s s s s s s s s s s

Hydroporus Dytiscidae Linard et al yes no y y y y y y y y y y y y y obscurus e e e e e e e e e e e e e s s s s s s s s s s s s s

Hygrotus Dytiscidae Linard et al yes yes y y y y y y y y y y y y y inaequalis e e e e e e e e e e e e e s s s s s s s s s s s s s

Liopterus Dytiscidae Linard et al yes yes y y y y y y y y y y y y y haemorrhoidalis e e e e e e e e e e e e e s s s s s s s s s s s s s

Macrogyrus Gyrinidae Linard et al KP41 KP41 y y y y y y y y y y y y y oblongus 9155 9155 e e e e e e e e e e e e e s s s s s s s s s s s s s

Haliplus Haliplidae Linard et al yes yes y y y y y y y y y y y y y lineatocollis e e e e e e e e e e e e e s s s s s s s s s s s s s

Hygrobia Hygrobiidae Linard et al yes yes y y y y y y y y y y y y y hermanni e e e e e e e e e e e e e s s s s s s s s s s s s s

Meru phyllisae Meruidae Linard et al yes yes y y y y y y n y y y y y y e e e e e e o e e e e e e s s s s s s s s s s s s

Hydroscapha Linard et al no no y y y y y y y y y y y y y granulum (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Incoltorrida Myxophaga Linard et al yes yes y y y y y y y y y y y y y madagassica (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Noterus Noteridae Linard et al yes yes y y y y y y y y y y y y y clavicornis e e e e e e e e e e e e e s s s s s s s s s s s s s

Habrocerus Polyphaga Linard et al yes yes y y y y y y y y y y y y y capillaricornis e e e e e e e e e e e e e (*) s s s s s s s s s s s s s

Ips sexdentatus Polyphaga Linard et al yes yes y y y y y y y y y y y y y (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Lucanus cervus Polyphaga Linard et al no no y y y y y y y y y y y y y (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Psyllioides Polyphaga Linard et al yes yes y y y y y y y y y y y y y hispnus (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Rhinoncus Polyphaga Linard et al no no y y y y y y y y y y y y y

4 pericarpius (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Tenebrio Polyphaga Linard et al yes yes y y y y y y y y y y y y y mollitor (*) e e e e e e e e e e e e e s s s s s s s s s s s s s

Rhysodes sp Rhysodidae Linard et al AF01 AF01 y y y y y y y y y y y y y 2521 2521 e e e e e e e e e e e e e s s s s s s s s s s s s s

Trachypachus Trachypachidae Linard et al yes yes y y y y y y y y y y y y y holmbergi e e e e e e e e e e e e e s s s s s s s s s s s s s

Supplementary Table 3: Distances from the root to the tip nodes, mean Ka (non synonymous substitution rate) values between each taxon and the rest of taxa (middle), and the proportion of GC-rich amino acid codons corresponding to each taxon.

Distance to Mean Ka GARP GC propotion Taxon root proportion

0.10567463 pumicatus 0.374 7 12.9924812 21.1

0.11242570 niger 0.404 7 13.17383404 22.2

0.10458429 Amara communis 0.370 4 12.92252468 20.8

0.10493622 minutulus 0.406 7 13.42925659 22.1

0.10849663 chlorocephala 0.449 6 13.78483668 23.5

0.11413519 Abax parallelepipedus 0.416 7 13.62683438 24.1

Hexagonia terminalis 0.580 0.1088684 12.5 21.7

0.11965883 Trechus obtusus 0.509 8 13.4319348 21.6

0.21979926 Rhysodes sp 2.125 7 14.624 27.9

0.10452802 iridipennis 0.346 3 13.38535414 21.3

5 0.11428702 Bembidion laetum 0.464 3 13.48753379 23

0.13980146 pilicornis 0.566 5 13.74455246 23.1

0.12674568 0.536 8 13.73844122 23.2

0.11850318 0.432 9 13.56438602 23.3

Metrius contractus 0.575 0.13471563 13.73840168 25.7

Elaphrus cupreus 0.360 0.10551118 13.54198014 22.8

0.11056667 brevicollis 0.397 9 13.68038741 23.1

0.10830543 0.364 4 14.26218709 24.6

0.10308460 Calosoma sp 0.367 5 13.39125225 23.9

Cychrus caraboides 0.325 0.10340415 13.64044272 22.3

Habrodera capensis 0.775 0.12682243 14.45856019 26.6

0.12661087 Cicindela campestris 0.819 9 14.38356164 28.2

0.12954509 Odontocheila sp 0.843 6 14.64721643 28.1

Pseudotetracha 0.13961903 mendacia 0.945 1 16.04330709 31.5

0.13320737 Australicapitona hopei 0.972 5 15.14971835 32.6

Pogonostoma 0.12412803 subtiligrossum 0.785 6 13.99041342 25

0.13047423 Omus cazieri 0.640 2 13.82915173 24.2

Platychile pallida 0.630 0.12646808 13.96664681 25.3

6 8

0.15020645 Manticora tibialis 0.849 2 14.43483448 27.8

Trachypachus 0.10664252 holmbergi 0.312 2 13.39526521 22

Liopterus 0.13538316 haemorrhoides 0.532 2 14.14020371 23.5

0.11488095 Colymbetes fuscus 0.421 3 14.24731183 25.4

0.12026635 sulcatus 0.428 6 13.84062313 23.4

0.14021749 inaequalis 0.730 4 14.33681073 27.2

0.12551085 Hydroporus obscurus 0.521 1 13.30137807 21.9

0.13812025 hermanii 0.517 1 13.6486082 24.5

0.11467458 Aspydites niobe 0.366 7 13.31334333 22.1

0.20488555 Noterus clavicornis 1.608 6 12.03216826 21.1

0.19108510 Meru phyllisae 1.240 4 12.51803752 23.8

0.14168558 oblongus 0.661 5 14.12259615 25.3

0.11951628 Haliplus lineatocollis 0.418 5 13.09559485 21.4

7