<<

Varney et al. BMC Ecol Evo (2021) 21:6 BMC Ecology and Evolution https://doi.org/10.1186/s12862-020-01728-y

RESEARCH ARTICLE Open Access Assessment of mitochondrial genomes for heterobranch gastropod phylogenetics Rebecca M. Varney1, Bastian Brenzinger2, Manuel António E. Malaquias3, Christopher P. Meyer4, Michael Schrödl2,5 and Kevin M. Kocot1,6*

Abstract Background: is a diverse of marine, freshwater, and terrestrial gastropod molluscs. It includes such disparate taxa as nudibranchs, sea hares, bubble , pulmonate land snails and , and a number of (mostly small-bodied) poorly known snails and slugs collectively referred to as the “lower heterobranchs”. Evolutionary relationships within Heterobranchia have been challenging to resolve and the group has been subject to frequent and signifcant taxonomic revision. Mitochondrial (mt) genomes can be a useful molecular marker for phylogenetics but, to date, sequences have been available for only a relatively small subset of Heterobranchia. Results: To assess the utility of mitochondrial genomes for resolving evolutionary relationships within this clade, eleven new mt genomes were sequenced including representatives of several groups of “lower heterobranchs”. Maximum likelihood analyses of concatenated matrices of the thirteen protein coding genes found weak support for most higher-level relationships even after several taxa with extremely high rates of evolution were excluded. Bayes- ian inference with the CAT GTR model resulted in a reconstruction that is much more consistent with the current understanding of heterobranch+ phylogeny. Notably, this analysis recovered and Orbitestelloidea in a polytomy with a clade including all other heterobranchs, highlighting these taxa as important to understanding early heterobranch evolution. Also, dramatic gene rearrangements were detected within and between multiple . However, a single gene order is conserved across the majority of heterobranch clades. Conclusions: Analysis of mitochondrial genomes in a Bayesian framework with the site heterogeneous CAT GTR model resulted in a topology largely consistent with the current understanding of heterobranch phylogeny. However,+ mitochondrial genomes appear to be too variable to serve as good phylogenetic markers for robustly resolving a number of deeper splits within this clade. Keywords: Heterobranchia, , Mitochondrial genome, Mitogenomic

Background taxa, and can have gene order rearrangements that are Mitochondrial genomes are popular molecular mark- potentially phylogenetically informative. Phylogenetic ers for phylogenetics: they are relatively easy to analyses of mitochondrial genomes have clarifed rela- sequence, assemble, and annotate, typically have a mod- tionships within diverse groups of invertebrates such as erate level of sequence conservation that facilitates phy- crustaceans, echinoderms, sponges, hemichordates, and logenetic comparisons among relatively distantly related annelids, just to name a few [1–5]. However, the appli- cation of mitochondrial genomes to phylogenetics can be limited by diferences in evolutionary rates (which *Correspondence: [email protected] 1 Department of Biological Sciences, The University of Alabama, Campus can lead to long-branch attraction (LBA) artifacts and Box 870344, Tuscaloosa, AL 35487, USA incomplete lineage sorting [6]. Mitochondrial genome- Full list of author information is available at the end of the article based studies of molluscan evolutionary relationships

© The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativeco​ mmons​ .org/licen​ ses/by/4.0/​ . The Creative Commons Public Domain Dedication waiver (http://creativeco​ ​ mmons.org/publi​ cdoma​ in/zero/1.0/​ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Varney et al. BMC Ecol Evo (2021) 21:6 Page 2 of 14

have been variable in success. Molluscan mitochondrial lower heterobranchs, and the evolution of heterobranch genomes exhibit wide variation in size, organization, and mitochondrial genome organization. rate of evolution [7–9]. Mitochondrial genomes have substantially aided in clarifcation of relationships within Results clades such as Caudofoveata [10], Cephalopoda [11, 12], Genome assemblies and data matrix and some gastropod clades (e.g., [13–15]), but have had We sequenced genomic DNA libraries from eleven spe- limited success at resolving relationships within other cies of heterobranch gastropods and extracted mito- molluscan clades (e.g., [16]) and resolving deep mollus- chondrial sequences (Table 1, Additional fle 1: Table S1). can phylogeny [17, 18]. Despite high sequencing coverage, a single contiguous Heterobranchia is a -rich clade of gastropod mitochondrial genome was recovered for only two of the molluscs that encompasses a wide diversity of snails and eleven taxa. All of the newly sequenced mt genomes were slugs that occupy marine, freshwater, and terrestrial habi- incomplete to some degree, possibly due to difculties tats [19, 20]. Heterobranchs are thought to have diverged in sequencing through secondary structures associated from other gastropods approximately 380 million years with the 16S rDNA (which was absent from or incom- ago (mya; [21, 22]). Almost every clade within Hetero- plete in several of our assemblies) and the control region, branchia has been subject to signifcant and continued but most of the mitochondrial protein-coding genes taxonomic revision. Te name Heterobranchia was were obtained for all species. Alignments of amino acid coined by Burmeister (1837), but it is most commonly sequences were produced for the thirteen protein-coding attributed to Gray (1840) who used it to unite Opistho- genes obtained from the newly sequenced taxa and pub- branchia (e.g., sea slugs) and (e.g., land snails). licly available heterobranch mt genomes on NCBI (Addi- Tis group was later renamed to refect the tional fle 1: Table S1). After trimming each alignment to secondarily detorted arrangement of the cerebrovisceral remove ambiguously aligned positions, the concatenated commisures [23], but Heterobranchia was redefned data matrix totaled 4735 amino acid sites with 31.3% gaps to include Euthyneura and a grouping of taxa that are across 104 taxa (17 outgroups and 87 heterobranchs). generally referred to as the “lower Heterobranchia” or Allogastropoda [24, 25] including, at times, Pyramidel- Maximum likelihood analyses loidea, , Valvatoidea, Orbitestelloidea, A partitioned maximum likelihood (ML) analysis of , , , Tjaernoei- this data matrix using the best-ftting model for each idae, , Rhodopemorpha, and gene (Additional fle 2: Figure S1; see additional data [21, 22, 24–30]. has since been demon- on FigShare for more information) resulted in a tree strated to be a non-monophyletic group as sea clades with cristata (Valvatoidea) recovered as the sis- such as and Acochlidia share a more recent ter taxon to a clade containing all other heterobranchs common ancestor with the pulmonates than other sea with successive branching of Microdiscula slugs as do some “lower” heterobranchs like Pyramidel- (Orbitestelloidea), a clade composed of loidea and Glacidorbidae [8, 13, 22, 31, 32], reviewed by (Gymnosomata), Psilaxis radiatus (Architectonicoidea), [19]. atomus (), and mor- Phylogenetic analyses to date have been unable to rocayensis (Rissoelloidae), and then Rhopalocaulis gran- robustly resolve most relationships among major hetero- didieri (), which was the sister taxon of branch clades. However, most of these studies have been all remaining Heterobranchia. All members of the clade limited by taxon sampling. In particular, many of the containing C. limacina, P. radiatus, O. atomus, and R. “lower heterobranchs” were missing from most investiga- morrocayensis exhibited extremely long branches rela- tions of heterobranch phylogeny to date. Tese snails and tive to the other heterobranchs and it is well-established slugs are thought to represent a critical group to under- that Gymnosomata is nested within . standing heterobranch evolution as it has been debated Tus, we strongly suspected that this clade was the result whether they form a clade that is sister to all remaining of long-branch attraction. Tis, combined with very low heterobranchs or a “basal” paraphyletic grade. Here, we backbone support values, led us to re-run the analy- sequenced mitochondrial genomes from 11 heterobranch sis with the following unstable and long-branched taxa taxa, including several so-called lower heterobranchs and removed: C. limacina, P. radiatus, O. atomus, and R. select other understudied clades. Tese new data were morrocayensis (see Additional fle 10: Table S3). analyzed in combination with publicly available hetero- Te matrix with unstable and long-branched taxa branch and outgroup mitochondrial genomes to inves- removed totaled 4447 amino acid sites with 28.7% gaps tigate the utility of mitochondrial genomes for resolving across 99 taxa. In the resulting partitioned ML analy- higher-level heterobranch phylogeny, placement of the sis using the best-ftting model for each gene (Fig. 1), Varney et al. BMC Ecol Evo (2021) 21:6 Page 3 of 14

Table 1 Mitochondrial genomes sequenced in the present study and associated sources of samples Taxon mt contig Protein tRNAs (22) rRNAs (2) Complete Missing genes Sample accession Provenance length coding (bp) genes (13)

Acochlidium fjiense 14,098 13 20 2 Nearly trnS2, trnK ZSM Mol- Fiji, Viti Levu, Lami 20130988 River, collected by M Schrödl & E Schwabe, August 2006 14,599 12 17 1 Nearly trnl, atp8, trnS1, ZSM Mol- Antarctica, ANDEEP trnD, rrnL, trnY, 20081086 SYSTCO Expedition trnR on R/V Polarstern, ANTXXIV/2 cruise station PS 71/038- 04, collected by E Schwabe & H Flores, January 2008 Ilbia ilbi 13,944 12 21 2 Nearly atp8 MRG10019-03 Shoreham Beach 18 March 2017, collected by A Falconer Microdiscula 12,965 13 21 2 Yes MRG828-06 Dutton Way, Port- charopa land 2 March 2017, approx 2 m deep, collected by A Falconer Omalogyra atomus 12,413 10 20 1 No atp8, rrnS, trnY, ZSM Mol- France, Pyrénées- nad4l, trnV 20142011 Orientales, Banyuls-sur-Mer, from red in upper intertidal, collected by B Brenzinger & TP Neusser, July 2014 Psilaxis radiatus 12,154 12 15 1 No atp8, trnN, trnS1, ZMBN 94175 Museum of Zoology tncC, trnF, trnY, at the University of rrnL, trnQ, trnV Bergen conformis 14,017 12 22 2 Nearly nad4l None Malta, of Ġnejna Bay, 31 May 2017 Rissoella morrocay- 11,085 12 5 1 No trnS1, rrnL, trnI, ZMBN 99933 Museum of Zoology ensis trnN, trnC, trnF, at the University of trnY, trnQ, Bergen Runcina ornata 13,862 13 22 2 Yes ZMBN 87949 Museum of Zoology at the University of Bergen cf. cor- 14,614 13 21 1 Yes trnK, rrnL USNM 1442311 French Polynesia, ticalis Moorea, NW side Cook’s Bay, collected by C McKeon, G Paulay, J O’Donnell and C Meyer, 12 June 2006 Valvata cf. cristata 14,495 13 21 1 Nearly trnR, rrnL ZSM Mol- Germany, Munich, 20170210 pond on ZSM grounds, collected by B Brenzinger, March 2017

Yes: genome circularizes via overlapping ends, missing genes likely missed by annotator Nearly: all genes listed on a single scafold, but scafold does not contain a full mitochondrion No: mitochondrion either exists on two discontinuous scafolds or is missing multiple protein coding genes Varney et al. BMC Ecol Evo (2021) 21:6 Page 4 of 14

(See fgure on next page.) Fig. 1 Maximum likelihood phylogeny of heterobranch gastropods based on the reduced set of taxa following removal of both unstable leaves fagged by RogueNaRok and the four longest-branched taxa. Taxa for which new sequences were collected are shown in bold. The data set was trimmed with TrimAL with default settings, partitioned by gene in RAxML, and the PROTGAMMAAUTO was used to select the best-ftting model for each partition. Bootstrap support values are presented at each node

Valvata was again recovered as the sister taxon to a and, while in previous analyses R. morrocayensis had a clade composed of all other heterobranchs (bootstrap much longer branch than the other taxa in this clade, in support, bs = 100) followed by the successive branching the edge-unlinked tree, all four taxa had similarly elon- of Microdiscula (bs = 62) and Rhopalocaulis (bs = 100). gated branches (Additional fle 6: Figure S5). In this edge- Most major clades of Heterobranchia (Euthyneura unlinked analysis, the positions of and the sensu lato) were recovered with high support: Acteo- pair of ellobiods were recovered with relationships simi- noidea, , , Runcinida, Aplysi- lar to those of the initial partitioned ML tree (Additional ida, Siphonariida, Sacoglossa, and fle 2: Figure S1). We also analyzed the set of all taxa were all recovered with 100% bootstrap support, and except C. limacina to assess whether removal of this sin- with 99% bootstrap support. Of gle rapidly-evolving taxon would ‘release’ the other long- the family- and order-level taxa, Ellobioidea is the only branched taxa, which are traditionally considered to be one that was recovered as non-monophyletic, with lower heterobranchs, from this putative LBA artifact. Te Pedipes pedipes and myosotis falling well other three long-branched taxa remained in the same outside of the clade containing the rest of Ellobioidea, location with long branches (Additional fle 7: Figure S6). albeit with very low support. Support for relationships among most higher-level heterobranch clades was gen- Bayesian inference erally weak and a number of higher-level groupings Because of poor support for most nodes of interest in within Heterobranchia including Tectipleura, Euo- our maximum likelihood analyses, we also performed a pisthobranchia, , , Systel- Bayesian inference with the CAT + GTR + G4 model on lommatophora, and Amphipulmonata (sensu [20]) the same datasets, but only the analysis of the dataset were not monophyletic. However, Nudipleura (Nudi- with unstable and long-branched taxa removed reached branchia + Pleurobranchomorpha) and a clade com- convergence. Of the six chains that were run for over posed of Aplysiida + were recovered 60,000 generations, four converged according to the Phy- monophyletic with maximal support. loBayes bpcomp maxdif criterion (maxdif value = 0.29), To explore the impact of diferent partitioning schemes yielding a tree with a topology that is much more con- on tree topology, and to determine whether other mod- sistent with the current understanding of heterobranch els better mitigated the long-branch attraction of C. relationships (Fig. 2). Valvata and Microdiscula were limacina, P. radiatus, O. atomus, and R. morrocayensis, recovered in a polytomy with a clade that comprised all we ran several additional analyses. A partitioned analy- other heterobranchs, which received maximal support. sis with a mixed model (LG + C60 + G + F) yielded a tree Tis clade consisted of a polytomy of Nudipleura + Acte- (Additional fle 3: Figure S2) with the same clade of long- onoidea, Ringicula, and the remaining heterobranchs. branched taxa as sister to all remaining heterobranchs Nudipleura + was weakly supported but except Valvata and Microdiscula and did not vary sig- Nudipleura again received maximal support. nifcantly in any other way from the original ML tree Te largest recovered subclade within Heterobranchia, (Additional fle 2: Figure S1). A ML analysis with Lanfear Tectipleura, consisted of Euopisthobranchia (Cephalas- clustering (Additional fle 4: Figure S3) and a fully par- pidea, Runcinida, Aplysiida, and Umbraculoidea) and titioned ML analysis with resampling within partitions Panpulmonata (Siphonariida, Sacoglossa, Hygrophila, (Additional fle 5: Figure S4) both produced similar trees Ellobioidea, , Systellommatophora, and to the initial partitioned ML analysis (Additional fle 2: Stylommatophora), which were recovered reciprocally Figure S1), but with the two members of Ellobioidea that monophyletic and both clades received maximal support. did not form a clade with the rest of Ellobioidea (Myo- Within Euopisthobranchia, Cephalaspidea and Runci- sotella and Pedipes) falling outside Hygrophila and thus nida formed a (weakly supported) clade sister to a clade even farther from the remaining ellobioids. An analysis of Aplysiida + Umbraculoidea, the latter of which was with an edge-unlinked model altered the relationships strongly supported (posterior probability, pp = 0.98). within the long-branched clade, with O. atomus and P. Within Panpulmonata, Siphonariida was recovered radiatus as sister to R. morrocayensis and C. limacina, as the sister taxon to the rest of the clade (pp = 1) with Varney et al. BMC Ecol Evo (2021) 21:6 Page 5 of 14

100 Phasianella solida

100 delphinus limacina

99 Georissa bangueyensis 100 100 Clithon retropictus 100 Bellamya quadrata Cipangopaludina cathayensis

100 98 Pomacea canaliculata

100 Turritella bacillum

69 Tylomelania sarasinorum

31 Cymatium parthenopeum Ilyanassa obsoleta 94 53 Menathais tuberosa

100 Thais clavigera 100 100 Concholepas concholepas striatus Valvata cf. cristata Valvatoidea Microdiscula charopa Orbitestelloidea Rhopalocaulis grandidieri Systellommatophora Ringicula conformis Ringiculoidea gardineri festiva 100 49 quadricolor 80 100 Chromodorismagnifica 42 100 kubaryana 84 100 europaea Nudipleura Homoiodoris japonica 99 ocellata

95 Tritoniadiomedea 59 japonica 100 100 leonina novaezealandiae 62 100 sp. strigosa

100 physis Acteonoidea 72 100 undatus

100 Odontoglaja guamensis Sagaminopteron nigropunctatus 100 Cephalaspidea sp. 31 100 calyculata Runcina ornata Runcinida 47 100 Illbia ilbi dactylomela 63 100 100 Aplysia californica Aplysia kurodai Aplysiida 100 100 100 Tylodina cf. corticalis Umbraculoidea 96 gigas Siphonariida 100

53 Plakobranchus cf ocellatus

66 gracilis Elysiaornata 100 100 chlorotica Sacoglossa 95 sp. 100 fragilis 49 rhamphidia Amphiboloidea 43 dolabrata Acochlidium fijensis 100 chinense 92 bidentata 93 20 72 Ovatella vulcani Ellobioidea 100 tridentatum reticulatus

99 100 Platevindex mortoni peronii Systellommatophora 99 celtica

27 99 Pedipes pedipes Ellobioidea 26 glabrata 100 corneus 79 100 duryi Hygrophila balthica 43 59 100 pervia acuta Achatinafulica

88 nux putris 100 100 Achatinella sowerbyana

100 57 Pupillamuscorum

100 pusilla 93 Gastrocoptacristata 30 Arionrufus

100 incanum Cerion uva Stylommatophora 49 100 virgata

100 itala 96 obtusus

100 nemoralis 93 aspersa

100 100 mexicana cereolus Mastigeulotakiangsinensis 100 54 Aegista aubryana 100 Aegistadiversifamilia 85 Camaena cicatricosa 100 Camaena poyuensis

0.20 Varney et al. BMC Ecol Evo (2021) 21:6 Page 6 of 14

(See fgure on next page.) Fig. 2 Bayesian inference phylogeny of Heterobranch molluscs based on the reduced set of taxa following removal of both unstable leaves fagged by RogueNaRok and the four longest-branched taxa. Taxa for which new sequences were collected in the present study are shown in bold. The data set was trimmed with BMGE and trees were generated in PhyloBayes with four chains using the CAT GTR Γ4 substitution model. Tree shown is + + the majority rule posterior consensus tree. Posterior probabilities are presented at each node

Sacoglossa sister to all other taxa within that clade In the remaining heterobranchs, the clade spanning (pp = 0.96). Te remaining panpulmonates formed two Acteonoidea, Nudipleura, and the subclade including clades. One consisted of Stylommatophora, Systellom- Runcinida, Cephalaspidea, Umbraculoidea, Aplysiida, matophora, and Ellobioidea, although neither Systellom- Siphonariida, Sacoglossa, Amphiboloidea, Pyramidel- matophora nor Ellobioidea were recovered monophyletic. lidae, Hygrophila, Acochlidia, Systellommatophora, Rhopalocaulis (Systellommatophora) was recovered as Ellobioidea, and Stylommatophora, a relatively stable the sister taxon of Stylommatophora (pp = 0.71) and this mitochondrial gene order and orientation exists, referred clade was recovered in a polytomy with the ellobioids to here as the common heterobranch gene order [cox1- Pedipes and Myosotella that was maximally supported. 16SrRNA-nad6-nad5-nad1-nad4L-ctyB-cox2-atp8-atp6- Sister to this polytomy was a moderately well-supported 12SrRNA-nad3-nad4-cox3-nad2, with atp8-nad3 and clade (pp = 0.98) in which the remainder of Systellom- cox3 both reversed in direction from cox1]. All members matophora () was sister to the remaining of Nudipleura examined adhere to this common gene Ellobioidea. Sister to the Stylommatophora-Systellom- order except , in which a single gene matophora- Ellobioidea clade was a clade comprising (nad4) changed position, and all members of Acteonoidea Hygrophila (pp = 1.00), Pyramidella dolabrata (Pyra- adhere to the same order as well. Variation exists within midelloidea), Salinator rhamphidia (Amphiboloidea), the Cephalaspidea, with a shared rearrangement of cytB, and Acochlidium fjiense (Acochlidia). Salinator and Pyr- nad1, nad4L, and cox2 shared among three-fourths of its amidella formed a well-supported clade (pp = 0.99) but members, and Sagaminopteron nigropunctatum contain- otherwise, higher-level relationships in the Hygrophila- ing further rearrangements. Aplysiida adheres to the sta- -Amphiboloidea-Acochlidia clade were ble arrangement with the exception of Aplysia kurodai, in weakly supported. which the orientation of the 12S rRNA gene is reversed though its position remains the same. Both representatives of Siphonariida have diferent Mitochondrial gene order evolution internal mitochondrial gene rearrangements: Siphonaria A somewhat diagnostic gene arrangement exists for het- gigas reversed the positions of nad4 and nad3, while erobranchs relative to other gastropod clades, but many Siphonaria pectinata inserts cox2 between nad4L and heterobranch taxa and subclades have diferences in both cytB. All sacoglossans shared a common gene order, as gene organization and orientation in their mitochondrial do Pyramidella dolabrata and Acochlidium fjiensis. Te genomes (Fig. 3). Caenogastropods encode all mitochon- majority of Ellobioidea are consistent, excepting Myoso- drial genes in the same orientation, while all members of tella myosotis and Pedipes pedipes, which have diferent the clade comprising and single-gene transpositions than one another. Addition- share a single inversion of [12S rRNA, 16S rRNA, nad1, ally, the mt genome of P. pedipes is expanded, with more nad6, cytB, nad4L, nad4, nad5]. Across the diverse taxa intergenic space than other closely related taxa. Interest- used as outgroups in this study, no individual deviations ingly, these two taxa are those that fall together in a dif- in gene arrangement were found. ferent part of the phylogeny than the remaining members In contrast to this consistency, the taxa at the base of Ellobioidea, making this group paraphyletic. Te clade of our heterobranch tree all have remarkably diferent comprising Hygrophila was strongly supported, and mitochondrial gene arrangements from one another. all members within it share the common heterobranch Mitochondrial gene order within most of the “lower gene order except , which has a completely Heterobranchia” is variable: Psilaxis radiatus (Archi- unique gene arrangement. tectonicoidea), Omalogyra atomus (Omalogyridae), Within Stylommatophora, both members of Rissoella morrocayensis (Rissoellidae), and Valvata cf. Achatinella shared a single gene (cox2) moved to a dif- cristata () all have distinct gene orders from ferent position relative to other members of the clade. one another and from remaining heterobranchs, includ- Likewise, the taxa , Cepaea nemora- ing changes in both order and orientation. Microdiscula lis, and (syn. Helix aspersa) all share a charopa also has an entirely unique gene order. single gene (nad4) inserted at a diferent location in the Varney et al. BMC Ecol Evo (2021) 21:6 Page 7 of 14

100 Pomaceacanaliculata Marisacornuarietis 100 Tylomelania sarasinorum Turritella bacillum 99 Cymatium parthenopeum 100 Ilyanassa obsoleta 100 Conus striatus 100 Menathais tuberosa 100 100 Thaisclavigera 100 Concholepas concholepas 100 Cipangopaludina cathayensis 100 Bellamya quadrata

100 Titiscanialimacina Georissa bangueyensis 100 Clithon retropictus 100 Phasianella solida 100 Valvata cf. cristata Valvatoidea Microdiscula charopa Orbitestelloidea Ringicula conformis Ringiculoidea 100 Berthellina sp. Pleurobranchaea novaezealandiae Chromodoris magnifica 100 100 Chromodoris quadricolor Hypselodoris festiva 100 99 96 Nembrotha kubaryana 100 Roboastra europaea Nudipleura 63 Homoiodoris japonica 100 Notodoris gardineri Phyllidia ocellata 55 100

100 Sakuraeolis japonica 100 diomedea 100 100 Acteonoidea 100 Pupastrigosa 100 Runcinaornata Illbia ilbi Runcinida 54 100 Sagaminopteron nigropunctatus Odontoglaja guamensis Cephalaspidea 100 100 Smaragdinella calyculata 100 Bulla sp. Tylodina cf. corticalis Umbraculoidea 100 Aplysia vaccaria 98 Aplysia kurodai Aplysiida 100 100 Aplysia californica 99 Siphonariapectinata 100 Siphonariida Siphonaria gigas Placida sp. Thuridilla gracilis 100 99 Plakobranchus cf. ocellatus 100 100 Sacoglossa 100 100 Salinator rhamphidia Amphiboloidea 99 Pyramidella dolabrata Pyramidellidae Physellaacuta 100 99 100 Galba pervia 75 Hygrophila 99 100 100 Biomphalariaglabrata Acochlidium fijensis Acochlidiacea Platevindex mortoni 100 Peronia peronii 100 100 Systellommatophora 100 Onchidella borealis 98 Trimusculus reticulatus Ovatella vulcani 100 96 Ellobioidea 100 100 100 Pedipes pedipes Myosotella myosotis Ellobioidea Rhopalocaulis grandidieri Systellommatophora 100 fulica 99 Naesiotus nux 71 100 Cerion uva 56 Cerion incanum 100 100 100 100 Cylindrus obtusus 100 Helix aspersa 100 Cepaea nemoralis 100 Praticolella mexicana 100 Polygyracereolus 100 Camaena poyuensis 99 100 Camaena cicatricosa 99 Mastigeulota kiangsinensis 100 Aegistadiversifamilia 100 Aegista aubryana rufus Pupillamuscorum 100 Stylommatophora 57 98 cristata 100 Achatinella sowerbyana 100 Achatinella mustelina

0.50 Varney et al. BMC Ecol Evo (2021) 21:6 Page 8 of 14

Caenogastropoda COX1 COX2 atp8 atp6 12S 16S ND1 ND6 CYTB ND4L ND4 ND5COX3 ND3 ND2

Neritimorpha/Vetigastropoda COX1 COX2 atp8 atp6 ND5CND4ND4LCYTBND6 ND116S 12S OX3 ND3 ND2

Valvata cf. cristata COX1 COX2 ND6NND2CD4L ND4ND5 CYTB ND3 atp8 ND1 12S atp6 OX3 Microdiscula charopa COX1 COX3 ND4L ND2 atp6 ND1 ND4CND5CND6 YTB OX2 ND3 atp8 Ringicula conformis

Runcinaornata COX1 16S1ND6NND5CD1 ND4L CYTB COX2 atp8 atp6 2S ND3 ND4 OX3 ND2 Illbia ilbi

Cephalaspidea Tylodina cf. corticalis Aplysiida

Siphonaria pectinata COX1 16S1ND6NND5CD1 CYTB COX2 ND4L atp8 atp6 2S ND3 ND4 OX3 ND2

Siphonariagigas COX1 16S1ND6NND5CD1 ND4L CYTB COX2 atp8 atp6 2S ND4 ND3 OX3 ND2

Sacoglossa Salinator rhamphidia Pyramidella dolabrata

Hygrophila COX1 16S1ND6NND5CD1 ND4L CYTB COX2 atp8 atp6 2S ND3 ND4 OX3 ND2 Acochlidiumfijensis Onchidiidae Ellobioidea

Pedipes pedipes COX1 16S1ND6NND5ND1 ND4L CYTB COX2 atp8 atp6 2S ND3 COX3 D4 ND2

Myosotella myosotis COX1 16S1ND6NND5CD1 CYTB COX2 ND4L atp8 atp6 2S ND3 ND4 OX3 ND2 Rhopalocaulis grandidieri

COX1 16S1ND6NND5CD1 ND4L CYTB COX2 atp8 atp6 2S ND3 ND4 OX3 ND2 Stylommatophora

0.50 Fig. 3 Presumed ancestral mitochondrial gene order based on a TreeRex analysis of each major clade of heterobranch gastropods. Grey boxes spanning multiple clades indicate the common Heterobranch gene order shared among most taxa. Empty white boxes represent missing genes from sequenced mitochondrial contigs. Tree topology is taken from the BI tree presented in Fig. 2

mitochondrial genome. Arion rufus has the 12S rRNA mitochondrial genome evolutionary rate is too rapid to placed prior to atp8-atp6 instead of after it, but all other recover ancient radiations (e.g., [8, 17, 33]). members of Stylommatophora shared the common het- Our maximum likelihood-based analysis including all erobranch gene order. taxa failed to recover most recognized higher-level het- erobranch clades but did recover a maximally supported Discussion clade of taxa with extremely long branches near the base Heterobranch phylogeny of Heterobranchia. Tis clade includes taxa known to We sequenced mitochondrial genomes from eleven het- have brief lifespans, some of only a few months, which erobranch gastropods and investigated heterobranch may correlate with a more rapid accumulation of genetic evolutionary relationships using amino acid sequences changes [34]. To combat this putative artifact of long- from the thirteen mitochondrial protein-coding genes branch attraction, ML analyses of a dataset with long- as well as the evolution of heterobranch mitochondrial branched and unstable taxa excluded were performed. genome organization. Mitochondrial genomes can be Excluding these taxa resulted in a tree that exhibited useful in because of the func- an apparent mis-rooting within Heterobranchia rela- tional constraint that should, in theory, lead to a relatively tive to other studies (e.g., [22, 31, 32], reviewed by [19]) high degree of conservation across evolutionary time. with Panpulmonata paraphyletic with respect to a clade Tis has been demonstrated in diverse groups of ani- of opisthobranchs. Support for most higher-level het- mals where mitochondrial genomes have served as useful erobranch clades was weak in both maximum likelihood markers for molecular phylogenetics [1–3, 5]. However, analyses, although most order-level taxa were recovered in other animal lineages, it has been demonstrated that monophyletic with strong support. Varney et al. BMC Ecol Evo (2021) 21:6 Page 9 of 14

Clear long-branch attraction and weak support for ) [42–44], although unequivocal of Val- deep relationships within Heterobranchia initially led vatoidea and Orbitestelloidea—with minute, fragile and us to conclude that mitochondrial genomes have little often inconspicuous shells—are much younger (Cre- to no phylogenetic signal for deep nodes within Het- taceous to Eocene) (see [42, 43]). Architectonicoidea erobranchia. Additional maximum likelihood analyses is another candidate for an old group judging from that attempted to account for diferences in evolution- the presence of fossils in the [45]. Unfortu- ary rates between genes did not improve resolution of nately, most of the other lower heterobranchs we sam- these deeper nodes. However, although mostly weakly pled—O. atomus (Omalogyroidea), Psilaxis radiatus supported, a number of previously hypothesized rela- (Architectonicoidea), and Rissoella morrocayensis (Ris- tionships were recovered in all of our maximum like- soelloidae)—exhibited extremely long branches and the lihood analyses. Tese include Euthyneura (e.g. all Bayesian inference including these taxa (as well as an heterobranchs except Valvatoidea and Orbitestelloidea), analysis including these taxa but excluding C. limacina; Pyramidelloidea + Amphiboloidea, Nudipleura + Rin- data not shown) failed to converge. gicula (Ringipleura), Ringipleura + Acteonoidea (not Our Bayesian inference recovered Pyramidel- considering Rissoelloidae), Aplysiida + Umbraculoidea loidea + Amphiboloidea and Aplysiida + Umbraculoidea (not considering Gymnosomata and Tecosomata), and with strong support (pp ≥ 0.98). Tis analysis also recov- Cephalaspidea + Runcinida [22, 31, 32, 35–38]. ered a number of other heterobranch clades that have In order to determine if selecting a model that better been identifed in other studies but were not recovered accounts for site-specifc rate heterogeneity could help in the maximum likelihood analysis of this dataset. Most improve resolution, we conducted a Bayesian infer- notably among these is Panpulmonata. We recovered ence using the site heterogeneous CAT + GTR + G4 Siphonariida as the sister taxon of the remaining panpul- model. Tis analysis resulted in a topology that is much monates followed by Sacoglossa as the sister taxon to all more consistent with other studies examining hetero- other panpulmonates after that, all with strong support branch evolutionary relationships to date. Again, we (pp ≥ 0.99). Interestingly, support for the relative place- recovered Euthyneura to the exclusion of Valvatoidea ment of these two clades has been weak in most studies and Orbitestelloidea with maximal support. Whereas with the relevant taxon sampling to date (but see [31]). our maximum likelihood analyses recovered Val- Our results are inconsistent with most studies to date, vatoidea as the sister taxon to all other heterobranchs which have recovered these two taxa as a clade [22, 38] with moderate to weak support, our Bayesian infer- or with Sacoglossa, not Siphonariida, sister to the rest of ence recovered these two “lower heterobranchs” in a Panpulmonata [31, 32], reviewed by [20, 46]. polytomy with the rest of Heterobranchia. Tis is in Although Ringicula was previously recovered as the concordance with previous Sanger-sequencing based sister taxon of Nudipleura [32], which we recovered here approaches [20, 29] but neither clade was so far sam- in our maximum likelihood analyses, this relationship pled by phylogenomics [35] or mitogenomics [15]. was not supported in our Bayesian inference. Instead, Valvatoidea (= Ectobranchia) is a group of minute Nudipleura was recovered as the sister taxon of Acteo- freshwater and marine snails with discoidal to ovoid noidea, but this clade was weakly supported. Ringicula shells. Haszprunar et al. regarded Valvatoidea as the was recovered in a polytomy with this weakly supported earliest-branching heterobranch clade based on their Nudipleura-Acteonoidea clade (and a strongly supported broad, rhipidoglossate and unusual ectobranch clade consisting of all other heterobranchs), so although gill [39]. Tis phylogenetic position was favored by we fnd no support for the Ringipleura hypothesis in this Brenzinger et al. because a clade of all heterobranchs analysis, our Bayesian inference results are not incompat- except Valvatoidea is supported by the presence of cili- ible with Ringipleura. ated strips in the cavity [40]. Sperm ultrastru- All of our analyses failed to recover Ellobioidea as a cure is also consistent with their placement among the monophyletic group. A previous analysis that included lower heterobranchs [41]. Orbitestelloidea was consid- some of the Ellobioidea mitochondrial genomes analyzed ered to belong to Valvatoidea in the past. Our Bayesian herein, including those of the two taxa that were recov- analysis produced a polytomy containing these taxa, but ered separately from the rest of Ellobioidea in our analy- all our maximum likelihood analyses separated these ses (Pedipes pedipes and Myosotella myosotis), also failed taxa from one another with Valvatoidea sister to all to recover a monophyletic Ellobioidea [47]. However, other heterobranchs, as consistent with the most recent Dayrat et al. and Romero et al. sampled both of these classifcation [27]. Te record also coincides with species and recovered them as nested within Ellobioidea a greater age of “lower” heterobranchs (possibly present (although Dayrat et al. also recovered Trimusculus and in mid-Paleozoic) vs. Euthyneura (diversifying in the Otininae within Ellobioidea) [48, 49]. Varney et al. BMC Ecol Evo (2021) 21:6 Page 10 of 14

Evolution of heterobranch mitochondrial gene origin of Heterobranchia and later, even more so, at the organization origin of Euthyneura. Long-branch attraction, as discussed above, is likely responsible for the recovery of C. limacina in a clade Conclusions with the “lower heterobranchs” O. atomus, P. radia- Here, we sequenced 11 new heterobranch mitochon- tus, and R. morrocayensis. Often there is a correlation drial genomes including several “lower heterobranchs”. between a high rate of genome evolution and genome Tese new data were analyzed in combination with pub- rearrangements [50]; O. atomus and members of the licly available heterobranch and outgroup mitochon- genus Rissoella are known to have short life cycles [20, drial genomes using maximum likelihood and Bayesian 51]. C. limacina has a completely unique gene order, so inference. Results of maximum likelihood analyses with it is possible that some sequence diferences may be a site-homogeneous models indicated that even with the result of rearrangement and these in turn contributed to exclusion of exceptionally rapidly evolving taxa, mito- the misplacement of this taxon. Within Ellobioidea, the chondrial genomes have limited utility for resolving two members that are consistently recovered apart from most higher-level heterobranch relationships. How- the rest (Myosotella myosotis and Pedipes pedipes) both ever, Bayesian inference using the site-heterogeneous contain single gene transpositions (though difering from CAT + GTR + G model recovered most recognized one another). higher-level heterobranch clades including Tectipleura, However, this correlation does not hold for other Euopisthobranchia, and Panpulmonata. Unfortunately, isolated members of clades with extreme gene order most of the lower heterobranch taxa that we aimed to rearrangements. For example, Physella acuta has mito- place in a phylogenetic context exhibited extremely fast chondrial gene reordering so extensive that a minimum rates of evolution. Relationships within most hetero- of fve independent gene rearrangements are necessary to branch order-level clades that were broadly sampled (e.g., account for the diference between it and the remaining Nudipleura, Aplysiida, Sacoglossa, Hygrophila, Stylom- members of Hygrophila [52]. Despite this, P. acuta still matophora) were well-resolved and strongly supported. forms a clade with the rest of Hygrophila with very high Despite the relatively rapid rate of nucleotide evolution support in all analyses. Likewise, Sagaminopteron nigro- in heterobranch mitochondrial genomes, gene order punctatus forms a clade with the other cephalaspids with was found to be largely conserved across the group. very high support despite difering dramatically in gene Taken together, these results provide support for sev- arrangement from the other three members included in eral hypothesized heterobranch clades and highlight the the analysis, and the relationships among these taxa are non-euthyneuran clades Valvata and Orbitestelloidea as supported by recent transcriptome-based analyses [53]. interesting and important taxa to study with respect to Te variable relationship of evolutionary rate of gene understanding early heterobranch evolution. However, sequences and mitochondrial gene rearrangement could a lack of resolution and poor support for a number of be interesting to investigate in future studies. deeper nodes within Heterobranchia highlight limita- Te shared gene arrangement among the majority of tions of mitogenomic data for deep phylogeny, especially heterobranch taxa suggests this common gene order for rapidly evolving taxa like the long-branched lower emerged prior to the common ancestor of Nudipleura heterobranchs, and reveals the surprising degree of het- and remaining taxa. However, most taxa previously iden- erogeneity within even closely related molluscan taxa tifed as “lower heterobranchs,” as well as the additional that may in part be responsible for these limitations. taxa recovered at the base of the heterobranch tree in our analyses, have unique mitochondrial gene arrangements Methods relative to all other gastropods. Te diferences among DNA extraction, library preparation, and sequencing these taxa and between them and the main clade of het- DNA was extracted from specimens obtained from vari- erobranchs cannot be explained with stepwise changes, ous sources (Table 1) using the Omega Bio-tek EZNA but instead suggest multiple independent inversions and MicroElute Genomic DNA Kit (Omega Bio-tek, Nor- transpositions and may be due to a combination of long cross, GA) or with a MO-BIO Powermax Soil DNA Iso- evolutionary trajectories (since the mid-Paleozoic) [54] lation Kit. As most of the newly sequenced taxa were and derived ecologies and lifestyles in many subgroups, small-bodied, in most cases entire specimens were including the commonly observed abbreviation and placed directly into lysis bufer, and if size permitted, modifcation of life cycles by multiple evolution of pae- were ground with a sterile pestle prior to digestion to domorphic groups. Our results indicate that mitochon- break open shells. DNA concentration was measured drial genome organization started to deviate considerably using a Qubit 4 Fluorometer (Termo Fisher Scientifc, from the ancestral molluscan arrangement frst at the Waltham, MA) with the ds DNA HS kit. Samples that Varney et al. BMC Ecol Evo (2021) 21:6 Page 11 of 14

yielded too little DNA for library preparation (Rissoella (with default settings), partitioned-by-gene supermatrix morrocayensis and Omalogyra atomus) were amplifed using RAxML v8.2.4 [63] with the PROTGAMMAUTO with multiple strand displacement amplifcation using model, which automatically selects the best-ftting model the Illustra Single Cell GenomiPhi DNA Amplifcation for each partition, rapid bootstrapping, and selection of Kit (GE Healthcare, Chicago, IL). Dual-indexed sequenc- the best-scoring maximum likelihood tree in one run. ing libraries were prepared with the Illumina Nextera Kit Te number of bootstrap replicates was determined by (Illumina, San Diego, CA). Library size was assessed via the majority-rule consensus criterion (autoMRE). agarose gel following a test PCR (run with provided Illu- Leaf stability was assessed with RogueNaRok [64] using mina primers 1.1 and 2.1, run 95 °C for 10 min followed the majority rule consensus criterion. Four taxa (R. gran- by 40 cycles of [95° for 10 s, 60° for 30 s]). Pooled librar- dideri, P. pedipes, P. acuta, and M. myosotis) had a leaf ies were sequenced with a 2 × 100 bp paired-end TruSeq stability diference of < 0.75 and were considered to be 3000/4000 SBS kit on an Illumina HiSeq4000 (Macrogen, unstable by RogueNaRok (Additional fle 8: Table S2). South Korea) using 1/24 lane each. Tese taxa, along with the very long-branched taxa C. limacina, P. radiatus, O. atomus, and R. morrocayen- Assembly and annotation sis were removed and the remaining sequences for each De novo assemblies of reads were initially carried out gene were re-aligned, trimmed, and concatenated before with Spades 3.14.0 [55]. In the case of O. atomus, the reanalysis in RAxML as described above. longest mitochondrial genome contig produced by To attempt to combat the apparent long branch attrac- Spades was missing several mitochondrial genes and Ray tion among C. limacina, P. radiatus, O. atomus, and R. 2.2.0 [56] was used for assembly. Mitochondrial genomes morrocayensis, trees were also produced for the BMGE- were identifed by creating a BLAST database from trimmed matrix with a number of diferent models each set of assembled scafolds and querying that data- and/or analysis settings. We performed a series of ML base with the complete mitochondrion of Galba pervia analyses in IQ-TREE 2 [65] with 1000 rapid bootstraps (NCBI NC_018536.1) via BLASTN with an e-value cut- employing diferent models and partitioning schemes of of 1e-4. Te longest BLAST hits were annotated with including (1) the BMGE-trimmed dataset partitioned by the MITOS2 web server with default parameters and the gene with a partitioned mixed model (LG + C60 + G + F) invertebrate mitochondrial genetic code (5) [57]. and the best tree from RAxML provided as a starting tree (Additional fle 3: Figure S2); (2) the same BMGE- Data set construction trimmed dataset partitioned by gene and using Lanfear Coding sequences of the 13 mitochondrial protein- clustering to select optimal partitioning [66], resulting coding genes (cox1, cox2, cox3, atp6, atp8, nad1, nad2, in 5 partitions with diferent models (Additional fle 4: nad3, nad4, nad4L, nad5, nad6, cob) were extracted Figure S3); (3) a fully partitioned analysis of this matrix from the newly sequenced and annotated mitochondrial where PartitionFinder independently selected the best genomes, as well as those publicly available on NCBI model for each gene with the –GENESITE correction to (see Additional fle 1: Table S1). Single-gene alignments resample partitions and then sites within partitions [67] were simultaneously produced for DNA and amino acid (Additional fle 5: Figure S4); (4) an analysis of this matrix sequences with MACSE v1.2 [7] using the invertebrate with an edge-unlinked model to better account for het- genetic code (5). Alignments were trimmed with trimAL erotachy (GHOST) [68] (Additional fle 6: Figure S5). with default settings [58]. Trimmed alignments were We also ran a RAxML analysis on the original TrimAL- checked manually in MEGA 10.0.4 [59] and corrected by trimmed dataset but with C. limacina removed (Addi- hand if translations were initially out of frame. FAScon- tional fle 7: Figure S6. In all IQ-TREE 2 and RAxML CAT [60] was used to assemble the concatenated amino trees, the clade of four (or three, in the last analysis) long- acid supermatrix fle. In response to difculties with branched taxa persisted, and the overall tree topology did long-branched taxa (Additional fle 2: Figure S1) and in not change. keeping with recent recommendations to improve phy- Bayesian analysis logenetic analysis [61], the alignment was also trimmed with BMGE [62] trimming with default settings. Te Bayesian trees were generated with PhyloBayes-MPI v1.6 BMGE matrix was used for subsequent analyses. All data [69] with four chains and the CAT + GTR + Γ4 substitu- matrices are available online via FigShare. tion model. Two analyses were attempted based on the BMGE-trimmed matrix: (1) an analysis sampling all taxa; Maximum likelihood analyses and (2) an analysis excluding the taxa fagged as unsta- An initial maximum likelihood analysis (Supplemental ble in the initial maximum likelihood tree (C. limacina, P. Figure S1) was conducted on the initial TrimAL-trimmed radiatus, O. atomus, and R. morrocayensis). Varney et al. BMC Ecol Evo (2021) 21:6 Page 12 of 14

Mitochondrial gene order heterobranch mitochondrial genomes. Rearrangements shown on In light of the apparent heterogeneity in mitochondrial branches are delineated as T for transposition and TDL for tandem-dupli- gene sequences within and between clades, we examined cation-random-loss events. Nodes colored green as consistently recon- structed, red reconstructed with the fallback method (where P0 indicates gene order across major groups with TreeREx v1.85 [70]. the chosen solution is not better than other possible solutions). Te heterogeneity within several groups made it impos- Additional fle 9: Table S2. RogueNaRok leaf instability indices, run with sible to visualize all at once (Additional fle 8: Figure S7), the tree from Figure 1. Lsi_42_max represents the maximum leaf instabil- so nodes of major clades were collapsed and the inferred ity across four possible quartets, lsi_42_ent the entropy between the two most diferent quartets, and Lsi_42_dif the leaf stability diferences ancestral gene arrangement for each clade diagrammed between the two most common quartets. again with TreeREx (Additional fle 11: Figure S8). Syn- Additional fle 10: Figure S8: TreeRex output of a subset of representa- tenic blocks were visualized with Geneious R11 (Addi- tive mitochondrial genomes from heterobranch gastropods to showcase tional fle 10: Table S3). more general patterns in rearrangements between major clades. Rear- rangements shown on branches are delineated as T for transposition and TDL for tandem-duplication-random-loss events. Nodes colored green as Supplementary Information consistently reconstructed, red reconstructed with the fallback method The online version contains supplementary material available at https​://doi. (where P0 indicates the chosen solution is not better than other possible org/10.1186/s1286​2-020-01728​-y. solutions). Additional fle 11: Table S3. Mitochondrial gene orders in all taxa Additional fle 1: Table S1. Data downloaded from NCBI used in the from the present study, including outgroups, with < and > indicating present study. directionality and orange boxes indicating possible locations of gene rearrangements. Additional fle 2: Figure S1. Maximum likelihood phylogeny of het- erobranch gastropods based on the full set of available heterobranch mitochondrial genomes (including long-branched taxa). The data set Abbreviation was partioned by gene, trimmed with TrimAL with default settings, and mt: Mitochondrial. analyzed in RAxML with the PROTGAMMAAUTO setting to select the best- ftting model for each partition. Bootstrap support values are presented at Acknowledgements each node. The authors thank Audrey Falconer of the Museum Victoria, for Additional fle 3: Figure S2. Maximum likelihood phylogeny of het- generously providing specimens of Microdiscula charopa and Ilbia ilbi. We also erobranch gastropods based on the full set of available heterobranch thank Denise Akob for suggestions for Figure 3. We thank Deb Crocker and mitochondrial genomes (including long-branched taxa). The data set Robert Grifn for assistance with the University of Alabama High-Performance was partitioned by gene, trimmed with BMGE, and analyzed in IQ-TREE Computing cluster. 2 with the LG C60 G F mixed model. Bootstrap support values are presented at each+ node.+ + Authors’ contributions BB, MAEM, CPM, and MS collected and identifed specimens. RMV and CP Additional fle 4: Figure S3. Maximum likelihood phylogeny of het- performed DNA extraction. RMV performed sequencing library preparation erobranch gastropods based on the full set of available heterobranch and quality control. RMV analyzed the data with help from KMK. All authors mitochondrial genomes (including long-branched taxa). The data set was contributed to the writing of the manuscript. All authors read and approved partitioned by gene, trimmed with BMGE, and greedy Lanfear clustering the fnal manuscript. was applied in IQ-TREE 2 to determine the optimal partitioning scheme. Five partitions with independent models were applied. Bootstrap support Funding values are presented at each node. This work was funded by start-up funds to KMK from The University of Ala- Additional fle 5: Figure S4. Maximum likelihood phylogeny of het- bama College of Arts and Sciences and Department of Biological Sciences and erobranch gastropods based on the full set of available heterobranch Deutsche Forschungsgemeinschaft DFG BR5727/1-1 to BB. mitochondrial genomes (including long-branched taxa). The data set was partitioned by gene, trimmed with BMGE, and each partition was allowed Availability of data and materials to select its own optimal model via ModelFinder implemented in IQ-TREE The raw Illumina FASTQ reads generated in this study are available via NCBI 2. The analysis was run with the –GENESITE correction to facilitate resam- SRA under BioProject ID PRJNA626646. Mitochondrial sequences and annota- pling frst within partition and then within sites. Bootstrap support values tions are also available under NCBI BioProject PRJNA626646. Single gene are presented at each node. alignments, data matrices, and tree fles are available viaFigShare: https​://fgsh​ are.com/proje​cts/Asses​sment​_of_mitoc​hondr​ial_genom​es_for_heter​obran​ Additional fle 6: Figure S5. Maximum likelihood phylogeny of het- ch_gastr​opod_phylo​genet​ics/80765​. erobranch gastropods based on the full set of available heterobranch mitochondrial genomes (including long-branched taxa). The data set was Ethics approval and consent to participate trimmed with BMGE, concatenated into a supermatrix, and analyzed with All samples were invertebrate samples from existing museum collections. No an edge-unlinked model to better account for heterotachy (GHOST). The additional were collected for this study. analysis was run in IQ-TREE 2 with the –GENESITE correction to facilitate resampling frst within partition and then within sites. Bootstrap support Consent for publication values are presented at each node. Not applicable. Additional fle 7: Figure S6. Maximum likelihood phylogeny of hetero- branch gastropods based on the full set of available heterobranch mito- Competing interests chondrial genomes (including long-branched taxa) except for C. limacina. The authors declare that they have no competing interests. The data set was partitioned, trimmed with TrimAL with default settings, and concatenated into a supermatrix, and run in RAxML with -PROTGAM- Author details MAAUTO setting to select the best-ftting model. 1 Department of Biological Sciences, The University of Alabama, Campus Box 870344, Tuscaloosa, AL 35487, USA. 2 SNSB-Bavarian State Collection Additional fle 8: Figure S7. TreeRex output of a rearrangement of Zoology, Münchhausenstr. 21, 81247 München, Germany. 3 Department analysis highlighting the multiple inversions and transpositions across of Natural History, University Museum of Bergen, University of Bergen, Varney et al. BMC Ecol Evo (2021) 21:6 Page 13 of 14

5020 Bergen, Norway. 4 National Museum of Natural History, Smithsonian 20. Schrödl M, Stöger I. A review on deep molluscan phylogeny: old Institution, 10th St. & Constitution Ave. NW, Washington, D.C. 20560, USA. markers, integrative approaches, persistent problems. J Nat History. 5 BioGeoCenter LMU (Ludwig Maximillion University Munich), University 2014;48:2773–804. of Munich, Biozentrum, Großhaderner Str. 2, 82152 Planegg‑Martinsried, Ger- 21. Dinapoli A, Klussmann-Kolb A. The long way to diversity – Phylogeny and many. 6 Alabama Museum of Natural History, Campus Box 870344, Tuscaloosa, evolution of the Heterobranchia (: Gastropoda). Mol Phylogenet AL 35487, USA. Evol. 2010;55:60–76. 22. Jörger KM, Stöger I, Kano Y, Fukuda H, Knebelsberger T, Schrödl M. On Received: 19 May 2020 Accepted: 26 November 2020 the origin of Acochlidia and other enigmatic euthyneuran gastropods, with implications for the systematics of Heterobranchia. BMC Evol Biol. 2010;10:323. 23. Spengel J. Die Geruchsorgane und das Nervensystem der Mollusken. Zeitschrift für Wissenschaftliche Zoologie. 1881;35:372. References 24. Haszprunar G. The Heterobranchia–a new concept of the phylogeny of 1. Scouras A, Smith MJ. A novel mitochondrial gene order in the Crinoid the higher Gastropoda. J Zool Syst Evol Res. 1985;23:15–37. Echinoderm Florometra serratissima. Mol Biol Evol. 2001;18:61–73. 25. Ponder WF, Lindberg DR. Towards a phylogeny of gastropod mol- 2. Boore JL, Staton JL. The mitochondrial genome of the Sipunculid luscs: an analysis using morphological characters. Zool J Linn Soc. Phascolopsis gouldii supports its association with Annelida rather than 1997;119:83–265. Mollusca. Mol Biol Evol. 2002;19:127–37. 26. Warèn A, Bouchet P. New records, species, genera, and a new family of 3. Lavrov DV, Lang BF. Poriferan mtDNA and animal phylogeny based on gastropods from hydrothermal vents and hydrocarbon seeps. Zoolog Scr. mitochondrial gene arrangements. Syst Biol. 2005;54:651–9. 1993;22:1–90. 4. Li Y, Sun X, Hu X, Xun X, Zhang J, Guo X, et al. Scallop genome reveals 27. Bieler R, Ball AD, Mikkelsen PM. Marine Valvatoidea - Comments on molecular adaptations to semi-sessile life and neurotoxins. Nat Commun. anatomy and systematics with description of a new species from Florida 2017;8:1–11. (Heterobranchia: Cornirostridae). Malacologia. 1998;17:99. 5. Galaska MP, Li Y, Kocot KM, Mahon AR, Halanych KM. Conservation of 28. Bouchet P, Rocroi J-P. Classifcation and nomenclator of gastropod fami- mitochondrial genome arrangements in brittle stars (Echinodermata, lies. Malacologia: International Journal of Malacology. 2005. http://www. Ophiuroidea). Mol Phylogen Evol. 2019;130:115–20. vliz.be/en/imis?refd​ 78278​. Accessed 29 Nov 2017. = 6. Talavera G, Vila R. What is the phylogenetic signal limit from mitog- 29. Brenzinger B, Wilson NG, Schrödl M. Microanatomy of shelled Koloonella enomes? The reconciliation between mitochondrial and nuclear data in cf. minutissima (Laseron, 1951) (Gastropoda: ‘lower’ Heterobranchia: Mur- the Insecta class phylogeny. BMC Evol Biol. 2011;11:315. chisonellidae) does not contradict a sister-group relationship with enig- 7. Ranwez V, Harispe S, Delsuc F, Douzery EJP. MACSE: multiple alignment matic Rhodopemorpha slugs. J Molluscan Stud. 2014;2014(80):518–40. of coding sequences accounting for frameshifts and stop codons. PLoS 30. Wilson NG, Jörger KM, Brenzinger B, Schrödl M. Phylogenetic placement ONE. 2011;6:e22594. of the enigmatic worm-like Rhodopemorpha slugs as basal Hetero- 8. Wägele JW, Bartolomaeus T. Deep metazoan phylogeny: The backbone branchia. J Molluscan Stud. 2017;83:399–408. of the tree of life, new insights from analyses of molecules, morphology, 31. Kocot KM, Halanych KM, Krug PJ. Phylogenomics supports Panpulmo- and theory of data analysis. Berlin, Boston: De Gruyter; 2014. http://doi. nata: Opisthobranch paraphyly and key evolutionary steps in a major org/https​://doi.org/10.1515/97831​10277​524. radiation of gastropod molluscs. Mol Phylogenet Evol. 2013;69:764–71. 9. McCartney MA, Auch B, Kono T, Mallez S, Zhang Y, Obille A, et al. The 32. Kano Y, Brenzinger B, Nützel A, Wilson NG, Schrödl M. Ringiculid bubble genome of the zebra , Dreissena polymorpha: a resource for inva- snails recovered as the sister group to sea slugs (Nudipleura). Scientifc sive species research. bioRxiv. 2019. p. 696732. Reports. 2016;6:30908. 10. Mikkelsen NT. Phylogeny and systematics of Caudofoveata (Mollusca, 33. Podsiadlowski L, Mwinyi A, Lesný P, Bartolomaeus T. Mitochondrial gene Aplacophora). The University of Bergen; 2018. https​://bora.uib.no/handl​ order in Metazoa–theme and variations. In: Deep Metazoan Phylog- e/1956/17813​. Accessed 13 May 2020. eny: The Backbone of the Tree of Life, New insights from analyses of 11. Allcock AL, Cooke IR, Strugnell JM. What can the mitochondrial genome molecules, morphology, and theory of data analysis. Berlin, Boston: De reveal about higher-level phylogeny of the molluscan class Cephalop- Gruyter; 2014. http://doi.org/https​://doi.org/10.1515/97831​10277​524. oda? Zool J Linn Soc. 2011;161:573–86. 34. Fretter V. The Structure and life history of some minute prosobranchs of 12. Uribe JE, Zardoya R. Revisiting the phylogeny of Cephalopoda using rock pools: Skeneopsis (Fabricius), Omalogyra atomus (Philippi), complete mitochondrial genomes. J Molluscan Stud. 2017;83:133–44. Rissoella diaphana (Alder) and (Jefreys). J Mar Biol Assoc 13. Gaitán-Espitia JD, Nespolo RF, Opazo JC. The complete mitochondrial UK. 2019;27:597–632. genome of the land Cornu aspersum (: Mollusca): intra- 35. Klussmann-Kolb A, Dinapoli A, Kuhn K, Streit B, Albrecht C. From sea specifc divergence of protein-coding genes and phylogenetic consid- to land and beyond–new insights into the evolution of euthyneuran erations within Euthyneura. PLoS ONE. 2013;8:e67299. Gastropoda (Mollusca). BMC Evol Biol. 2008;8:57. 14. Williams ST, Foster PG, Littlewood DTJ. The complete mitochondrial 36. Malaquias MAE, Reid DG. Tethyan vicariance, relictualism and speciation: genome of a turbinid vetigastropod from MiSeq Illumina sequencing of evidence from a global molecular phylogeny of the opisthobranch genus genomic DNA and steps towards a resolved gastropod phylogeny. Gene. Bulla. J Biogeogr. 2009;36:1760–77. 2014;533:38–47. 37. Zapata F, Wilson NG, Howison M, Andrade SCS, Jörger KM, Schrödl M, 15. Uribe JE, Williams ST, Templado J, Abalde S, Zardoya R. Denser mitog- et al. Phylogenomic analyses of deep gastropod relationships reject enomic sampling improves resolution of the phylogeny of the . Proc R Soc B. 2014;281:20141739. superfamily (Gastropoda: Vetigastropoda). J Molluscan Stud. 38. Ayyagari VS, Sreerama K. Molecular phylogeny and evolution of Pulmo- 2017;83:111–8. nata (Mollusca: Gastropoda) on the basis of mitochondrial (16S, COI) and 16. Grande C, Templado J, Zardoya R. Evolution of gastropod mitochondrial nuclear markers (18S, 28S): an overview. J Genet. 2020;99:17. genome arrangements. BMC Evol Biol. 2008;8:61. 39. Haszprunar G, Speimann E, Hawe A, Heß M. Interactive 3D anatomy and 17. Stöger I, Schrödl M. Mitogenomics does not resolve deep molluscan afnities of the Hyalogyrinidae, basal Heterobranchia (Gastropoda) with a relationships (yet?). Mol Phylogenet Evol. 2013;69:376–92. rhipidoglossate radula. Org Divers Evol. 2011;11:201–36. 18. Stöger I, Kocot KM, Poustka AJ, Wilson NG, Ivanov D, Halanych KM, et al. 40. Brenzinger B, Haszprunar G, Schrödl M. At the limits of a successful body Monoplacophoran mitochondrial genomes: convergent gene arrange- plan - 3D microanatomy, histology and evolution of Helminthope (Mol- ments and little phylogenetic signal. BMC Evol Biol. 2016;16:274. lusca: Heterobranchia: Rhodopemorpha), the most worm-like gastropod. 19. Schrödl M, Jörger KM, Klussmann-Kolb A, Wilson NG. Bye bye “Opistho- Front Zool. 2013;10:37. branchia”! A review on the contribution of mesopsammic sea slugs to 41. Healy JM. Comparative sperm ultrastructure and spermiogenesis in basal euthyneuran systematics. Thalassas. 2011;27:101–12. heterobranch gastropods (Valvatoidea, Architectonicoidea, Rissoelloidea, Omalogyroidea, Pyramidelloidea) (Mollusca). Zoolog Scr. 1993;22:263–76. Varney et al. BMC Ecol Evo (2021) 21:6 Page 14 of 14

42. Frýda J, Nützel A, Wagner J. Paleozoic gastropod phylogeny. In: Phylogeny 57. Bernt M, Donath A, Jühling F, Externbrink F, Florentz C, Fritzsch G, et al. and evolution of the Mollusca. New York: University of California Press; MITOS: Improved de novo metazoan mitochondrial genome annotation. 2008. p. 239. Mol Phylogenet Evol. 2013;69:313–9. 43. Gründel J, Nützel A. On the early evolution (Late Triassic to Late Jurassic) 58. Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for of the (Gastropoda: Heterobranchia), with a provi- automated alignment trimming in large-scale phylogenetic analyses. sional classifcation. Neues Jahrbuch für Geologie und Paläontologie. Bioinformatics. 2009;25:1972–3. 2012;264:31–59. 59. Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: Molecular evo- 44. Bandel K. Phylogeny of the Caecidae (). Institut, Univer- lutionary genetics analysis across computing platforms. Mol Biol Evol. sität Hamburg. 1996;79:53–115. 2018;35:1547–9. 45. Wägele H, Klussmann-Kolb A, Vonnemann V, Medina M, Heterobranchia 60. Kück P, Meusemann K. FASconCAT: Convenient handling of data matrices. I. The Opisthobranchia. In: Phylogeny and evolution of the Mollusca. Mol Phylogenet Evol. 2010;56:1115–8. Berkeley: University of California Press; 2008. 61. Anisimova M, Liberles DA, Philippe H, Provan J, Pupko T, von Haeseler A. 46. Cunha TJ, Giribet G. A congruent topology for deep gastropod relation- State-of the art methodologies dictate new standards for phylogenetic ships. Proc R Soc B Biol Sci. 2019;286:20182776. analysis. BMC Evol Biol. 2013;13:161. 47. White TR, Conrad MM, Tseng R, Balayan S, Golding R, de Frias Martins AM, 62. Criscuolo A, Gribaldo S. BMGE (Block Mapping and Gathering with et al. Ten new complete mitochondrial genomes of pulmonates (Mol- Entropy): a new software for selection of phylogenetic informative lusca: Gastropoda) and their impact on phylogenetic relationships. BMC regions from multiple sequence alignments. BMC Evol Biol. 2010;10:210. Evol Biol. 2011;11:295. 63. Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post- 48. Dayrat B, Conrad M, Balayan S, White TR, Albrecht C, Golding R, et al. analysis of large phylogenies. Bioinformatics. 2014;30:1312–3. Phylogenetic relationships and evolution of pulmonate gastropods (Mol- 64. Aberer AJ, Krompass D, Stamatakis A. Pruning rogue taxa improves lusca): New insights from increased taxon sampling. Mol Phylogenet Evol. phylogenetic accuracy: an efcient algorithm and webservice. Syst Biol. 2011;59:425–37. 2013;62:162–6. 49. Romero PE, Weigand AM, Pfenninger M. Positive selection on panpulmo- 65. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams M, von nate mitogenomes provide new clues on adaptations to terrestrial life. Haeseler A, et al. IQ-TREE 2: new models and efcient methods for phylo- BMC Evol Biol. 2016;16:164. genetic inference in the genomic era. Mol Biol Evol. 2020;37:1530–4. 50. Shao R, Dowton M, Murrell A, Barker SC. Rates of gene rearrangement 66. Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A. Selecting optimal and nucleotide substitution are correlated in the mitochondrial genomes partitioning schemes for phylogenomic datasets. BMC Evol Biol. of insects. Mol Biol Evol. 2003;20:1612–9. 2014;14:82. 51. Wolstenholme DR. Animal mitochondrial DNA: structure and evolution. 67. Gadagkar SR, Rosenberg MS, Kumar S. Inferring species phylogenies from Int Rev Cytol. 1992;141:173–216. multiple genes: concatenated sequence tree versus consensus gene tree. 52. Nolan JR, Bergthorsson U, Adema CM. Physella acuta: atypical mitochon- J Exp Zool. 2005;304B:64–74. drial gene order among panpulmonates (Gastropoda). J Molluscan Stud. 68. Crotty S, Minh B, Bean N, Holland B, Tuke J, Jermiin L, et al. GHOST: 2014;80:388–99. recovering historical signal from heterotachously evolved sequence 53. Knutson VL, Brenzinger B, Schrödl M, Wilson NG, Giribet G. Most Ceph- alignments. Syst Biol. 2020;69:22. alaspidea have a shell, but transcriptomes can provide them with a 69. Lartillot N, Rodrigue N, Stubbs D, Richer J. PhyloBayes MPI: phylogenetic backbone (Gastropoda: Heterobranchia). Mol Phylogen Evol. 2020;9:22. reconstruction with infnite mixtures of profles in a parallel environment. 54. Frýda J, Racheboeuf PR, Frýdová B, Ferrová L, Mergl M, Berkyová S. Plat- Syst Biol. 2013;62:611–5. yceratid gastropods - stem group of patellogastropods, neritimorphs or 70. Bernt M, Merkle D, Middendorf M. An algorithm for inferring mitog- something else? Bull Geosci. 2009;1:107–20. enome rearrangements in a . In: Nelson CE, Vialette S, 55. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, editors. Comparative Genomics. Springer: Berlin; 2008. p. 143–57. et al. SPAdes: A new genome assembly algorithm and its applications to single-cell sequencing. J Comput Biol. 2012;19:455–77. 56. Boisvert S, Laviolette F, Corbeil J. Ray: Simultaneous assembly of reads Publisher’s Note from a mix of high-throughput sequencing technologies. J Comput Biol. Springer Nature remains neutral with regard to jurisdictional claims in pub- 2010;17:1519–33. lished maps and institutional afliations.

Ready to submit your research ? Choose BMC and benefit from:

• fast, convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations • maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress.

Learn more biomedcentral.com/submissions