1 Revised version: November 21st, 2016
2
3
4 Denser mitogenomic sampling improves resolution of the phylogeny of the
5 superfamily Trochoidea (Gastropoda: Vetigastropoda)
6
7
8 Juan E. Uribe1, Suzanne T. Williams2, José Templado1, Samuel
9 Abalde1, and Rafael Zardoya1*
10
11 1Museo Nacional de Ciencias Naturales (MNCN-CSIC), José Gutiérrez
12 Abascal 2, 28006, Madrid, Spain
13 2Department of Life Sciences, Natural History Museum, Cromwell Rd,
14 London SW7 5BD, UK
15
16
17
18
19
20
21 *Correspondence: R. Zardoya; email: [email protected]
22 23 ABSTRACT
24 The great morphological and ecological diversity within the superfamily Trochoidea s.l.
25 (Gastropoda: Vetigastropoda) has in the past hindered the reconstruction of a robust
26 phylogeny for the group based on morphology. Moreover, previous molecular
27 phylogenies disagreed on the monophyly and internal relationships of Trochoidea s.l.,
28 as well as on its relative phylogenetic position within Vetigastropoda. In order to further
29 resolve the trochoidean and vetigastropod phylogenetic trees, we considerably increased
30 the representation of trochoidean families for which no previous mitochondrial (mt)
31 genomes were available: the complete mt genome of Cittarium pica (Tegulidae) and the
32 nearly complete mt genomes of Tectus virgatus (Tegulidae), Gibbula umbilicaris
33 (Trochidae), and Margarites vorticiferus (Margaritidae) were sequenced. In addition,
34 the nucleotide sequences of all protein coding and rRNA genes of Clanculus
35 margaritarius (Trochidae) and of Calliostoma zizyphinum (Calliostomatidae) were
36 derived from transcriptomic sequence data. The reconstructed phylogenetic trees using
37 probabilistic methods and Neomphalina as outgroup recovered with maximal support a
38 Trochoidea sensu Hickman & McLean, 1990 clade that included superfamilies
39 Angarioidea and Phasianelloidea deeply nested within superfamily Trochoidea sensu
40 Williams (2012). The families Trochidae and Calliostomatidae were the sister group to
41 the remaining trochoidean lineages. Of these, the family Margaritidae was sister to a
42 clade including Phasianelloidea + Angarioidea and Turbinidae + Tegulidae, this latter
43 family being paraphyletic (Cittarium and Tectus need to be assigned to a new family).
44 Gene order within newly determined mt genomes was very stable (with only few
45 rearrangements restricted to tRNA genes) and conformed to the vetigastropod and
46 gastropod consensus genome organizations.
47
48 Keys words: Mitogenomic phylogeny, rearrangement, Vetigastropoda, Trochoidea,
49 Trochidae, Calliostomatidae, Margaritidae, Cittarium, Tectus.
50 51 INTRODUCTION
52 Trochoidea s.l. Rafinesque, 1815 (top shells, turban shells, and allies) is one of the
53 most ecologically and morphologically diverse lineage of marine gastropods and by far
54 the largest superfamily belonging to the subclass Vetigastropoda, with more than 2,000
55 living species grouped into about 500 recognized genera (Hickman, 1996; Geiger,
56 Nützel & Sasaki, 2008). The clade is distributed worldwide and is present throughout
57 all seas and oceans, at all latitudes and bathymetric ranges (Hickman & McLean, 1990;
58 Williams, Karube & Ozawa, 2008). Trochoideans play an important ecological role as a
59 predominant element in different marine communities such as intertidal rocky shores,
60 seagrass beds, or coral reefs, and they are also found in many other marine habitats
61 (Williams et al., 2008). They have a long fossil record that goes back to the Middle
62 Triassic, 228-245 million years ago, but the time of the origin of the group is certainly
63 much older (Hickman & McLean, 1990; Williams et al., 2008).
64 The taxonomic internal classification of Trochoidea has a long history of
65 controversy and instability. In their comprehensive morphological monograph on
66 trochacean gastropods, Hickman & McLean (1990) maintained the three families
67 traditionally recognized within the superfamily i.e., Trochidae, Turbinidae and
68 Skeneidae, and organized the different genera into various subfamilies and tribes based
69 on suites of shared morphological characters. Later, in the taxonomic classification of
70 gastropods proposed by Bouchet et al. (2005), the family Turbinidae (including the
71 subfamily Skeneinae) was classified within the superfamily Turbinoidea. However,
72 major changes to the systematics of Trochoidea were based on recent molecular
73 phylogenies (Geiger & Thacker, 2005; Williams & Ozawa, 2006; Kano, 2008; Williams
74 et al., 2008; Williams, 2012), which challenged the monophyly of the superfamily as
75 well as of several of the internal groups as defined by Hickman & McLean (1990), and 76 prompted for important changes to the taxon composition and arrangement of families
77 (Williams, 2012). For instance, some taxa were transferred to the superfamily
78 Seguenzioidea (Verrill, 1884), newly redefined by Kano (2008), and a number of
79 minute skeneimorph genera were variously relocated either to Seguenzoidea (Kano,
80 Chikyu & Warén, 2009; Haszprunar et al., 2016), Neomphalina (Kunze et al., 2008), or
81 to the new family Crosseolidae of uncertain taxonomic position (Hickman, 2013).
82 Furthermore, several molecular studies redefined the family Turbinidae (Williams &
83 Ozawa, 2006), reinterpreted the superfamilies Angarioidea and Phasianelloidea
84 (Williams et al., 2008), and restricted Trochoidea to the families Trochidae, Turbinidae,
85 Solariellidae, Calliostomatidae, Liotiidae, Skeneidae, Margaritidae and Tegulidae
86 (Williams, 2012).
87 None of these taxonomic changes was definitive and the debates over the final
88 composition and internal phylogenetic relationships of Trochoidea remain more alive
89 than ever. Moreover, this question is directly related to resolving phylogenetic
90 relationships among the different superfamilies of Vetigastropoda. In this regard, some
91 studies recovered Phasianelloidea and/ or Angarioidea in early-branching positions of
92 the Vetigastropoda tree after the divergence of Pleurotomarioidea ((Williams & Ozawa,
93 2006; Kano, 2008; Williams et al., 2008; Aktipis & Giribet, 2012) whereas several
94 recent phylogenies grouped Phasianelloidea and/ or Angarioidea with Trochoidea
95 (Zapata et al., 2014; Uribe et al., 2016; Lee et al. 2017; Wort, Fenberg & Williams,
96 2017). While earlier studies were based on few partial mitochondrial and nuclear genes
97 and a rather extensive lineage representation, later ones were based on phylogenomic
98 data but with reduced taxon sampling.
99 Phylogenetic analysis of complete mitochondrial (mt) genomes resulted in good
100 resolution among vetigastropod superfamilies (e.g. Uribe et al., 2016) and therefore, 101 they are good candidates to resolve phylogenetic relationships within Trochoidea. Until
102 recently, there were available 22 complete or near-complete mt genomes of
103 Vetigastropoda, which represent the living superfamilies Fissurelloidea,
104 Lepetodriloidea, Seguenzioidea, Haliotoidea, Angarioidea, Phasianelloidea, and
105 Trochoidea (no mt genome has been sequenced for Pleurotomarioidea and
106 Lepetelloidea). However, the great diversity of Trochoidea was clearly
107 underrepresented, as mt genomes for only 12 species belonging to families Turbinidae,
108 Trochidae, and Tegulidae had been published (Uribe et al., 2016; Lee et al. 2017; Wort
109 et al. 2017). Here, we increased the number of complete mt genomes representing
110 different families within Trochoidea to test the monophyly and address internal
111 phylogenetic relationships of the superfamily (in particular the relative positions of
112 families Trochidae, Calliostomatidae, and Margaritidae plus of the genera Tectus and
113 Cittarium , of which onlyTrochidae was previously included in phylogenetic analyses),
114 as well as to resolve its relative phylogenetic position within Vetigastropoda. In
115 addition, the reconstructed phylogeny was used to determine whether trochoidean mt
116 genomes show rearrangements in their genes orders. During the review process of this
117 paper, Lee et al. (2016) published a related mitogenomic phylogenetic study, which
118 complemented our taxon sampling and enriched it at the family level. Therefore, the mt
119 genomes reported by Lee et al. (2016) were incorporated into our phylogenetic
120 analyses.
121
122 MATERIALS AND METHODS
123 Samples and DNA/ RNA extraction
124 One specimen each of Cittarium pica (Tegulidae), Tectus virgatus (Tegulidae),
125 Gibbula umbilicaris (Trochidae), Clanculus margaritarius (Trochidae), Calliostoma 126 zizyphinum (Calliostomatidae), and Margarites vorticiferus (Margaritidae) was used for
127 this study (See Table 1, for details on the locality, collector, and voucher ID of each
128 sample; family assignment was based on WoRMS: accessed October 2016, Gofas,
129 2009). Samples of C. pica, T. virgatus, G. umbilicaris, and M. vorticiferus, were stored
130 in 100% ethanol at -20 ºC, and total genomic DNA was isolated from up to 30 mg of
131 foot tissue following a standard phenol chloroform extraction.
132 Samples of C. margaritarius and C. zizyphinum were stored in RNALater at -80
133 ºC, and total RNA was isolated from mantle tissue using the RNeasy Fibrous Tissue
134 Mini Kit (Qiagen) according to the manufacturer’s instructions. Total RNA was
135 quantified and its integrity assessed using a Qubit® 2.0 Fluorometer RNA assay kit and
136 an Agilent 2200 Tapestation with a high sensitivity R6K Screen Tape, respectively.
137 Dynabeads® mRNA DIRECT™ Micro Kit (Ambion, Life Technologies) were used to
138 isolate mRNA using the 100ng-1µg total RNA protocol.
139
140 PCR amplification and sequencing
141 Two alternative strategies were carried out to obtain mitogenomic sequence data.
142 For C. pica, T. virgatus, G. umbilicaris, and M. vorticiferus, complete or near-complete
143 mt genomes were PCR amplified and sequenced, whereas for C. margaritarius and C.
144 zizyphinum, transcriptomic sequence data was generated and mt protein-coding and
145 rRNA genes were identified.
146 For obtaining complete or near-complete mt genomes from genomic DNA a
147 three-step strategy was used. First, fragments of cox1, rrnL, and cox3 genes were
148 amplified using the primers respectively detailed in Folmer et al. (1994), Palumbi et al.
149 (1991), and Boore & Brown (2000). The standard PCR reactions contained 2.5 µl of
150 10x buffer, 1.5 µl of MgCL2 (25 mM), 0.5 µl of dNTPs (2.5 mM each), 0.5 µl of each 151 primer (10 mM), 0.5-1 µl (20-100 ng) of template DNA, 0.2 µl of Taq DNA polymerase
152 5PRIME (Hamburg, Germany), and sterilized distilled water up to 25 µl. The PCR
153 temperature and cycle conditions used were: a denaturalization step at 94° C for 60 s;
154 45 cycles of denaturalization at 94° C 30 s, annealing at 44° C (cox1) or 52° C (rrnL
155 and cox3) for 60 s and extension at 72° C for 90 s; a final extension step at 72° C for 5
156 min. Second, the amplified PCR fragments were sequenced using Sanger sequencing,
157 and new primers were designed (see Suppl. Mat. for primer sequences) for amplifying
158 outwards from the short fragments in the next step. Third, the remaining mtDNA was
159 amplified in two-three overlapping fragments by long PCR using the newly designed
160 primers. The long PCR reaction contained 2.5 µl of 10x LA Buffer II (Mg2+ plus), 3 µl
161 of dNTPs (2.5 mM each), 0.5 µl of each primer (10 mM), 0,5-1 µl (20-100 ng) of
162 template DNA, 0.2 µl TaKaRa LA Taq DNA polymerase (5 units/µl), and sterilized
163 distilled water up to 25 µl. The following PCR conditions were used: a denaturing step
164 at 94° C for 60 s; 45 cycles of denaturation at 98° C for 10 s, annealing at 53° C for 30 s
165 and extension at 68° C for 60 s per kb; and a final extension step at 68 °C for 12 min.
166 Long-PCR products were purified by ethanol precipitation. Overlapping
167 fragments from the same mt genome were pooled together in equimolar concentrations
168 and subjected to massive parallel sequencing. For each mt genome, a separate indexed
169 library was constructed using the NEXTERA XT DNA library prep Kit (Illumina, San
170 Diego, CA, USA) and sequenced in a single lane of Illumina MiSeq V2 500 at Sistemas
171 Genómicos (Valencia, Spain).
172 Transcriptomes of C. margaritarius and C. zizyphinum were sequenced using the
173 following procedure: Illumina libraries were prepared for each transcriptome using the
174 ScriptSeq™ v2 RNA-Seq Library Preparation Kit from Epicentre (Epicentre
175 Biotechnologies, Madison, WI, USA), size checked with an Agilent 2200 Tapestation 176 and quantified using qPCR. The libraries were loaded onto one fifth of a MiSeq V2 500
177 cycle sequencing run. Each library was run twice.
178
179 Assembly and annotation
180 The reads corresponding to the different PCR amplified mt genomes were sorted
181 using the corresponding library indices. Adapter sequences were removed using
182 SeqPrep (St John, 2011). Assembly was performed using the TRUFA webserver
183 (Kornobis et al., 2015). The quality (randomness) of the sequencing was checked using
184 FastQC v.0.10.1 (Andrews, 2010). Reads were trimmed and filtered out according to
185 their quality scores using PRINSEQ v.0.20.3 (Schmieder & Edwards, 2011). Filtered
186 reads were used for de novo assembly of mt genomes, searching for contigs with a
187 minimum length of 3 kb. The complete or nearly complete sequence of each mt genome
188 was finally assembled by overlapping the various contigs in Sequencher 5.0.1. The
189 assembled sequence was used as reference to map the original (raw) reads with a
190 minimum identity of 99% using Geneious® 8.0.3 to estimate coverage.
191 Genome annotation was performed by setting a limit of nucleotide identity of
192 75% to previously reported vetigatropod mt genomes (Uribe et al., 2016) using
193 Geneious® 8.0.3. The annotated 13 mt protein-coding genes were further corroborated
194 by identifying the corresponding open reading frames using the invertebrate
195 mitochondrial code. The transfer RNA (tRNA) genes were further identified with
196 tRNAscan-SE 1.21 (Schattner, Brooks & Lowe, 2005), which infers cloverleaf
197 secondary structures. The ribosomal RNA (rRNA) genes were identified by sequence
198 comparison with previously reported vetigastropod mt genomes, and assumed to extend
199 to the boundaries of adjacent genes (Boore, Macey & Medina, 2005). GenBank
200 accession numbers of each newly sequenced mt genome are provided in Table 1. 201 Transcriptomes of C. margaritarius, and C. zizyphinum were assembled with
202 Galaxy (Giardine et al., 2005; Blankenberg et al., 2010; Goecks, Nekrutenko & Taylor,
203 2010) as outlined in Williams et al. (in review). Reads for the two separate sequencing
204 runs were concatenated and filtered for reads that contained all but three identical bases.
205 Trimmomatic (Lohse et al., 2012) was then used with the initial ILLUMINACLIP step
206 and a sliding window to trim reads (averaging across four bases and requiring an
207 average quality score of 24), and to remove all reads with a length of less than 30 bases.
208 Transcriptome assembly was performed using Trinity (Grabherr et al., 2011), with
209 default settings and a minimum contig length of 200 bases. Open reading frames were
210 identified using the program TransDecoder (Haas et al., 2013). The mt protein coding
211 and rRNA genes of C. margaritarius, and C. zizyphinum were extracted from the
212 corresponding transcriptomes in Geneious by using published amino acid sequences for
213 each mitochondrial gene from Bolma rugosa (GenBank KT207824; Uribe et al., 2016)
214 to identify matching sequences in the dataset of assembled contigs using the tBLASTx
215 option, then the new contig was used as a reference sequence against the original reads
216 to obtain full length genes. Gene boundaries for rRNA genes were determined by
217 comparison with other vetigastropod sequences.
218
219 Sequence alignment
220 The nucleotide sequences of the 13 protein coding and two rRNA genes encoded
221 in the newly determined complete or nearly complete mt genomes were aligned each
222 separately with the corresponding orthologous sequences of all vetigastropod complete
223 or nearly complete mt genomes available at NCBI (www.ncbi.nlm.nih.gov/; see Table
224 1). The complete mt genome of Chrysomallon squamiferum (Neomphalina) was used as
225 outgroup following Uribe et al. (2016). Each protein-coding gene was aligned with 226 Translator X (Abascal, Zardoya & Telford, 2010) using the deduced amino acid
227 sequence as guide whereas rRNA genes were aligned separately using MAFFT v7
228 (Katoh & Standley, 2013) with default parameters. Ambiguously aligned positions were
229 removed using Gblocks v.0.91b (Castresana, 2000) with the following settings:
230 minimum sequence for flanking positions: 85%; maximum contiguous non-conserved
231 positions: 8; minimum block length: 10; gaps in final blocks: no. The generated single
232 alignments were concatenated using Geneious® 8.0.3.
233
234 Phylogenetic analyses
235 Phylogenetic relationships were reconstructed using Bayesian inference (BI;
236 Huelsenbeck & Ronquist, 2001) and maximum likelihood (ML; Felsenstein, 1981). BI
237 analyses were conducted using MrBayes v3.1.2 (Ronquist & Huelsenbeck, 2003) and
238 running four simultaneous Monte Carlo Markov chains (MCMC) for 10 million
239 generations, sampling every 1,000 generations, and discarding the first 25% generations
240 as burn-in (as judged by plots of ML scores and low SD of split frequencies) to prevent
241 sampling before reaching stationarity. Two independent BI runs were performed to
242 increase the chance of adequate mixing by the MCMC and to increase the chance of
243 detecting failure to converge, as determined using Tracer v1.6 (Rambaut & Drummond,
244 2007). ML analyses were conducted with RAxML v7.3.1 (Stamatakis, 2006) and
245 default parameters using the rapid hill-climbing algorithm and 10,000 bootstrap
246 pseudoreplicates.
247 The program Partition Finder (Lanfear et al., 2012) was used to select best
248 partition schemes and best-fit models of substitution according to the Bayesian
249 information criterion (BIC; Schwarz, 1978). For protein-coding genes, the partitions
250 tested were: all genes combined, all genes separated except atp6-atp8 and nad4-nad4L, 251 and genes grouped by subunits (atp, cox, cob and nad). In addition, the three above
252 partition schemes were tested considering first, second, and third codon positions
253 separated. For the mt rRNA genes, the two genes combined or separated were tested.
254
255 RESULTS AND DISCUSSION
256 Sequencing and assembly
257 The nucleotide sequences and gene arrangement of the complete mt genome of C.
258 pica and the nearly complete mt genomes of T. virgatus, G. umbilicaris, and M.
259 vorticiferus were determined (see annotation and main features in Suppl. Mat.). In
260 Tectus and Gibbula a fragment of about 3kb between rrnL and cox3 genes could not be
261 PCR amplified. In the case of Margarites, a shorter fragment of about 2kb between rrnS
262 and cox3 genes was missing (Fig. 1). In addition, the nucleotide sequences of all protein
263 coding and rRNA genes of C. margaritarius and of C. zizyphinum were derived from
264 transcriptomic sequence data. The number of reads, mean coverage, and sequence
265 length (bp) of each complete or nearly complete mt genome are: C. pica (165,292,
266 1,390x and 17,949); T. virgatus (205,498, 2,218x and 13,891); G. umbilicaris (142,074,
267 1,666x and 12,885); and M. vorticiferus (290,484, 2,858x and 15,254). The GenBank
268 accession number and coverage for each of the mitochondrial genes of C. margaritarius
269 and C. zizyphinum are shown in Suppl. Mat.
270
271 Mitochondrial genome organization
272 Genome organization could only be determined for those mt genomes that were
273 amplified by long PCR (all but C. margaritarius and C. zizyphinum). These mt
274 genomes share the same gene order with regards to the relative position of protein-
275 coding genes, and only minor changes affecting individual tRNA genes were observed 276 (Fig. 1). The consensus gene order for Trochoidea s.l. (including Phasianelloidea and
277 Angarioidea) is the same observed in Haliotoidea and Seguenzoidea but not in
278 Fissurelloidea and Lepetodriloidea (see Lee et al., 2016 and Uribe et al., 2016 for
279 further information and discussion on vetigastropod mt gene arrangements). Moreover,
280 this consensus gene order conforms to the genome arrangement of the hypothetical
281 ancestor of gastropods (Fig. 1; Uribe et al., 2016). With respect to this gastropod
282 ancestral gene order, the mt genome of T. virgatus showed a translocation of the trnQ to
283 a new relative position between cob and nad6 genes in the minor strand (Fig. 1). The mt
284 genome of G. umbilicaris had an inversion of the trnT gene from major to minor strand
285 (Fig. 1). The mt genome of M. vorticiferus showed a translocation of the trnM to a new
286 relative position between nad6 and trnP genes in the minor strand (Fig. 1). Finally, the
287 mt genomes of Tegula, Bolma, Lunella, Cittarium, Phasianella, and Angaria showed
288 rearrangements affecting trnG and trnE genes, and in some instances, one or both genes
289 were missing (Fig. 1; Lee et al., 2016; Uribe et al., 2016). It is not possible to infer the
290 exact evolution of these rearrangements given that this part of the mt genome could not
291 be sequenced in Tectus, Margarites, and Gibbula, and is not available for Clanculus and
292 Calliostoma (Fig. 1). However, it is important to note that these two genes are located at
293 the end of the ancestral MCYWQGE tRNA gene cluster, and just before the
294 hypothesized control region of gastropod mt genomes, which is known to act as hotspot
295 of gene order rearrangements (Duarte, De Azeredo-Espin & Junqueira, 2008).
296
297 Phylogenetic relationships among vetigastropod superfamilies and within Trochoidea
298 s.l.
299 A molecular phylogeny of Vetigastropoda was reconstructed using probabilistic
300 methods. The final alignment was 11,475 positions long. The best partition scheme was 301 the one having all protein-coding genes combined (but with each codon position
302 analyzed separately) and the two rRNA genes combined. The best-fit model for the
303 different partitions was GTR+I+G. The ML (-lnL = 17,998.18) and BI phylogenetic
304 analyses arrived at the same topology using Neomphalina as outgroup (Fig. 2). The
305 superfamily Lepetodriloidea was recovered as sister group of the remaining
306 vetigastropds, although only with moderate statistical support (61% BP, 0.94 BPP).
307 This lineage was recovered as sister group of Seguenzoidea + Haliotoidea in previous
308 mt genome phylogenies (Lee et al., 2016; Uribe et al., 2016; Wort et al., 2017). The
309 next lineage that branched off was Fissurelloidea, whose members exhibited relatively
310 long branches (Fig. 2). Fissurelloidea has been normally recovered as sister group of the
311 remaining vetigastropod lineages in previous mt genome phylogenies, and the
312 possibility of a long-branch attraction effect by the outgroup cannot be dismissed (Lee
313 et al., 2016; Uribe et al., 2016; Wort et al., 2017). The superfamilies Seguenzoidea and
314 Haliotoidea formed a well-supported clade (84% BPP, 1 BPP), which was the sister
315 group of Trochoidea s.l. (including Phasianelloidea and Angarioidea), and this
316 relationship received relatively high support (75% BP, 1 BPP). The mt genome
317 phylogenies clearly differed from a recent phylogeny based on nuclear transcriptomic
318 data in which phylogenetic relationships within Vetigastropoda were fully resolved (all
319 nodes received maximal statistical support; Zapata et al., 2014). In Zapata et al. study,
320 Seguenzoidea were recovered as the sister group of a clade in which Lepetodriloidea
321 was sister to Lepetelloidea and Haliotoidea was sister to Trochoidea s.l. (including
322 Phasianelloidea), however no representatives of Fissurelloidea were included (Zapata et
323 al., 2014). Here, we did not incorporate a representative of the superfamily
324 Pleurotomarioidea, which in other phylogenies is placed as sister group of the remaining
325 vetigastropod lineages (Kano, 2008; Williams et al., 2008; Zapata et al., 2014) or even 326 unrelated to Vetigastropoda (Aktipis & Giribet, 2012). Other missing superfamilies
327 were Lepetelloidea and Scissurelloidea. Several molecular phylogenies based on partial
328 gene sequences recovered a close relationship between Scissurelloidea and
329 Lepetodriloidea (Yoon & Kim 2005; Williams & Ozawa 2006; Kano, 2008), whereas
330 the relative phylogenetic position of Lepetelloidea remains controversial (Aktipis &
331 Giribet, 2012)
332 The main focus of the present phylogenetic analysis was Trochoidea s.l. This
333 clade received maximal support and included Trochoidea, Phasianelloidea and
334 Angarioidea sensu Williams et al 2008 (Fig. 2). The recognition of Phasianelloidea and
335 Angarioidea as valid superfamilies different from Trochoidea sensu Hickman &
336 McLean (1991) was based on phylogenetic analyses of partial mt and nuclear genes that
337 placed these two lineages in early diverging positions in the vetigastropod tree
338 (Williams et al., 2008; Aktipis & Giribet, 2012; see also the position of Phasianelloidea
339 in Kano, 2008). However, our results are in agreement with more recent phylogenies
340 based on mt (Lee et al., 2016; Uribe et al., 2016; Wort et al., 2017) and nuclear (Zapata
341 et al., 2014) genomic data sets, which also recovered a clade grouping Trochoidea
342 together with Phasianelloidea and Angarioidea (the latter was missing in Zapata et al.,
343 2014). Interestingly, Phasianelloidea and Angarioidea show relatively long branches in
344 the mt genome phylogenies (but shorter than non-trochoidean taxa; this study; Lee et
345 al., 2016; Uribe et al., 2016), which in previous studies with different taxon sampling
346 and molecular markers (18S and 28S also showed long branches for Phasianelloidea
347 and in particular for the family Areneidae of Angarioidea in Williams et al., 2008) may
348 have produced a long-branch attraction effect and the pulling of these two lineages to
349 more basal positions. The recovery of a monophyletic Trochoidea sensu Hickman &
350 McLean (1991), which is supported here, has been shown to be particular sensitive to 351 the choice of mt and nuclear genes used in the phylogenetic analyses and the use of
352 amino acids versus nucleotides (Wort et al., 2017).
353 The representation of Trochoidea in recent phylogenomic analyses has been rather
354 limited, a situation that has been addressed in the present phylogenetic analysis with the
355 inclusion of represetatives of the families Margaritidae, Trochidae and Calliostomatidae
356 as well as exemplars from the genera Tectus and Cittarium (see also Lee et al., 2016).
357 The reconstructed phylogenetic tree recovered the family Trochidae as sister group of
358 the family Calliostomatidae with maximal statistical support, and this clade was the
359 sister group of the remaining trochoidean lineages, which formed a monophyletic group
360 with high support (70% BP; 1 BPP). The family Trochidae sensu Williams et al. (2008)
361 is the largest and most diverse in terms of diet and habitat and comprise up to 10
362 subfamilies, more than 600 known species and more than 60 genera. Therefore, the
363 representation in our study is still quite incomplete with only representatives of the
364 subfamilies Cantharidinae (Gibbula), Stomatellinae (Stomatella) and Trochinae
365 (Clanculus). While Trochidae species are mostly herbivores or detritivores (Hickman &
366 McLean, 1990), the members of the family Calliostomatidae constitute an uniform
367 group of carnivorous snails that can be distinguished from Trochidae by their distinct
368 feeding adaptations, which resulted in differences in their alimentary tracts and radular
369 morphology (Hickman & McLean, 1990; Marshall, 1995). The family Margaritidae was
370 sister of a maximally supported clade including Phasianelloidea + Angarioidea and
371 Turbinidae + (paraphyletic) Tegulidae (Fig. 2). The family Margaritidae, historically
372 included as a subfamily within Trochidae, was recognized for the first time at familial
373 rank by Williams (2012). It could represent an early radiation that diverged from the
374 tropical and subtropical groups by adaptation to cold waters (high latitudes and deep
375 waters). The family Tegulidae has a long controversial taxonomic history due to the 376 unusual distribution of character states of its members. Hickman & McLean (1990)
377 retained the group as a subfamily within Trochidae, emphasizing the evolutionary
378 conservativeness of conchological characters, such as the oblique aperture and
379 interrupted peristome of the shell and the short growing edge of the operculum. Later,
380 Hickman (1996) suggested that Tegula and allies represented an enigmatic group
381 located somewhere between Trochidae and Turbinidae, and Williams (2012) finally
382 raised it to familial rank. Our results support Tegulidae as a distinct lineage closely
383 related to Turbinidae (Fig. 2). The reconstructed phylogeny also suggests that the
384 genera of this family should be redefined since the speciose genus Tegula has proved to
385 be non-monophyletic (or alternatively genera Omphalius and Chlorostoma need to be
386 assigned to Tegula). Near thirty species are grouped today within this genus (Bouchet,
387 2011) based on similarity of shell characters.
388 The recovered internal phylogenetic relationships of Trochoidea s.l. are fully
389 congruent with the five-gene tree of Williams (2012), who did not include
390 Phasianelloidea and Angarioidea in her phylogenetic analysis. In particular, it is worth
391 noting that Cittarium and Tectus (and possibly Rochia; Williams, 2012) need to be
392 assigned to a new family. Hickman & Maclean (1990) included Tectus within Trochinae
393 and Cittarium within Gibbulinae. More recently, Bouchet et al. (2005) assigned both
394 genera to Tegulidae. It was Williams (2012) who first recovered Tectus, Cittarium, and
395 Rochia as a distinct clade, although with low support. Although not formally described,
396 she suggested a familial rank for this clade pending further studies. Our results support
397 her suggestion and highlight the need to study the morphological and anatomical
398 peculiarities of these genera with respect to other trochoidean families.
399 To summarize, the recovered phylogeny prompts for a redefinition of Trochoidea
400 sensu Williams et al. (2008), supporting instead the hypothesis of Hickman & Maclean 401 (1990). However, this redefinition in order to be complete should await further
402 mitogenomic studies including missing families such as Skeneidae, Solariellidae, and
403 Liotiidae.
404
405
406 SUPPLEMENTARY MATERIAL
407 Supplementary material is available at Journal of Molluscan Studies online.
408
409 ACKNOWLEDGEMENTS
410 We thank Yasunori Kano and one anonymous reviewer for their insightful comments on
411 a previous version of the paper. We are grateful to Jesus Marco and Luis Cabellos who
412 provided access to the supercomputer Altamira at the Institute of Physics of Cantabria
413 (IFCA-CSIC), member of the Spanish Supercomputing Network, for performing
414 phylogenetic analyses. This work was supported by the Spanish Ministry of Science and
415 Innovation (CGL2010-18216 and CGL2013-45211-C2-2-P to RZ; BES-2011-051469 to
416 JEU; BES‐2014‐069575 to SA) and by funding from the NHM Department of Life
417 Sciences to STW.
418 419
420 REFERENCES
421 ABASCAL, F., ZARDOYA, R. & TELFORD, M.J. 2010. TranslatorX: multiple alignment
422 of nucleotide sequences guided by amino acid translations. Nucleic Acids Research,
423 38: W7-13.
424 AKTIPIS, S.W. & GIRIBET, G. 2012. Testing relationships among the vetigastropod
425 taxa: a molecular approach. Journal of Molluscan Studies, 78: 12-27.
426 ANDREWS, S. 2010. FastQC.
427 http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
428 BLANKENBERG, D., VON KUSTER, G., CORAOR, N., ANANDA, G., LAZARUS, R.,
429 MANGAN, M., NEKRUTENKO, A. & J., T. 2010. Galaxy: a web-based genome
430 analysis tool for experimentalists. Current Protocols in Molecular Biology, 19.10: 1-
431 21.
432 BOORE, J.L. & BROWN, W.M. 2000. Mitochondrial genomes of Galathealinum,
433 Helobdella, and Platynereis: sequence and gene arrangement comparisons Indicate
434 that Pogonophora is not a phylum and Annelida and Arthropoda are not sister taxa.
435 Molecular Biology and Evolution, 17: 87-106.
436 BOORE, J.L., MACEY, J.R. & MEDINA, M. 2005. Sequencing and comparing whole
437 mitochondrial genomes of animals. Methods in Enzymology, 395: 311-348.
438 BOUCHET, P., ROCROI, J.P., FRÝDA, J., HAUSDORF, B., PONDER , W.F.,
439 VALDÉS, Á. & WÁREN, A. 2005. Classification and nomenclator of gastropod
440 families. Malacologia, 47: 1-397.
441 BOUCHET, P. (2011). Tegula Lesson, 1832. In: MolluscaBase (2016). Accessed
442 through: World Register of Marine Species at
443 http://www.marinespecies.org/aphia.php?p=taxdetails&id=413467 on 2016-11-18
444 CASTRESANA, J. 2000. Selection of conserved blocks from multiple alignments for
445 their use in phylogenetic analysis. Molecular Biology and Evolution, 17: 540-552. 446 DUARTE, G.T., DE AZEREDO-ESPIN, A.M.L. & JUNQUEIRA, A.C.M. 2008. The
447 mitochondrial control region of blowflies (Diptera: Calliphoridae): a hot spot for
448 mitochondrial genome rearrangements. Journal of Medical Entomology, 45: 667.
449 FELSENSTEIN, J. 1981. Evolutionary trees from DNA sequences: A maximum
450 likelihood approach. Journal of Molecular Evolution, 17: 368-376.
451 FOLMER, O., BLACK, M., HOEH, W., LUTZ, R. & VRIJENHOEK, R. 1994. DNA
452 primers for amplification of mitochondrial cytochrome c oxidase subunit I from
453 diverse metazoan invertebrates. Molecular Marine Biology and Biotechnology, 3:
454 294-299.
455 GEIGER, D.L., NÜTZEL, A. & SASAKI, T. 2008. Vetigastropoda. In: Phylogeny and
456 Evolution of the Mollusca: (Ponder, W.F. and Lindberg, D.R., eds), pp. 297-330.
457 University of California Press, Berkeley.
458 GEIGER, D.L. & THACKER, C.E. 2005. Molecular phylogeny of Vetigastropoda
459 reveals non-monophyletic Scissurellidae, Trochoidea, and Fissurelloidea. Molluscan
460 Research, 25: 47-55.
461 GIARDINE, B., RIEMER, C., HARDISON, R.C., BURHANS, R., ELNITSKI, L., SHAH,
462 P., ZHANG, Y., BLANKENBERG, D., ALBERT, I., TAYLOR, J., MILLER, W., KENT,
463 W.J. & NEKRUTENKO, A. 2005. Galaxy: A platform for interactive large-scale
464 genome analysis. Genome Research, 15: 1451-1455.
465 GOECKS, J., NEKRUTENKO, A. & TAYLOR, J. 2010. Galaxy: a comprehensive
466 approach for supporting accessible, reproducible, and transparent computational
467 research in the life sciences. Genome Biology, 11: R86.
468 GOFAS, S. 2009. Trochoidea Rafinesque, 1815. In: MolluscaBase (2016). Accessed
469 through: World Register of Marine Species at
470 http://www.marinespecies.org/aphia.php?p=taxdetails&id=156489 on 2016-11-18
471 GRABHERR, M.G., HAAS, B.J., YASSOUR, M., LEVIN, J.Z., THOMPSON, D.A.,
472 AMIT, I., ADICONIS, X., FAN, L., RAYCHOWDHURY, R., ZENG, Q., CHEN, Z.,
473 MAUCELI, E., HACOHEN, N., GNIRKE, A., RHIND, N., DI PALMA, F., BIRREN, 474 B.W., NUSBAUM, C., LINDBLAD-TOH, K., FRIEDMAN, N. & REGEV, A. 2011. Full-
475 length transcriptome assembly from RNA-Seq data without a reference genome.
476 Nature Biotechnology, 29: 644-652.
477 HAAS, B.J., PAPANICOLAOU, A., YASSOUR, M., GRABHERR, M., BLOOD, P.D.,
478 BOWDEN, J., COUGER, M.B., ECCLES, D., LI, B., LIEBER, M., MACMANES,
479 M.D., OTT, M., ORVIS, J., POCHET, N., STROZZI, F., WEEKS, N., WESTERMAN,
480 R., WILLIAM, T., DEWEY, C.N., HENSCHEL, R., LEDUC, R.D., FRIEDMAN, N. &
481 REGEV, A. 2013. De novo transcript sequence reconstruction from RNA-seq using
482 the Trinity platform for reference generation and analysis. Nature Protocols, 8: 1494-
483 1512.
484 HASZPRUNAR, G., KUNZE, T., BRÜCKNER, M. & HES, M. 2016. Towards a sound
485 definition of Skeneidae (Mollusca, Vetigastropoda): 3D interactive anatomy of the
486 type species, Skenea serpuloides (Montagu, 1808) and comments on related taxa.
487 Organisms Diversity & Evolution: 1-19.
488 HICKMAN, C.S. 1996. Phylogeny and patterns of evolutionary radiation in trochoidean
489 gastropods. In: Origin and Evolutionary Radiation of the Mollusca: (Taylor, J.D.,
490 ed), pp. 177–198. Oxford University Press, Oxford.
491 HICKMAN, C.S. 2013. Crosseolidae, a new family of skeneiform microgastropods and
492 progress toward definition of monophyletic Skeneidae. American Malacological
493 Bulletin, 31: 1-16.
494 HICKMAN, C.S. & MCLEAN, J.H. 1990. Systematic revision and suprageneric
495 classification of trochacean gastropods. Natural History Museum of Los Angeles
496 County, Science Series, 35: 1-169.
497 HUELSENBECK, J. & RONQUIST, F. 2001. MrBayes: Bayesian inference of
498 phylogenetic trees. Bioinformatics, 17: 754-755.
499 KANO, Y. 2008. Vetigastropod phylogeny and a new concept of Seguenzioidea:
500 independent evolution of copulatory organs in the deep-sea habitats. Zoologica
501 Scripta, 37: 1-21. 502 KANO, Y., CHIKYU, E. & WARÉN, A. 2009. Morphological, ecological and molecular
503 characterization of the enigmatic planispiral snail genus Adeuomphalus
504 (Vetigastropoda: Seguenzioidea). Journal of Molluscan Studies, 75: 397-418.
505 KATOH, K. & STANDLEY, D.M. 2013. MAFFT Multiple Sequence Alignment Software
506 Version 7: Improvements in Performance and Usability. Molecular Biology and
507 Evolution, 30: 772-780.
508 KORNOBIS, E., CABELLOS, L., AGUILAR, F., FRÍAS-LÓPEZ, C., ROZAS, J.,
509 MARCO, J. & ZARDOYA, R. 2015. TRUFA: A User-Friendly Web Server for de
510 novo RNA-seq Analysis Using Cluster Computing. Evolutionary Bioinformatics, 11:
511 97-104.
512 KUNZE, T., BECK, F., BRUÅNCKNER, M., HES, M. & HASZPRUNAR, G. 2008.
513 Skeneimorph gastropods in Neomphalina and Vetigastropoda - A preliminary report.
514 Zoosymposia, 1: 119-131.
515 LANFEAR, R., CALCOTT, B., HO, S.Y.W. & GUINDON, S. 2012. PartitionFinder:
516 combined selection of partitioning schemes and substitution models for phylogenetic
517 analyses. Molecular Biology and Evolution, 29: 1695-1701.
518 LEE, H., SAMADI, S., PUILLANDRE, N., TSAI, M.-H., DAI, C.-F. & CHEN, W.-J. 2016.
519 Eight new mitogenomes for exploring the phylogeny and classification of
520 Vetigastropoda. Journal of Molluscan Studies, 82: 534-541.
521 LOHSE, M., BOLGER, A.M., NAGEL, A., FERNIE, A.R., LUNN, J.E., STITT, M. &
522 USADEL, B. 2012. RobiNA: a user-friendly, integrated software solution for RNA-
523 Seq-based transcriptomics. Nucleic Acids Research, 40: W622-W627.
524 MARSHALL, B. A. 1995. Calliostomatidae (Gastropoda: Trochoidea) from New
525 Caledonia, the Loyalty Islands, and the northern Lord Howe Rise. In: (Bouchet P.,
526 ed.) Résultats des Campagnes Musorstom. Mémoires du Muséum National
527 d’Histoire Naturelle, 14: 382–458. 528 PALUMBI, S., MARTIN A, MCMILLAN, W., STICE, L. & GRABOWSKI, G. 1991. The
529 simple fool’s guide to PCR. http://palumbi.stanford.edu/SimpleFoolsMaster.pdf: 1-
530 45.
531 RAMBAUT, A. & DRUMMOND, A.J. 2007. Tracer v1.4, Available from
532 http://beast.bio.ed.ac.uk/Tracer.
533 RONQUIST, F. & HUELSENBECK, J.P. 2003. MrBayes 3: Bayesian phylogenetic
534 inference under mixed models. Bioinformatics, 19: 1572-1574.
535 SCHATTNER, P., BROOKS, A.N. & LOWE, T.M. 2005. The tRNAscan-SE, snoscan
536 and snoGPS web servers for the detection of tRNAs and snoRNAs. Nucleic Acids
537 Research, 33: W686-689.
538 SCHMIEDER, R. & EDWARDS, R. 2011. Quality control and preprocessing of
539 metagenomic datasets. Bioinformatics, 27: 863-864.
540 SCHWARZ, G. 1978. Estimating the dimension of a model. The Annals of Statistics, 6:
541 461-464.
542 ST JOHN, J. 2011. SeqPrep. Available from https://github.com/jstjohn/SeqPrep.
543 STAMATAKIS, A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic
544 analyses with thousands of taxa and mixed models. Bioinformatics, 22: 2688-2690.
545 URIBE, J.E., KANO, Y., TEMPLADO, J. & ZARDOYA, R. 2016. Mitogenomics of
546 Vetigastropoda: insights into the evolution of pallial symmetry. Zoologica Scripta, 45:
547 145-159.
548 VERRILL, A.E. 1884. Second catalogue of Mollusca recently added to the fauna of the
549 New England coast and the adjacent parts of the Atlantic, consisting mostly of deep-
550 sea species, with notes on others previously recorded. Transactions of the
551 Connecticut Academy of Science, 6: 139- 294.
552 WILLIAMS, S.T. 2012. Advances in molecular systematics of the vetigastropod
553 superfamily Trochoidea. Zoologica Scripta, 41: 571-595. 554 WILLIAMS, S.T., KARUBE, S. & OZAWA, T. 2008. Molecular systematics of
555 Vetigastropoda: Trochidae, turbinidae and trochoidea redefined. Zoologica Scripta,
556 37: 483-506.
557 WILLIAMS, S.T. & OZAWA, T. 2006 Molecular phylogeny suggests polyphyly of both
558 the turban shells (family Turbinidae) and the superfamily Trochoidea (Mollusca:
559 Vetigastropoda). Molecular Phylogenetics and Evolution, 39: 33-51.
560 WORT, E., FENBERG, P. & WILLIAMS, S.T. 2017. Testing the contribution of
561 individual genes in mitochondrial genomes for assessing phylogenetic relationships
562 in Vetigastropoda. Journal of Molluscan Studies, In press.
563 YOON, S. H. & KIM, W. 2005. Phylogenetic relationships among six vetigastropod
564 subgroups (Mollusca, Gastropoda) based on 18S rDNA sequences. Molecules and
565 Cells, 19: 283–288.
566 ZAPATA, F., WILSON, N.G., HOWISON, M., ANDRADE, S.C.S., JÖRGER, K.M.,
567 SCHRÖDL, M., GOETZ, F.E., GIRIBET, G. & DUNN, C.W. 2014. Phylogenomic
568 analyses of deep gastropod relationships reject Orthogastropoda. Proceedings of
569 the Royal Society of London, Biological Sciences, 281: 20141739.
570 571 Legend to Figures
572
573 Figure 1. Gene orders of selected Trochoidea s.l. mitochondrial genomes. The
574 consensus genome organization is shown for each lineage as well as the
575 ancestral gene order for Gastropoda. The genes encoded in the major and
576 minor strands are shown in the top and bottom lines, respectively. Gene
577 rearrangements (restricted to tRNA genes) are highlighted by colours.
578 Translocated genes are in green. Inverted genes are in blue. Genes for which
579 the exact rearrangement could not be inferred are in orange. Striped boxes
580 indicate regions not sequenced. The gene order of Clanculus and Calliostoma
581 could not be determined because sequence data was derived from RNASeq.
582
583 Figure 2. Phylogenetic relationships among vetigastropod superfamilies and within
584 Trochoidea s.l. based on 29 mitochondrial (protein-coding + rRNA) genes.
585 The reconstructed ML phylogram using Neomphalina as outgroup is shown.
586 Numbers at nodes are statistical support values for ML (BP)/ BI (BPP). An
587 asterisk indicates maximal support in ML (100% BP) and BI (1 BPP).
588 Vetigastropod superfamilies and trochoidean families are indicated.
589 cox1 cox2 D atp8 atp6 T cox3 K A R N I nad3 S nad2 T. brunnea F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q G Tegulidae cox1 cox2 D atp8 atp6 T cox3 K A R N I nad3 S nad2 T. lividomaculata F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q
cox1 E cox2 D atp8 atp6 G cox3 K A R N T I nad3 S nad2 F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q Turbinidae Bolma cox1 cox2 D atp8 atp6 E G cox3 K A R N T I nad3 S nad2 Lunella F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q
cox1 cox2 D atp8 atp6 T G cox3 K A R N I nad3 S nad2 Unassigned Cittarium F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q
cox1 cox2 D atp8 atp6 T cox3 K A R N I nad3 S nad2 Tectus F nad5 H nad4 nad4L S cob Q nad6 P nad1 L L rrnL V rrnS
PHASIANELLOIDEA cox1 cox2 D atp8 atp6 T E G cox3 K A R N I nad3 S nad2 Phasianella F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q
ANGARIOIDEA cox1 cox2 D atp8 atp6 T G cox3 K A R N I nad3 S nad2 Angaria F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M Y C W Q E
Margaritidae cox1 cox2 D atp8 atp6 T cox3 K A R N I nad3 S nad2 Margarites F nad5 H nad4 nad4L S cob nad6 M P nad1 L L rrnL V rrnS
Clanculus undetermined order Trochidae cox1 cox2 D atp8 atp6 cox3 K A R N I nad3 S nad2 Gibbula F nad5 H nad4 nad4L T S cob nad6 P nad1 L L rrnL
Calliostomatidae Calliostoma undetermined order
cox1 cox2 D atp8 atp6 T cox3 K A R N I nad3 S nad2 Ancestral Gastropoda F nad5 H nad4 nad4L S cob nad6 P nad1 L L rrnL V rrnS M C Y W Q G E * Omphalius nigerrimus * Chlorostoma argyrostomum TROCHOIDEA: Tegulidae * Tegula brunnea Tegula lividomaculata 93/1 Lunella aff. cinerea 89/1 * Lunella granulata TROCHOIDEA: Turbinidae 93/1 * Bolma rugosa Astralium haematragus Tectus virgatus 99/1 TROCHOIDEA: Unassigned * Cittarium pica Phasianella australis * PHASIANELLOIDEA 70/1 Phasianella solida Angaria neglecta 99/1 * ANGARIOIDEA Angaria delphinus Margarites vorticiferus * TROCHOIDEA: Margaritidae * Gibbula umbilicalis * Gibbula umbilicaris TROCHOIDEA: Trochidae * Stomatella planulata 75/1 Clanculus margaritarius * Calliostoma zizyphinum TROCHOIDEA: Calliostomatidae Haliotis rubra * HALIOTOIDEA 61/0.94 Haliotis tuberculata 84/1 Granata lyrata SEGUENZIOIDEA * Diodora graeca * Fissurella volcano FISSURELLOIDEA Variegemarginula punctata Lepetodrilus nux * LEPETODRILOIDEA Lepetodrilus schrolli Chrysomallon squamiferum NEOMPHALINA (outgroup) 5.0 0.5 5.0 Table 1. Mitochondrial (mt) DNA data analyzed in this study. Length in bp, Genbank accesion number, museum voucher, sampling location, and name of collector are provided
New mt data Species Family Superfamily bp Acc. No. Voucher Location Collector Cittarium pica Unassigned Trochoidea 17,949 KY212109 MNCN/ADN: 91331 Guanahacabibes, Bolivar, Cuba José Templado Tectus virgatus* Unassigned Trochoidea 15,956 KY205709 MNCN/ADN: 91332 Aqaba, Jordan José Templado Margarites vorticiferus* Margaritidae Trochoidea 15,253 KY205708 NHMUK: 20110451 Amchitka I., Cannikin, USA Piotr Kuklinski Gibbula umbilicaris* Trochidae Trochoidea 13,269 KY205707 MNCN/ADN: 86692 El Mohon, Murcia, SE Spain José Templado § Clanculus margaritarius† Trochidae Trochoidea — — NHMUK 20150502 Kitahama, Shirahama, Nishimuro-gun, Wakayama Pref., Japan Tomo Nakano Calliostoma zizyphinum† Calliostomatidae Trochoidea — — NHMUK 20160315 Shetland Islands, 60° 14.9'N, 01° 5.1'W, UK Piotr Kuklinski
GenBank mt data Species Family Superfamily bp GenBank Acc. No. Reference Chlorostoma argyrostomum Tegulidae Trochoidea 17,780 KX298892 Lee et al. 2016 Omphalius nigerrimus Tegulidae Trochoidea 17,755 KX298895 Lee et al. 2016 Tegula brunnea Tegulidae Trochoidea 17,690 NC_016954 Simison, 2011 (unpublished) Tegula lividomaculata Tegulidae Trochoidea 17,375 NC_029367 Uribe et al., 2016 Astralium haematragum Tegulidae Trochoidea 16,310 KX298891 Lee et al. 2016 Gibbula umbilicalis* Trochidae Trochoidea 16,277 KX646541 Wort et al. 2017 Stomatella planulata Trochidae Trochoidea 17,151 KX298894 Lee et al. 2016 Bolma rugosa Turbinidae Trochoidea 17,432 NC_029366 Uribe et al., 2016 Lunella aff. cinerea Turbinidae Trochoidea 17,670 KF700096 Williams et al., 2014 Lunella granulata Turbinidae Trochoidea 17,190 KX298890 Lee et al. 2016 Phasianella australis* Phasianellidae Phasianelloidea 18,397 KX298888 Lee et al. 2016 Phasianella solida Phasianellidae Phasianelloidea 16,698 KR297251 Uribe et al., 2016 Angaria delphinus Angariidae Angarioidea 19,554 KX298893 Lee et al. 2016 Angaria neglecta Angariidae Angarioidea 19,470 KR297248 Uribe et al., 2016 Haliotis rubra Haliotidae Haliotoidea 16,907 NC_005940 Maynard et al., 2005 Haliotis tuberculata Haliotidae Haliotoidea 16,521 NC_013708 VanWormhoudt et al., 2009 Granata lyrata Chilofdontidae Seguenzioidea 17,632 NC_028708 Uribe et al., 2016 Fissurella volcano Fissurellidae Fissurelloidea 17,575 NC_016953 Simison, 2011 (unpublished) Diodora graeca Fissurellidae Fissurelloidea 17,209 KT207825 Uribe et al., 2016 Variegemarginula punctata* Fissurellidae Fissurelloidea 14,440 KX298889 Lee et al. 2016 Lepetodrilus nux* Lepetodrilidae Lepetodriloidea 16,353 LC107880 Nakajima et al., 2016 Lepetodrilus schrolli* Lepetodrilidae Lepetodriloidea 15,579 KR297250 Uribe et al., 2016 Chrysomallon squamiferum Peltospiridae Neomphaloidea 15,388 AP013032 Nakagawa et al., 2014 *Nearly complete mt genomes † The GenBank Acc. No. of each mt gene is shown in Supplementary Data 3 along with coverage data. § voucher is a different specimen from another locality. Supplementary Data 1. Amplification strategy. Long PCR and primer walking primers Cittarium pica Long PCR Primer Sequence 5'-3' Fragment (bp) Citt-cox1-F TGGTTAATTCCTCTGATATTGGGAGCTCC cox1-rrnL (11436) TROmt16sF GATAACAGCGTAATCTTTCTGGAGAGATC TROmt16sR AAGCTCAACAGGGTCTTCTTGTCCC rrnL-cox3 (3439) 85CPcox3R CATAGACACCATCTGAGATAGTTAACGG 85CPcox3F GAGCTTATTTTCATAGAAGTCTCGCTTC cox3-cox1 (3580) Citt-cox1-R GCAGGATCAAAGAAGGATGTGTTAAAATTTC
Margarites vorticiferus Long PCR Primer Sequence 5'-3' Fragment (bp) Alecox3F_UJ CTGAGCATATTTCCATAGAAGCCTGGC cox3-cox1 (3074) Alecox1R CTGATCAAGTGAATAGTGGTAGGCGTTC Alecox1F CTTAGTTTTCGGGATTTGAGCAGGCC cox1-rrnS (12686) Ale12SF TTTAAATCCTTCCAGGGGAACCTGTCC
Gibbula umbilicaris Long PCR Primer Sequence 5'-3' Fragment (bp) GVcox3F TTTCCACAGAAGACTTGCTCCTACTCC cox3-cox1 (2867) G2-cox1-r AATAGAAGAAACACCYGCTAAGTGAAGGGA G2-cox1-F CCGGTGCTATTACTATGCTGCTCACTGA cox1-rrnL (9870) TROmt16sF GATAACAGCGTAATCTTTCTGGAGAGATC
Tectus virgatus Long PCR Primer Sequence 5'-3' Fragment (bp) TVcox3F GTATTTCCACAGAAGGTTGGCTTCTGC cox3-cox1 (3567) TVcox1R GAAGAGATAGCAGCAACAAAATGGCCGT TVcox1F GCATTTCCGCGACTTAATAACATGAGATT cox1-rrnL (10646) TROmt16sF GATAACAGCGTAATCTTTCTGGAGAGATC Supplementary Data 2. Mitochondrial genome features
Cittarium pica Gene Codon Gene Type Start Stop Length Start Stop Strand cox1 CDS 1 1539 1539 ATG TAA forward cox2 CDS 1669 2364 696 ATG TAG forward trnD tRNA 2535 2605 71 forward atp8 CDS 2608 2785 178 –– TAA forward atp6 CDS 2975 3670 696 ATG TAA forward trnF tRNA 3730 3799 70 reverse nad5 CDS 3937 5683 1747 ATG TAA reverse trnH tRNA 5684 5754 71 reverse nad4 CDS 5805 7202 1398 ATG TAA reverse nad4L CDS 7196 7495 300 ATG TAG reverse trnT tRNA 7600 7669 70 forward trnS (tga) trna 7803 7870 68 reverse cob CDS 7902 9041 114 ATG TAA reverse nad6 CDS 9231 9737 507 ATG TAA reverse trnP tRNA 9741 9810 70 reverse nad1 CDS 9953 10903 951 ATG TAA reverse trnL (taa) tRNA 10905 10972 68 reverse trnL (tag) tRNA 11281 11348 68 reverse rrnL rRNA 11349 13012 1664 reverse trnV tRNA 13013 13086 74 reverse rrnS rRNA 13087 14153 1067 reverse trnM tRNA 14154 14222 69 reverse trnY tRNA 14312 14379 68 reverse trnC tRNA 14381 14447 67 reverse trnW tRNA 14455 14521 67 reverse trnQ tRNA 14539 14607 69 reverse trnG tRNA 14618 14686 69 forward cox3 CDS 14740 15519 780 ATG TAA forward trnK tRNA 15718 15777 60 forward trnA tRNA 15778 15845 68 forward trnR tRNA 15906 15974 69 forward trnN tRNA 15989 16061 73 forward trnI tRNA 16109 16179 71 forward nad3 CDS 16184 16537 354 ATG TAA forward trnS (cgt) tRNA 16626 16692 67 forward nad2 CDS 16696 17949 1254 ATG T–– forward Margarites vorticiferus Gene Codon Gene Type Start Stop Length Start Stop Strand cox1 CDS 2522 4066 1545 ATG TAA forward cox2 CDS 4117 4844 728 ATA TAA forward trnD tRNA 4875 4941 67 forward atp8 CDS 4943 5122 180 ATG TAG forward atp6 CDS 5200 5895 696 ATG TAA forward trnF tRNA 5928 5993 66 reverse nad5 CDS 6012 7765 1754 ATG TAG reverse trnH tRNA 7766 7831 66 reverse nad4 CDS 7894 9282 1389 GTG TAA reverse nad4L CDS 9276 9575 300 ATG TAG reverse trnT tRNA 9635 9705 71 forward trnS (tga) tRNA 9710 9776 67 reverse cob CDS 9786 10925 114 ATG TAA reverse nad6 CDS 10997 11500 504 ATG TAA reverse trnM tRNA 11500 11563 64 reverse trnP tRNA 11641 11709 69 reverse nad1 CDS 11769 12716 948 ATG TAA reverse trnL (taa) tRNA 12718 12785 68 reverse trnL (tag) tRNA 12811 12878 68 reverse rrnL rRNA 12879 14458 158 reverse trnV tRNA 14459 14527 69 reverse rrnS rRNA 14528 15253 727 reverse
cox3 CDS 1 483 483 –– TAA forward trnK tRNA 520 578 59 forward trnA tRNA 579 650 72 forward trnR tRNA 690 751 62 forward trnN tRNA 752 821 70 forward trnI tRNA 828 894 67 forward nad3 CDS 897 125 354 ATG TAG forward trnS (cgt) tRNA 1271 1338 68 forward nad2 CDS 1343 2499 1157 TGT TAA forward Gibbula umbilicaris Gene Codon Gene Type Start Stop Length Start Stop Strand cox1 CDS 2419 3954 1536 ATG TAA forward cox2 CDS 3983 4675 693 ATG TAA forward trnD tRNA 4706 4770 65 forward atp8 CDS 4772 4936 165 ATG TAA forward atp6 CDS 4981 5679 699 ATG TAG forward trnF tRNA 5712 5776 65 reverse nad5 CDS 5801 7555 1755 ATG TAA reverse trnH tRNA 7556 7623 68 reverse nad4 CDS 7700 9088 1389 ATG TAG reverse nad4L CDS 9082 9381 300 ATG TAG reverse trnT tRNA 9406 9474 69 reverse trnS (tga) tRNA 9503 9569 67 reverse cob CDS 9601 1074 114 ATG TAA reverse nad6 CDS 10856 11362 507 ATG TAA reverse trnP tRNA 11366 11431 66 reverse nad1 CDS 11494 12435 942 ATG TAG reverse trnL (taa) tRNA 12437 12504 68 reverse trnL (tag) tRNA 12530 12597 68 reverse rrnL rRNA 12598 13269 671 reverse
cox3 CDS 1 479 479 –– TAA forward trnK tRNA 513 570 58 forward trnA tRNA 571 638 68 forward trnR tRNA 641 709 69 forward trnN tRNA 716 782 67 forward trnI tRNA 784 850 67 forward nad3 CDS 855 1208 354 ATG TAA forward trnS (cgt) tRNA 1218 1285 68 forward nad2 CDS 1289 2418 113 ATG T–– forward Tectus virgatus Gene Codon Gene Type Start Stop Length Start Stop Strand cox1 CDS 2979 4514 1536 ATG TAA forward cox2 CDS 4586 5276 691 ATG TAA forward trnD tRNA 5453 5520 68 forward atp8 CDS 5521 5706 186 ATG TAG forward atp6 CDS 5866 6561 696 ATG TAA forward trnF tRNA 6596 6665 70 reverse nad5 CDS 6782 8515 1734 ATG TAA reverse trnH tRNA 8516 8584 69 reverse nad4 CDS 8682 10079 1398 ATG TAA reverse nad4L CDS 10073 10372 300 ATG TAA reverse trnT tRNA 10429 10497 69 forward trnS (tga) tRNA 10553 10618 66 reverse cob CDS 10629 11768 114 ATG TAA reverse trnQ tRNA 11767 11831 65 reverse nad6 CDS 11834 12340 507 ATG TAA reverse trnP tRNA 12345 12415 71 reverse nad1 CDS 12497 13447 951 ATG TAG reverse trnL (taa) tRNA 13449 13516 68 reverse trnL (tag) tRNA 13561 13628 68 reverse rrnL rRNA 13629 15163 1534 reverse trnV tRNA 15164 15231 67 reverse rrnS rRNA 15232 15956 724 reverse
cox3 CDS 1 546 546 –– TAA forward trnK tRNA 734 791 58 forward trnA tRNA 792 859 68 forward trnR tRNA 916 985 70 forward trnN tRNA 103 1098 69 forward trnI tRNA 1146 1215 70 forward nad3 CDS 1219 1572 354 ATG TAG forward trnS (cgt) tRNA 1703 1769 67 forward nad2 CDS 1774 2978 1205 ATG T–– forward Supplementary Data 3. Coverage based on unpaired reads for mitochondrial genes obtained from NGS shotgun sequencing and their respectively Genbank accesion number Calliostoma zizyphinum Clanculus margaritarius Mean Minimum Maximum Acc. No. Mean Minimum Maximum Acc. No. COX1 7291.8 269 28585 KY200866 12286.6 530 32082 KY200867 COX2 9657.6 2342 26004 KY200869 48166.1 7545 112145 KY200868 COX3 5034.7 195 11993 KY200871 6472.2 256 10505 KY200870 CYTB 2744.7 127 5458 KY200872 13055.9 671 52237 KY200873 ND1 953.8 16 1757 KY200875 3740.5 1521 6799 KY200874 ND2 162 60 6810 KY200876 10042.5 652 25779 KY200877 ND3 5823.2 176 9325 KY200879 4870.9 283 7895 KY200878 ND4 1130.2 76 2404 KY200880 4858 1107 8186 KY200881 ND4L 600.6 19 941 KY200883 3175.6 508 4804 KY200882 ND5 460 131 1076 KY200885 3018.9 525 11168 KY200884 ND6 972.4 105 1754 KY200887 15338.5 1433 31569 KY200886 ATP6 2486.5 122 3949 KY200888 9141.6 1037 20583 KY200889 ATP8 27.9 18 34 KY200891 20252.5 8564 24420 KY200890 12S 6258 20 14620 KY200892 8161.1 408 15373 KY200893 16S 6608 1296 13496 KY200894 21247.6 1492 65018 KY200895