Genome
A complement to DNA barcoding reference library for identification of fish from the Northeast Pacific
Journal: Genome
Manuscript ID gen-2020-0192.R1
Manuscript Type: Article
Date Submitted by the 18-Mar-2021 Author:
Complete List of Authors: Turanov, Sergei; A.V. Zhirmunsky National Scientific Center of Marine Biology, Laboratory of Molecular Systematic; Far Eastern State Technical Fisheries University, Chair of Water Biological Resources and Aquaculture Kartavtsev, Yuri; A.V. Zhirmunsky Institute of Marine Biology FEB RAS, Lab of MolecularDraft Systematics Keyword: barcoding gap, Enophrys, Albatrossia, Coryphaenoides, deep sea fish
Is the invited manuscript for consideration in a Special Not applicable (regular submission) Issue? :
© The Author(s) or their Institution(s) Page 1 of 23 Genome
1 A complement to DNA barcoding reference library for identification of fish from the
2 Northeast Pacific
3
4 Sergei V. Turanov1,2,*, Yuri Ph. Kartavtsev1
5
6 1Laboratory of Molecular Systematic, A.V. Zhirmunsky National Scientific Center of Marine
7 Biology, Far Eastern Branch, Russian Academy of Sciences, 690041 Vladivostok, Russia
8 2Chair of Water Biological Resources and Aquaculture, Far Eastern State Technical Fisheries
9 University, 690087 Vladivostok, Russia
10 *Corresponding author; [email protected]; 17, Palchevsky St., Vladivostok 690041, Russia
Draft
1
© The Author(s) or their Institution(s) Genome Page 2 of 23
12 Abstract
13 The seas of the North Pacific Ocean are characterized by a large variety of fish fauna,
14 including endemic species. Molecular genetic methods, often based on DNA barcoding approaches,
15 have been recently used to determine species boundaries and identify cryptic diversity within these
16 species. This study complements the DNA barcode library of fish from the Northeast Pacific area.
17 A library based on 154 sequences of the mitochondrial COI gene from 44 species was assembled
18 and analyzed. It was found that 39 species (89%) can be unambiguously identified by the clear
19 thresholds forming a barcoding gap. Deviations from the standard 2% threshold value resulted in
20 detection of the species Enophrys lucasi in the sample, which is not typical for the eastern part of
21 the Bering Sea. This barcoding gap also made it possible to identify naturally occurring low values
22 of interspecific divergence of eulittoral taxa Aspidophoroides and the deep-sea genus 23 Coryphaenoides. Synonymy of the genusDraft Albatrossia in favor of the genus Coryphaenoides is 24 suggested based on both the original and previously published data.
25 Keywords: COI; barcoding gap; Enophrys; Albatrossia; Coryphaenoides; deep sea fish.
26
27 1. Introduction
28 The seas of the North Pacific Ocean are characterized by rich biotopes, and contain a wide
29 endemic variety of hydrobionts that have attracted the attention of taxonomists of different
30 specializations. Data from recent studies applying integrated approaches (Turanov and Kartavtsev
31 2014; Turanov et al. 2016; Moreva et al. 2017; Skurikhina et al. 2018; Smé et al. 2019; Turanov
32 2019; Chernyshev 2020; Jung et al. 2020; Stonik and Efimova 2020; Skriptsova and Kalita 2020)
33 show that the species diversity of fish and other aquatic organisms in this region is undervalued.
34 Molecular genetic approaches can be an undeniable leader among the supportive tools in
35 biodiversity studies.
36 Molecular genetic techniques not only help to document existing diversity (Hebert et al.
37 2003a, 2003b) and discover cryptic species (Bickford et al. 2007; Hubert and Hanner 2015), but
2
© The Author(s) or their Institution(s) Page 3 of 23 Genome
38 they can also be used to identify taxonomic discrepancies or cases of intentional substitution in the
39 commercial distribution of fish and shellfish biota (Galimberti et al. 2013; Khaksar et al. 2015;
40 Nedunoori et al. 2017). The comprehensive nature of molecular genetic methods enable
41 development of solutions to identify single species (Pfleger et al. 2016; Schenekar et al. 2020b;
42 Yusishen et al. 2020), as well as the use of rapid analysis technologies to monitor species diversity
43 (Lecaudey et al. 2019; Belevich et al. 2020; Schenekar et al. 2020a). However, the development of
44 unified methods has been hindered by the lack of a verified DNA barcode database of living
45 organisms in the region of the world where such methods could be implemented (McGee et al. 2019;
46 Weigand et al. 2019; Schenekar et al. 2020a).
47 DNA barcoding was originally developed to facilitate taking inventory of the entire species
48 diversity (Hebert et al. 2003a, 2003b), but has now evolved into a global initiative to ensure the 49 speed and quality of monitoring and conservationDraft measures (DeSalle and Goldstein 2019). There 50 are still limitations related to both the methodology and conceptual issues of evolutionary biology
51 and species definition (Meyer and Paulay 2005; Krishnamurthy and Francis 2012; Collins and
52 Cruickshank 2013; DeSalle and Goldstein 2019), however DNA barcoding is nevertheless
53 extremely useful. Further development is still required, especially for studying the biotic diversity
54 of the Northeast Pacific Ocean. Previously, this approach has been shown to be reliable for
55 detecting cryptic species diversity, and limitations have been identified regarding the applicability
56 of a strict threshold for many perch-like fish species (Turanov et al. 2016) from this region. This
57 paper provides an update to the reference barcode database of fish from the Far Eastern seas of
58 Russia, with taxonomic comments.
59 2. Material and methods
60 Fish specimens were collected using gillnets (Sea of Japan) and bottom trawls (Sea of
61 Okhotsk and Bering Sea) during the period from 2007 to 2011 (Fig. 1). Species identification was
62 conducted according to the most commonly used identification keys for the area (Lindberg and
63 Krasyukova 1987; Nakabo 2002) and subsequently adjusted to the current nomenclature (Parin et al.
3
© The Author(s) or their Institution(s) Genome Page 4 of 23
64 2014; Fricke et al. 2019). The sampling consisted of 154 specimens representing 44 species from 33
65 genera, 15 families and 6 orders. Each species had between one and five specimens. Voucher
66 specimens of the fish investigated are kept under corresponding numbers in the museum of the
67 NSCMB FEB RAS (Supplement S11). A piece of skeletal muscle tissue was taken from each
68 specimen and stored in 95% ethanol. Total DNA was isolated from this tissue using a K-Sorb
69 commercial kit (Syntol, Moscow).
70 The samples were then genotyped by a fragment of the mitochondrial COI gene using a
71 cocktail of universal C_FishF1t1–C_FishR1t1 primers (Ivanova et al. 2007). The PCR reaction
72 mixture (total volume 25 µl) included 1 µl of total DNA solution (20–150 ng), 5 µl of ready-made
73 PCR mixture ScreenMix (Eurogen, Moscow), 0.4 mM of primer solution, and deionized water up to
74 the final volume. The thermal cycling conditions consisted of preheating at 94ºC for 2 min, and 30 75 cycles according to the following scheme:Draft denaturation at 94ºC for 40 sec., annealing at 52ºC for 40 76 sec., and 1 min. elongation at 72ºC with final elongation for 10 min. To evaluate the PCR results,
77 electrophoresis of amplicons was performed in 1% agarose gel stained with ethidium bromide,
78 visualized under UV light. The amplified COI fragments were purified by alcohol precipitation and
79 then sequenced with appropriate primers (Ivanova et al., 2007) using the BrightDye™ Terminator
80 Cycle Sequencing Kit v3.1 (NimaGen). Capillary electrophoresis of the fragments was performed
81 on an ABI Prism 3130 DNA Genetic Analyzer sequencer (Applied Biosystems, USA). The
82 consensus sequences from the obtained chromatograms were assembled using Geneious software
83 (Kearse et al. 2012). Sequence alignment and subsequent correction of the reading frame (if
84 necessary) were performed in MEGA 7 (Kumar et al. 2016) using the MUSCLE algorithm (Edgar
85 2004). During the alignment, the closest matches from the output data of BLAST (Altschul et al.
86 1990) in GenBank (Benson et al. 2018) were used as reference sequences. The sequences with all
87 the necessary information and pictures with lifetime coloration were placed in BOLD
88 (Ratnasingham and Hebert 2007, 2013) in a project called FFES and uploaded to GenBank
1 gen-2020-0192.R1suppla 4
© The Author(s) or their Institution(s) Page 5 of 23 Genome
89 (Supplement S12). The genetic distances (p-distances) as well as their corrected values according to
90 the two-parameter Kimura model (Kimura 1980) were calculated using the BOLD workbench. The
91 upper conditional threshold value of intraspecific genetic distances was assumed to be the minimum
92 value of distances within a genus between different species. The BarcodingR package (Zhang et al.
93 2017) was used to calculate and plot a graph reflecting the Barcoding gap or the presence of a
94 threshold between intraspecific and interspecific genetic distances (Meyer and Paulay 2005; Meier
95 et al. 2006, 2008). We also used the BIN (Barcode Index Number, (Ratnasingham and Hebert 2013))
96 discordance report information provided by the BOLD workbench. To test the assumption of
97 genetic differentiation between species with extraordinarily low interspecific genetic distances, we
98 used the geneflow Fst indices with a permutation test based on 10,000 replicates in DnaSP 5
99 (Librado and Rozas 2009). In addition to the distance-based criteria for species delimitation, we 100 used a topological approach (i.e., constructionDraft of phylogenetic trees and identification of 101 monophyletic clusters corresponding to species groups pre-defined by morphological features). For
102 this purpose, a neighbor joining (NJ) tree was constructed in the program MEGA 7 using the K2P
103 model, based on available sequences. The robustness of the tree topology was estimated based on
104 the results of 1,000 pseudo-replicas of the non-parametric bootstrap test. The Bayesian topology (se)
105 was also inferred in the program MrBayes 3.2.7 (Ronquist et al. 2012). The simultaneous selection
106 of the optimal model of nucleotide substitutions among those implemented in MrBayes and the
107 partition scheme, incorporating the codons, was performed based on AIC in the program
108 PartitionFinder 2.0 (Guindon et al. 2010; Lanfear et al. 2012, 2014). According to the scheme
109 proposed by PartitionFinder, the first and second positions of the codons together made up a
110 partition that is separate from the third position. The optimal model for the first and second
111 positions of the codons in the matrix was GTR+G+I, while for the third position it was GTR+G. For
112 the first partition, a model with six parameters of substitutions was set, taking into account the
113 proportion of invariable sites (I), as well as Г-distribution of variability frequencies between sites
2 gen-2020-0192.R1suppla 5
© The Author(s) or their Institution(s) Genome Page 6 of 23
114 (nst=6, pinvar=est, rates=invgamma). An equivalent model was applied to the second partition,
115 excluding the proportion of invariant sites. The parameters for the different partitions were set to
116 unlink. The search for tree topology and marginal values of posterior probability was carried out by
117 launching four Markov chains in 1,000,000 generations. The frequency of sampling by Metropolis
118 algorithm from the probability distribution was 1 per 100 generations. The first 25% of trees
119 corresponding to the burn-in step were discarded. A consensus tree was generated based on the
120 remaining 15,002 trees. The convergence indices (ESS, PSRF) indicated sufficient sampling from
121 all parameters and a sufficient number of generations.
122 Ethics approval
123 Collection of specimens was conducted during a commercial fishing trip in accordance with
124 all applicable laws and the specimens were delivered to the authors in frozen form. 125 3. Results Draft 126 The phylogenetic NJ-tree (Fig. 3 and 4) demonstrates high or fairly reliable support for
127 species and genus clusters with the formation of 45 monophyletic groups (including species
128 represented by one sequence). Species Enophrys diceraus is divided into two distinct clusters. The
129 only genus forming paraphyletic lineages is Coryphaenoides. At the same time, the phylogenetic
130 relationships cannot be considered on the level above the genus, due to the low information capacity
131 of the short COI sequences in combination with the construction method. This is reflected in the
132 polyphyly among the families Cottidae and Agonidae of the order Scorpeniformes. The BI-tree
133 topology (Supplement S23) provides similar resolution to that of the NJ analysis of species and
134 genus branches. The main exception is that C. acrolepis forms a stem-group in relation to A.
135 pectoralis, whereas in the NJ-topology there is a clear bifurcation between sequences of the species
136 with shallow divergence (Fig. 3). The same is true for the pair of species A. bartoni and A. olrikii,
137 respectively (see Fig. 4 and Supplement S24). In general, the BI-tree topology is more stable and is
138 represented by only one polytomic node indicating the uncertainty of the position of Osmeriformes
3 gen-2020-0192.R1supplb 4 gen-2020-0192.R1supplb 6
© The Author(s) or their Institution(s) Page 7 of 23 Genome
139 order relative to the other taxa. The families Cottidae and Agonidae on this tree are represented by
140 monophyletic clades.
141 Only one of the 154 sequences in the present dataset (B. nigripinnis, FFES016-18) was not
142 barcode compliant and hence was not included in any BINs. The remaining sequences were
143 assigned to 44 BINs. Among these, 33 BINs (75%, 129 sequences) were concordant, 2 BINs
144 (4.5%) qualified as discordant, and 9 BINs (20.5%) were determined as singletons. One of the
145 discordant BINs was reported due to a genus-level conflict, which came from comparison of A.
146 pectoralis and C. acrolepis sharing the single BIN (BOLD:AAC7497). Another discordance was
147 caused by a species-level conflict of A. olrikii and A. bartoni with a common BIN
148 (BOLD:AAA9928). Interestingly, sequences of E. diceraus were split into a pair of valid BINS –
149 BOLD:AAJ0725 and BOLD:AAE3573 – allocated among singletons and concordant BINs, 150 respectively. Draft 151 Gene flow estimates calculated among phylogroups of A. pectoralis, C. acrolepis and C.
152 cinereus revealed pairwise Fst values of 0.75 between A. pectoralis and C. acrolepis, whereas C.
153 cinereus against A. pectoralis and C. acrolepis were 0.99 and 0.96, respectively. These results
154 imply that these phylogroups have become significantly differentiated and can be considered as
155 separate species. Similar results were found for the pair A. olrikii and A. bartoni, for which the
156 pairwise Fst value was close to 0.77. In both cases, the groups compared had no shared haplotypes.
157 4. Discussion
158 This study describes the nucleotide variability of the mitochondrial COI gene from 44
159 marine fish species collected in the Far Eastern seas. This variability can be used to determine the
160 applicability of these sequences for identifying species and thus contributing to the global reference
161 barcode database of fish from the Northeast Pacific (Steinke et al. 2009; Mecklenburg et al. 2011;
162 Zhang and Hanner 2011; Wang et al. 2012; Turanov et al. 2016, 2019; Kartavtsev et al. 2016).
163 Our results showed that 39 species (89%) can be unambiguously identified to the species
164 level, and that the data obtained do not show any discrepancies between molecular genetic criteria
7
© The Author(s) or their Institution(s) Genome Page 8 of 23
165 and morphological features. For the remaining 5 species (11%), COI sequences exhibit some
166 deviations at the species level according to either distance-based or topological criteria. When the
167 complete dataset was subjected to distribution analysis of intra- and interspecific genetic distances,
168 the results indicated the absence of a barcoding gap (Fig. 2A). The overlap of genetic distance
169 values was caused by two reasons: low divergence at the interspecific level, and the presence of an
170 additional species level phylogroup resulting in high intraspecific values. Deviations of this kind are
171 the most common when DNA barcode libraries are analyzed (Hubert and Hanner 2015), if the
172 conventional threshold for distinguishing between intra- and interspecific variability is set at 2%
173 (Ward 2009). Establishing a universal threshold for delimitation is advantageous in rapid
174 assessments of species boundaries for the majority of known taxa (Ratnasingham and Hebert 2013).
175 In addition, generalizations about the evolution of mitochondrial fragments indicate that 176 intraspecific variability caused by synonymousDraft substitutions due to evolutionary features of the 177 vertebrates’ mitochondrial genome usually does not exceed 0.5% (Stoeckle and Thaler 2018).
178 However, this measure should not limit the distance criteria for species delineation, as the strict
179 threshold approach is not applicable to all species (DeSalle and Goldstein 2019) and may
180 misrepresent their natural boundaries (Meyer and Paulay 2005; Bagley et al. 2019). For example,
181 the total number of cases of poly- and paraphyly in taxonomic studies involving molecular genetic
182 data of mitochondrial nature can reach 23% (Funk and Omland 2003). These numbers are
183 comparable with earlier results obtained for other taxonomic groups (Turanov et al. 2016). In this
184 paper, the relatively low number of deviations seems to be a result of a high proportion of
185 singletons (20.5%) and genera represented in the sampling by a single species (38.6%).
186 The barcoding gap is clearly marked when creating a reduced data set that excludes the
187 sequences of five deviant species (Fig. 2B). The specific features of their variability require
188 additional discussion. The cluster of E. diceraus species includes two BINs of the genus Enophrys
189 (Fig. 4). This genus has four valid species, distributed in the North Pacific Ocean from the Sea of
190 Japan on the east to the coastal waters of southern California on the west (Parin et al. 2014; Pietsch
8
© The Author(s) or their Institution(s) Page 9 of 23 Genome
191 and Orr 2015; Mecklenburg et al. 2018; Burton and Lea 2019). A recent study found cryptic
192 diversity within the E. diceraus species, such that COI gene sequences of individuals from the Sea
193 of Japan and Sea of Okhotsk were separated by a divergence of 2.89% (Moreva et al. 2017).
194 Moreover, the COI gene tree topology placed another species, E. lucasi, between these phylogroups
195 (ibid., Fig. 3). This species is common in the western part of the Bering Sea, along the Aleutian
196 Islands and in the Gulf of Alaska. Recent data on the differentiation of the species E. diceraus and E.
197 lucasi indicate that they are remarkably similar in morphological features, but clearly differ based
198 on the divergence of COI sequences (Mecklenburg et al. 2011). BIN BOLD:AAJ0725 from our
199 samples belongs to E. lucasi. Hence, this deviation (high genetic distance) is caused by the
200 erroneous identification of a single specimen of a rare species for a given region (Parin et al. 2014).
201 A pair of species, A. bartoni and A. olrikii, which share a common BIN, demonstrate a 202 mutual divergence value of 0.015 and a clearDraft clustering according to species affinity (Fig. 4). Their 203 common BIN profile in BOLD also shows a clear bimodal distribution of genetic distance values,
204 indicating that there are two species in its composition. In this case, the clear threshold value
205 adopted by BOLD does not reflect the natural species boundaries of taxa, and the topological
206 criterion of identity is seen as more reliable. This appears to be true also for the cluster of taxa
207 Albatrossia and Coryphaenoides (Fig. 3). The genus Coryphaenoides includes many species
208 adapted to life in deep waters, and is extremely widely distributed (Iwamoto and Stein 1974;
209 Iwamoto and Sazonov 1988; Cohen et al. 1990; Parin et al. 2014). The validity of the genus
210 Albatrossia is controversial. Some authors consider it valid (Iwamoto and Sazonov 1988; Cohen et
211 al. 1990; Parin et al. 2014), while others believe it is a member of the genus Coryphaenoides
212 (Iwamoto and Stein 1974). It is noteworthy that in the only work examining the molecular
213 phylogenetic relationships of different representatives of the genus Coryphaenoides, A. pectoralis is
214 considered to be the representative of this genus; the topological position of this species in the
215 corresponding reconstruction precludes any other interpretation (see Fig. 5 in (Morita 1999)). Our
9
© The Author(s) or their Institution(s) Genome Page 10 of 23
216 data (Fig. 3) fully support this conclusion and suggest that the genus Albatrossia is a synonym of
217 the genus Coryphaenoides.
218 When forming the reference nucleotide sequence sets – one of which is presented in this
219 paper – special attention should be paid to the nucleotide divergence patterns of those species
220 whose natural boundaries of variability cannot be described by a simple threshold value upon
221 barcoding gap setting. Carefully analyzed and supervised in this way, the library will provide the
222 basis for noninvasive approaches to biodiversity monitoring using eDNA (Valentini et al. 2016).
223 Declaration of competing interest
224 The authors declare that they have no known competing financial interests or personal
225 relationships that could influence the work reported in this paper.
226 Acknowledgements 227 This research was partially supportedDraft by a Grant of the President of the Russian Federation 228 (MK-305.2019.4), Far Eastern Branch of the Russian Academy of Sciences in the framework of the
229 Federal Program of Base Research (18-4-040) and Ministry of Science and Higher Education of the
230 Russian Federation (agreement number 075-15-2020-796, grant number 13.1902.21.0012).
231 References
232 Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment
233 search tool. J. Mol. Biol. 215(3): 403–410. doi:10.1016/S0022-2836(05)80360-2.
234 Bagley, J.C., de Aquino, P.D.P.U., Breitman, M.F., Langeani, F., and Colli, G.R. 2019. DNA
235 barcode and minibarcode identification of freshwater fishes from Cerrado headwater streams
236 in Central Brazil. J. Fish Biol. doi:10.1111/jfb.14098.
237 Belevich, T.A., Ilyash, L. V., Milyutina, I.A., Logacheva, M.D., and Troitsky, A. V. 2020.
238 Photosynthetic Picoeukaryotes Diversity in the Underlying Ice Waters of the White Sea,
239 Russia. Diversity 12(3): 93. Multidisciplinary Digital Publishing Institute.
240 doi:10.3390/d12030093.
241 Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Ostell, J., Pruitt, K.D., and Sayers,
10
© The Author(s) or their Institution(s) Page 11 of 23 Genome
242 E.W. 2018. GenBank. Nucleic Acids Res. 46(D1): D41–D47. Available from
243 http://dx.doi.org/10.1093/nar/gkx1094.
244 Bickford, D., Lohman, D.J., Sodhi, N.S., Ng, P.K.L., Meier, R., Winker, K., Ingram, K.K., and Das,
245 I. 2007. Cryptic species as a window on diversity and conservation. Trends Ecol. Evol. 22(3):
246 148–155. doi:https://doi.org/10.1016/j.tree.2006.11.004.
247 Burton, E.J., and Lea, R.N. 2019. Annotated checklist of fishes from monterey bay national marine
248 sanctuary with notes on extralimital species. Zookeys. doi:10.3897/zookeys.887.38024.
249 Chernyshev, A. V. 2020. Nemerteans from the Far Eastern Seas of Russia. Russ. J. Mar. Biol. 46(3):
250 141–153. doi:10.1134/S1063074020030049.
251 Cohen, D.M., Inada, T., Iwamoto, T., and Scialabba, N. 1990. FAO Catalogue of Species Vol.10.
252 FAO species Cat. Vol. 10 Gadiform Fishes world (Order Gadiformes) An Annot. Illus. Cat. 253 cods, hakes, grenadiers other gadiformDraft fishes known to date. 254 Collins, R. a, and Cruickshank, R.H. 2013. The seven deadly sins of DNA barcoding. Mol. Ecol.
255 Resour. 13(6): 969–75. doi:10.1111/1755-0998.12046.
256 DeSalle, R., and Goldstein, P. 2019. Review and Interpretation of Trends in DNA Barcoding. Front.
257 Ecol. Evol. doi:10.3389/fevo.2019.00302.
258 Edgar, R.C. 2004. MUSCLE: Multiple sequence alignment with high accuracy and high throughput.
259 Nucleic Acids Res. 32(5): 1792–1797. doi:10.1093/nar/gkh340.
260 Fricke, R., Eschmeyer, W.N., and van der Laan, R. 2019. Catalog of fishes: Genera, species,
261 references. Inst. Biodivers. Sci. Sustain. Calif. Acad. Sci. [accessed 1 Febr. 2018].
262 Funk, D.J., and Omland, K.E. 2003. Species-level paraphyly and polyphyly: frequency, causes, and
263 consequences, with insights from animal mitochondrial DNA. Annu. Rev. Ecol. Evol. Syst.
264 34(1): 397–423. Annual Reviews 4139 El Camino Way, PO Box 10139, Palo Alto, CA 94303-
265 0139, USA.
266 Galimberti, A., De Mattia, F., Losa, A., Bruni, I., Federici, S., Casiraghi, M., Martellos, S., and
267 Labra, M. 2013. DNA barcoding as a new tool for food traceability. Food Res. Int. 50(1): 55–
11
© The Author(s) or their Institution(s) Genome Page 12 of 23
268 63. doi:https://doi.org/10.1016/j.foodres.2012.09.036.
269 Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, O. 2010. New
270 algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the
271 performance of PhyML 3.0. Syst. Biol. doi:10.1093/sysbio/syq010.
272 Harrington B. 2004 – 2005. Inkscape. – http://www.inkscape.org.
273 Hebert, P.D.N., Cywinska, A., Ball, S.L., and deWaard, J.R. 2003a. Biological identifications
274 through DNA barcodes. Proc. Biol. Sci. 270(1512): 313–321.
275 Hebert, P.D.N., Ratnasingham, S., and deWaard, J.R. 2003b. Barcoding animal life: cytochrome c
276 oxidase subunit 1 divergences among closely related species. Proc. Biol. Sci. 270 Suppl: S96–
277 S99.
278 Hubert, N., and Hanner, R. 2015. DNA Barcoding, species delineation and taxonomy: a historical 279 perspective. DNA Barcodes 3(1): 44–58.Draft 280 Ivanova, N. V, Zemlak, T.S., Hanner, R.H., and Hebert, P.D.N. 2007. Universal primer cocktails for
281 fish DNA barcoding. Mol. Ecol. Notes 7(4): 544–548.
282 Iwamoto, T., and Sazonov, Y.I. 1988. A review of the southeastern Pacific Coryphaenoides (sensu
283 lato) (Pisces, Gadiformes, Macrouridae). Proc. Calif. Acad. Sci.
284 Iwamoto, T., and Stein, D.L. 1974. A systematic review of the rattail fishes (Macrouridae:
285 Gadiformes) from Oregon and adjacent waters. Occas. Pap. Calif. Acad. Sci.
286 doi:10.5962/bhl.part.15932.
287 Jung, D.-W., Gosliner, T.M., Choi, T.-J., Kil, H.-J., Chichvarkhin, A., Goddard, J.H.R., and Valdés,
288 Á. 2020. The return of the clown: pseudocryptic speciation in the North Pacific clown
289 nudibranch, Triopha catalinae (Cooper, 1863) sensu lato identified by integrative taxonomic
290 approaches. Mar. Biodivers. 50(5): 84. doi:10.1007/s12526-020-01107-2.
291 Kartavtsev, Y.P., Rozhkovan, K. V, and Masalkova, N.A. 2016. Phylogeny based on two mtDNA
292 genes (Co-1, Cyt-B) among Sculpins (Scorpaeniformes, Cottidae) and some other scorpionfish
293 in the Russian Far East. Mitochondrial DNA Part A 27(3): 2225–2240. Taylor & Francis.
12
© The Author(s) or their Institution(s) Page 13 of 23 Genome
294 doi:10.3109/19401736.2014.984164.
295 Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper,
296 A., Markowitz, S., and Duran, C. 2012. Geneious Basic: an integrated and extendable desktop
297 software platform for the organization and analysis of sequence data. Bioinformatics 28(12):
298 1647–1649. Oxford University Press.
299 Khaksar, R., Carlson, T., Schaffner, D.W., Ghorashi, M., Best, D., Jandhyala, S., Traverso, J., and
300 Amini, S. 2015. Unmasking seafood mislabeling in U.S. markets: DNA barcoding as a unique
301 technology for food authentication and quality control. Food Control 56: 71–76.
302 doi:10.1016/j.foodcont.2015.03.007.
303 Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through
304 comparative studies of nucleotide sequences. J. Mol. Evol. 16(2): 111–120. 305 Krishnamurthy, P.K., and Francis, R.A. 2012.Draft A critical review on the utility of DNA barcoding in 306 biodiversity conservation. Biodivers. Conserv. 21(8): 1901–1919. Springer.
307 Kumar, S., Stecher, G., and Tamura, K. 2016. MEGA7: molecular evolutionary genetics analysis
308 version 7.0 for bigger datasets. Mol. Biol. Evol. 33(7): 1870–1874. Society for Molecular
309 Biology and Evolution.
310 Lanfear, R., Calcott, B., Ho, S.Y.W., and Guindon, S. 2012. PartitionFinder: combined selection of
311 partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29(6):
312 1695–1701. Oxford University Press.
313 Lanfear, R., Calcott, B., Kainer, D., Mayer, C., and Stamatakis, A. 2014. Selecting optimal
314 partitioning schemes for phylogenomic datasets. BMC Evol. Biol. 14(1): 82. BioMed Central.
315 Lecaudey, L.A., Schletterer, M., Kuzovlev, V. V, Hahn, C., and Weiss, S.J. 2019. Fish diversity
316 assessment in the headwaters of the Volga River using environmental DNA metabarcoding.
317 Aquat. Conserv. Mar. Freshw. Ecosyst. 29(10): 1785–1800. John Wiley & Sons, Ltd.
318 doi:10.1002/aqc.3163.
319 Librado, P., and Rozas, J. 2009. DnaSP v5: a software for comprehensive analysis of DNA
13
© The Author(s) or their Institution(s) Genome Page 14 of 23
320 polymorphism data. Bioinformatics 25(11): 1451–1452. Oxford University Press.
321 Lindberg, G.U., and Krasyukova, Z. V. 1987. Fishes of the Sea of Japan and adjacent parts of the
322 Sea of Okhotsk and the Yellow Sea. Part 5. Teleostomi. Osteichthyes. Actinopterygii. XXX.
323 Scorpeniformes. (CLXXVI. Fam. Scorpaenidae – CXCIV. Fam. Liparididae). Nauka,
324 Leningrad.
325 McGee, K.M., Robinson, C. V., and Hajibabaei, M. 2019. Gaps in DNA-Based Biomonitoring
326 Across the Globe. Front. Ecol. Evol. doi:10.3389/fevo.2019.00337.
327 Mecklenburg, C.W., Lynghammer, A., Johannesen, E., Byrkjedal, I., Christiansen, J.S., Dolgov, A.
328 V, Karamushko, O. V, Mecklenburg, T.A., Møller, P.R., Steinke, D., and Wienerroither, R.M.
329 2018. Marine fishes of the Arctic region. In Conservation of Arctic Flora and Fauna.
330 Mecklenburg, C.W., Møller, P.R., and Steinke, D. 2011. Biodiversity of arctic marine fishes: 331 taxonomy and zoogeography. Draft 332 Meier, R., Shiyang, K., Vaidya, G., and Ng, P.K.L. 2006. DNA barcoding and taxonomy in diptera:
333 A tale of high intraspecific variability and low identification success. Syst. Biol.
334 doi:10.1080/10635150600969864.
335 Meier, R., Zhang, G., and Ali, F. 2008. The use of mean instead of smallest interspecific distances
336 exaggerates the size of the “barcoding gap” and leads to misidentification.
337 doi:10.1080/10635150802406343.
338 Meyer, C.P., and Paulay, G. 2005. DNA barcoding: Error rates based on comprehensive sampling.
339 PLoS Biol. 3(12): 1–10. doi:10.1371/journal.pbio.0030422.
340 Moreva, I., Radchenko, O., Petrovskaya, A., and Borisenko, S. 2017. Molecular genetic and
341 karyological analysis of antlered sculpins of Enophrys diceraus group (Cottidae). Russ. J.
342 Genet. 53(97): 1030–1041.
343 Morita, T. 1999. Molecular Phylogenetic Relationships of the Deep-Sea Fish Genus
344 Coryphaenoides (Gadiformes: Macrouridae) Based on Mitochondrial DNA. Mol. Phylogenet.
345 Evol. doi:10.1006/mpev.1999.0661.
14
© The Author(s) or their Institution(s) Page 15 of 23 Genome
346 Nakabo, T. 2002. Fishes of Japan: with pictorial keys to the species. Tokai University Press.
347 Nedunoori, A., Turanov, S. V., and Kartavtsev, Y.P. 2017. Fish product mislabeling identified in
348 the Russian far east using DNA barcoding. Gene Reports 8: 144–149.
349 doi:10.1016/j.genrep.2017.07.006.
350 Parin, N.V., Evseenko, S.A., and Vasil’eva, E.D. 2014. Fishes of the Rusian Seas: Annotated
351 Catalogue. KMK Scientific Press, Moscow.
352 Pfleger, M.O., Rider, S.J., Johnston, C.E., and Janosik, A.M. 2016. Saving the doomed: Using
353 eDNA to aid in detection of rare sturgeon for conservation (Acipenseridae). Glob. Ecol.
354 Conserv. 8: 99–107. doi:https://doi.org/10.1016/j.gecco.2016.08.008.
355 Pietsch, T.W., and Orr, J.W. 2015. Fishes of the Salish Sea: A compilation and distributional
356 analysis. NOAA Prof. Pap. NMFS 18. 357 Ratnasingham, S., and Hebert, P.D.N.Draft 2007. The Barcode of Life Data System 358 (www.barcodinglife.org). Mol. Ecol. Notes.
359 Ratnasingham, S., and Hebert, P.D.N. 2013. A DNA-Based Registry for All Animal Species: The
360 Barcode Index Number (BIN) System. PLoS One 8(7).
361 Ronquist, F., Teslenko, M., Van Der Mark, P., Ayres, D.L., Darling, A., Höhna, S., Larget, B., Liu,
362 L., Suchard, M.A., and Huelsenbeck, J.P. 2012. Mrbayes 3.2: Efficient bayesian phylogenetic
363 inference and model choice across a large model space. Syst. Biol. 61(3): 539–542.
364 doi:10.1093/sysbio/sys029.
365 Schenekar, T., Schletterer, M., Lecaudey, L.A., and Weiss, S.J. 2020a. Reference databases, primer
366 choice, and assay sensitivity for environmental metabarcoding: Lessons learnt from a re-
367 evaluation of an eDNA fish assessment in the Volga headwaters. River Res. Appl. 36(7):
368 1004–1013. John Wiley & Sons, Ltd. doi:10.1002/rra.3610.
369 Schenekar, T., Schletterer, M., and Weiss, S.J. 2020b. Development of a TaqMan qPCR protocol
370 for detecting Acipenser ruthenus in the Volga headwaters from eDNA samples. Conserv.
371 Genet. Resour. 12(3): 395–397. doi:10.1007/s12686-020-01128-w.
15
© The Author(s) or their Institution(s) Genome Page 16 of 23
372 Schlitzer, R., Ocean Data View, http://odv.awi.de, 2016.
373 Skriptsova, A. V, and Kalita, T.L. 2020. A re-evaluation of Palmaria (Palmariaceae, Rhodophyta) in
374 the North-West Pacific. Eur. J. Phycol. 55(3): 266–274. Taylor & Francis.
375 doi:10.1080/09670262.2020.1714081.
376 Skurikhina, L.A., Oleinik, A.G., Kukhlevsky, A.D., Kovpak, N.E., Frolov, S. V, and Sendek, D.S.
377 2018. Phylogeography and demographic history of the Pacific smelt Osmerus dentex inferred
378 from mitochondrial DNA variation. Polar Biol. 41(5): 877–896. Springer.
379 Smé, N.A., Lyon, S., Mueter, F., Brykov, V., Sakurai, Y., and Gharrett, A.J. 2019. Examination of
380 saffron cod Eleginus gracilis (Tilesius 1810) population genetic structure. Polar Biol.: 1–15.
381 Springer.
382 Steinke, D., Zemlak, T.S., Boutillier, J.A., and Hebert, P.D.N. 2009. DNA barcoding of Pacific 383 Canada’s fishes. doi:10.1007/s00227-009-1284-0.Draft 384 Stoeckle, M.Y., and Thaler, D.S. 2018. Why should mitochondria define species? Hum. Evol. 33(1–
385 2): 1–30.
386 Stonik, I. V, and Efimova, K. V. 2020. Attheya (Bacillariophyta) from the northwestern Sea of
387 Japan: a description of two subgenera based on molecular and morphological data. Phycologia
388 59(3): 227–237. Taylor & Francis. doi:10.1080/00318884.2020.1732801.
389 Turanov, S. V, Balanov, A.A., and Shelekhov, V.A. 2019. Species of the genus Ammodytes
390 (Ammodytidae) in the northwestern part of the Sea of Japan. J. Appl. Ichthyol.
391 Turanov, S. V, and Kartavtsev, Y.P. 2014. The taxonomic composition and distribution of sand
392 lances from the genus Ammodytes (Perciformes: Ammodytidae) in the North Pacific. Russ. J.
393 Mar. Biol. 40(4).
394 Turanov, S.V. 2019. Building and analysis of the reference nucleotide sequence data base of the
395 mitochondrial COI gene for delimitation of sand lances species (Uranoscopiformes:
396 Ammodytidae) from the Northern Hemisphere. Russ. J. Mar. Biol. 45(1).
397 Turanov, S.V., Kartavtsev, Y.P., Lipinsky, V.V., Zemnukhov, V.V., Balanov, A.A., Lee, Y.-H., and
16
© The Author(s) or their Institution(s) Page 17 of 23 Genome
398 Jeong, D. 2016. DNA-barcoding of perch-like fishes (Actinopterygii: Perciformes) from far-
399 eastern seas of Russia with taxonomic remarks for some groups. Mitochondrial DNA 27(2).
400 doi:10.3109/19401736.2014.945525.
401 Valentini, A., Taberlet, P., Miaud, C., Civade, R., Herder, J., Thomsen, P.F., Bellemain, E., Besnard,
402 A., Coissac, E., Boyer, F., Gaboriaud, C., Jean, P., Poulet, N., Roset, N., Copp, G.H., Geniez,
403 P., Pont, D., Argillier, C., Baudoin, J.M., Peroux, T., Crivelli, A.J., Olivier, A., Acqueberge,
404 M., Le Brun, M., Møller, P.R., Willerslev, E., and Dejean, T. 2016. Next-generation
405 monitoring of aquatic biodiversity using environmental DNA metabarcoding. Mol. Ecol.
406 doi:10.1111/mec.13428.
407 Wang, Z.-D., Guo, Y.-S., Liu, X.-M., Fan, Y.-B., and Liu, C.-W. 2012. DNA barcoding South
408 China Sea fishes. doi:10.3109/19401736.2012.710204. 409 Ward, R.D. 2009. DNA barcode divergenceDraft among species and genera of birds and fishes. Mol. 410 Ecol. Resour. 9(4): 1077–1085. doi:10.1111/j.1755-0998.2009.02541.x.
411 Weigand, H., Beermann, A.J., Čiampor, F., Costa, F.O., Csabai, Z., Duarte, S., Geiger, M.F.,
412 Grabowski, M., Rimet, F., Rulik, B., Strand, M., Szucsich, N., Weigand, A.M., Willassen, E.,
413 Wyler, S.A., Bouchez, A., Borja, A., Čiamporová-Zaťovičová, Z., Ferreira, S., Dijkstra,
414 K.D.B., Eisendle, U., Freyhof, J., Gadawski, P., Graf, W., Haegerbaeumer, A., van der Hoorn,
415 B.B., Japoshvili, B., Keresztes, L., Keskin, E., Leese, F., Macher, J.N., Mamos, T., Paz, G.,
416 Pešić, V., Pfannkuchen, D.M., Pfannkuchen, M.A., Price, B.W., Rinkevich, B., Teixeira,
417 M.A.L., Várbíró, G., and Ekrem, T. 2019. DNA barcode reference libraries for the monitoring
418 of aquatic biota in Europe: Gap-analysis and recommendations for future work.
419 doi:10.1016/j.scitotenv.2019.04.247.
420 Yusishen, M.E., Eichorn, F.-C., Anderson, W.G., and Docker, M.F. 2020. Development of
421 quantitative PCR assays for the detection and quantification of lake sturgeon (Acipenser
422 fulvescens) environmental DNA. Conserv. Genet. Resour. 12(1): 17–19. doi:10.1007/s12686-
423 018-1054-8.
17
© The Author(s) or their Institution(s) Genome Page 18 of 23
424 Zhang, A.B., Hao, M. Di, Yang, C.Q., and Shi, Z.Y. 2017. BarcodingR: an integrated r package for
425 species identification using DNA barcodes. Methods Ecol. Evol. doi:10.1111/2041-
426 210X.12682.
427 Zhang, J.-B., and Hanner, R. 2011. DNA barcoding is a useful tool for the identification of marine
428 fishes from Japan. doi:10.1016/j.bse.2010.12.017.
Draft
18
© The Author(s) or their Institution(s) Page 19 of 23 Genome
430 Figure captions
431 Fig. 1. Map of fish sampling localities across the study area of the Northeast Pacific.
432 Sampling sites are indicated by shaded circles. The map was generated by the Ocean Data View
433 (Schlitzer, 2016). The Inkscape (Harrington, 2004-2005) was used to edit and compile it.
434 Fig. 2. The results of DNA barcoding gap analysis of COI-genotyped fish specimens from
435 the Far Eastern seas. Barplots show the distribution of intraspecific (grey bars) and interspecific
436 (black bars) genetic distance variation based on the K2P substitution model. A and B represent
437 results for full (154 sequences) and restricted (130 sequences) datasets, respectively. The restricted
438 dataset excludes E. diceraus, A. olrikii, A. bartoni, C. acrolepis and A. pectoralis.
439 Fig. 3. Midpoint-rooted NJ phylogenetic tree reconstructed based on the partial COI
440 sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The triangle 441 indicates the collapsed cluster, fully shownDraft separately in Fig. 4. Values in the nodes represent 442 bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey area
443 represents clusters with extraordinary patterns of divergence.
444 Fig. 4. Part of the entire NJ phylogenetic tree represented in Fig. 3. The tree is reconstructed
445 based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-
446 distance. The tree is collapsed and rooted at the midpoint. Values in the nodes represent bootstrap
447 support measures higher than 50%. Intraspecies clusters are collapsed. Grey areas represent clusters
448 with extraordinary patterns of divergence.
19
© The Author(s) or their Institution(s) Genome Page 20 of 23
Draft
Map of fish sampling localities across the study area of the Northeast Pacific. Sampling sites are indicated by shaded circles. The map was generated by the Ocean Data View (Schlitzer, 2016). The Inkscape (Harrington, 2004-2005) was used to edit and compile it.
234x169mm (600 x 600 DPI)
© The Author(s) or their Institution(s) Page 21 of 23 Genome
Draft
The results of DNA barcoding gap analysis of COI-genotyped fish specimens from the Far Eastern seas. Barplots show the distribution of intraspecific (grey bars) and interspecific (black bars) genetic distance variation based on the K2P substitution model. A and B represent results for full (154 sequences) and restricted (130 sequences) datasets, respectively. The restricted dataset excludes E. diceraus, A. olrikii, A. bartoni, C. acrolepis and A. pectoralis.
204x256mm (600 x 600 DPI)
© The Author(s) or their Institution(s) Genome Page 22 of 23
Draft
Midpoint-rooted NJ phylogenetic tree reconstructed based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The triangle indicates the collapsed cluster, fully shown separately in Fig. 4. Values in the nodes represent bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey area represents clusters with extraordinary patterns of divergence.
141x177mm (600 x 600 DPI)
© The Author(s) or their Institution(s) Page 23 of 23 Genome
Draft
Part of the entire NJ phylogenetic tree represented in Fig. 3. The tree is reconstructed based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The tree is collapsed and rooted at the midpoint. Values in the nodes represent bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey areas represent clusters with extraordinary patterns of divergence.
138x172mm (600 x 600 DPI)
© The Author(s) or their Institution(s)