Genome

A complement to DNA barcoding reference library for identification of fish from the Northeast Pacific

Journal: Genome

Manuscript ID gen-2020-0192.R1

Manuscript Type: Article

Date Submitted by the 18-Mar-2021 Author:

Complete List of Authors: Turanov, Sergei; A.V. Zhirmunsky National Scientific Center of Marine Biology, Laboratory of Molecular Systematic; Far Eastern State Technical Fisheries University, Chair of Water Biological Resources and Aquaculture Kartavtsev, Yuri; A.V. Zhirmunsky Institute of Marine Biology FEB RAS, Lab of MolecularDraft Systematics Keyword: barcoding gap, Enophrys, Albatrossia, Coryphaenoides, deep sea fish

Is the invited manuscript for consideration in a Special Not applicable (regular submission) Issue? :

© The Author(s) or their Institution(s) Page 1 of 23 Genome

1 A complement to DNA barcoding reference library for identification of fish from the

2 Northeast Pacific

3

4 Sergei V. Turanov1,2,*, Yuri Ph. Kartavtsev1

5

6 1Laboratory of Molecular Systematic, A.V. Zhirmunsky National Scientific Center of Marine

7 Biology, Far Eastern Branch, Russian Academy of Sciences, 690041 Vladivostok, Russia

8 2Chair of Water Biological Resources and Aquaculture, Far Eastern State Technical Fisheries

9 University, 690087 Vladivostok, Russia

10 *Corresponding author; [email protected]; 17, Palchevsky St., Vladivostok 690041, Russia

Draft

1

© The Author(s) or their Institution(s) Genome Page 2 of 23

12 Abstract

13 The seas of the North Pacific Ocean are characterized by a large variety of fish fauna,

14 including endemic species. Molecular genetic methods, often based on DNA barcoding approaches,

15 have been recently used to determine species boundaries and identify cryptic diversity within these

16 species. This study complements the DNA barcode library of fish from the Northeast Pacific area.

17 A library based on 154 sequences of the mitochondrial COI gene from 44 species was assembled

18 and analyzed. It was found that 39 species (89%) can be unambiguously identified by the clear

19 thresholds forming a barcoding gap. Deviations from the standard 2% threshold value resulted in

20 detection of the species Enophrys lucasi in the sample, which is not typical for the eastern part of

21 the Bering Sea. This barcoding gap also made it possible to identify naturally occurring low values

22 of interspecific divergence of eulittoral taxa Aspidophoroides and the deep-sea 23 Coryphaenoides. Synonymy of the genusDraft Albatrossia in favor of the genus Coryphaenoides is 24 suggested based on both the original and previously published data.

25 Keywords: COI; barcoding gap; Enophrys; Albatrossia; Coryphaenoides; deep sea fish.

26

27 1. Introduction

28 The seas of the North Pacific Ocean are characterized by rich biotopes, and contain a wide

29 endemic variety of hydrobionts that have attracted the attention of taxonomists of different

30 specializations. Data from recent studies applying integrated approaches (Turanov and Kartavtsev

31 2014; Turanov et al. 2016; Moreva et al. 2017; Skurikhina et al. 2018; Smé et al. 2019; Turanov

32 2019; Chernyshev 2020; Jung et al. 2020; Stonik and Efimova 2020; Skriptsova and Kalita 2020)

33 show that the species diversity of fish and other aquatic organisms in this region is undervalued.

34 Molecular genetic approaches can be an undeniable leader among the supportive tools in

35 biodiversity studies.

36 Molecular genetic techniques not only help to document existing diversity (Hebert et al.

37 2003a, 2003b) and discover cryptic species (Bickford et al. 2007; Hubert and Hanner 2015), but

2

© The Author(s) or their Institution(s) Page 3 of 23 Genome

38 they can also be used to identify taxonomic discrepancies or cases of intentional substitution in the

39 commercial distribution of fish and shellfish biota (Galimberti et al. 2013; Khaksar et al. 2015;

40 Nedunoori et al. 2017). The comprehensive nature of molecular genetic methods enable

41 development of solutions to identify single species (Pfleger et al. 2016; Schenekar et al. 2020b;

42 Yusishen et al. 2020), as well as the use of rapid analysis technologies to monitor species diversity

43 (Lecaudey et al. 2019; Belevich et al. 2020; Schenekar et al. 2020a). However, the development of

44 unified methods has been hindered by the lack of a verified DNA barcode database of living

45 organisms in the region of the world where such methods could be implemented (McGee et al. 2019;

46 Weigand et al. 2019; Schenekar et al. 2020a).

47 DNA barcoding was originally developed to facilitate taking inventory of the entire species

48 diversity (Hebert et al. 2003a, 2003b), but has now evolved into a global initiative to ensure the 49 speed and quality of monitoring and conservationDraft measures (DeSalle and Goldstein 2019). There 50 are still limitations related to both the methodology and conceptual issues of evolutionary biology

51 and species definition (Meyer and Paulay 2005; Krishnamurthy and Francis 2012; Collins and

52 Cruickshank 2013; DeSalle and Goldstein 2019), however DNA barcoding is nevertheless

53 extremely useful. Further development is still required, especially for studying the biotic diversity

54 of the Northeast Pacific Ocean. Previously, this approach has been shown to be reliable for

55 detecting cryptic species diversity, and limitations have been identified regarding the applicability

56 of a strict threshold for many perch-like fish species (Turanov et al. 2016) from this region. This

57 paper provides an update to the reference barcode database of fish from the Far Eastern seas of

58 Russia, with taxonomic comments.

59 2. Material and methods

60 Fish specimens were collected using gillnets (Sea of Japan) and bottom trawls (Sea of

61 Okhotsk and Bering Sea) during the period from 2007 to 2011 (Fig. 1). Species identification was

62 conducted according to the most commonly used identification keys for the area (Lindberg and

63 Krasyukova 1987; Nakabo 2002) and subsequently adjusted to the current nomenclature (Parin et al.

3

© The Author(s) or their Institution(s) Genome Page 4 of 23

64 2014; Fricke et al. 2019). The sampling consisted of 154 specimens representing 44 species from 33

65 genera, 15 families and 6 orders. Each species had between one and five specimens. Voucher

66 specimens of the fish investigated are kept under corresponding numbers in the museum of the

67 NSCMB FEB RAS (Supplement S11). A piece of skeletal muscle tissue was taken from each

68 specimen and stored in 95% ethanol. Total DNA was isolated from this tissue using a K-Sorb

69 commercial kit (Syntol, Moscow).

70 The samples were then genotyped by a fragment of the mitochondrial COI gene using a

71 cocktail of universal C_FishF1t1–C_FishR1t1 primers (Ivanova et al. 2007). The PCR reaction

72 mixture (total volume 25 µl) included 1 µl of total DNA solution (20–150 ng), 5 µl of ready-made

73 PCR mixture ScreenMix (Eurogen, Moscow), 0.4 mM of primer solution, and deionized water up to

74 the final volume. The thermal cycling conditions consisted of preheating at 94ºC for 2 min, and 30 75 cycles according to the following scheme:Draft denaturation at 94ºC for 40 sec., annealing at 52ºC for 40 76 sec., and 1 min. elongation at 72ºC with final elongation for 10 min. To evaluate the PCR results,

77 electrophoresis of amplicons was performed in 1% agarose gel stained with ethidium bromide,

78 visualized under UV light. The amplified COI fragments were purified by alcohol precipitation and

79 then sequenced with appropriate primers (Ivanova et al., 2007) using the BrightDye™ Terminator

80 Cycle Sequencing Kit v3.1 (NimaGen). Capillary electrophoresis of the fragments was performed

81 on an ABI Prism 3130 DNA Genetic Analyzer sequencer (Applied Biosystems, USA). The

82 consensus sequences from the obtained chromatograms were assembled using Geneious software

83 (Kearse et al. 2012). Sequence alignment and subsequent correction of the reading frame (if

84 necessary) were performed in MEGA 7 (Kumar et al. 2016) using the MUSCLE algorithm (Edgar

85 2004). During the alignment, the closest matches from the output data of BLAST (Altschul et al.

86 1990) in GenBank (Benson et al. 2018) were used as reference sequences. The sequences with all

87 the necessary information and pictures with lifetime coloration were placed in BOLD

88 (Ratnasingham and Hebert 2007, 2013) in a project called FFES and uploaded to GenBank

1 gen-2020-0192.R1suppla 4

© The Author(s) or their Institution(s) Page 5 of 23 Genome

89 (Supplement S12). The genetic distances (p-distances) as well as their corrected values according to

90 the two-parameter Kimura model (Kimura 1980) were calculated using the BOLD workbench. The

91 upper conditional threshold value of intraspecific genetic distances was assumed to be the minimum

92 value of distances within a genus between different species. The BarcodingR package (Zhang et al.

93 2017) was used to calculate and plot a graph reflecting the Barcoding gap or the presence of a

94 threshold between intraspecific and interspecific genetic distances (Meyer and Paulay 2005; Meier

95 et al. 2006, 2008). We also used the BIN (Barcode Index Number, (Ratnasingham and Hebert 2013))

96 discordance report information provided by the BOLD workbench. To test the assumption of

97 genetic differentiation between species with extraordinarily low interspecific genetic distances, we

98 used the geneflow Fst indices with a permutation test based on 10,000 replicates in DnaSP 5

99 (Librado and Rozas 2009). In addition to the distance-based criteria for species delimitation, we 100 used a topological approach (i.e., constructionDraft of phylogenetic trees and identification of 101 monophyletic clusters corresponding to species groups pre-defined by morphological features). For

102 this purpose, a neighbor joining (NJ) tree was constructed in the program MEGA 7 using the K2P

103 model, based on available sequences. The robustness of the tree topology was estimated based on

104 the results of 1,000 pseudo-replicas of the non-parametric bootstrap test. The Bayesian topology (se)

105 was also inferred in the program MrBayes 3.2.7 (Ronquist et al. 2012). The simultaneous selection

106 of the optimal model of nucleotide substitutions among those implemented in MrBayes and the

107 partition scheme, incorporating the codons, was performed based on AIC in the program

108 PartitionFinder 2.0 (Guindon et al. 2010; Lanfear et al. 2012, 2014). According to the scheme

109 proposed by PartitionFinder, the first and second positions of the codons together made up a

110 partition that is separate from the third position. The optimal model for the first and second

111 positions of the codons in the matrix was GTR+G+I, while for the third position it was GTR+G. For

112 the first partition, a model with six parameters of substitutions was set, taking into account the

113 proportion of invariable sites (I), as well as Г-distribution of variability frequencies between sites

2 gen-2020-0192.R1suppla 5

© The Author(s) or their Institution(s) Genome Page 6 of 23

114 (nst=6, pinvar=est, rates=invgamma). An equivalent model was applied to the second partition,

115 excluding the proportion of invariant sites. The parameters for the different partitions were set to

116 unlink. The search for tree topology and marginal values of posterior probability was carried out by

117 launching four Markov chains in 1,000,000 generations. The frequency of sampling by Metropolis

118 algorithm from the probability distribution was 1 per 100 generations. The first 25% of trees

119 corresponding to the burn-in step were discarded. A consensus tree was generated based on the

120 remaining 15,002 trees. The convergence indices (ESS, PSRF) indicated sufficient sampling from

121 all parameters and a sufficient number of generations.

122 Ethics approval

123 Collection of specimens was conducted during a commercial fishing trip in accordance with

124 all applicable laws and the specimens were delivered to the authors in frozen form. 125 3. Results Draft 126 The phylogenetic NJ-tree (Fig. 3 and 4) demonstrates high or fairly reliable support for

127 species and genus clusters with the formation of 45 monophyletic groups (including species

128 represented by one sequence). Species Enophrys diceraus is divided into two distinct clusters. The

129 only genus forming paraphyletic lineages is Coryphaenoides. At the same time, the phylogenetic

130 relationships cannot be considered on the level above the genus, due to the low information capacity

131 of the short COI sequences in combination with the construction method. This is reflected in the

132 polyphyly among the families Cottidae and Agonidae of the order Scorpeniformes. The BI-tree

133 topology (Supplement S23) provides similar resolution to that of the NJ analysis of species and

134 genus branches. The main exception is that C. acrolepis forms a stem-group in relation to A.

135 pectoralis, whereas in the NJ-topology there is a clear bifurcation between sequences of the species

136 with shallow divergence (Fig. 3). The same is true for the pair of species A. bartoni and A. olrikii,

137 respectively (see Fig. 4 and Supplement S24). In general, the BI-tree topology is more stable and is

138 represented by only one polytomic node indicating the uncertainty of the position of Osmeriformes

3 gen-2020-0192.R1supplb 4 gen-2020-0192.R1supplb 6

© The Author(s) or their Institution(s) Page 7 of 23 Genome

139 order relative to the other taxa. The families Cottidae and Agonidae on this tree are represented by

140 monophyletic clades.

141 Only one of the 154 sequences in the present dataset (B. nigripinnis, FFES016-18) was not

142 barcode compliant and hence was not included in any BINs. The remaining sequences were

143 assigned to 44 BINs. Among these, 33 BINs (75%, 129 sequences) were concordant, 2 BINs

144 (4.5%) qualified as discordant, and 9 BINs (20.5%) were determined as singletons. One of the

145 discordant BINs was reported due to a genus-level conflict, which came from comparison of A.

146 pectoralis and C. acrolepis sharing the single BIN (BOLD:AAC7497). Another discordance was

147 caused by a species-level conflict of A. olrikii and A. bartoni with a common BIN

148 (BOLD:AAA9928). Interestingly, sequences of E. diceraus were split into a pair of valid BINS –

149 BOLD:AAJ0725 and BOLD:AAE3573 – allocated among singletons and concordant BINs, 150 respectively. Draft 151 Gene flow estimates calculated among phylogroups of A. pectoralis, C. acrolepis and C.

152 cinereus revealed pairwise Fst values of 0.75 between A. pectoralis and C. acrolepis, whereas C.

153 cinereus against A. pectoralis and C. acrolepis were 0.99 and 0.96, respectively. These results

154 imply that these phylogroups have become significantly differentiated and can be considered as

155 separate species. Similar results were found for the pair A. olrikii and A. bartoni, for which the

156 pairwise Fst value was close to 0.77. In both cases, the groups compared had no shared haplotypes.

157 4. Discussion

158 This study describes the nucleotide variability of the mitochondrial COI gene from 44

159 marine fish species collected in the Far Eastern seas. This variability can be used to determine the

160 applicability of these sequences for identifying species and thus contributing to the global reference

161 barcode database of fish from the Northeast Pacific (Steinke et al. 2009; Mecklenburg et al. 2011;

162 Zhang and Hanner 2011; Wang et al. 2012; Turanov et al. 2016, 2019; Kartavtsev et al. 2016).

163 Our results showed that 39 species (89%) can be unambiguously identified to the species

164 level, and that the data obtained do not show any discrepancies between molecular genetic criteria

7

© The Author(s) or their Institution(s) Genome Page 8 of 23

165 and morphological features. For the remaining 5 species (11%), COI sequences exhibit some

166 deviations at the species level according to either distance-based or topological criteria. When the

167 complete dataset was subjected to distribution analysis of intra- and interspecific genetic distances,

168 the results indicated the absence of a barcoding gap (Fig. 2A). The overlap of genetic distance

169 values was caused by two reasons: low divergence at the interspecific level, and the presence of an

170 additional species level phylogroup resulting in high intraspecific values. Deviations of this kind are

171 the most common when DNA barcode libraries are analyzed (Hubert and Hanner 2015), if the

172 conventional threshold for distinguishing between intra- and interspecific variability is set at 2%

173 (Ward 2009). Establishing a universal threshold for delimitation is advantageous in rapid

174 assessments of species boundaries for the majority of known taxa (Ratnasingham and Hebert 2013).

175 In addition, generalizations about the evolution of mitochondrial fragments indicate that 176 intraspecific variability caused by synonymousDraft substitutions due to evolutionary features of the 177 vertebrates’ mitochondrial genome usually does not exceed 0.5% (Stoeckle and Thaler 2018).

178 However, this measure should not limit the distance criteria for species delineation, as the strict

179 threshold approach is not applicable to all species (DeSalle and Goldstein 2019) and may

180 misrepresent their natural boundaries (Meyer and Paulay 2005; Bagley et al. 2019). For example,

181 the total number of cases of poly- and paraphyly in taxonomic studies involving molecular genetic

182 data of mitochondrial nature can reach 23% (Funk and Omland 2003). These numbers are

183 comparable with earlier results obtained for other taxonomic groups (Turanov et al. 2016). In this

184 paper, the relatively low number of deviations seems to be a result of a high proportion of

185 singletons (20.5%) and genera represented in the sampling by a single species (38.6%).

186 The barcoding gap is clearly marked when creating a reduced data set that excludes the

187 sequences of five deviant species (Fig. 2B). The specific features of their variability require

188 additional discussion. The cluster of E. diceraus species includes two BINs of the genus Enophrys

189 (Fig. 4). This genus has four valid species, distributed in the North Pacific Ocean from the Sea of

190 Japan on the east to the coastal waters of southern California on the west (Parin et al. 2014; Pietsch

8

© The Author(s) or their Institution(s) Page 9 of 23 Genome

191 and Orr 2015; Mecklenburg et al. 2018; Burton and Lea 2019). A recent study found cryptic

192 diversity within the E. diceraus species, such that COI gene sequences of individuals from the Sea

193 of Japan and Sea of Okhotsk were separated by a divergence of 2.89% (Moreva et al. 2017).

194 Moreover, the COI gene tree topology placed another species, E. lucasi, between these phylogroups

195 (ibid., Fig. 3). This species is common in the western part of the Bering Sea, along the Aleutian

196 Islands and in the Gulf of Alaska. Recent data on the differentiation of the species E. diceraus and E.

197 lucasi indicate that they are remarkably similar in morphological features, but clearly differ based

198 on the divergence of COI sequences (Mecklenburg et al. 2011). BIN BOLD:AAJ0725 from our

199 samples belongs to E. lucasi. Hence, this deviation (high genetic distance) is caused by the

200 erroneous identification of a single specimen of a rare species for a given region (Parin et al. 2014).

201 A pair of species, A. bartoni and A. olrikii, which share a common BIN, demonstrate a 202 mutual divergence value of 0.015 and a clearDraft clustering according to species affinity (Fig. 4). Their 203 common BIN profile in BOLD also shows a clear bimodal distribution of genetic distance values,

204 indicating that there are two species in its composition. In this case, the clear threshold value

205 adopted by BOLD does not reflect the natural species boundaries of taxa, and the topological

206 criterion of identity is seen as more reliable. This appears to be true also for the cluster of taxa

207 Albatrossia and Coryphaenoides (Fig. 3). The genus Coryphaenoides includes many species

208 adapted to life in deep waters, and is extremely widely distributed (Iwamoto and Stein 1974;

209 Iwamoto and Sazonov 1988; Cohen et al. 1990; Parin et al. 2014). The validity of the genus

210 Albatrossia is controversial. Some authors consider it valid (Iwamoto and Sazonov 1988; Cohen et

211 al. 1990; Parin et al. 2014), while others believe it is a member of the genus Coryphaenoides

212 (Iwamoto and Stein 1974). It is noteworthy that in the only work examining the molecular

213 phylogenetic relationships of different representatives of the genus Coryphaenoides, A. pectoralis is

214 considered to be the representative of this genus; the topological position of this species in the

215 corresponding reconstruction precludes any other interpretation (see Fig. 5 in (Morita 1999)). Our

9

© The Author(s) or their Institution(s) Genome Page 10 of 23

216 data (Fig. 3) fully support this conclusion and suggest that the genus Albatrossia is a synonym of

217 the genus Coryphaenoides.

218 When forming the reference nucleotide sequence sets – one of which is presented in this

219 paper – special attention should be paid to the nucleotide divergence patterns of those species

220 whose natural boundaries of variability cannot be described by a simple threshold value upon

221 barcoding gap setting. Carefully analyzed and supervised in this way, the library will provide the

222 basis for noninvasive approaches to biodiversity monitoring using eDNA (Valentini et al. 2016).

223 Declaration of competing interest

224 The authors declare that they have no known competing financial interests or personal

225 relationships that could influence the work reported in this paper.

226 Acknowledgements 227 This research was partially supportedDraft by a Grant of the President of the Russian Federation 228 (MK-305.2019.4), Far Eastern Branch of the Russian Academy of Sciences in the framework of the

229 Federal Program of Base Research (18-4-040) and Ministry of Science and Higher Education of the

230 Russian Federation (agreement number 075-15-2020-796, grant number 13.1902.21.0012).

231 References

232 Altschul, S.F., Gish, W., Miller, W., Myers, E.W., and Lipman, D.J. 1990. Basic local alignment

233 search tool. J. Mol. Biol. 215(3): 403–410. doi:10.1016/S0022-2836(05)80360-2.

234 Bagley, J.C., de Aquino, P.D.P.U., Breitman, M.F., Langeani, F., and Colli, G.R. 2019. DNA

235 barcode and minibarcode identification of freshwater fishes from Cerrado headwater streams

236 in Central Brazil. J. Fish Biol. doi:10.1111/jfb.14098.

237 Belevich, T.A., Ilyash, L. V., Milyutina, I.A., Logacheva, M.D., and Troitsky, A. V. 2020.

238 Photosynthetic Picoeukaryotes Diversity in the Underlying Ice Waters of the White Sea,

239 Russia. Diversity 12(3): 93. Multidisciplinary Digital Publishing Institute.

240 doi:10.3390/d12030093.

241 Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Ostell, J., Pruitt, K.D., and Sayers,

10

© The Author(s) or their Institution(s) Page 11 of 23 Genome

242 E.W. 2018. GenBank. Nucleic Acids Res. 46(D1): D41–D47. Available from

243 http://dx.doi.org/10.1093/nar/gkx1094.

244 Bickford, D., Lohman, D.J., Sodhi, N.S., Ng, P.K.L., Meier, R., Winker, K., Ingram, K.K., and Das,

245 I. 2007. Cryptic species as a window on diversity and conservation. Trends Ecol. Evol. 22(3):

246 148–155. doi:https://doi.org/10.1016/j.tree.2006.11.004.

247 Burton, E.J., and Lea, R.N. 2019. Annotated checklist of fishes from monterey bay national marine

248 sanctuary with notes on extralimital species. Zookeys. doi:10.3897/zookeys.887.38024.

249 Chernyshev, A. V. 2020. Nemerteans from the Far Eastern Seas of Russia. Russ. J. Mar. Biol. 46(3):

250 141–153. doi:10.1134/S1063074020030049.

251 Cohen, D.M., Inada, T., Iwamoto, T., and Scialabba, N. 1990. FAO Catalogue of Species Vol.10.

252 FAO species Cat. Vol. 10 Gadiform Fishes world (Order ) An Annot. Illus. Cat. 253 cods, hakes, other gadiformDraft fishes known to date. 254 Collins, R. a, and Cruickshank, R.H. 2013. The seven deadly sins of DNA barcoding. Mol. Ecol.

255 Resour. 13(6): 969–75. doi:10.1111/1755-0998.12046.

256 DeSalle, R., and Goldstein, P. 2019. Review and Interpretation of Trends in DNA Barcoding. Front.

257 Ecol. Evol. doi:10.3389/fevo.2019.00302.

258 Edgar, R.C. 2004. MUSCLE: Multiple sequence alignment with high accuracy and high throughput.

259 Nucleic Acids Res. 32(5): 1792–1797. doi:10.1093/nar/gkh340.

260 Fricke, R., Eschmeyer, W.N., and van der Laan, R. 2019. Catalog of fishes: Genera, species,

261 references. Inst. Biodivers. Sci. Sustain. Calif. Acad. Sci. [accessed 1 Febr. 2018].

262 Funk, D.J., and Omland, K.E. 2003. Species-level paraphyly and polyphyly: frequency, causes, and

263 consequences, with insights from mitochondrial DNA. Annu. Rev. Ecol. Evol. Syst.

264 34(1): 397–423. Annual Reviews 4139 El Camino Way, PO Box 10139, Palo Alto, CA 94303-

265 0139, USA.

266 Galimberti, A., De Mattia, F., Losa, A., Bruni, I., Federici, S., Casiraghi, M., Martellos, S., and

267 Labra, M. 2013. DNA barcoding as a new tool for food traceability. Food Res. Int. 50(1): 55–

11

© The Author(s) or their Institution(s) Genome Page 12 of 23

268 63. doi:https://doi.org/10.1016/j.foodres.2012.09.036.

269 Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., and Gascuel, O. 2010. New

270 algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the

271 performance of PhyML 3.0. Syst. Biol. doi:10.1093/sysbio/syq010.

272 Harrington B. 2004 – 2005. Inkscape. – http://www.inkscape.org.

273 Hebert, P.D.N., Cywinska, A., Ball, S.L., and deWaard, J.R. 2003a. Biological identifications

274 through DNA barcodes. Proc. Biol. Sci. 270(1512): 313–321.

275 Hebert, P.D.N., Ratnasingham, S., and deWaard, J.R. 2003b. Barcoding animal life: cytochrome c

276 oxidase subunit 1 divergences among closely related species. Proc. Biol. Sci. 270 Suppl: S96–

277 S99.

278 Hubert, N., and Hanner, R. 2015. DNA Barcoding, species delineation and : a historical 279 perspective. DNA Barcodes 3(1): 44–58.Draft 280 Ivanova, N. V, Zemlak, T.S., Hanner, R.H., and Hebert, P.D.N. 2007. Universal primer cocktails for

281 fish DNA barcoding. Mol. Ecol. Notes 7(4): 544–548.

282 Iwamoto, T., and Sazonov, Y.I. 1988. A review of the southeastern Pacific Coryphaenoides (sensu

283 lato) (Pisces, Gadiformes, ). Proc. Calif. Acad. Sci.

284 Iwamoto, T., and Stein, D.L. 1974. A systematic review of the rattail fishes (Macrouridae:

285 Gadiformes) from Oregon and adjacent waters. Occas. Pap. Calif. Acad. Sci.

286 doi:10.5962/bhl.part.15932.

287 Jung, D.-W., Gosliner, T.M., Choi, T.-J., Kil, H.-J., Chichvarkhin, A., Goddard, J.H.R., and Valdés,

288 Á. 2020. The return of the clown: pseudocryptic speciation in the North Pacific clown

289 nudibranch, Triopha catalinae (Cooper, 1863) sensu lato identified by integrative taxonomic

290 approaches. Mar. Biodivers. 50(5): 84. doi:10.1007/s12526-020-01107-2.

291 Kartavtsev, Y.P., Rozhkovan, K. V, and Masalkova, N.A. 2016. Phylogeny based on two mtDNA

292 genes (Co-1, Cyt-B) among Sculpins (Scorpaeniformes, Cottidae) and some other scorpionfish

293 in the Russian Far East. Mitochondrial DNA Part A 27(3): 2225–2240. Taylor & Francis.

12

© The Author(s) or their Institution(s) Page 13 of 23 Genome

294 doi:10.3109/19401736.2014.984164.

295 Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper,

296 A., Markowitz, S., and Duran, C. 2012. Geneious Basic: an integrated and extendable desktop

297 software platform for the organization and analysis of sequence data. Bioinformatics 28(12):

298 1647–1649. Oxford University Press.

299 Khaksar, R., Carlson, T., Schaffner, D.W., Ghorashi, M., Best, D., Jandhyala, S., Traverso, J., and

300 Amini, S. 2015. Unmasking seafood mislabeling in U.S. markets: DNA barcoding as a unique

301 technology for food authentication and quality control. Food Control 56: 71–76.

302 doi:10.1016/j.foodcont.2015.03.007.

303 Kimura, M. 1980. A simple method for estimating evolutionary rates of base substitutions through

304 comparative studies of nucleotide sequences. J. Mol. Evol. 16(2): 111–120. 305 Krishnamurthy, P.K., and Francis, R.A. 2012.Draft A critical review on the utility of DNA barcoding in 306 biodiversity conservation. Biodivers. Conserv. 21(8): 1901–1919. Springer.

307 Kumar, S., Stecher, G., and Tamura, K. 2016. MEGA7: molecular evolutionary genetics analysis

308 version 7.0 for bigger datasets. Mol. Biol. Evol. 33(7): 1870–1874. Society for Molecular

309 Biology and Evolution.

310 Lanfear, R., Calcott, B., Ho, S.Y.W., and Guindon, S. 2012. PartitionFinder: combined selection of

311 partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29(6):

312 1695–1701. Oxford University Press.

313 Lanfear, R., Calcott, B., Kainer, D., Mayer, C., and Stamatakis, A. 2014. Selecting optimal

314 partitioning schemes for phylogenomic datasets. BMC Evol. Biol. 14(1): 82. BioMed Central.

315 Lecaudey, L.A., Schletterer, M., Kuzovlev, V. V, Hahn, C., and Weiss, S.J. 2019. Fish diversity

316 assessment in the headwaters of the Volga River using environmental DNA metabarcoding.

317 Aquat. Conserv. Mar. Freshw. Ecosyst. 29(10): 1785–1800. John Wiley & Sons, Ltd.

318 doi:10.1002/aqc.3163.

319 Librado, P., and Rozas, J. 2009. DnaSP v5: a software for comprehensive analysis of DNA

13

© The Author(s) or their Institution(s) Genome Page 14 of 23

320 polymorphism data. Bioinformatics 25(11): 1451–1452. Oxford University Press.

321 Lindberg, G.U., and Krasyukova, Z. V. 1987. Fishes of the Sea of Japan and adjacent parts of the

322 Sea of Okhotsk and the Yellow Sea. Part 5. Teleostomi. Osteichthyes. . XXX.

323 Scorpeniformes. (CLXXVI. Fam. Scorpaenidae – CXCIV. Fam. Liparididae). Nauka,

324 Leningrad.

325 McGee, K.M., Robinson, C. V., and Hajibabaei, M. 2019. Gaps in DNA-Based Biomonitoring

326 Across the Globe. Front. Ecol. Evol. doi:10.3389/fevo.2019.00337.

327 Mecklenburg, C.W., Lynghammer, A., Johannesen, E., Byrkjedal, I., Christiansen, J.S., Dolgov, A.

328 V, Karamushko, O. V, Mecklenburg, T.A., Møller, P.R., Steinke, D., and Wienerroither, R.M.

329 2018. Marine fishes of the Arctic region. In Conservation of Arctic Flora and Fauna.

330 Mecklenburg, C.W., Møller, P.R., and Steinke, D. 2011. Biodiversity of arctic marine fishes: 331 taxonomy and zoogeography. Draft 332 Meier, R., Shiyang, K., Vaidya, G., and Ng, P.K.L. 2006. DNA barcoding and taxonomy in diptera:

333 A tale of high intraspecific variability and low identification success. Syst. Biol.

334 doi:10.1080/10635150600969864.

335 Meier, R., Zhang, G., and Ali, F. 2008. The use of mean instead of smallest interspecific distances

336 exaggerates the size of the “barcoding gap” and leads to misidentification.

337 doi:10.1080/10635150802406343.

338 Meyer, C.P., and Paulay, G. 2005. DNA barcoding: Error rates based on comprehensive sampling.

339 PLoS Biol. 3(12): 1–10. doi:10.1371/journal.pbio.0030422.

340 Moreva, I., Radchenko, O., Petrovskaya, A., and Borisenko, S. 2017. Molecular genetic and

341 karyological analysis of antlered sculpins of Enophrys diceraus group (Cottidae). Russ. J.

342 Genet. 53(97): 1030–1041.

343 Morita, T. 1999. Molecular Phylogenetic Relationships of the Deep-Sea Fish Genus

344 Coryphaenoides (Gadiformes: Macrouridae) Based on Mitochondrial DNA. Mol. Phylogenet.

345 Evol. doi:10.1006/mpev.1999.0661.

14

© The Author(s) or their Institution(s) Page 15 of 23 Genome

346 Nakabo, T. 2002. Fishes of Japan: with pictorial keys to the species. Tokai University Press.

347 Nedunoori, A., Turanov, S. V., and Kartavtsev, Y.P. 2017. Fish product mislabeling identified in

348 the Russian far east using DNA barcoding. Gene Reports 8: 144–149.

349 doi:10.1016/j.genrep.2017.07.006.

350 Parin, N.V., Evseenko, S.A., and Vasil’eva, E.D. 2014. Fishes of the Rusian Seas: Annotated

351 Catalogue. KMK Scientific Press, Moscow.

352 Pfleger, M.O., Rider, S.J., Johnston, C.E., and Janosik, A.M. 2016. Saving the doomed: Using

353 eDNA to aid in detection of rare sturgeon for conservation (Acipenseridae). Glob. Ecol.

354 Conserv. 8: 99–107. doi:https://doi.org/10.1016/j.gecco.2016.08.008.

355 Pietsch, T.W., and Orr, J.W. 2015. Fishes of the Salish Sea: A compilation and distributional

356 analysis. NOAA Prof. Pap. NMFS 18. 357 Ratnasingham, S., and Hebert, P.D.N.Draft 2007. The Barcode of Life Data System 358 (www.barcodinglife.org). Mol. Ecol. Notes.

359 Ratnasingham, S., and Hebert, P.D.N. 2013. A DNA-Based Registry for All Animal Species: The

360 Barcode Index Number (BIN) System. PLoS One 8(7).

361 Ronquist, F., Teslenko, M., Van Der Mark, P., Ayres, D.L., Darling, A., Höhna, S., Larget, B., Liu,

362 L., Suchard, M.A., and Huelsenbeck, J.P. 2012. Mrbayes 3.2: Efficient bayesian phylogenetic

363 inference and model choice across a large model space. Syst. Biol. 61(3): 539–542.

364 doi:10.1093/sysbio/sys029.

365 Schenekar, T., Schletterer, M., Lecaudey, L.A., and Weiss, S.J. 2020a. Reference databases, primer

366 choice, and assay sensitivity for environmental metabarcoding: Lessons learnt from a re-

367 evaluation of an eDNA fish assessment in the Volga headwaters. River Res. Appl. 36(7):

368 1004–1013. John Wiley & Sons, Ltd. doi:10.1002/rra.3610.

369 Schenekar, T., Schletterer, M., and Weiss, S.J. 2020b. Development of a TaqMan qPCR protocol

370 for detecting Acipenser ruthenus in the Volga headwaters from eDNA samples. Conserv.

371 Genet. Resour. 12(3): 395–397. doi:10.1007/s12686-020-01128-w.

15

© The Author(s) or their Institution(s) Genome Page 16 of 23

372 Schlitzer, R., Ocean Data View, http://odv.awi.de, 2016.

373 Skriptsova, A. V, and Kalita, T.L. 2020. A re-evaluation of Palmaria (Palmariaceae, Rhodophyta) in

374 the North-West Pacific. Eur. J. Phycol. 55(3): 266–274. Taylor & Francis.

375 doi:10.1080/09670262.2020.1714081.

376 Skurikhina, L.A., Oleinik, A.G., Kukhlevsky, A.D., Kovpak, N.E., Frolov, S. V, and Sendek, D.S.

377 2018. Phylogeography and demographic history of the Pacific smelt Osmerus dentex inferred

378 from mitochondrial DNA variation. Polar Biol. 41(5): 877–896. Springer.

379 Smé, N.A., Lyon, S., Mueter, F., Brykov, V., Sakurai, Y., and Gharrett, A.J. 2019. Examination of

380 saffron cod Eleginus gracilis (Tilesius 1810) population genetic structure. Polar Biol.: 1–15.

381 Springer.

382 Steinke, D., Zemlak, T.S., Boutillier, J.A., and Hebert, P.D.N. 2009. DNA barcoding of Pacific 383 Canada’s fishes. doi:10.1007/s00227-009-1284-0.Draft 384 Stoeckle, M.Y., and Thaler, D.S. 2018. Why should mitochondria define species? Hum. Evol. 33(1–

385 2): 1–30.

386 Stonik, I. V, and Efimova, K. V. 2020. Attheya (Bacillariophyta) from the northwestern Sea of

387 Japan: a description of two subgenera based on molecular and morphological data. Phycologia

388 59(3): 227–237. Taylor & Francis. doi:10.1080/00318884.2020.1732801.

389 Turanov, S. V, Balanov, A.A., and Shelekhov, V.A. 2019. Species of the genus Ammodytes

390 (Ammodytidae) in the northwestern part of the Sea of Japan. J. Appl. Ichthyol.

391 Turanov, S. V, and Kartavtsev, Y.P. 2014. The taxonomic composition and distribution of sand

392 lances from the genus Ammodytes (Perciformes: Ammodytidae) in the North Pacific. Russ. J.

393 Mar. Biol. 40(4).

394 Turanov, S.V. 2019. Building and analysis of the reference nucleotide sequence data base of the

395 mitochondrial COI gene for delimitation of sand lances species (Uranoscopiformes:

396 Ammodytidae) from the Northern Hemisphere. Russ. J. Mar. Biol. 45(1).

397 Turanov, S.V., Kartavtsev, Y.P., Lipinsky, V.V., Zemnukhov, V.V., Balanov, A.A., Lee, Y.-H., and

16

© The Author(s) or their Institution(s) Page 17 of 23 Genome

398 Jeong, D. 2016. DNA-barcoding of perch-like fishes (Actinopterygii: Perciformes) from far-

399 eastern seas of Russia with taxonomic remarks for some groups. Mitochondrial DNA 27(2).

400 doi:10.3109/19401736.2014.945525.

401 Valentini, A., Taberlet, P., Miaud, C., Civade, R., Herder, J., Thomsen, P.F., Bellemain, E., Besnard,

402 A., Coissac, E., Boyer, F., Gaboriaud, C., Jean, P., Poulet, N., Roset, N., Copp, G.H., Geniez,

403 P., Pont, D., Argillier, C., Baudoin, J.M., Peroux, T., Crivelli, A.J., Olivier, A., Acqueberge,

404 M., Le Brun, M., Møller, P.R., Willerslev, E., and Dejean, T. 2016. Next-generation

405 monitoring of aquatic biodiversity using environmental DNA metabarcoding. Mol. Ecol.

406 doi:10.1111/mec.13428.

407 Wang, Z.-D., Guo, Y.-S., Liu, X.-M., Fan, Y.-B., and Liu, C.-W. 2012. DNA barcoding South

408 China Sea fishes. doi:10.3109/19401736.2012.710204. 409 Ward, R.D. 2009. DNA barcode divergenceDraft among species and genera of birds and fishes. Mol. 410 Ecol. Resour. 9(4): 1077–1085. doi:10.1111/j.1755-0998.2009.02541.x.

411 Weigand, H., Beermann, A.J., Čiampor, F., Costa, F.O., Csabai, Z., Duarte, S., Geiger, M.F.,

412 Grabowski, M., Rimet, F., Rulik, B., Strand, M., Szucsich, N., Weigand, A.M., Willassen, E.,

413 Wyler, S.A., Bouchez, A., Borja, A., Čiamporová-Zaťovičová, Z., Ferreira, S., Dijkstra,

414 K.D.B., Eisendle, U., Freyhof, J., Gadawski, P., Graf, W., Haegerbaeumer, A., van der Hoorn,

415 B.B., Japoshvili, B., Keresztes, L., Keskin, E., Leese, F., Macher, J.N., Mamos, T., Paz, G.,

416 Pešić, V., Pfannkuchen, D.M., Pfannkuchen, M.A., Price, B.W., Rinkevich, B., Teixeira,

417 M.A.L., Várbíró, G., and Ekrem, T. 2019. DNA barcode reference libraries for the monitoring

418 of aquatic biota in Europe: Gap-analysis and recommendations for future work.

419 doi:10.1016/j.scitotenv.2019.04.247.

420 Yusishen, M.E., Eichorn, F.-C., Anderson, W.G., and Docker, M.F. 2020. Development of

421 quantitative PCR assays for the detection and quantification of lake sturgeon (Acipenser

422 fulvescens) environmental DNA. Conserv. Genet. Resour. 12(1): 17–19. doi:10.1007/s12686-

423 018-1054-8.

17

© The Author(s) or their Institution(s) Genome Page 18 of 23

424 Zhang, A.B., Hao, M. Di, Yang, C.Q., and Shi, Z.Y. 2017. BarcodingR: an integrated r package for

425 species identification using DNA barcodes. Methods Ecol. Evol. doi:10.1111/2041-

426 210X.12682.

427 Zhang, J.-B., and Hanner, R. 2011. DNA barcoding is a useful tool for the identification of marine

428 fishes from Japan. doi:10.1016/j.bse.2010.12.017.

Draft

18

© The Author(s) or their Institution(s) Page 19 of 23 Genome

430 Figure captions

431 Fig. 1. Map of fish sampling localities across the study area of the Northeast Pacific.

432 Sampling sites are indicated by shaded circles. The map was generated by the Ocean Data View

433 (Schlitzer, 2016). The Inkscape (Harrington, 2004-2005) was used to edit and compile it.

434 Fig. 2. The results of DNA barcoding gap analysis of COI-genotyped fish specimens from

435 the Far Eastern seas. Barplots show the distribution of intraspecific (grey bars) and interspecific

436 (black bars) genetic distance variation based on the K2P substitution model. A and B represent

437 results for full (154 sequences) and restricted (130 sequences) datasets, respectively. The restricted

438 dataset excludes E. diceraus, A. olrikii, A. bartoni, C. acrolepis and A. pectoralis.

439 Fig. 3. Midpoint-rooted NJ phylogenetic tree reconstructed based on the partial COI

440 sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The triangle 441 indicates the collapsed cluster, fully shownDraft separately in Fig. 4. Values in the nodes represent 442 bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey area

443 represents clusters with extraordinary patterns of divergence.

444 Fig. 4. Part of the entire NJ phylogenetic tree represented in Fig. 3. The tree is reconstructed

445 based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-

446 distance. The tree is collapsed and rooted at the midpoint. Values in the nodes represent bootstrap

447 support measures higher than 50%. Intraspecies clusters are collapsed. Grey areas represent clusters

448 with extraordinary patterns of divergence.

19

© The Author(s) or their Institution(s) Genome Page 20 of 23

Draft

Map of fish sampling localities across the study area of the Northeast Pacific. Sampling sites are indicated by shaded circles. The map was generated by the Ocean Data View (Schlitzer, 2016). The Inkscape (Harrington, 2004-2005) was used to edit and compile it.

234x169mm (600 x 600 DPI)

© The Author(s) or their Institution(s) Page 21 of 23 Genome

Draft

The results of DNA barcoding gap analysis of COI-genotyped fish specimens from the Far Eastern seas. Barplots show the distribution of intraspecific (grey bars) and interspecific (black bars) genetic distance variation based on the K2P substitution model. A and B represent results for full (154 sequences) and restricted (130 sequences) datasets, respectively. The restricted dataset excludes E. diceraus, A. olrikii, A. bartoni, C. acrolepis and A. pectoralis.

204x256mm (600 x 600 DPI)

© The Author(s) or their Institution(s) Genome Page 22 of 23

Draft

Midpoint-rooted NJ phylogenetic tree reconstructed based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The triangle indicates the collapsed cluster, fully shown separately in Fig. 4. Values in the nodes represent bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey area represents clusters with extraordinary patterns of divergence.

141x177mm (600 x 600 DPI)

© The Author(s) or their Institution(s) Page 23 of 23 Genome

Draft

Part of the entire NJ phylogenetic tree represented in Fig. 3. The tree is reconstructed based on the partial COI sequences of 154 fish specimens from the Far Eastern seas using K2P-distance. The tree is collapsed and rooted at the midpoint. Values in the nodes represent bootstrap support measures higher than 50%. Intraspecies clusters are collapsed. Grey areas represent clusters with extraordinary patterns of divergence.

138x172mm (600 x 600 DPI)

© The Author(s) or their Institution(s)