Quick viewing(Text Mode)

Molecular Phylogeny of Historical Micro-Invertebrate Specimens Using

Molecular Phylogeny of Historical Micro-Invertebrate Specimens Using

bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 Molecular phylogeny of historical micro-invertebrate specimens

2 using de novo sequence assembly

3

4 Running title: Phylogeny of historical micro-invertebrates

5

6 1,*Orr R.J.S, 1Sannum M. M., 2Boessenkool, S., 1Di Martino E., 3Gordon D.P., 4Mello H.,

7 5Obst M., 1Ramsfjell M.H., 4Smith A.M. & 1,2,*Liow, L.H.

8

9 1Natural History Museum, University of Oslo, Oslo, Norway

10 2Centre for Ecological and Evolutionary Synthesis, Department of Biosciences, University of

11 Oslo, Oslo, Norway

12 3National Institute of Water and Atmospheric Research, Wellington, New Zealand

13 4Department of Marine Science, University of Otago, Dunedin, New Zealand

14 5Department of Marine Sciences, University of Gothenburg, Gothenburg, Sweden

15

16

17

18

19

20

21

22

23 *corresponding authors

24

1 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

25 Abstract

26 Resolution of relationships at lower taxonomic levels is crucial for answering many

27 evolutionary questions, and as such, sufficiently varied species representation is vital. This

28 latter goal is not always achievable with relatively fresh samples. To alleviate the difficulties

29 in procuring rarer taxa, we have seen increasing utilization of historical specimens in building

30 molecular phylogenies using high throughput sequencing. This effort, however, has mainly

31 focused on large-bodied or well-studied groups, with small-bodied and under-studied taxa

32 under-prioritized. Here, we present a pipeline that utilizes both historical and contemporary

33 specimens, to increase the resolution of phylogenetic relationships among understudied and

34 small-bodied metazoans, namely, cheilostome bryozoans. In this study, we pioneer

35 sequencing of air-dried bryozoans, utilizing a recent library preparation method for low DNA

36 input. We use the de novo mitogenome assembly from the target specimen itself as reference

37 for iterative mapping, and the comparison thereof. In doing so, we present mitochondrial and

38 ribosomal RNA sequences of 43 cheilostomes representing 37 species, including 14 from

39 historical samples ranging from 50 to 149 years old. The inferred phylogenetic relationships

40 of these samples, analyzed together with publicly available sequence data, are shown in a

41 statistically well-supported 65 taxa and 17 genes cheilostome tree. Finally, the

42 methodological success is emphasized by circularizing a total of 27 mitogenomes, seven from

43 historical cheilostome samples. Our study highlights the potential of utilizing DNA from

44 micro-invertebrate specimens stored in natural history collections for resolving phylogenetic

45 relationships between species.

46

47 Key words: historical DNA, cheilostome bryozoans, circularized mitochondria, genomics,

48 genome-skimming, museum collections

2 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

49 Introduction

50 Robust phylogenetic hypotheses are crucial to understanding many biological processes,

51 ranging from those contributing to population history to those creating macroevolutionary

52 patterns. The development of methods for phylogenetic estimation and high throughput

53 sequencing (HTS) have, in combination, improved our understanding of the relationships

54 among extant organisms. These advances are frequently applied to the estimation of

55 relationships among higher taxa, e.g. distantly related phyla (Laumer et al., 2019;

56 Simion et al., 2017), families of flowering plants (Léveillé-Bourret, Starr, Ford, Moriarty

57 Lemmon, & Lemmon, 2017) and orders of birds (Jarvis et al., 2014). While these deeper

58 branches have received significant attention, those at the leaves (e.g. genera and species) often

59 remaining largely unresolved, especially for taxa that are understudied. For the resolution of

60 relationships at lower taxonomic levels, crucial as a backbone for answering many

61 evolutionary questions, a rich and broad species representation is vital.

62

63 Traditionally, researchers have sequenced relatively freshly collected specimens. However,

64 there is an increasing utilization of historical specimens for molecular phylogenetic

65 reconstruction in the absence of fresh tissue (e.g. Jarvis et al 2014). Historical specimens from

66 museum or other institutional collections are valuable sources of information representing

67 organisms that may be difficult or impossible to sample in contemporary populations (Holmes

68 et al., 2016). Most studies that use historical specimens for phylogenetic reconstruction tend

69 to focus on larger-bodied species that are recently extinct (Anmarkrud & Lifjeld, 2017; Bunce

70 et al., 2009; Mitchell et al., 2014; Sharko et al., 2019), well-studied groups (Billerman &

71 Walsh, 2019; Jarvis et al., 2014) and organisms of economic importance (Bonanomi,

72 Therkildsen, Hedeholm, Hemmer-Hansen, & Nielsen, 2014; Larson et al., 2007; Larsson et

3 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

73 al., 2019), although there are a few reports on invertebrate and other small-bodied taxa (Der

74 Sarkissian et al., 2017; Kistenich et al., 2019; Sproul & Maddison, 2017). When employing

75 HTS in such historical studies, nucleotide reads are typically mapped to a closely related

76 reference genome, or one from the same species (Sharko et al., 2019). Alternatively, complete

77 sequences of target regions from related species and/or genera (Anmarkrud & Lifjeld, 2017;

78 Billerman & Walsh, 2019) are used to design baits or probes (Derkarabetian, Benavides, &

79 Giribet, 2019; Ruane & Austin, 2017). While these approaches are excellent for groups whose

80 sequences are relatively well-understood, they are unfeasible for clades with low inter-genus

81 sequence conservation, or for those lacking sequence data from closely related reference

82 organisms.

83

84 Here, we present a pipeline to use historical material from lesser-studied, small-bodied

85 organisms for the purpose of reconstructing molecular phylogenetic relationships. Our target

86 organisms are cheilostomes, the most species-rich order of the phylum , with ca.

87 6500 described extant species, representing about 80% of the living species diversity of the

88 phylum (Bock & Gordon, 2013). Cheilostomes are lightly- to heavily-calcified, sessile,

89 colonial metazoans common in benthic marine habitats. Most species are encrusting, while

90 fewer are erect, and most are small (colony size c.1 cm2, module size c. 500 µm x 200 µm),

91 and live on hard substrates that may be overgrown by other fouling organisms (including

92 other bryozoans, hydroids, foraminiferans and tube worms).

93

94 Systematic relationships among cheilostome bryozoans remain largely based on

95 morphological characters (Bock & Gordon, 2013) with molecular phylogenies being

96 restricted to recently collected, ethanol-preserved samples where genetic data was obtained

4 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

97 using PCR-based methods (Fuchs, Obst, & Sundberg, 2009; Knight, Gordon, & Lavery,

98 2011; Orr, Waeschenbach, et al., 2019; Waeschenbach, Taylor, & Littlewood, 2012), or more

99 recently, by a HTS genome-skimming approach (Orr, Haugen, et al., 2019). While these

100 studies have improved our understanding of the phylogenetic relationships among

101 cheilostomes, many key taxa, potentially filling important phylogenetic positions, are hard to

102 procure and remain unfeatured in sequencing projects. Contributing to the advancement of

103 historical DNA methods (Billerman & Walsh, 2019; Knapp & Hofreiter, 2010), we pioneer

104 the sequencing of air-dried bryozoan specimens (i.e. never preserved in ethanol) that have

105 been stored up to 150 years since collection. To do so, we employ a recently developed DNA

106 library preparation method (SMARTer ThruPLEX, Takara) to amplify and sequence

107 historical samples with low DNA concentrations. We bypass the need for primers/probes, a

108 clear advantage for cheilostomes which are known to have low inter-genus sequence

109 conservation (Orr, Haugen, et al., 2019; Orr, Waeschenbach, et al., 2019; Waeschenbach et

110 al., 2012). The de novo assembled sequences from the target colony itself were used as a

111 reference for iterative mapping, avoiding assumptions about what could be justifiably used as

112 a reference. In doing so, we generate mitochondrial and ribosomal RNA sequences of 43

113 cheilostome colonies representing 37 species and 30 genera, with 14 of these from historical

114 samples ranging from 50 to 149 years old. As a derivative of the demonstration of our

115 pipeline, we also present a well-supported cheilostome tree using 65 taxa and 17 genes.

116 Finally, the success of the methodology we employed is highlighted by the circularization of

117 seven historical cheilostome mitochondrial genomes. This approach emphasizes the potential

118 for analyzing DNA from micro-invertebrate samples stored in natural history collections,

119 especially for phylogenetic reconstruction of many hitherto inaccessible cheilostome

120 genomes.

5 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

121 Materials and Methods

122 Twenty-one dried historical cheilostome samples and 29 recently collected samples, of which

123 26 were ethanol-preserved and 3 dried, were targeted (Table S1). We selected the historical

124 samples to represent a spread of collection dates while including those that are

125 phylogenetically previously resolved and those that are currently enigmatic. Recently

126 collected samples were selected for their potential for phylogenetic verification of the

127 historical specimens. Each colony was subsampled for DNA isolation and scanning electron

128 microscopy (SEM), using a Hitachi TM4040PLus. For microscopy, we bleached subsamples

129 in diluted household bleach for a few hours to overnight, removing soft tissues in order to

130 document skeletal morphology. SEMs of dried samples were taken both pre- and post-

131 bleaching. All physical vouchers are stored at the Natural History Museum of Oslo,

132 University of Oslo, and SEMs are available in the Online Supplementary Information (SI).

133 Other metadata of the samples are reported in Table S1.

134

135 DNA isolation, sequencing, assembly and annotation

136 Genomic DNA for all 50 specimens was isolated using the DNeasy Blood and Tissue kit

137 (QIAGEN, Germantown, MD, USA). Colonies were homogenized in lysis buffer, using a

138 pestle, in the presence of proteinase-K prior to a 24-hour 37oC incubation period. DNA was

139 eluted in pre-heated 65oC Tris-Cl buffer (10 mM) and incubated at 37oC for 10 minutes.

140 Recovered DNA was quantified prior to library preparation using a Qubit 2.0 fluorometer

141 (ThermoFisher, US).

142

143 DNA from the historical specimens were isolated in a laboratory designed for handling

144 samples with low DNA concentrations (sensi-lab, NHM, Oslo). DNA extractions were

6 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

145 performed inside a hood equipped with UV lights and all equipment was bleached and UV-

146 sterilized prior to use. Samples were vortexed twice in nuclease-free H2O, air-dried, then

147 subject to UV for 10 minutes to minimize surface contaminants. These treated samples were

148 subsequently crushed with a stainless-steel mortar and pestle (Gondek, Boessenkool, & Star,

149 2018) prior to DNA isolation.

150

151 Isolated DNA samples with >15ng were prepared with the standard KAPA HyperPrep kit

152 (Roche, USA) by the Norwegian Sequencing Centre (NCS, Oslo, Norway), while those with

153 <15ng were prepared using the SMARTer® ThruPLEX® DNA-Seq Kit (Takara Bio Inc,

154 Japan) in the sensi-lab at NHM Oslo (Table S1). Genomic DNA up to 150 bp in read length

155 was pair-end (PE) sequenced on an Illumina HiSeq4000 at the NCS. The non-historical DNA

156 samples were fragmented to a 350 bp insert size, whilst no such step was performed on the

157 historical samples with already short DNA length, giving a variable insert size (Table S1). A

158 blank control taken during the DNA extraction of the historical samples was also sequenced

159 (library preparation: SMARTer® ThruPLEX® DNA-Seq Kit).

160

161 Illumina HiSeq reads were quality checked using FastQC v.0.11.8 (Andrews, 2010), then

162 quality- and adapter-trimmed using TrimGalore v0.4.4 (Krueger, 2015). All samples were de

163 novo assembled with SPAdes 3.11.1 (Bankevich et al., 2012) using k-mers of 21,33, 55, 77,

164 99, and 127. The mitogenome and rRNA operon of each sample were identified with blastn

165 (Altschul, Gish, Miller, Myers, & Lipman, 1990) against our own database of unpublished

166 cheilostome sequences. For each historical sample, the SPAdes de novo assembled

167 mitogenome sequence was used as the reference (seed) input for its own iterative mapped

168 assembly using GetOrganelle (Jin et al., 2019) and NOVOplasty 3.7 (Dierckxsens, Mardulyn,

7 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

169 & Smits, 2016), both under default settings. In addition, a mitogenome sequence from a taxon

170 phylogenetically related to the historical sample in question was also provided for a second

171 iterative assembly (Fig S1).

172

173 Annotation and alignments

174 Mitogenomes from the separate assembly methods, for each of the samples were annotated

175 with Mitos2 using a metazoan reference and invertebrate genetic code (Bernt et al., 2013) to

176 identify two rRNA genes (rrnL and rrnS) and 13 protein coding genes (atp6, atp8, cox1, cox2,

177 cox3, cob, nad1, nad2, nad3, nad4, nad4l, nad5, and nad6). In addition, two rRNA operon

178 genes (ssu (18s) and lsu (28s)) were identified and annotated using RNAmmer (Lagesen et

179 al., 2007). Further, mitogenes and rRNA operons from 27 bryozoan taxa (Table S2), obtained

180 from NCBI, were aligned with our samples to compile a broader taxon sample representing a

181 cheilostome ingroup and ctenostome outgroup. A minimum gene number of four per taxon, to

182 avoid possible effects of missing-data on phylogenetic inference (Philippe et al., 2004; Wiens

183 & Morrill, 2011), was set for this study (see Table 1 and S1 for the number of genes included

184 for each taxon). MAFFT (Katoh & Standley, 2013) was used for alignment with default

185 parameters: for the four rRNA genes (nucleotide) the Q-INS-i model, considering secondary

186 RNA structure, was utilized; for the 13 protein-coding genes, in amino acid format, the G-

187 INS-I model was used. The 17 separate alignments were edited manually using Mesquite v3.1

188 (Maddison & Maddison, 2017). Ambiguously aligned characters were removed from each

189 alignment using Gblocks (Talavera & Castresana, 2007) with least stringent parameters. The

190 single-gene alignments were concatenated using the catfasta2phyml perl script (Nylander,

191 2010). An initial mitochondrial supermatrix, consisting of up to five assemblies per sample

192 (Fig. S1) was used to investigate possible differences among assembly methods using a

8 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

193 phylogenetic analysis (see next section). Using the best combination of assembled genes from

194 this initial supermatrix of historical samples, we then created a downstream final supermatrix,

195 where each sample (now including all samples in Table 1) was represented only once in the

196 final phylogenetic analysis. Here, the two rRNA operon genes were also included (Fig. 1).

197 The alignments (both masked and unmasked) are available through Dryad

198 (https://doi.org/???/dryad.???).

199

200 Phylogenetic reconstruction

201 Maximum likelihood (ML) phylogenetic analyses were carried out for each single gene

202 alignment using the “AUTO” parameter in RAxML v8.0.26 (Stamatakis, 2006) to establish

203 the evolutionary model with the best fit. The general time reversible (GTR+G) was the

204 preferred model for the four rRNA genes (18s, 28s, rrnS and rrnL), and MtZoa+G for all 13

205 protein coding genes. The concatenated datasets, divided into rRNA and protein gene

206 partitions, each with its own separate gamma distribution were analyzed using RAxML. The

207 topology with the highest likelihood score of 100 heuristic searches was chosen. Bootstrap

208 values were calculated from 500 pseudo-replicates.

209

210 Bayesian inference (BI) was performed using a modified version of MrBayes incorporating

211 the MtZoa evolutionary model (Huelsenbeck & Ronquist, 2001; Tanabe, 2016) for the

212 samples represented in the final supermatrix (Fig .1). The dataset was executed, as before,

213 with rRNA and protein gene partitions under their separate gamma distributions. Two

214 independent runs, each with three heated and one cold Markov Chain Monte Carlo (MCMC)

215 chain, were initiated from a random starting tree. The MCMC chains were run for 20,000,000

216 generations with trees sampled every 1,000th generation. The posterior probabilities and mean

9 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

217 marginal likelihood values of the trees were calculated after the burnin phase. The average

218 standard deviation of split frequencies between the two runs was <0.01, indicating

219 convergence of the MCMC chains.

220

221

222 Results

223 We successfully sequenced and assembled 43 cheilostome colonies representing 37 species

224 from 30 genera (Table 1). Fourteen of these come from historical samples, each representing a

225 separate genus. The oldest known historical sample we assembled was of Myriapora truncata

226 (BLEED 1197) from 1871. The 14 successfully assembled historical cheilostome samples had

227 a total DNA input range of 1.3 – 304 ng for Illumina library preparation (Table S1). Of the 21

228 sequenced historical samples, one failed to provide any sequence read data. Of the 20

229 assembled historical samples, one was removed from the dataset based on the minimum gene

230 number used for phylogenetic inference (Table S1), a result of low sequencing depth. For an

231 additional four of the 20 assembled samples, no mitogenome or rRNA operon genes for the

232 intended cheilostome target were identified with blastn, despite adequate sequencing depth

233 (Table S1). And for a remaining sample, the assembly provided a limited contig number with

234 no cheilostome target identifiable. The negative control provided 3.6 million reads that upon

235 assembly gave 50 contigs >500 bp, with a 1,561 bp maximum. All assembled contigs from

236 the negative control were attributed to contamination from Canis familiaris and Homo

237 sapiens.

238

10 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

239 Our mitochondrial comparative phylogeny of three assembly methods (GetOrganelle,

240 NOVOplasty and SPAdes) with different seeds for the historical samples demonstrated

241 congruence (Fig. S1). The nodes subtending each sample are fully supported monophylies.

242 The only observed discrepancy between the three assembly methods is seen for Porella

243 compressa (BLEED 1188), where the sequence assembled with NOVOplasty using a modern

244 P. concinna (BLEED 579) sample as seed, groups with the latter species, rather than itself.

245 The NOVOplasty assembly from Porella compressa (BLEED 1188) recovered only two

246 genes (nad3 and nad6) representing 124 (3%) of 4144 inferred characters. The SPAdes

247 assembly, in contrast, recovered five of 15 mitochondrial genes representing 1295 or 33% of

248 the inferred characters. We also observe variation in the genes that are recovered between

249 assembly methods for other specimens. The de novo method using SPAdes recovered the

250 most mitochondrial genes per sample on average (Fig. S1). This was followed by the iterative

251 method of NOVOplasty, with its results independent of seed input. Lastly, GetOrganelle

252 consistently recovered the fewest mitochondrial genes per sample. Additionally, GetOrganelle

253 failed to assemble in approximately 30% of cases. Despite these differences, the phylogeny

254 demonstrates congruence among the assembly methods, with all samples, but the sequence

255 from P. compressa using NOVOplasty, being monophyletic across assemblies.

256

257 Given the phylogenetic congruence of the assembly methods, we concatenated genes from

258 these separate assemblies to form a final matrix with each sample represented only once (see

259 methods and Table 1) to infer the phylogeny presented in Fig. 1. Specifically, the input for

260 each specimen to this supermatrix utilized the assembly method that gave the highest number

261 of annotated genes. When there is an equal number of recovered genes among assemblies for

262 a given sample, prioritization was as follows: the de novo assembly of SPAdes, before that of

11 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

263 NOVOplasty utilizing a SPAdes seed, and lastly NOVOplasty using a seed from a close

264 relative (Fig. S1).

265 In presenting our main phylogeny we only highlight relationships represented by families

266 constituting two or more genera receiving at least moderate support (>70 bootstrap support).

267 The ingroup (cheilostomes) is monophyletic and separated from the ctenostome outgroup

268 with full support 100 bootstrap (BS) / 1.00 Posterior Probability (PP) (Fig. 1). The first clade

269 to diverge within the cheilostomes (Clade X, Fig. 1) is a highly supported (99 BS /1.00 PP)

270 grouping including Calpensia, Steginoporella, Scruparia, Membranipora and Electra, with

271 the latter two genera, forming a fully supported monophyly. The remaining cheilostome taxa

272 form a fully supported group divided into two main sub-clades. The first sub-clade (Clade Y,

273 Fig. 1) is moderately supported (77 BS / 1.00 PP), and further split into two groups. The basal

274 of these two groupings is a fully supported lineage of two bugulids. The candids comprising

275 Cradoscrupocellaria and Emma have moderate support (88 BS / 1.00 PP), but place

276 paraphyletic to a highly supported (95 BS / 1.00 PP) monophyly of two Caberea species,

277 from the same family. The terminal of these two groupings receives moderate support (84 BS

278 / 1.00 PP) with a better resolved internal branching pattern than the previous clade. In

279 ascending order, we recover a lineage of three hippothoids with high support (90 BS / 1.00

280 PP), the adeonids represented by five genera with full support, and a flustrid clade, again fully

281 supported and harboring three genera. The latter two family groupings additionally form a

282 highly supported (92 BS / 1.00 PP) monophyly. The second main sub-clade (Clade Z, Fig. 1)

283 is highly supported (98 BS / 1.00 PP), and further splits into multiple lineages. Again, in

284 ascending order, a fully supported lineage of three catenicellid genera that form a highly

285 supported (99 BS / 1.00 PP) sister relationship with Myriapora truncata, and excluded from

286 terminal relationships with full support. We find a moderately supported (78 BS / 0.99 PP)

12 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

287 grouping of Fenestrulina, Microporella and Tessaradoma excluded from the terminal split

288 with (98 BS / 1.00 PP). A highly supported (96 BS/ 1.00 PP) monophyly of four celleporids

289 moderately (75 BS / 1.00 PP) excluded from the subsequent split. A fully supported

290 philodoporid grouping comprises two genera and four species. And finally, a monophyly

291 comprising a fully supported grouping of three Parasmittina species is observed and highly

292 supported (97 BS / 1.00 PP) clade formed by the species Oshurkovia littoralis with members

293 of Porella. The placements of Watersipora subtorquata and Cryptosula pallasiana are only

294 weakly supported within this final clade.

295 Of the 14 historical samples, Bugula neritina (BLEED 1200), Electra pilosa (BLEED 1178),

296 Oshurkovia littoralis (BLEED 1194) and Securiflustra securifrons (BLEED 1179) all form

297 fully supported monophyletic intraspecies relationships. Caberea ellisii (BLEED 1190) and

298 Parasmittina jeffreysi (BLEED 1202) form highly (95 BS/ 1.00 PP) and fully supported

299 monophylies, respectively, within their respective genera. Further, Porella compressa

300 (BLEED 1188) places within a genus monophyly, albeit without support. Hippothoa sp.

301 (BLEED1187), Omalosecosa ramulosa (BLEED1177) and Turbicellepora smitti

302 (BLEED1182), and Cradoscrupocellaria reptans (BLEED1192), are all highly supported

303 within their corresponding families. The three remaining taxa, Calpensia nobilis

304 (BLEED1184), Myriapora truncata (BLEED1197) and Tessaradoma boreale (BLEED1183)

305 all lack the inclusion of a close taxonomic relative in our phylogenetic inference. However,

306 for M. truncata (BLEED1197) we present a phylogeny of cox1 (Fig. S2) demonstrating that

307 our historical sample forms a fully supported monophyly with that of a M. truncata cox1

308 sequence obtained from NCBI (ATX63952).

309

13 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

310 In total, mitochondrial genomes of 27 of 43 samples were circularized with a size range of

311 13700-19704 bp (Table 1). Of these, seven were from the 14 historical samples, with the

312 oldest being that of Parasmittina jeffreysi (BLEED 1202, 14260 bp, collected in 1901; Fig.

313 2).

314

315

316 Discussion

317 Continued explorative expeditions and the discovery of yet unknown organisms will provide

318 optimal organic material for DNA sequencing, and subsequent analyses of extant species.

319 However, natural history collections are unique sources that can provide historical material

320 for organisms that may be difficult or impossible to sample from contemporary populations.

321 Experimentation with laboratory techniques and bioinformatic tools of such historical

322 samples can expand the information space for our general understanding of the biology,

323 including the genetics and phylogenetics, of both well- and under-studied species. In this

324 study, we expanded on our knowledge of a small-bodied and understudied phylum, by

325 applying HTS and a combination of de novo and iterative assembly methods.

326

327 Overcoming challenges of sequencing understudied groups

328 While the cheilostome molecular tree is very far from complete (Fuchs et al., 2009; Knight et

329 al., 2011; Orr, Haugen, et al., 2019; Orr, Waeschenbach, et al., 2019; Waeschenbach et al.,

330 2012), we have contributed to this community endeavor by sequencing a number of species

331 that have never been sequenced before and by demonstrating that it is possible, without much

332 extra effort, to extract sequence data of many genes from old, air-dried samples. We utilized

333 de novo and iterative methods to assemble our cheilostome sequence data to overcome the

14 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

334 challenges of reference-based assemblies, due to the high degree of observed sequence

335 variability among cheilostomes. This is especially important because systematic relationships

336 remain largely based on morphological characters for cheilostomes (Bock & Gordon, 2013)

337 which we know can sometimes be misleading (Orr, Haugen, et al., 2019; Orr, Waeschenbach,

338 et al., 2019), hence seed choice is fraught with difficulties. By using de novo assembled

339 sequences from the target colony itself as a reference for iterative mapping, we circumvent

340 the difficulties of reference-choice. To explore the validity of our approach, we also explored

341 the robustness of the phylogenetic placement by using both nominal conspecifics and

342 congenerics as references where available (Fig. S1). Encouragingly, we find that in the

343 majority of the cases, even historical samples can be used as their own reference for iterative

344 mapping assemblies, at least for the samples we studied. Phylogenetic comparison of these

345 assemblies consistently demonstrated fully supported monophylies, independent of method

346 and the supplied reference. While SPAdes, in general, recovered the most complete

347 assemblies and mitochondrial genes, the output of NOVOplasty was almost comparable, and

348 even better in some cases. We propose that the iterative method employed in NOVOplasty

349 could supplement that of the de novo, beckoning integration into mitogenome assembly

350 pipelines, and removing the necessity for a reference sequence from a closely related species.

351 Conversely, the use of GetOrganelle was superfluous for our dataset, recovering the fewest

352 genes per sample, and having the highest failure rate. The iterative methods of GetOrganelle

353 and NOVOplasty have previously been compared for chloroplast assemblies (de Oliveira,

354 Duijm, & Steege, 2020; Freudenthal et al., 2019) and combined for plastid genome

355 construction (Ma & Lu, 2019; Tan et al., 2019; Yan, Zhang, Zhao, & Yuan, 2019). Results

356 from these previous comparative analyses, in contrast to our own, indicate GetOrganelle may

357 be more suited to chloroplast assemblies. Additional iterative methods, such as MitoZ (Meng,

15 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

358 Li, Yang, & Liu, 2019), have not been evaluated here and could be included in future studies.

359 Finally, the pipeline we describe in the present study proved suitable for circularizing

360 mitochondrial genomes (27 samples, ranging from 13700-19704 bp; Table 1), including from

361 historical samples >100 years old (Fig. 2). While detailed mitogenome comparisons are

362 beyond the scope of this current study, they are now available for further genome evolution

363 research.

364

365 An expanded cheilostome phylogeny

366 The phylogeny presented here is the most broadly taxonomically sampled cheilostome tree to

367 date. In our discussion of the topology we focus on the relationships that have received at

368 least moderate support (see results). Our tree (Fig. 1) lends statistical support to previous

369 work on the molecular phylogeny of cheilostomes (Knight et al., 2011; Orr, Haugen, et al.,

370 2019; Orr, Waeschenbach, et al., 2019; Waeschenbach et al., 2012), but also demonstrates

371 well-supported branching patterns for new relationships. In summary, the most basal

372 cheilostome clade (Clade X, Fig. 1) consists of Electra, Membranipora and Scruparia,

373 reminiscent of the topology depicted by Waeschenbach et al. (2012), but we add two newly

374 sequenced genera, namely, Calpensia and Steginoporella to this grouping. Calpensia

375 (currently placed in Microporidae) is sister to Steginoporella, based on 14 and 16 genes

376 respectively (Fig 1). They share a frontal, depressed, pseudoporous cryptocyst, which in

377 Calpensia and Thalamoporella, the sister of Steginoporella (Knight et al., 2011), bears two

378 opesiules while in Steginoporella are paired opercular indentations, outlined by the median

379 process, forming opesiule-like lateral openings in some species. This result calls for a revision

380 of the systematic placement of Calpensia, likely in the superfamily Thalamoporelloidea,

381 where additional molecular sequence data from Thalamoporella could prove useful.

16 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

382 Presently, only partial cox1 and rrnL sequence data are available each from two different

383 Thalamoporella species, and as such, were not included in this study. In the first main

384 cheilostome clade (Clade Y, Fig. 1) we add the genera Arachnopusia, Bicellariella, Caberea,

385 Emma, Hincksina, Hippothoa, Patsyella and Rhabdozoum to that previously shown (Orr,

386 Haugen, et al., 2019; Waeschenbach et al., 2012). Hippothoa, the morphologically highly

387 distinct hippothoids, places with high confidence with the previously sequenced Antarctothoa

388 and Celleporella (Knight et al., 2011). Hincksina places with commonly occurring members

389 of the family Flustridae (namely Flustra and Securiflustra). Even though Hincksina is in need

390 of revision with regards to its relationship with Gregarinidra, the placement of this sample

391 within the Flustridae seems beyond doubt. Moving on to the second main clade (Clade Z, Fig.

392 1) of cheilostomes represented in our topology, we place Cornuticella, Myriapora,

393 Omalosecosa, Parasmittina, Porella, Pterocella, Stephanollona, Terminocella, Tessaradoma

394 and Turbicellepora for the first time. Myriapora places as sister to the catenicellids, a position

395 that may be the result of a limited taxon sample. Catenicellidae have a fundamentally

396 gymnocystal-shield, with the earliest, Maastrichtian, species having a costate frontal area (A.

397 Cheetham, pers. comm. In Banta & Wass, 1979). In contrast, Myriaporidae have a

398 pseudoporous-lepralioid-shield and, in addition, a vastly different astogeny and colony form

399 is observed between these two families (Ferretti, Magnino, & Balduzzi, 2007; Wass, 1983).

400 Tessaradoma boreale (family Tessaradomidae) places between Microporella and

401 Fenestrulina, two genera that are superficially similar but are now found to belong to separate

402 families (Orr, Waeschenbach, et al., 2019). Tessaradoma, prior to the introduction of the

403 family Tessaradomidae (also including the genus Smithsonius), has been considered within

404 Microporellidae (MacGillivary, 1895) despite these families having contrasting frontal shield

405 forms. A key feature of Tessaradoma, which partly explains its attributions to

17 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

406 Microporellidae (among other families), is a suboral opening in the frontal shield interpreted

407 as a spiramen (Gordon, 1993). However, broader taxon sampling is needed to confirm the

408 affinity of Tessaradoma to Microporellidae, prompting a future inclusion of Taylorius, as

409 previously suggested (Orr, Waeschenbach, et al., 2019), Smithsonius, and lastly

410 Siphonicytara, which may have descended from a Beisselina-like (Tessaradomidae) ancestor

411 (Gordon & Taylor, 2015).

412

413 Verifying the placement of historical samples

414 The recent introduction of HTS and genome-skimming to bryozoology has brought overdue

415 support to phylogenetic relationships, but this approach is not without its limitations (Orr,

416 Haugen, et al., 2019; Orr, Waeschenbach, et al., 2019). Cheilostome bryozoans live in close

417 proximity with other fouling/encrusting organisms, including other bryozoan species. As

418 such, the isolation and subsequent amplification of solely target DNA is challenging. In our

419 study we verified that the targeted bryozoan in 11 of our historical samples was sequenced

420 successfully via phylogenetic placement (Fig. 1). However, Calpensia (BLEED 1184),

421 Myriapora (BLEED 1197) and Tessaradoma (BLEED 1184) cannot be verified the same

422 way. The placement of these three specimens are challenging to validate as we lack close

423 taxonomic relatives in our phylogenetic inference to support their respective positions (Fig.

424 1). However, other information, beyond relative phylogenetic proximity, can help to verify

425 specific samples. Firstly, as discussed earlier, Calpensia shares important morphological traits

426 with Steginoporella and its relatives even though it currently nominally belongs to the

427 phylogenetically more derived, but one-size-fits all family Microporidae (Gray, 1848; Jullien,

428 1888). Secondly, the Myriapora cox1 gene has previously been amplified utilizing barcoding

429 primers (Cahill et al., 2017). We can therefore confirm the placement of Myriapora (BLEED

18 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

430 1197) in the main phylogeny (Fig. 1) based on 17 genes, by showing its cox1 affinity to that

431 of Myriapora cox1 from NCBI (Fig. S2). Lastly, the position of Tessaradoma (BLEED 1184)

432 as sister to Microporella remains a working hypothesis until increased taxon sampling can

433 either confirm or reject this position. We emphasize that nothing can substitute for the

434 continued accumulation of broadly taxonomically sampled sequence and morphological data

435 for aiding phylogenetic understanding. Lastly, and considering the confirmed phylogenetic

436 placement of all but one of our 14 historical samples, contaminant issues can be considered

437 negligible. Our negative control provided only a limited read number, that upon assembly

438 gave few and short contigs attributed to mammal DNA, which were filtered out during the

439 assembly pipeline. Considering the encrusting and fouling nature of bryozoans in general,

440 contamination especially from the same Order is a concern independent of sample age. Here,

441 we have treated our cheilostome samples as a metagenomes, as already demonstrated in (Orr,

442 Haugen, et al., 2019).

443

444 Sample age and quality versus sequencing success

445 The aim of our study was to demonstrate the possibility of sequencing historical, air-dried

446 cheilostome material. As it was not our aim to systematically evaluate the lower limit of total

447 DNA that could be amplified and HTS sequenced for bryozoans, we merely confirm success

448 down to 1.3ng total DNA (see Table S1), although total DNA as low as 50pg is theoretically

449 possible to amplify using the SMARTer ThruPLEX Illumina library preparation. Of the 21

450 historical samples we sequenced, only three samples had few or no reads, despite apparently

451 successful library preparation. A single sample had too few assembled contigs to identify the

452 cheilostome target. For four samples, one or more contaminants were sequenced instead

453 (analyses not shown) and subsequently identified and removed from analyses (Table S1). As

19 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

454 mentioned, contamination is common also for recently collected samples, highlighting the

455 necessity of robust filtering steps in the bioinformatic pipeline. It may be expected that non-

456 encrusting but erect colonies could be more easily freed of contaminants, resulting in

457 improved assemblies and subsequent phylogenetic placement. However, this idea is not

458 supported in the present study given the distribution of failed and successful samples among

459 erect and encrusting colony forms (Table S2, Fisher’s Exact Test p >>0.05). While we did not

460 systematically study the effect of sample age, as multiple factors could not be controlled for

461 (including species, quality and volume of starting material and treatment during and after

462 collecting and/or first preservation), there appears to be no simple relationship between age or

463 preservation method and the sequencing success for cheilostome bryozoans (Table S1). Given

464 these observations, we encourage researchers to attempt the sequencing of historical

465 cheilostome samples of any age or species if ample material is available for experimentation.

466

467 The importance of natural history collections

468 Natural history collections are treasure-troves of past information that can potentially shed

469 light on many questions that we may not even have yet thought of, with tools that are

470 continuously being developed. We have capitalized on historical samples stored at the

471 Museum of Natural History at the University of Oslo, accumulated during periods where

472 future sequencing technologies or phylogenetic estimation techniques belonged to the realm

473 of science fiction (Short, Dikow, & Moreau, 2018; Yeates, Zwick, & Mikheyev, 2016). Many

474 more such collections exist throughout the world and our relatively simple pipeline widens

475 the possibilities of extracting useful sequence information from many more under-studied

476 phyla and small-bodied invertebrates, beyond cheilostome bryozoans. This work illustrates

20 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

477 the importance of natural history collections, and the need to store and maintain these for

478 future generations and technological advances.

479

480

481 Acknowledgements

482 We thank Rønning A.H. for help in the NHM Oslo collections, Thorbek B.L.G. for assistance

483 in the sensi-lab, Mills S. for handling and export of the New Zealand NIWA (National

484 Institute of Water and Atmospheric Research) samples, and Rust S. for sorting samples from

485 the University of Otago. Sequencing was performed at the Norwegian Sequencing Centre

486 (University of Oslo) and analyses done on the project node nn9244k on the Abel server

487 (University of Oslo). This project has received funding from the European Research Council

488 (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant

489 agreement No 724324 to LHL) and MMS has received student funding from NHM Oslo.

490 Field work in Sweden was supported by Assemble Plus grant to RJSO (grant number

491 730984). MO was supported by SYNTHESYS Plus grant (grant number 823827).

492

493

494 References 495 Altschul, S. F., Gish, W., Miller, W., Myers, E. W., & Lipman, D. J. (1990). Basic local 496 alignment search tool. Journal of Molecular Biology, 215(3), 403-410. 497 doi:http://dx.doi.org/10.1016/S0022-2836(05)80360-2 498 Andrews, S. (2010). FastQC: a quality control tool for high throughput sequence data. 499 Retrieved from https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ 500 Anmarkrud, J. A., & Lifjeld, J. T. (2017). Complete mitochondrial genomes of eleven extinct 501 or possibly extinct bird species. Molecular Ecology Resources, 17(2), 334-341. 502 doi:10.1111/1755-0998.12600 503 Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., . . . 504 Pevzner, P. A. (2012). SPAdes: a new genome assembly algorithm and its applications 505 to single-cell sequencing. Journal of Computational Biology, 19(5), 455-477. 506 doi:10.1089/cmb.2012.0021

21 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

507 Banta, W. C., & Wass, R. E. (1979). Catenicellid Cheilostome Broyozoa. I. Frontal walls. 508 Australian Journal of Zoology Supplementary Series, 27. doi:10.1071/AJZS068 509 Bernt, M., Donath, A., Jühling, F., Externbrink, F., Florentz, C., Fritzsch, G., . . . Stadler, P. F. 510 (2013). MITOS: Improved de novo metazoan mitochondrial genome annotation. 511 Molecular Phylogenetics and Evolution, 69(2), 313-319. 512 doi:https://doi.org/10.1016/j.ympev.2012.08.023 513 Billerman, S. M., & Walsh, J. (2019). Historical DNA as a tool to address key questions in 514 avian biology and evolution: A review of methods, challenges, applications, and 515 future directions. Molecular Ecology Resources, 19(5), 1115-1130. doi:10.1111/1755- 516 0998.13066 517 Bock, P. E., & Gordon, D. (2013). Phylum Bryozoa Ehrenberg, 1831. Zootaxa, 3703(1), 67- 518 74. 519 Bonanomi, S., Therkildsen, N. O., Hedeholm, R. B., Hemmer-Hansen, J., & Nielsen, E. E. 520 (2014). The use of archived tags in retrospective genetic analysis of fish. Mol Ecol 521 Resour, 14(3), 616-621. doi:10.1111/1755-0998.12211 522 Bunce, M., Worthy, T. H., Phillips, M. J., Holdaway, R. N., Willerslev, E., Haile, J., . . . 523 Cooper, A. (2009). The evolutionary history of the extinct ratite moa and New 524 Zealand Neogene paleogeography. Proceedings of the National Academy of Sciences, 525 106(49), 20646-20651. doi:10.1073/pnas.0906660106 526 Cahill, A. E., De Jode, A., Dubois, S., Bouzaza, Z., Aurelle, D., Boissin, E., . . . Chenuil, A. 527 (2017). A multispecies approach reveals hot spots and cold spots of diversity and 528 connectivity in invertebrate species with contrasting dispersal modes. Molecular 529 Ecology, 26(23), 6563-6577. doi:10.1111/mec.14389 530 de Oliveira, S. M., Duijm, E., & Steege, H. t. (2020). Chloroplast genome of the nutmeg tree: 531 Myristica fragrans Houtt. (Myristicaceae). bioRxiv, 2020.2002.2025.964122. 532 doi:10.1101/2020.02.25.964122 533 Der Sarkissian, C., Pichereau, V., Dupont, C., Ilsøe, P. C., Perrigault, M., Butler, P., . . . 534 Orlando, L. (2017). Ancient DNA analysis identifies marine mollusc shells as new 535 metagenomic archives of the past. Molecular Ecology Resources, 17(5), 835-853. 536 doi:10.1111/1755-0998.12679 537 Derkarabetian, S., Benavides, L. R., & Giribet, G. (2019). Sequence capture phylogenomics 538 of historical ethanol-preserved museum specimens: Unlocking the rest of the vault. 539 Molecular Ecology Resources, 19(6), 1531-1544. doi:10.1111/1755-0998.13072 540 Dierckxsens, N., Mardulyn, P., & Smits, G. (2016). NOVOPlasty: de novo assembly of 541 organelle genomes from whole genome data. Nucleic Acids Research, 45(4), e18-e18. 542 doi:10.1093/nar/gkw955 543 Ferretti, C., Magnino, G., & Balduzzi, A. (2007). Morphology of the larva and ancestrula of 544 Myriapora truncata (Bryozoa, ). Italian Journal of Zoology, 74(4), 545 341-350. doi:10.1080/11250000701629572 546 Freudenthal, J. A., Pfaff, S., Terhoeven, N., Korte, A., Ankenbrand, M. J., & Förster, F. 547 (2019). The landscape of chloroplast genome assembly tools. bioRxiv, 665869. 548 doi:10.1101/665869 549 Fuchs, J., Obst, M., & Sundberg, P. (2009). The first comprehensive molecular phylogeny of 550 Bryozoa (Ectoprocta) based on combined analyses of nuclear and mitochondrial 551 genes. Molecular Phylogenetics and Evolution, 52, 225–233. 552 Gondek, A. T., Boessenkool, S., & Star, B. (2018). A stainless-steel mortar, pestle and sleeve 553 design for the efficient fragmentation of ancient bone. BioTechniques, 64(6), 266-269. 554 doi:10.2144/btn-2018-0008

22 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

555 Gordon, D. P. (1993). Bryozoan frontal shields: studies on umbonulomorphs and impacts on 556 classification. Zoologica Scripta, 22(2), 203-221. doi:10.1111/j.1463- 557 6409.1993.tb00352.x 558 Gordon, D. P., & Taylor, P. D. (2015). Bryozoa of the Early Eocene Tumaio Limestone, 559 Chatham Island, New Zealand. Journal of Systematic Palaeontology, 13(12), 983- 560 1070. doi:10.1080/14772019.2014.991905 561 Gray, J. E. (1848). Part 1. Centroniae or radiated . In List of the specimens of British 562 animals in the collection of the British Museum (pp. 1-173). 563 Greiner, S., Lehwark, P., & Bock, R. (2019). OrganellarGenomeDRAW (OGDRAW) version 564 1.3.1: expanded toolkit for the graphical visualization of organellar genomes. Nucleic 565 Acids Research, 47(W1), W59-W64. doi:10.1093/nar/gkz238 566 Holmes, M. W., Hammond, T. T., Wogan, G. O. U., Walsh, R. E., LaBarbera, K., Wommack, 567 E. A., . . . Nachman, M. W. (2016). Natural history collections as windows on 568 evolutionary processes. Molecular Ecology, 25(4), 864-881. doi:10.1111/mec.13529 569 Huelsenbeck, J. P., & Ronquist, F. (2001). MrBayes: Bayesian inference of phylogenetic 570 trees. Bioinformatics, 17(8), 754-755. doi:10.1093/bioinformatics/17.8.754 571 Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., . . . Zhang, G. (2014). 572 Whole-genome analyses resolve early branches in the tree of life of modern birds. 573 Science, 346(6215), 1320-1331. doi:10.1126/science.1253451 574 Jin, J.-J., Yu, W.-B., Yang, J.-B., Song, Y., dePamphilis, C. W., Yi, T.-S., & Li, D.-Z. (2019). 575 GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle 576 genomes. bioRxiv, 256479. doi:10.1101/256479 577 Jullien, J. (1888). Bryozoaires. Mission scientifique du Cap Horn 1882-1883. In (Vol. 6, pp. 578 1-92). Zoologie. 579 Katoh, K., & Standley, D. M. (2013). MAFFT Multiple equence Alignment Software Version 580 7: Improvements in performance and usability. Molecular Biology and Evolution, 581 30(4), 772-780. doi:10.1093/molbev/mst010 582 Kistenich, S., Halvorsen, R., Schrøder-Nielsen, A., Thorbek, L., Timdal, E., & Bendiksby, M. 583 (2019). DNA Sequencing Historical Lichen Specimens. Frontiers in Ecology and 584 Evolution, 7(5). doi:10.3389/fevo.2019.00005 585 Knapp, M., & Hofreiter, M. (2010). Next Generation Sequencing of Ancient DNA: 586 Requirements, Strategies and Perspectives. Genes, 1(2), 227-243. 587 doi:10.3390/genes1020227 588 Knight, S., Gordon, D. P., & Lavery, S. D. (2011). A multi-locus analysis of phylogenetic 589 relationships within cheilostome bryozoans supports multiple origins of ascophoran 590 frontal shields. Molecular Phylogenetics and Evolution, 61(2), 351-362. 591 doi:http://dx.doi.org/10.1016/j.ympev.2011.07.005 592 Krueger, F. (2015). TrimGalore. Retrieved from 593 https://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ 594 Lagesen, K., Hallin, P., Rødland, E. A., Staerfeldt, H.-H., Rognes, T., & Ussery, D. W. 595 (2007). RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic 596 Acids Research, 35(9), 3100-3108. doi:10.1093/nar/gkm160 597 Larson, G., Cucchi, T., Fujita, M., Matisoo-Smith, E., Robins, J., Anderson, A., . . . Dobney, 598 K. (2007). Phylogeny and ancient DNA of Sus provides insights into neolithic 599 expansion in Island Southeast Asia and Oceania. Proceedings of the National 600 Academy of Sciences, 104(12), 4834-4839. doi:10.1073/pnas.0607753104 601 Larsson, P., Oliveira, H. R., Lundström, M., Hagenblad, J., Lagerås, P., & Leino, M. W. 602 (2019). Population genetic structure in Fennoscandian landrace rye (Secale cereale L.)

23 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

603 spanning 350 years. Genetic Resources and Crop Evolution, 66(5), 1059-1071. 604 doi:10.1007/s10722-019-00770-0 605 Laumer, C. E., Fernández, R., Lemer, S., Combosch, D., Kocot, K. M., Riesgo, A., . . . 606 Giribet, G. (2019). Revisiting metazoan phylogeny with genomic sampling of all 607 phyla. Proceedings of the Royal Society B: Biological Sciences, 286(1906), 20190831. 608 doi:doi:10.1098/rspb.2019.0831 609 Léveillé-Bourret, É., Starr, J. R., Ford, B. A., Moriarty Lemmon, E., & Lemmon, A. R. 610 (2017). Resolving Rapid Radiations within Angiosperm Families Using Anchored 611 Phylogenomics. Systematic Biology, 67(1), 94-112. doi:10.1093/sysbio/syx050 612 Ma, Q., & Lu, Y. (2019). The complete chloroplast genome of Eichhornia crassipes 613 (Pontederiaceae) and phylogeny of commelinids. Mitochondrial DNA Part B, 4(2), 614 3186-3187. doi:10.1080/23802359.2019.1667901 615 MacGillivary, P. H. (1895). A monograph of the Tertiary Polyzoa of Victoria. In 616 Transactions of the Royal Society of Victoria (Vol. 4, pp. 1-166). 617 Maddison, W. P., & Maddison, D. R. (2017). Mesquite: a modular system for evolutionary 618 analysis. Version 3.1 http://mesquiteproject.org. 619 Meng, G., Li, Y., Yang, C., & Liu, S. (2019). MitoZ: a toolkit for animal mitochondrial 620 genome assembly, annotation and visualization. Nucleic Acids Research, 47(11), e63- 621 e63. doi:10.1093/nar/gkz173 622 Mitchell, K. J., Llamas, B., Soubrier, J., Rawlence, N. J., Worthy, T. H., Wood, J., . . . 623 Cooper, A. (2014). Ancient DNA reveals elephant birds and kiwi are sister taxa and 624 clarifies ratite bird evolution. Science, 344(6186), 898-900. 625 doi:10.1126/science.1251981 626 Nylander, J. A. A. (2010). catfasta2phyml. Retrieved from 627 https://github.com/nylander/catfasta2phyml 628 Orr, R. J. S., Haugen, M. N., Berning, B., Bock, P., Cumming, R. L., Florence, W. K., . . . 629 Liow, L. H. (2019). A genome-skimmed phylogeny of a widespread bryozoan family, 630 Adeonidae. BMC Evolutionary Biology, 19(1), 235. doi:10.1186/s12862-019-1563-4 631 Orr, R. J. S., Waeschenbach, A., Enevoldsen, E. L. G., Boeve, J. P., Haugen, M. N., Voje, K. 632 L., . . . Liow, L. H. (2019). Bryozoan genera Fenestrulina and Microporella no longer 633 confamilial; multi-gene phylogeny supports separation. Zoological Journal of the 634 Linnean Society, 186(1), 190-199. doi:10.1093/zoolinnean/zly055 635 Philippe, H., Snell, E. A., Bapteste, E., Lopez, P., Holland, P. W. H., & Casane, D. (2004). 636 Phylogenomics of Eukaryotes: Impact of Missing Data on Large Alignments. 637 Molecular Biology and Evolution, 21(9), 1740-1752. doi:10.1093/molbev/msh182 638 Ruane, S., & Austin, C. C. (2017). Phylogenomics using formalin-fixed and 100+ year-old 639 intractable natural history specimens. Molecular Ecology Resources, 17(5), 1003- 640 1008. doi:10.1111/1755-0998.12655 641 Sharko, F. S., Rastorguev, S. M., Boulygina, E. S., Tsygankova, S. V., Ibragimova, A. S., 642 Tikhonov, A. N., & Nedoluzhko, A. V. (2019). Molecular phylogeny of the extinct 643 Steller's sea cow and other Sirenia species based on their complete mitochondrial 644 genomes. Genomics, 111(6), 1543-1546. 645 doi:https://doi.org/10.1016/j.ygeno.2018.10.012 646 Short, A. E. Z., Dikow, T., & Moreau, C. S. (2018). Entomological Collections in the Age of 647 Big Data. Annual Review of Entomology, 63(1), 513-530. doi:10.1146/annurev-ento- 648 031616-035536 649 Simion, P., Philippe, H., Baurain, D., Jager, M., Richter, D. J., Di Franco, A., . . . Manuel, M. 650 (2017). A Large and Consistent Phylogenomic Dataset Supports Sponges as the Sister

24 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

651 Group to All Other Animals. Current Biology, 27(7), 958-967. 652 doi:10.1016/j.cub.2017.02.031 653 Sproul, J. S., & Maddison, D. R. (2017). Sequencing historical specimens: successful 654 preparation of small specimens with low amounts of degraded DNA. Molecular 655 Ecology Resources, 17(6), 1183-1201. doi:10.1111/1755-0998.12660 656 Stamatakis, A. (2006). RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses 657 with thousands of taxa and mixed models. Bioinformatics, 22(21), 2688-2690. 658 doi:10.1093/bioinformatics/btl446 659 Talavera, G., & Castresana, J. (2007). Improvement of phylogenies after removing divergent 660 and ambiguously aligned blocks from protein sequence alignments. Syst Biol, 56(4), 661 564-577. doi:10.1080/10635150701472164 662 Tan, W., Gao, H., Yang, T., Jiang, W., Zhang, H., Yu, X., & Tian, X. (2019). The complete 663 chloroplast genome of a medicinal resource plant (Rumex crispus). Mitochondrial 664 DNA Part B, 4(2), 2800-2801. doi:10.1080/23802359.2019.1660254 665 Tanabe, A. S. (2016). mrbayes5d. Retrieved from https://github.com/astanabe/mrbayes5d 666 Waeschenbach, A., Taylor, P. D., & Littlewood, D. T. J. (2012). A molecular phylogeny of 667 bryozoans. Molecular Phylogenetics and Evolution, 62, 718-735. 668 Wass, R. E. (1983). Early astogeny in the Catenicellidae (Bryozoa, Cheilostomata). 669 Alcheringa: An Australasian Journal of Palaeontology, 7(1), 41-48. 670 doi:10.1080/03115518308619632 671 Wiens, J. J., & Morrill, M. C. (2011). Missing Data in Phylogenetic Analysis: Reconciling 672 Results from Simulations and Empirical Data. Systematic Biology, 60(5), 719-731. 673 doi:10.1093/sysbio/syr025 674 Yan, M., Zhang, X., Zhao, X., & Yuan, Z. (2019). The complete mitochondrial genome 675 sequence of sweet cherry (Prunus avium cv. ‘summit’). Mitochondrial DNA Part B, 676 4(1), 1996-1997. doi:10.1080/23802359.2019.1617082 677 Yeates, D. K., Zwick, A., & Mikheyev, A. S. (2016). Museums are biobanks: unlocking the 678 genetic potential of the three billion specimens in the world's biological collections. 679 Current Opinion in Insect Science, 18, 83-88. 680 doi:https://doi.org/10.1016/j.cois.2016.09.009 681 682 Author contributions: RJSO, SB and LHL designed the study, HM, AMS, DPG, MO, 683 RJSO, LHL, MHR collected the material, RJSO and MMS performed the lab work and 684 bioinformatics, MHR and EDM took the SEMs, DPG, EDM, MO identified the taxa, RJSO 685 and LHL wrote the first draft of the manuscript which all authors commented and agreed on. 686 687 Table 1. Samples generated and analyzed in this study: BLEED stands for Bryozoan Lab for Ecology, 688 Evolution and Development and BLEED numbers are numerical tags for the specimens. Abbreviations for 689 countries: NZ = New Zealand, SWE = Sweden, AUS = Australia, NOR = Norway, IT = Italy and UK = United 690 Kingdom. The size of the mitogenome, in base pairs (bp), is only shown if closed/circularized, otherwise the cell 691 is labelled as “NO”. The columns “rRNA” and “Mt” represent the number of rRNA and mitochondrial genes 692 annotated and used in phylogenetic inference, with a maximum of 2 and 15, respectively. An * succeeding the 693 year of collection indicates an air-dried historical sample, defined here as 50 years or older. Multiple historical 694 samples have “<1970” indicated because the collection year was unstated. However, taxonomic identifications 695 were made in 1970 for these specimens, implying that they must have been collected then, or earlier. For an 696 expanded overview of metadata see Table S1 and for additional taxa used, but not generated during this study, 697 Table S2.

25 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Genus Species BLEED Year Country rRNA Mt Mt size (bp) Accession Antarctothoa delta 703 2018 NZ 2 12 14114 ?? Arachnopusia unicornis 183 2009 NZ 2 15 NO ?? Arachnopusia unicornis 221 2011 NZ 2 15 NO ?? Bicellariella ciliata 560 2018 SWE 2 15 19704 ?? Bugula neritina 1200 1960* AUS 2 15 15414 ?? Caberea angusta 160 2009 NZ 2 13 NO ?? Caberea ellisii 1190 <1970* NOR 0 14 17589 ?? Calpensia nobilis 1184 <1970* IT 2 14 14049 ?? Celleporina sinuata 328 2015 NZ 1 14 15651 ?? Cornuticella trapezoidea 1687 2018 NZ 2 14 14590 ?? Cradoscrupocellaria reptans 1192 <1970* NO 1 15 NO ?? Electra pilosa 819 2018 NO 2 14 13970 ?? Electra pilosa 1172 2018 NO 2 13 13965 ?? Electra pilosa 1178 <1970* UNKNOWN 0 4 NO ?? Electra scuticifera 98 2015 NZ 2 12 13804 ?? Emma tricellata 196 2015 NZ 2 14 16275 ?? Flustra foliacea 568 2018 SWE 2 15 16802 ?? Hincksina sp. 61 2009 NZ 2 14 15726 ?? Hippothoa sp. 1187 1875* UK 0 15 NO ?? Membranipora membranacea 816 2018 NO 2 15 14739 ?? Membranipora membranacea 1163 2018 SWE 2 15 14736 ?? Myriapora truncata 1197 1871* IT 2 15 NO ?? Omalosecosa ramulosa 1177 <1970* NO 1 8 NO ?? Oshurkovia littoralis 1194 <1970* UK 2 13 13973 ?? Parasmittina aotea 86 2010 NZ 2 13 14232 ?? Parasmittina jeffreysi 1202 1901* Arctic 2 15 14260 ?? Parasmittina solenosmilioides 1267 2015 NZ 2 15 17725 ?? Patsyella acanthodes 131 2011 NZ 2 14 NO ?? Porella compressa 1188 <1970* NO 1 5 NO ?? Porella concinna 579 2018 SWE 2 13 14277 ?? Porella concinna 1169 2018 SWE 2 14 14278 ?? Pterocella scutella 104 2015 NZ 2 13 14025 ?? Rhabdozoum wilsoni 695 2018 NZ 0 15 NO ?? Rhynchozoon angulatum 694 2018 NZ 2 13 14372 ?? Rhynchozoon zealandicum 127 2015 NZ 2 12 14092 ?? Securiflustra securifrons 551 2018 SWE 2 8 NO ?? Securiflustra securifrons 1179 <1970* NO 2 15 17457 ?? Steginoporella perplexa 100 2011 NZ 2 12 13700 ?? Stephanollona scintillans 679 2018 NZ 1 12 NO ?? Stephanollona sp. 170 2009 NZ 2 8 NO ?? Terminocella n.sp. 194 2010 NZ 2 15 NO ??

26 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Tessaradoma boreale 1183 <1970* NO 2 13 14211 ?? Turbicellepora smitti 1182 <1970* NO 0 14 NO ?? 698

699

700

701

702

703

704

27 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

*Porella concinna BLEED579 *Porella concinna BLEED1169 *Porella compressa BLEED1188 (<1970) 9 7/1.00 Oshurkovia littoralis 5 0/0.99 *Oshurkovia littoralis BLEED1194 (<1970)

Cryptosula pallasiana Clade Z 5 4/1.00 *Parasmittina jeffreysi BLEED1202 (1901) 9 7/1.00 *Parasmittina solenosmilioides BLEED1267 *Parasmittina aotea BLEED86 Watersipora subtorquata 7 5/1.00 *Rhynchozoon angulatum BLEED694 *Rhynchozoon zealandicum BLEED127 *Stephanollona scintillans BLEED679 9 8/1.00 *Stephanollona sp. BLEED170 *Omalosecosa ramulosa BLEED1177 (<1970) *Turbicellepora smitti BLEED1182 (<1970) 9 5/1.00 *Celleporina sinuata BLEED328 9 6/1.00 Galeopsis porcellanicus Microporella cf.ciliata Microporella sp.

9 Microporella sp. nov. 2

8/1.00 8 1/1.00 *Tessaradoma boreale BLEED1183 (<1970) 7 8/0.99 Fenestrulina malusii Fenestrulina sp. nov. 1 *Cornuticella trapezoidea BLEED1687 8 9/1.00 Terminocella n. sp. BLEED194 *Pterocella scutella BLEED104 9 9/1.00

*Myriapora truncata BLEED1197 (1871) Cheilostomes *Securiflustra securifrons BLEED1179 (<1970) *Securiflustra securifrons BLEED551 *Hincksina sp. BLEED61 Flustra foliacea *Flustra foliacea BLEED568 9 2/1.00 foliifera fascialis 5 5/0.97 Reptadeonella brasiliensis 9 5/1.00 Laminopora jellyae 9 1/1.00 8 Cucullipora sp. 4/1.00 Adeonella pallasii

*Antarctothoa delta BLEED703 Clade Y Celleporella hyalina 9 0/1.00 *Hippothoa sp. BLEED1187 (1875)

7 *Caberea ellisii BLEED1190 (<1970)

7/1.00 9 5/1.00 *Caberea angusta BLEED160 *Rhabdozoum wilsoni BLEED695 5 2/0.92 *Emma tricellata BLEED196 8 8/1.00 *Cradoscrupocellaria reptans BLEED1192 (<1970) *Patsyella acanthodes BLEED131 *Arachnopusia unicornis BLEED183 *Arachnopusia unicornis BLEED221 5 9/0.91 Callopora lineata Bugula neritina *Bugula neritina BLEED1200 (1960) *Bicellariella ciliata BLEED560 *Electra pilosa BLEED1172 8 2/- *Electra pilosa BLEED1178 (<1970) Electra pilosa *Electra pilosa BLEED819 Clade X *Electra scuticifera BLEED98 Membranipora membranacea *Membranipora membranacea BLEED1163 *Membranipora membranacea BLEED816 Membranipora grandicella 9

9/1.00 Scruparia chelata *Calpensia nobilis BLEED1184 (<1970) *Steginoporella perplexa BLEED100 Alcyonidium mytili Flustrellidra hispida Anguinella palmata Ctenostomes Paludicella sp. Amathia citrina 0.1 705 706 Figure 1. The inferred phylogeny of cheilostomes based on 17 genes from historical and recently sampled 707 material. Maximum likelihood topology of 65 cheilostome ingroup taxa and 5 ctenostome outgroup taxa with 708 8324 nucleotide and amino acid characters inferred using RAxML (100 heuristic searches and bootstrap of 500 709 pseudoreplicates). The numbers on the internal nodes are ML bootstrap values (BS from RAxML) followed by 710 posterior probabilities (PP from MrBayes). Circles indicate 100 BP and 1.00 PP. Only BS >50 and PP >0.95 are 711 shown, dash indicates values below this. * indicates taxa generated in this study. Bold font indicates historical 712 samples with sampling year in brackets. Clade X (blue), Clade Y (red), and Clade X (brown) discussed in the 713 text are highlighted. See Table S1 for gene list.

28 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

714

cox3

rrnS nad1

nad3

nad6

atp8 cox2

cox1

cob Parasmittina jeffreysi B mitochondrial genome 14,260 bp

nad2

rrnL

atp6

nad4l A nad5 C 715 nad4 716 Figure 2. The circularized mitochondrial genome of Parasmittina jeffreysi (BLEED 1202). The figure shows 717 (A) the circularised 14260 bp mitogenome of P. jeffreysi (BLEED1202, collected in 1901) with gene order 718 constructed and annotated using OGDRAW v1.3.1 (Greiner, Lehwark, & Bock, 2019); the SEM (B) of the same 719 sample, and a photograph (C) from the 1901 Norwegian FRAM II expedition where this sample was collected. 720 The FRAM II picture is used with permission of the Norwegian Polar Institute.

721

722

29 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

Oshurkovia littoralis BLEED1194 Novo (NCBI) 13 *

Oshurkovia littoralis BLEED1194 SPAdes * 13 Oshurkovia littoralis BLEED1194 Novovo (SPAdes) 13 100 Oshurkovia littoralis BLEED1194 GO (SPAdes) 13 Oshurkovia littoralis BLEED1194 GO (NCBI) 13 Oshurkovia littoralis NCBI 5* Porella compressa BLEED1188 SPAdes * 98 1 90 Porella compressa BLEED1188 Novo (SPAdes) Porella compressa BLEED1188 GO (SPAdes) 1 94 Porella compressa BLEED1188 NOVO (BLEED579) 2 Porella concinna BLEED579 66 100 Porella concinna BLEED1169 Parasmittina jeffreysi BLEED1202 Novo (SPAdes) 15 100 Parasmittina jeffreysi BLEED1202 Novo (BLEED1267) 15 95 * 97 15 Parasmittina jeffreysi BLEED1202 SPAdes * Parasmittina solenosmilioides BLEED1267 95 Parasmittina aotea BLEED86 82 Cryptosula pallasiana Watersipora subtorquata 100 Rhynchozoon angulatum BLEED694 100 Rhynchozoon zealandicum BLEED127 100 Stephanollona scintillans BLEED679 Stephanollona sp. BLEED170 93 Turbicellepora smitti BLEED1182 Novo (SPAdes) 14 Turbicellepora smitti BLEED1182 GO (SPAdes) 14 100 Turbicellepora smitti BLEED1182 Novo (BLEED328) 14 14* Turbicellepora smitti BLEED1182 SPAdes * 0.1 100 Omalosecosa ramulosa BLEED1177 Novo (SPAdes) 2

95 100 * Omalosecosa ramulosa BLEED1177 SPAdes 8 * Omalosecosa ramulosa BLEED1177 Novo (BLEED328) 2 Celleporina sinuata BLEED328 100 88 Galeopsis porcellanicus Tessaradoma boreale BLEED1183 GO (SPAdes) 9 Tessaradoma boreale BLEED1183 GO (BLEED694) 9 100 Tessaradoma boreale BLEED1183 SPAdes 2 Tessaradoma boreale BLEED1183 Novo (BLEED694) 14 76 14* Tessaradoma boreale BLEED1183 Novo (SPAdes) * 100 Microporella sp. 100 100 Microporella cf. ciliata 90 Microporella sp. nov. 2 100 Fenestrulina sp. nov. 1 Fenestrulina malusii 15* Myriapora truncata BLEED1197 Novo (BLEED104) * Myriapora truncata BLEED1197 GO (BLEED104) 5 100 Myriapora truncata BLEED1197 SPAdes 5 Myriapora truncata BLEED1197 Novo (NCBI COI) 13 13 100 Myriapora truncata BLEED1197 Novo (SPAdes) 99 Terminocella n. sp. BLEED194 100 Cornuticella trapezoidea BLEED1687 Pterocella scutella BLEED104 Cradoscrupocellaria reptans BLEED1192 GO (SPAdes) 14 62 14* Cradoscrupocellaria reptans BLEED1192 SPAdes * 100 Cradoscrupocellaria reptans BLEED1192 Novo (BLEED196) 2 97 Cradoscrupocellaria reptans BLEED1192 Novo (SPAdes) 2 Emma tricellata BLEED196 54 100 Caberea ellisii BLEED1190 SPAdes 14 88 * 15 Caberea ellisii BLEED1190 Novo (SPAdes) * Caberea angusta BLEED160 Rhabdozoum wilsoni BLEED695 Bugula neritina BLEED1200 Novo (NCBI) 14 14* Bugula neritina BLEED1200 SPAdes * Bugula neritina BLEED1200 Novo (SPAdes) 14 100 Bugula neritina BLEED1200 GO (NCBI) 14 14 100 Bugula neritina BLEED1200 GO (SPAdes) Bugula neritina NCBI 100 Bicellariella ciliata BLEED560 100 Patsyella acanthodes BLEED131 100 Arachnopusia unicornis BLEED221 Arachnopusia unicornis BLEED183 Callopora lineata 15* Securiflustra securifrons BLEED1179 Novo (SPAdes) * Securiflustra securifrons BLEED1179 Novo (BLEED551) 15 100 Securiflustra securifrons BLEED1179 SPAdes 13 100 Securiflustra securifrons BLEED1179 GO (SPAdes) 11 100 Securiflustra securifrons BLEED551 100 Hincksina sp. BLEED61 100 Flustra foliacea BLEED568 Flustra foliacea 85 55 Adeona foliifera fascialis 96 Reptadeonella brasiliensis 98 Laminopora jellyae 100 Cucullipora sp. 79 Adeonella pallasii 100 Celleporella hyalina Antarctothoa delta BLEED703 96 15* Hippothoa sp. BLEED1187 SPAdes * Amathia citrina 4* Electra pilosa BLEED1178 SPAdes * 71 Electra pilosa BLEED1178 Novo (BLEED1172) 2 Electra pilosa BLEED1178 Novo (SPAdes) 2 100 Electra pilosa BLEED1172 100 Electra pilosa 100 Electra pilosa BLEED819 Electra scuticifera BLEED98 100 Membranipora membranacea BLEED816 100 Membranipora membranacea BLEED1163 100 Membranipora membranacea Membranipora grandicella Calpensia nobilis BLEED1184 SPAdes 13 14* Calpensia nobilis BLEED1184 Novo (BLEED100) * Calpensia nobilis BLEED1184 Novo (SPAdes) 13 100 Calpensia nobilis BLEED1184 GO (SPAdes) 13 92 100 Calpensia nobilis BLEED1184 GO (BLEED100) 5 Steginoporella perplexa BLEED100 Scruparia chelata rrnL atp6 atp8 cox1 cox2 cox3 cytb rrnS 50 Anguinella palmata nad1nad2nad3nad4nad4lnad5nad6 TOTAL Paludicella sp. 71 Flustrellidra hispida 723 50 Alcyonidium mytili 724 Figure S1. Phylogeny comparing assembly methods of historical samples based. Maximum likelihood 725 topology of 109 taxa and 4144 nucleotide and amino acid characters inferred using RAxML (20 heuristic 726 searches and bootstrap of 100 pseudoreplicates). Only BootStrap values BS >50 are shown. The three different 727 assembly methods are highlighted as follows: NOVOplasty (“Novo” blue font), GetOrganelle (“GO” red font), 728 and SPAdes (“SPAdes” green font). The reference/seed used is in brackets following the assembly method with 729 “SPAdes” indicating the de novo assembly of the same sample was utilized. Note that not all methods were 730 applicable to all samples. The table to the right of the tree shows the individual and total genes recovered (of a 731 maximum of 15) for each sample and method, with a colored box indicating a positive result. The gene order per 732 column is as follows: atp6, atp8, cox1, cox2, cox3, cytb, nad1, nad2, nad3, nad4, nad4l, nad5, nad6, rrnL and 733 rrnS. The * to the left and right of a row indicate that the result from this assembly method was utilized in the 734 main phylogeny shown in Fig. 1.

30 bioRxiv preprint doi: https://doi.org/10.1101/2020.03.30.015669; this version posted March 31, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

100 Bugula neritina 7 2 Bugula neritina BLEED1200 (1960) Bicellariella ciliata BLEED560 Celleporella hyalina Antarctothoa delta BLEED703 Callopora lineata Emma tricellata BLEED196 100 Arachnopusia unicornis BLEED221 Arachnopusia unicornis BLEED183 Cradoscrupocellaria reptans BLEED1192 (<1970) Patsyella acanthodes BLEED131 Rhabdozoum wilsoni BLEED695 Caberea ellisii BLEED1190 (<1970) Caberea angusta BLEED160 9 2 Securiflustra securifrons BLEED551 7 9 Securiflustra securifrons BLEED1179 (<1970) 9 1 Hincksina sp. BLEED61 100 Flustra foliacea Flustra foliacea BLEED568 9 7 Microporella sp. 9 8 Microporella cf. ciliata Microporella sp.nov. 2 9 8 Oshurkovia littoralis BLEED1194 (<1970) 100 Oshurkovia littoralis Porella concinna BLEED579 100 Porella concinna BLEED1169 5 1 Watersipora subtorquata Parasmittina aotea BLEED86 6 6 Parasmittina jeffreysi BLEED1202 (1901) 6 5 Reptadeonella brasiliensis 8 2 Laminopora jellyae 6 1 Adeona foliifera fascialis Adeonella pallasii Cucullipora sp. Hippothoa sp. BLEED1187 (1875) 9 4 Terminocella n. sp. BLEED194 100 Pterocella scutella BLEED104 Cornuticella trapezoidea BLEED1687 100 Myriapora truncata BLEED1197 (1871) Myriapora truncata ATX63952 Tessaradoma boreale BLEED1183 (<1970) Galeopsis porcellanicus Celleporina sinuata BLEED328 100 Rhynchozoon angulatum BLEED694 9 7 Rhynchozoon zealandicum BLEED127 100 Stephanollona scintillans BLEED679 Stephanollona sp. BLEED170 5 9 100 Omalosecosa ramulosa BLEED1177 (<1970) Turbicellepora smitti BLEED1182 (<1970) 6 4 Cryptosula pallasiana Parasmittina solenosmilioides BLEED1267 Amathia citrina Flustrellidra hispida Electra pilosa BLEED1178 (<1970) Electra pilosa BLEED1172 9 9 Electra pilosa 9 3 Electra pilosa BLEED819 Electra scuticifera BLEED98 8 6 Membranipora membranacea BLEED816 Membranipora membranacea BLEED1163 9 8 Membranipora membranacea Membranipora grandicella Scruparia chelata Paludicella sp. 9 1 Steginoporella perplexa BLEED100 Calpensia nobilis BLEED1184 (<1970) Alcyonidium mytili Anguinella palmata

0.1

735 736 Figure S2. Phylogeny of cox1 confirming the placement of Myriapora truncata (BLEED1197) from 1871. 737 Maximum likelihood topology of 68 taxa and 490 amino acid characters inferred using RAxML (20 heuristic 738 searches and bootstrap of 100 pseudoreplicates). Only Bootstrap values BS >50 are shown. The M. truncata 739 (BLEED1197) from 1871 and the NCBI sequence (ATX63952) are shown in bold, as is the BS support value for 740 this branch.

741 742

31