Molecular Ecology

A process of convergent amplification and tissue-specific expression dominate the evolution of toxin and toxin-like genes in sea anemones

Journal: Molecular Ecology

Manuscript ID MEC-18-1374.R1 Manuscript Type:ForOriginal Review Article Only Date Submitted by the 09-Mar-2019 Author:

Complete List of Authors: Surm, Joachim; Queensland University of Technology Faculty of Health, ; Institute of Health and Biomedical Innovation, Smith, Hayden; Queensland University of Technology Faculty of Science and Engineering; Queensland University of Technology Institute for Future Environments Madio, Bruno; University of Queensland Institute for Molecular Bioscience Undheim, Eivind; University of Queensland Centre for Advanced Imaging King, Glenn F; University of Queensland Institute for Molecular Bioscience Hamilton, Brett; University of Queensland Centre for Advanced Imaging; University of Queensland Centre for Microscopy and Microanalysis van der Burg, Chloé; Queensland University of Technology Faculty of Health; Institute of Health and Biomedical Innovation Pavasovic, Ana; Queensland University of Technology, School of Biomedical Sciences Prentis, Peter

Venom, , RNA-seq, phylogenetics, mass spectrometry imaging, Keywords: selective pressure

Page 1 of 50 Molecular Ecology

1 Article

2 A process of convergent amplification and tissue-specific expression

3 dominate the evolution of toxin and toxin-like genes in sea

4 anemones

5

6 Joachim M. Surm1,2*, Hayden L. Smith3,4, Bruno Madio5, Eivind A. B. Undheim6, Glenn F. King5, Brett 7 R. Hamilton6,7, Chloé A. vanFor der Burg Review1,2, Ana Pavasovic1 ,Only and Peter J. Prentis3,4 8

9 1School of Biomedical Sciences, Faculty of Health, Queensland University of Technology

10 2Institute of Health and Biomedical Innovation, Queensland University of Technology

11 3School of Earth, Environmental and Biological Sciences, Science and Engineering Faculty,

12 Queensland University of Technology

13 4Institute for Future Environments, Queensland University of Technology

14 5Institute for Molecular Bioscience, University of Queensland

15 6Centre for Advanced Imaging, University of Queensland

16 7Centre for Microscopy and Microanalysis, University of Queensland

17 *Correspondence: [email protected];

18 Molecular Ecology Page 2 of 50

19 Abstract

20 Members of Cnidaria are an ancient group of venomous and rely on a number

21 of specialised tissues to produce toxins in order to fulfil a range of ecological roles including prey

22 capture, defence against predators, digestion, and aggressive encounters. However, limited

23 comprehensive analyses of the evolution and expression of toxin genes currently exists for cnidarian

24 species. In this study, we use genomic and transcriptomic sequencing data to examine gene copy

25 number variation and selective pressure on toxin gene families in phylum Cnidaria. Additionally, we

26 use quantitative RNA-seq and mass spectrometry imaging to understand expression patterns and

27 tissue localisation of toxin productionFor Review in sea anemones. UsingOnly genomic data, we demonstrate that the

28 first large scale expansion and diversification of known toxin genes occurs in phylum Cnidaria, a

29 process we also observe in other venomous lineages, which we refer to as convergent amplification.

30 Our analyses of selective pressure on toxin gene families reveal that purifying selection

31 is the dominant mode of evolution for these genes and that phylogenetic inertia is an important

32 determinant of toxin gene complement in this group. The gene expression and tissue localisation data

33 revealed that specific genes and proteins from toxin gene families show strong patterns of tissue and

34 developmental-phase specificity in sea anemones. Overall, convergent amplification and phylogenetic

35 inertia has strongly influenced the distribution and evolution of the toxin complement observed in sea

36 anemones, while the production of venoms with different compositions across tissues is related to

37 the functional and ecological roles undertaken by each tissue type.

38 Keywords 39 Venom, Cnidaria, RNA-seq, phylogenetics, mass spectrometry imaging, selective pressure

40

1 Page 3 of 50 Molecular Ecology

41 1. Introduction

42 Venomous animals rely on their toxins for a range of ecological processes, including prey

43 capture, defence against predators, and intra and interspecific aggression (Casewell, Wüster, Vonk,

44 Harrison, & Fry, 2013; Fry et al., 2009). Toxins are primarily gene-encoded peptides and proteins that

45 evolved from ancestral “house-keeping” molecules that perform functions unrelated to venom

46 production in the body (Casewell et al., 2013). Venomous taxa have evolved multiple times during

47 metazoan evolution, and the genes that encode peptide and protein toxins are often considered to 48 evolve rapidly under positiveFor Darwinian Review selection, enhanced Only by a genetic redundancy generated 49 through gene duplication events (Casewell et al., 2013; Fry et al., 2009; Sunagar & Moran, 2015).

50 New evidence suggests that the evolution of toxin and toxin-like (TTL) genes is dominated by

51 purifying selection in ancient venomous lineages such as cnidarians, coleoids, and (Jouiaei

52 et al., 2015; Pineda et al., 2014; Ruder et al., 2013; Sunagar & Moran, 2015; Sunagar et al., 2013;

53 Undheim et al., 2014a, 2014b). This observation, however, does not account for gene age within these

54 taxa. Consequently, this calls for a comprehensive analysis of selective pressures on widespread gene

55 families (i.e., those shared in venomous lineages across a broad taxonomic distribution) versus those

56 that are lineage-specific (i.e., gene families restricted to particular phylum or order) to better

57 understand venom evolution in ancient lineages. Importantly, a lack of positive selection on TTL genes

58 indicates other evolutionary processes may play a key role in venom evolution in ancient taxa.

59 Cnidarians are the oldest venomous metazoan lineage (Erwin et al., 2011; Menon, McIlroy, &

60 Brasier, 2013; Park et al., 2012) and they are defined by their envenomation system, which consists of

61 specialised cells called cnidocytes (Fautin, 2009; Fautin & Mariscal, 1991; Kass-Simon & Scappaticci,

62 2002). Cnidocytes are distributed throughout the body, but they vary in density and morphology

63 across tissues (Beckmann & Özbek, 2012; David et al., 2008; Fautin, 2009; Fautin & Mariscal, 1991;

64 Özbek, 2010). This envenomation system is unique, and it allows cnidarians to produce toxins across

65 multiple tissues (Macrander, Broe, & Daly, 2016), whereas most venomous lineages show restricted 2 Molecular Ecology Page 4 of 50

66 expression of toxin genes within one or more isolated gland (Dutertre et al., 2014; Fingerhut et al.,

67 2018; Gao et al., 2018; Modica, Lombardo, Franchini, & Oliverio, 2015; Undheim et al., 2015; Walker

68 et al., 2018). In sea anemone species, three tissue types have become highly specialised for different

69 ecological roles associated with venom delivery: acrorhagi are inflatable aggressive organs used in

70 intraspecific aggressive encounters, tentacles are used in prey capture and defence, and mesenteric

71 filaments are multifunctional morphological structures used principally in digestion and killing of prey

72 (Fautin & Mariscal, 1991; Kass-Simon & Scappaticci, 2002; Macrander, Brugler, & Daly, 2015; Prentis,

73 Pavasovic, & Norton, 2018). While venoms in sea anemones are by far the most well studied among

74 cnidarians, toxin gene expressionFor and Review protein localisation Only patterns across these functionally distinct

75 tissues remains largely unexplored. Significantly, as evidence supports that changes in TTL gene

76 expression may generate different venom profiles (Amazonas et al., 2018), we hypothesise that in sea

77 anemones, TTL gene expression varies among tissue types and correlates with their distinct ecological

78 functions.

79 In this paper, we comprehensively surveyed the TTL gene complement in both venomous and

80 non-venomous taxa using comparative genomics, which revealed that the total TTL gene repertoire

81 has expanded in the majority of known venomous lineages investigated. We refer to this process as

82 convergent amplification and it is the result of convergent recruitment (Fry et al., 2009) followed by

83 an increase in copy number of toxin-encoding genes. Our results indicate that the first evidence of

84 convergent amplification is observed in phylum Cnidaria, the oldest extant venomous lineage. As with

85 all cnidarians, sea anemones (actiniarians) share a common venomous ancestor, and their TTL genes

86 are the best studied among all cnidarian groups. Consequently, we performed a fine scale comparative

87 analysis on actiniarian transcriptomes, to systematically investigate the toxin gene complement and

88 selective forces acting on widespread and lineage-specific TTL gene families in this group. Finally,

89 functional genomic analyses were performed on a candidate sea anemone, Actinia tenebrosa, using

90 quantitative RNA-seq and mass spectrometry imaging (MSI), to investigate whether functionally

3 Page 5 of 50 Molecular Ecology

91 distinct tissue types generate different venom profiles consistent with their ecological functions.

92

93 2. Materials and methods

94 2.1 Identification of TTL genes

95 The first aim of this study was to comprehensively investigate the distribution, copy number

96 and evolution of TTL genes and gene families across Metazoan taxa. In particular, we wanted to

97 examine if the first large expansion of TTL genes and gene families occurred in phylum Cnidaria as it is 98 the oldest extant venomousFor lineage. ReviewTo achieve these aims Only we first identified candidate genes in gene 99 sets and predicted protein sets from the sequenced genomes of representative taxa from following

100 phyla and taxonomic groupings , Porifera, Placozoa, Cnidaria, , ,

101 and Deuterostomia. As only few genomes currently exist for Ctenophora, Porifera and Placozoa, we

102 combined them into an artificial group called CPP. A list of the genomes from the specific species used

103 in this study can be found in Supplementary Table 1, and it includes four species from CPP, five species

104 from Cnidaria, 24 species from Ecdysozoa, eight species from Lophotrochozoa, and nine species from

105 Deuterostomia.

106 BLASTP was performed to identify TTL candidate genes in all predicted proteomes from

107 transcriptomes and genomes against the manually curated Swiss-Prot database (accessed 18/04/18)

108 (e value < 1e-05). Significant queries with top BLAST annotations from proteins in the Tox-Prot database

109 (Jungo & Bairoch, 2005) were considered candidate TTL genes. The presence of a signal peptide in

110 these candidate TTL proteins was examined using SignalP (Petersen, Brunak, Heijne, & Nielsen, 2011).

111 We grouped candidate TTLs with a signal peptide into protein families using their top BLAST hit. This

112 hit description included the sequence similarity with other proteins (family and domains) as described

113 in the Swiss-Prot knowledgebase.

114 In order to determine the distribution and expansion of TTL gene families, we compared the

115 number of different protein families and copy number in each species. This was based on the number 4 Molecular Ecology Page 6 of 50

116 of predicted proteins that received significant BLAST hits against the Tox-Prot database. This data was

117 used to determine the extent of shared and lineage-specific TTL genes across the broad metazoan

118 groupings listed above. Gene families found in multiple metazoan groups were considered shared,

119 while those found in a single metazoan group were classified as lineage-specific. To further examine

120 the distribution of lineage-specific TTL gene families within previously defined metazoan groups, we

121 reported copy number variation across the representative species.

122 Phylum Cnidaria was selected to compare the frequency of lineage-specific TTL gene families

123 in a phylum as it shares a venomous common ancestor and is the oldest extant venomous group. The

124 distribution and expansionFor of TTL gene Review families within thisOnly phylum was compared using the number

125 of predicted proteins that received significant BLAST hits against the

126 Tox-Prot database. The genomes investigated included a Medusozoan (hydrozoan, Hydra vulgaris) and

127 four Anthozoans. The four anthozoans consisted of three scleractinians (Acropora digitifera,

128 Stylophora pistillata, and Orbicella faveolata) and one actiniarian (Exaiptasia pallida).

129 Although candidates identified in this study had a top BLAST hit to a Tox-Prot database

130 sequence, it is unlikely all TTLs identified are functional toxins (Madio, Undheim, & King, 2017; von

131 Reumont, Undheim, Jauss, & Jenner, 2017). Some TTL gene families, such as sea anemone 8 toxin,

132 have not been functionally validated as toxins. Sea anemone 8 toxin has been observed in multiple

133 toxin studies and we have included this putative TTL protein to remain consistent with previous

134 literature (Macrander et al., 2016; Madio et al., 2017; Oliveira, Fuentes-Silva, & King, 2012).

135

136 2.2 Comparative genomic and phylogenetic analyses

137 2.2.1 Comparative analysis of TTL gene families in Actiniarian species

138 We undertook a fine scale analysis of TTL genes in actiniarian species to better understand

139 whether phylogenetic inertia and/or ecological similarity among species influenced the distribution of

140 TTL gene families. Actiniarians were selected as they have multiple well described and validated TTL

5 Page 7 of 50 Molecular Ecology

141 gene families (Honma & Shiomi, 2006; Jouiaei et al., 2015; Norton, 1991, 2009; Prentis et al., 2018;

142 Shiomi, 2009), whose distribution and evolution are not well understood (Daly, 2016) (see

143 Supplementary Table 2.1 for full list of gene families). Fourteen transcriptomes were used in this

144 analysis from three different superfamilies (Baumgarten et al., 2015; Dnyansagar et al., 2018;

145 Macrander et al., 2016; Madio et al., 2017; Schwaiger et al., 2014; Sorek et al., 2018; van der Burg,

146 Prentis, Surm, & Pavasovic, 2016), including Actinioidea, , and Edwardsioidea (Rodríguez

147 et al., 2014). Raw reads were retrieved from the sequence read archive and converted to FASTQ files.

148 The Trinity software package version 2.0.6 was used to assemble the majority of the transcriptomes,

149 with Trinity 2.2.0 used to assembleFor Aiptasia.Review diaphana, EdwardsiellaOnly carnea, Nematostella vectensis,

150 and E. pallida (see Supplementary Table 3, 4, 5, and 6 for transcriptome assembly statistics), with the

151 data used after Trimmomatic quality filtering (Bolger, Lohse, & Usadel, 2014; Grabherr et al., 2011).

152 BUSCO was used to validate the quality and completeness of the transcriptomes (Simão, Waterhouse,

153 Ioannidis, Kriventseva, & Zdobnov, 2015; Waterhouse et al., 2018).

154 Downstream RNA-seq analysis was performed using software leveraged in the Trinity package

155 version 2.2.0 (Haas et al., 2013). Individual reads were mapped back to reference transcriptome

156 assemblies independently for each species using Bowtie2 and abundance estimated using RSEM (Li &

157 Dewey, 2011). Normalised abundance estimates were calculated as fragments per kilobase of

158 transcript per million mapped (FPKM). Transcripts with FPKM values of zero were removed as

159 assembly artefacts. ORFfinder was used to identify open reading frames encoding for proteins > 25

160 amino acid residues in length to produce a predicted proteome for the 14 transcriptomes (Haas et al.,

161 2013). CD-HIT was then used to cluster 100% identical proteins for each individual proteome to

162 remove redundancy (Fu, Niu, Zhu, Wu, & Li, 2012). TTL candidate genes were identified as above. In

163 order to determine the distribution and expansion of TTL gene families in actiniarians, we compared

164 39 different protein families and copy number in each species. Principal component analysis (PCA) was

165 performed using a matrix, which was log2 and median centred, of TTL gene to cluster species.

6 Molecular Ecology Page 8 of 50

166 To investigate whether genome-wide expansions of gene families could have confounded our

167 results we examined a second gene family and genome-wide patterns of gene duplication in sea

168 anemones. The second gene family investigated was Green Fluorescent Proteins (GFP). Candidates

169 genes from this family were identified by performing BLASTPs against a custom protein database

170 generated from functionally characterised proteins in the Swiss-Prot database that contain a GFP Pfam

171 (PF01353) (e value < 1e-05). Sea anemone queries that received a significant hit were then

172 manually examined to ensure they contained a GFP Pfam domain using HMMER 3.1b2 against the

173 Pfam database (e value < 1e-05). To further examine genome-wide duplication across the taxa

174 examined we used BUSCOFor on the predicted Review proteome ofOnly each species to determine the amount of

175 duplication in complete single-copy orthologs present in the transcriptome of each sea anemone

176 species.

177 To evaluate if gene families showed a taxonomically restricted distribution, we constructed a

178 species tree for the 14 transcriptomes investigated. Single-copy orthologous genes from the 14

179 actiniarian transcriptomes were identified using OrthoMCL (Li, Stoeckert, & Roos, 2003) (e value < 1e-

180 05, inflation 1.5). From this, 1,004 single-copy orthologous genes were aligned using MAFFT. Aligned

181 orthologs were concatenated with the final alignment consisting of approximately 269,033 amino

182 acids. The concatenated protein alignment was imported into IQ-TREE (Nguyen et al., 2014) to

183 determine the best-fit model of protein evolution. The JTT model with Gamma rate heterogeneity,

184 invariable sites and empirical codon frequencies was selected, and a Maximum Likelihood tree

185 generated using 1,000 bootstrap iterations (Stamatakis, 2014).

186 To investigate the gain and loss of TTL gene families in actiniarians, we used the DOLLOP

187 program from the PHYLIP package version 3.696 (Felsenstein, 1989)

188 (http://evolution.genetics.washington.edu/phylip.html). The species tree and a presence/absence

189 matrix of TTL gene families, previously constructed, were imported into the DOLLOP program. The

190 most parsimonious evolutionary scenario for the gain and loss of TTL gene families was estimated

7 Page 9 of 50 Molecular Ecology

191 using Dollo’s parsimony law, which assumes genes arise once on the evolutionary tree and can be lost

192 independently in different evolutionary lineages (Farris, 1977).

193

194 2.2.2 Selection analyses

195 To determine whether different selective pressures have acted on gene families shared across

196 order Actiniaria compared to those restricted to a single actiniarian family we used the gene family

197 distribution and sequence data generated in section 2.2.1. Specifically, lineage-specific and widely

198 distributed TTL gene families were tested for evidence of nucleotide variation consistent with the

199 action of positive or negativeFor selection, Review by analysing the Only ratio of synonymous to non-synonymous

200 mutations using CODEML within the PAML package version 4.8 (Yang, 2007). Protein sequences from

201 all TTL gene families were aligned using MAFFT version 7 (Katoh & Standley, 2013). The protein

202 alignments were back translated using Pal2Nal (Suyama, Torrents, & Bork, 2006) to generate codon

203 alignments. Only gene families with alignments that contained at least three sequences were used for

204 downstream selection analyses. Codon alignments were imported into IQ-TREE (Nguyen et al., 2015)

205 to determine the best nucleotide substitution model and to generate Maximum Likelihood

206 phylogenetic trees. Maximum Likelihood models implemented in the CODEML package of PAML

207 (Yang, 2007) were used to assess whether specific TTL gene families were under positive selection.

208 This analysis was performed as described in the study by Jouiaei et al. (2015) on 29 candidate TTL gene

209 families.

210

211 2.2.3 Sanger sequencing of TTL genes in actiniarians

212 Sanger sequencing was performed to validate multiple lineage-specific TTL gene families.

213 Thirteen TTL genes identified in the transcriptomes for five species (A. tenebrosa, Anthopleura

214 buddemeieri, Aulactina veratra, Telmatactis sp., and annamensis) were validated

215 (Supplementary Table 7). Primer 3 software (Untergasser et al., 2012) was used to design primers from

8 Molecular Ecology Page 10 of 50

216 transcripts generated from the transcriptomes (Supplementary Table 8). cDNA was synthesised using

217 the SensiFASTTM cDNA synthesis kit (Bioline). Toxin genes were amplified using the MyFiTM DNA

218 polymerase mix (Bioline). Amplified fragments were purified using ISOLATE II PCR and Gel kit (Bioline).

219 Amplified fragments were sequenced using BigDye® Terminator version 3.1 (Thermo Fisher).

220 Sequences were then cleaned using an ethanol/EDTA protocol (Surm, Prentis, & Pavasovic, 2015).

221 Sequence chromatograms were visualised in Geneious version 9.1.3 (Kearse et al., 2012) and aligned

222 to the transcript from which the primers were designed and the percentage similarity was calculated.

223

224 2.3 Tissue-specific and ontogeneticFor Review expression patterns Only of TTL genes

225 To investigate whether TTL genes showed tissue specific expression patterns across

226 functionally distinct tissue types, we undertook an RNA-seq experiment using tentacles, acrorhagi,

227 and mesenteric filaments in A. tenebrosa. We also performed a separate analysis with the model sea

228 anemone N. vectensis using tentacles, nematosomes, and mesenteric filaments to determine if similar

229 patterns occurred in another species. Actinia tenebrosa was selected as multiple TTL proteins have

230 been functionally characterised in this species and is closely related to Actinia equina, which also has

231 long been used for the discovery of toxin proteins. In fact, many of the toxin genes identified and

232 validated in sea anemone species have been discovered in species from the Actinia.

233 Furthermore, A. tenebrosa possess functional acrorhagi, a novel envenomation structure used in

234 intraspecific combat that is unique to certain species from the Actinioidea superfamily. All individuals

235 used for the tissue-specific and ontogenetic expression patterns of TTL genes were housed in holding

236 tanks under standard aquarium conditions for one week following field collection (van der Burg et al.,

237 2016). The three tissue types considered to contain the highest density of nematocysts were isolated

238 from nine individuals (three replicate pools of three individuals for each tissue type). Total RNA was

239 extracted as described (Prentis & Pavasovic, 2014), with minor modifications. These modifications

240 included homogenisation of tissue in a Tissuelyser II (Qiagen) using a stainless-steel ball bearing

9 Page 11 of 50 Molecular Ecology

241 (Qiagen) in Trizol®. All samples were assessed for integrity, quality and concentration using a

242 Bioanalyzer 2100 (Agilent) and a QubitTM Fluorometer (ThermoFisher). Sequencing libraries were

243 prepared using the Illumina TruSeq® Stranded mRNA Library Preparation Kit for 75 bp paired-end

244 chemistry on an Illumina NextSeq 500.

245 Strand-specific raw reads from all nine libraries were quality checked (Q > 20, N < 1%) and

246 trimmed using Trimmomatic (Bolger et al., 2014). Trinity version 2.0.6 was used to assemble trimmed

247 reads using default settings (Grabherr et al., 2011). The assembled transcriptome was assessed for

248 completeness, using BUSCO. Strand-specific raw reads from all nine libraries were mapped to

249 reference transcriptomes For to generate Review FPKM values. TheOnly edgeR package (Robinson, McCarthy, &

250 Smyth, 2010) was used to perform differential gene expression analysis among the libraries after a

251 TMM normalisation step to account for differences in total RNA abundance across the samples.

252 Transcripts were considered differentially expressed for a given false discovery rate (FDR) value of <

253 1e-03 and a fold-change of 4. Using the Trinity pipeline, heat maps were generated in R and used to

254 visualise differentially expressed transcripts. Differentially expressed transcripts with similar

255 expression patterns were further partitioned into subclusters by cutting the dendrogram at 50% of its

256 height.

257 Quality control analyses were performed on samples and replicates to validate if any

258 discrepancies or batch effects were present. Pearson correlation was used to test for correlation in

259 gene expression among replicates. The relationship among the sample replicates were also explored

260 using PCA to ensure no batch effect was present (Supplementary Figure 1). ORFfinder

261 (https://www.ncbi.nlm.nih.gov/orffinder) was used to identify open reading frames in the forward–

262 strand, encoding for proteins > 25 amino acid residues in length to produce a predicted proteome.

263 The predicted proteome was used as a query against the Swiss-Prot database. Protein sequences with

264 a significant hit were used to map GO terms using UniProt idmapping. Following annotation, gene

265 ontology (GO) enrichment analysis was performed using GOseq (Young, Wakefield, Smyth, & Oshlack,

10 Molecular Ecology Page 12 of 50

266 2010) to determine if specific GO terms were over or underrepresented in the differentially expressed

267 transcripts and subclusters. GO terms were considered significantly enriched or depleted at FDR <

268 0.05. Following GO enrichment analysis, significantly enriched GO terms were visualised using REVIGO

269 with SimRel semantic similarities (Supek, Bošnjak, Škunca, & Šmuc, 2011). TTL candidate genes were

270 identified as previously described.

271 To evaluate whether gene expression patterns of TTL genes vary over the life history of

272 actiniarians, different ontogenetic stages of A. tenebrosa were investigated. This included four size

273 classes of juveniles ( petal disc diameter of 1, 3, 6, and 9 mm), which consisted of pools of three

274 individuals (Angeli, Zara, Turra,For & Gorman, Review 2016; Larson, Only 2017). Genetic variability may contribute to

275 variation in expression as individuals were not genetically identical. A single RNA-seq library was

276 generated for each size class. Total RNA was extracted as above and sequencing libraries were

277 prepared using the Illumina TruSeq® Stranded mRNA Library Preparation Kit for 75 bp single-end

278 chemistry on an Illumina NextSeq 500. Assembly, annotation, and downstream analysis was

279 performed as previously described.

280 Species-specific differences in tissue and developmental TTL expression were investigated by

281 performing additional differential expression analysis in the model species, N. vectensis (Bioproject

282 PRJEB13676 (Babonis, Martindale, & Ryan, 2016), PRJNA200689 (Schwaiger et al., 2014)). Assembly,

283 annotation, and downstream analysis was performed as previously described, however, strand-

284 specific flags were not included.

285

286 2.3.1 qPCR of TTL genes across tissue-types

287 Quantitative PCR (qPCR) was performed to validate the differential expression of candidate

288 toxins in A. tenebrosa for 10 sequences across tissue types. Total RNA previously used for each RNA-

289 seq library was used to synthesise cDNA using a SensiFASTTM cDNA synthesis kit (Bioline). Primers were

290 designed from the reference transcriptome to amplify regions within candidate TTL transcripts.

11 Page 13 of 50 Molecular Ecology

291 FastStart Essential DNA Green Master kit (Roche) was used for qPCR and run on the LightCycler® 96

292 System (Roche) to measure specific fluorescence at each cycle and quantify the initial levels of mRNA

293 for each gene in each tissue. All qPCR analyses comprised three technical replicates, with three

294 biological replicates, to validate the expression of TTL genes across tissue types. Negative controls (no

295 cDNA) were also performed for each gene in each sample, and the 18S gene was used as a

296 housekeeping control gene (Reitzel & Tarrant, 2009; Tarrant, Reitzel, Kwok, & Jenny, 2014). Relative

297 quantification analysis was performed using the analysis function of the Lightcycler96 software using

298 the ΔΔCT method. This method uses the reference gene and provides a basis for comparing levels of

299 target sequences to levels Forof reference Review sequences and the Only final result is expressed as a relative ratio.

300 Significance of results was assessed through ANOVA testing and differences were considered

301 significant with P value < 0.05.

302

303 2.4 Mass Spectrometry

304 2.4.1 Venom extraction

305 Venom was obtained from A. tenebrosa specimens by electrical stimulation (Malpezzi, de

306 Freitas, Muramoto, & Kamiya, 1993) after a starvation period of at least 48 h, and it was then

307 fractioned using reversed-phase HPLC (RP-HPLC) as described previously (Madio et al., 2018, 2017).

308 309 2.4.3 MALDI-TOF

310 Lyophilized RP-HPLC fractions were dissolved in 0.1% (v/v) TFA/water and 0.5 μl was spotted

311 onto a matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) plate with 0.5 μl α-

312 cyano-4-hydroxycinnamic acid (CHCA) as the matrix (10 mg/ml in 60% acetonitrile (ACN)). Spots were

313 analysed using a TOF/TOF 5800 System (AB SCIEX) in linear positive ion mode.

314

315 2.4.4 LC-MS/MS

316 To identify proteins present in the milked venom, we used a bottom-up proteomics approach 12 Molecular Ecology Page 14 of 50

317 to analyse the digested RP-HPLC fractions. Reduction and alkylation of cysteine residues in venom

318 proteins and peptides was performed as reported previously (Hale, Butler, Gelfanova, You, &

319 Knierman, 2004). Reduced/alkylated venom was incubated overnight at 37 °C in 10 μl of 40 ng/μl

320 proteomics-grade trypsin (Sigma) in 40 mM NH4CO3, pH 8. The digested reduced/alkylated samples

321 were then resuspended in a final concentration of 1% formic acid (FA) and centrifuged for 15 min at

322 12,000 g prior to LC-MS/MS. For analysis of RP-HPLC fractions, tryptic peptides were fractionated on

323 an Agilent Zorbax stable-bond C18 column (2.1 mm × 100 mm, 1.8 μm particle size, 300 Å pore size)

324 using a flow rate of 180 μl/min and a gradient of 1–40% solvent B (90% ACN, 0.1% FA) in 0.1% FA over

325 15 min on a Shimadzu NexeraFor UHPLC Review coupled with an ABOnly SCIEX 5600 mass spectrometer equipped

326 with a Turbo V ion source heated to 500 °C. MS/MS spectra were acquired at a rate of 20 scans/s, with

327 accumulation time of 0.25 ms, resulting in a cycle time of 2.3 s, and optimised for high resolution.

328 Precursor ions with m/z of 300–1,800 m/z, a charge of +2 to +5, and an intensity of at least 120

329 counts/s were selected, with a unit mass precursor ion inclusion window of ± 0.7 Da, and excluding

330 isotopes within ±2 Da for MS/MS. The crude venom digest was analysed as above except using a

331 gradient of 1–40% solvent B in 0.1% FA over 60 min.

332 Mass spectra were searched against predicted coding sequences (CDSs) from the assembled

333 transcriptome using ProteinPilot v4.5 (AB SCIEX). Searches were run as thorough identification

334 searches, specifying tryptic digestion and the alkylation reagent as appropriate. Biological

335 modifications and amino acid substitutions were allowed in order to maximize the identification of

336 protein sequences from the transcriptome despite the inherent variability of toxins, potential isoform

337 mismatch with the transcriptomic data, and to account for experimental artefacts leading to chemical

338 modifications. We used a stringent detected protein threshold score of 1% FDR as calculated by decoy

339 searches.

340

13 Page 15 of 50 Molecular Ecology

341 2.4.5 Mass Spectrometry Imaging (MSI)

342 To localise specific toxin peptides to morphological structures we used MSI. Mass

343 spectrometry imaging was guided by published protocols (Caprioli, Farmer, & Gile, 1997) but with

344 sample preparation optimised as recently described (Madio et al., 2018; Mitchell et al., 2017; Undheim

345 et al., 2014b). Briefly, specimens of A. tenebrosa were left in 50% RCL2/ethanol at room temperature

346 overnight, then dehydrated sequentially using 50%, 60%, 70%, 90%, 95% and 100% ethanol (3 x 15

347 min at each concentration), cleared in xylene for 30 min, and embedded in paraffin wax. A whole

348 embedded animal was sectioned transversally at 7 µm thickness. Sections were de-paraffinized by

349 careful washing with xylene,For and optically Review imaged prior toOnly applying CHCA (7 mg/ml in 50% ACN, 0.2%

350 TFA) using a Bruker ImagePrep automated matrix sprayer. FlexControl 3.3 (Bruker) was used to

351 operate an UltraFlex III TOF-TOF mass spectrometer (Bruker) in linear positive mode, with m/z range

352 set to 1,000–20,000. A small laser size was chosen to achieve a spatial resolution of 50 µm, and matrix

353 ion suppression was enabled up to 980 m/z. Individual MSI experiments were performed using

354 FlexImaging 4.0 (Bruker). FlexImaging was used to establish the geometry and location of the section

355 on the slide based upon the optical image, choose the spatial resolution, and call upon FlexControl to

356 acquire individual spectra, accumulating 200 shots per raster point. FlexImaging was subsequently

357 used to visualise the data in 2D ion-intensity maps, producing an averaged spectrum based upon the

358 normalised individual spectra collected during the experiment.

359 Spectra to regions was assigned using probabilistic sematic analyses, as incorporated in

360 ClinProTools 3.0 (Bruker) and SCiLS Lab (SCiLS). The number of groups specified for the analyses were

361 therefore derived from an Aikake information criterion calculation as incorporated in ClinProTools.

362

363 3. Results

364 3.1 Comparative analysis of TTL genes across Metazoa

365 Venomous lineages have evolved independently numerous times across the metazoan tree of 14 Molecular Ecology Page 16 of 50

366 life (Casewell et al., 2013; Fry et al., 2009; Pisani et al., 2015). Our analysis captures the expansion and

367 diversification of TTL genes in currently available genomes of representative taxa across both

368 venomous and non-venomous lineages (Figure 1A). Comparative genomic analysis across Metazoa

369 reveals multiple TTL genes present in all lineages investigated, including those known to be non-

370 venomous. These non-venomous lineages, however, have fewer TTL genes compared to venomous

371 lineages (Figure 1A).

372 The process of convergent amplification, in which TTL genes and gene families are both

373 expanded, is observed in the majority of known venomous lineages. Overall CPP taxa have a lower

374 number of known TTL genesFor (7-8) and Review TTL gene families Only(4-7), which is not surprising given their lack

375 of venomous representatives. The first large expansion of TTL genes is observed in the cnidarian

376 E. pallida, which has the fifth largest number of TTL genes (86; cnidarian range 31-86) representing 14

377 gene families (cnidarian range 10–15). This expansion of TTL genes is concordant with evidence that

378 phylum Cnidaria is the earliest known venomous lineage (Jouiaei et al., 2015). We observed a process

379 of convergent amplification of TTL genes in multiple species in Ecdysozoa. The largest expansion

380 occurs in the Arizona bark scorpion Centruroides sculpturatus with 258 TTL genes (Ecdysozoa range 4–

381 258) and 24 TTL gene families ( range 3–24). In Lophotrochozoa, multiple venomous

382 lineages occur, however, some of the more exhaustively investigated lineages, such as cone snails, are

383 absent in our analysis due to a lack of available whole genome data. Based on available data, we report

384 a range of 7–84 TTL genes found in 4-15 TTL gene families in Lophotrochozoa. In Deuterostomia,

385 venomous lineages have evolved multiple times, but a convergent amplification of TTL genes is

386 restricted to the pit viper Protobothrops mucrosquamatus (88 TTLs across 25 gene families). TTL gene

387 copy number is highly variable in this group ranging from 3 to 88, which are found across 2–25 TTL

388 gene families. The pit viper P. mucrosquamatus, King cobra Ophiophagus hannah, platypus

389 Ornithorhynchus anatinus, and crown-of-thorns starfish Acanthaster planci, are all venomous, but

390 show divergent patterns of TTL gene copy number and distribution, with O. anatinus only having only

15 Page 17 of 50 Molecular Ecology

391 three TTL genes in two known toxin gene families. This indicates that copy number variation can be

392 extensive among independently evolved venomous lineages.

393 To understand the distribution of shared and lineage-specific TTL genes, we undertook a

394 comparative analysis of all characterised metazoan toxins. A total of 69 gene families are reported

395 across Metazoa, but only six are common to all broad metazoan groupings (Figure 1B). These include

396 phospholipase (PLA2), venom Kunitz-type, DNase II, type-B carboxylesterase/lipase, true venom lectin,

397 and snaclec. Much of the TTL gene family diversity observed across Metazoa is lineage-specific (43%;

398 29 of 68 gene families) and associated with independent origins of venomous taxa (Casewell et al.,

399 2013; Fry et al., 2009) (FigureFor 1B). Cnidaria Review (5), Ecdysozoa Only(17), Deuterostomia (5) and Lophotrochozoa

400 (2) all have lineage-specific TTL gene families, while taxa from the CPP grouping have none (Figure 1B).

401 Significantly, few lineage-specific TTL gene families show patterns of expansion, with the exceptions

402 of Cnidaria small cysteine-rich protein (SCRiP) (12 copies found in A. digitifera), and long (4 C-C)

403 scorpion toxin (59 copies found in C. sculpturatus). These findings indicate that although lineage-

404 specific gene families contribute to a significant proportion of the diversity and complexity of the TTL

405 gene complement, their impact on the total TTL gene number in most venomous species is less than

406 convergently recruited TTL gene families. Taken together, venomous lineages appear to evolve in a

407 consistent manner, relying on the convergent recruitment of shared gene families followed by gene

408 duplication, as well as a smaller component driven by the evolution of new toxin families that lack

409 homologs in other venomous species.

410 We investigated the evolution of lineage-specific TTL gene families within Cnidaria, a phylum

411 with a venomous common ancestor. A total of five TTL gene families (PLA2, multicopper oxidase,

412 peptidase M12A, actinoporin, and snaclec) are shared by all cnidarian taxa (Figure 2A). Lineage-

413 specific TTL gene families are found in all cnidarian species, with A. digitifera as an exception, with

414 three TTL gene families in E. pallida (sea anemone 8, AB hydrolase, and sea anemone structural class

415 9a), two TTL gene families in H. vulgaris (CRISP and DNase II), three TTL gene families in S. pistillata

16 Molecular Ecology Page 18 of 50

416 (conopeptide P-like, insulin, and phospholipase B-like), and two TTL gene families in O. faveolata

417 (latrotoxin-like and venom metalloproteinase (M12B)) (Figure 2B). The majority of these lineage-

418 specific TTL gene families are in fact common to other venomous lineages, with only two TTL gene

419 families (sea anemone 8 toxin and sea anemone structural class 9a TTL gene families) restricted to

420 phylum Cnidaria. The most expanded gene family in all cnidarian taxa was PLA2, with the exception of

421 SCRiP in A. digitifera. The most expanded lineage-specific TTL gene family is sea anemone 8 toxin in E.

422 pallida. The comparison of TTL copy number and gene families across Cnidaria are consistent with a

423 process of convergent amplification with limited evidence of TTL gene families contributing to this

424 process. For Review Only

425

426 3.2 Comparative analysis of TTL genes across Actiniaria

427 In total, 39 TTL gene families are found across the 14 transcriptomes. Ancestral reconstruction

428 analysis suggests 17 TTL gene families are present in the last common actiniarian ancestor (LCAA)

429 (Figure 3A), three of which can be found in all actiniarians (venom Kunitz-type, PLA2, and sea anemone

430 8 toxin). Of the 17 TTL gene families found in the LCAA, sea anemone sodium channel inhibitory toxin

431 (NaTx), sea anemone 8 toxin, sea anemone type 1 potassium channel toxin (KTx), and sea anemone

432 type 5 KTx are all restricted to Actiniaria (Figure 3A). A gain of five TTL gene families and a loss of a

433 single TTL gene families occurs in the Actinioidea superfamily following divergence from Metridioidea.

434 The Metridioidea superfamily experienced only a gene family loss following the split from Actinioidea.

435 Species-specific TTL gene family losses occur in all species, while species-specific gains are limited to

436 A. diaphana, Calliactis polypus, and five in Stichodactyla haddoni. These TTL gene families gained at

437 the species level, however, are not true species-specific gains as they are found in other venomous

438 lineages and sea anemone species. Sanger sequencing validated 12 lineage-specific TTL genes in sea

439 anemones with greater than 98.5% similarity to the transcript they were designed from

440 (Supplementary Table 7 and 8).

17 Page 19 of 50 Molecular Ecology

441 Species from Actinioidea have an increased mean copy number of TTL genes, compared to

442 species from Metridioidea and Edwardsioidea (Figure 3A). Anthopleura buddemeieri and A. tenebrosa

443 have the highest copy number of TTL genes, with 99 and 98 copies, respectively. Additionally,

444 Anemonia sulcata, A. veratra, S. haddoni, Anthopleura dowii, and Megalactis griffithsi have 79, 74, 66,

445 60, and 42 copies, respectively. Average copy number in the Metridioidea superfamily is lower, with

446 the highest copy number in A. diaphana (62), followed by N. annamensis (49), Telmatactis sp., (47), C.

447 polypus (46), and E. pallida (40). In Edwardsioidea, E. carnea and N. vectensis have 50 and 45 copies,

448 respectively. Principal component analysis revealed that the distribution and copy number of TTL

449 genes clustered the speciesFor based on Review superfamily (Supplementary Only Figure 2).

450 Copy number variation is observed within species from Actinioidea. Specifically, A. diaphana

451 and E. pallida have recently been synonymized (Grajales & Rodríguez, 2014), and show variation in

452 their TTL copy number. These samples were collected from different geographical locations and

453 differences in TTL copy number and diversity may be due to population-level genetic differences.

454 Variation in copy number could also be related to different degrees of transcriptome completeness,

455 with A. diaphana having a more complete transcriptome compared to E. pallida (Supplementary Table

456 5). This pattern was not observed among the four transcriptomes for A. tenebrosa, however, with the

457 blue ecotype having both the highest BUSCO score and the least copies of TTL genes. Furthermore,

458 the number of individuals does not appear to affect TTL copy number variation. This is evident with

459 the red ecotype transcriptome (n=2) showing similar TTL copy number with the brown (n=1) and green

460 (n=1) ecotype transcriptomes. Allelic variability has the potential to inflate the observed TTL gene copy

461 number observed in transcriptomes, while this has been minimised as much possible, these artefacts

462 may potentially impact the TTL copy number variation reported. Evidence of genome-wide expansions

463 were also explored that could potentially bias the TTL copy numbers reported. Variations in the copy

464 number of non-toxin genes, in similar datasets to those investigated here, have been previously

465 reported (Smith, Pavasovic, Surm, Phillips, & Prentis, 2018; Surm, Toledo, Prentis, & Pavasovic, 2018),

18 Molecular Ecology Page 20 of 50

466 but the expansions in these gene families are not consistent with the expansions we observe in TTL

467 gene families. Results from copy number analysis of the GFP gene family also showed copy variation

468 different from that observed in TTL gene families (Supplementary Table 2.2). Furthermore, we

469 observed no patterns of genome-wide expansion in single-copy orthologs in any of our transcriptomes

470 (Supplementary Table 2.2). Taken together, these results do not support a role for genome-wide

471 expansion in gene families confounding our copy number analysis of TTL genes in order Actiniaria.

472 A systematic approach was used to investigate whether lineage-specific TTLs show divergent

473 selective pressures in comparison to widely distributed TTL genes. Two TTL gene families display

474 patterns of nucleotide variationFor (dN/dS Review ratio (ω)) consistent Only with positive selection (Figure 3A). These

475 TTL gene families are true venom lectin (ω = 5.3753) and ficolin lectin (ω = 3.3567). Seven TTL gene

476 families have codons under positive selection, six of which have multiple sites under positive selection

477 (Figure 3B). Both the sea anemone type 3 (BDS-LIKE) KTx and ficolin lectin families have eight sites

478 under positive selection, while huwentoxin-1 has four sites. Sea anemone 8 toxin, SCRiP, actinoporin,

479 and CREC families all have two sites, while snaclec has one site evolving under positive selection. No

480 difference was observed between the evolutionary pressures on lineage-specific and widespread TTL

481 gene families with the majority of genes and sites under purifying selection (Supplementary Table 9)

482

483 3.3 TTLs show marked differences in expression and distribution across tissue types

484 Patterns of TTL gene expression and GO enrichment analysis across tissue types in A.

485 tenebrosa is consistent with the functional specialisation of acrorhagi, mesenteric filaments, and

486 tentacles (see Supplementary Table 6 for transcriptome assembly statistics). In total, 24,453

487 transcripts are differentially expressed across tissue types (Supplementary Figure 3A). Expression

488 patterns are more similar in acrorhagi and tentacle, with mesenteric filaments having the most

489 divergent expression profile. Enriched GO terms related to the functionality of specific tissues included

490 digestion (GO:0007586), polysaccharide catabolic process (GO:0000272) and cellular defense

19 Page 21 of 50 Molecular Ecology

491 response (GO:0006968) in mesenteric filaments; metalloendopeptidase activity (GO:0004222) and

492 hemolysis in other organism involved in symbiotic interaction (GO:0052331) in acrorhagi; ion channel

493 inhibitor activity (GO:0008200) and voltage-gated potassium channel activity (GO:0005249) in

494 tentacles (Supplementary Table 10). The nematocyst (GO:0042151; cnidarian toxin delivery system),

495 response to stimulus (GO:0050896), and toxin activity (GO:0090729) GO terms are also significantly

496 enriched in all comparisons among tissue types. The overrepresentation of the nematocyst and toxin

497 activity GO terms indicates that the process of envenomation and TTL genes are an important

498 component of the differentially expressed transcripts across these three tissues.

499 In total, 114 TTL transcriptsFor Revieware found in the three Only tissue types, with 113, 111, and 111 TTL

500 transcripts expressed with an FPKM > 0 in acrorhagi, tentacles, and mesenteric filaments, respectively

501 (Supplementary Table 11). Approximately, 68% (78/114) of TTL transcripts are differentially expressed

502 across the three tissue types (Figure 4A). As observed in the total dataset, the expression profiles of

503 TTL transcripts in tentacles and acrorhagi are more similar compared to mesenteric filaments.

504 Differentially expressed TTL transcripts are divided into five subclusters, representing transcripts

505 upregulated in tentacles (subcluster 1), transcripts upregulated in acrorhagi (subcluster 2), transcripts

506 upregulated in acrorhagi and tentacles (subcluster 3), a cluster of transcripts upregulated in

507 mesenteric filaments (subcluster 4), and a cluster of transcripts massively upregulated in acrorhagi

508 (subcluster 5) (Figure 4B). The different subclusters consist of 13, 18, 7, 24, and 16 transcripts for

509 subclusters 1-5, respectively.

510 All lineage-specific TTL gene families found in the reference transcriptome show differential

511 expression across the three tissue types. Expression patterns of multiple lineage-specific TTL gene

512 families are restricted exclusively to acrorhagi (Figure 4C). Copies of acrorhagin 1 and 2, SCRiP, and

513 sea anemone NaTx are upregulated only in acrorhagi. Sea anemone type 3 (BDS-LIKE) KTx and sea

514 anemone 8 toxin are upregulated in acrorhagi and tentacles. Sea anemone type 5 KTX is upregulated

515 only in tentacles. Different members of sea anemone type 1 KTx toxin are upregulated in multiple

20 Molecular Ecology Page 22 of 50

516 tissues. Widely distributed and expanded toxin families are also found to have members differentially

517 expressed across multiple tissues, including venom Kunitz-type, and PLA2. In addition, some widely

518 distributed and expanded toxin families also showed patterns of restricted expression, with natterin

519 upregulated exclusively in mesenteric filaments, true venom lectin upregulated exclusively in tentacle,

520 and snaclec and ficolin lectin upregulated exclusively in acrorhagi. Quantitative PCR was performed to

521 validate tissue-specific TTL differential expression analysis (Supplementary Table 12). Housekeeping

522 control gene (18S) shows little variation across tissue types. The 10 genes showed statistically

523 significant differential expression in concordance with tissue-specific RNA-seq results.

524 Similarly, distinct patternsFor of Review TTL gene expression Only are observed across multiple tissue types in

525 N. vectensis (Supplementary Figure 4A). In total, 45 TTL genes are identified in the transcriptome of

526 N. vectensis across three tissue types, mesenteric filaments, tentacles and nematosomes

527 (Supplementary Figure 5A). Of the 45 TTL genes, 18 are significantly differentially expressed, with

528 tentacles and nematosomes showing greater similarity than mesenteric filaments, which has the most

529 divergent expression pattern (Supplementary Table 11). The 18 differentially expressed TTL genes can

530 be divided into four subclusters, representing two and eight transcripts upregulated in mesenteric

531 filaments (subcluster 1 and 3, respectively), three transcripts upregulated in tentacle (subcluster 2),

532 and five transcripts upregulated in nematosomes (subcluster 4).

533 In order to examine whether the differences observed in gene expression translated into

534 proteomic difference in venom profiles among tissues, we examined cross-sections of A. tenebrosa by

535 MALDI MSI. Supporting the differential expression of TTL genes across tissue types, PCA by

536 probabilistic sematic analyses revealed distinct mass profiles correlating with tissue types (Figure 5A

537 and B). Notably, regions such as tentacles, oral disc, mesenteric filaments, and gonads were all

538 recovered as distinct groups, as was the general non-pedal disc epiderm. A peptide with low sequence

539 similarity to the sea anemone structural class 9a family is found in the venom of A. tenebrosa. This

540 peptide, although widely distributed, has a higher concentration in the tentacle region (Figure 5C).

21 Page 23 of 50 Molecular Ecology

541 Further examination of the regions corresponding to the acrorhagi revealed several unique masses

542 compared to the rest of the body. However, none of these masses were similar to those of acrorhagin

543 1 or 2. Instead, the most intense acrorhagi-associated peak corresponded to a sea anemone type 3

544 (BDS-LIKE) KTx, albeit with a significant extension in loop one of the -defensin scaffold (Figure 5D and

545 E).

546

547 3.4 TTLs show marked differences in expression across ontogenetic stages

548 We observed ontogenetic differences in expression of TTL gene families (we considered four

549 ontogenetic stages, namelyFor 1, 3, 6, and Review 9 mm) (see Supplementary Only Table 6 for transcriptome assembly

550 statistics). In total, 2,227 transcripts are differentially expressed across the different ontogenetic

551 stages (Supplementary Figure 3B). The nematocyst GO term is enriched in the 1 mm ontogenetic stage,

552 supporting that envenomation and TTL transcripts contributes to a component of the differentially

553 expressed genes. In total, 103 TTL transcripts are identified in the four ontogenetic stages, with 103,

554 103, 103 and 100 TTLs expressed with an FPKM > 0 in the 1, 3, 6, and 9 mm stages, respectively

555 (Supplementary Table 11). Only 13 TTL transcripts are differentially expressed across ontogeny (Figure

556 4D). TTL expression profiles are most similar in the 3 and 6 mm stages, while the largest stage (9 mm)

557 has the most divergent profile. Differentially expressed TTL transcripts are divided into four

558 subclusters, representing transcripts upregulated in 1 and 3 mm stage (subcluster 1), transcripts

559 upregulated in 1 mm stage (subcluster 2), transcripts upregulated in the 9 mm stage (subcluster 3),

560 transcripts upregulated in 3 and 6 mm stages (subcluster 4) (Figure 4E). The different subclusters are

561 made up of three, four, three, and three transcripts for subclusters 1–4, respectively (Figure 4F). This

562 indicates that the expression of TTL genes is weakly influenced by ontogeny, where different TTLs are

563 expressed in juveniles of different sizes. Limited differences in the expression patterns of TTL genes

564 are observed across the developmental stages (gastrula, planula, and adult) in N. vectensis

565 (Supplementary Figure 4B and 5B). Of the 63 TTL transcripts identified in N. vectensis, only five are

22 Molecular Ecology Page 24 of 50

566 differentially expressed, all of which are upregulated in the adult developmental stage

567 (Supplementary Table 11).

568

569 4. Discussion

570 Comparative analysis of TTL genes

571 Previous comparative genomic studies of venom evolution have been restricted to specific

572 TTL gene families within taxa (Sunagar & Moran, 2015), or have had limited taxonomic representation 573 (Casewell, Huttley, & Wüster,For 2012; ReviewCasewell et al., 2013; Only Fry et al., 2009). Here we present a large 574 scale comparative analysis of TTL gene family distribution across multiple metazoan genomes and

575 show that a process of convergent amplification of TTL genes and gene families occurs in most

576 venomous lineages (Figure 1A). This process of convergent amplification is explained by both gene

577 redundancy, through increased copy number within a given gene family, and increased diversity of

578 TTL gene families. Genetic redundancy is thought to increase the abundance of specific proteins acting

579 against a limited number of molecular targets (Barve & Wagner, 2013; Jackson et al., 2016; Kafri,

580 Springer, & Pilpel, 2009; Moran et al., 2008; Morgenstern & King, 2013; Nicosia et al., 2013; Wang,

581 Yap, Chua, & Khoo, 2008), while the evolution of new gene families leads to the production of proteins

582 with increased molecular target diversity (Casewell et al., 2013; Fry et al., 2009). This supports the

583 idea that venomous animals are characterised by a high proportion of lineage-specific TTL genes

584 (Casewell et al., 2013; Fry et al., 2009, 2003; Habermann, 1972; Olivera et al., 2012; Terlau & Olivera,

585 2004). This is evident in our results with lineage-specific genes contributing to almost half of the total

586 TTL gene families identified across Metazoa. While lineage-specific TTL gene families are common in

587 venomous species, our analysis indicates that they contribute less to the process of convergent

588 amplification as they have lower copy number compared to widespread TTL gene families. The first

589 convergent amplification of TTL gene families was found in phylum Cnidaria, the oldest extant

590 venomous lineage. 23 Page 25 of 50 Molecular Ecology

591 In line with previous studies, our analysis found multiple TTL gene families shared across all

592 cnidarian species examined (Jaimes-Becerra et al., 2017; Rachamim et al., 2015). These shared TTL

593 gene families have undergone significant and repeated gene duplication events, accounting for the

594 majority of the TTL gene complement identified. While we observe lineage-specific variation of TTL

595 gene complements within cnidarians, this is likely a consequence of gene loss or convergent

596 recruitment, with a limited role of de novo gene formation. Such a pattern supports the hypothesis of

597 convergent recruitment where the repurposing of pre-existing gene families into toxins plays a

598 dominant role in the evolution of venomous lineages. In addition, our data supports the notion that

599 the repeated duplication andFor evolution Review of gene families Only plays an important role in the evolution of

600 venom in cnidarians (Jaimes-Becerra et al., 2017; Jouiaei et al., 2015; Moran et al., 2008; Rachamim

601 et al., 2015).

602 Currently, there is no consensus about whether selective constraints act similarly on both

603 lineage-specific and widespread TTL genes in ancient venomous taxa (Jouiaei et al., 2015; Pineda et

604 al., 2014; Ruder et al., 2013; Sunagar et al., 2013; Undheim et al., 2014a, 2014b). In actiniarians, we

605 demonstrate that lineage-specific TTL genes evolve in a similar way to widespread TTL genes. In fact,

606 purifying selection plays the dominant role in the evolution of both lineage-specific and widespread

607 TTL gene families in this group (Figure 3). This extends the findings of Sunagar and Moran (2015) that

608 TTL genes in ancient animal lineages are more likely to be under purifying selection. In their two-speed

609 mode of evolution hypothesis, it is suggested that toxin-encoding genes in younger venomous lineages

610 are evolving under positive selection to confer an advantage with a shift in their ecological niche.

611 The evolution of venom composition in a number of lineages is thought to be dominated by

612 ecological factors, such as prey type availability (Amazonas et al., 2018; Dowell et al., 2018; Gibbs &

613 Mackessy, 2009; Mackessy, Sixberry, Heyborne, & Fritts, 2006; Sunagar, Morgenstern, Reitzel, &

614 Moran, 2016). In our analysis of sea anemone TTL genes we found that the toxin gene complement is

615 consistent with the relatedness of species and share a greater number of TTL genes compared to those

24 Molecular Ecology Page 26 of 50

616 that share an ecological niche. This suggests that potentially phylogenetic inertia is an important

617 mechanism in the evolution and distribution of TTL genes in sea anemones. Studies investigating the

618 sequence variation of TTL gene families in cnidarians consistently report a similar pattern (Jouiaei et

619 al., 2015; Macrander et al., 2016, 2015; Macrander & Daly, 2016). This is in contrast to what has been

620 observed in other venomous lineages, such as snakes, which show significant differences in their

621 sequence variation and in TTL gene complement within and across lineages (Amazonas et al., 2018;

622 Chippaux, Williams, & White, 1991; Dowell et al., 2018, 2016; Wooldridge et al., 2001). Many genes

623 encoding toxins in snakes show strong evidence of positive selection, where non-synonymous

624 mutations are thought to conferFor advantages Review to the venomous Only organism in their ecological niche (Gibbs

625 & Mackessy, 2009; Gibbs, Sanz, Sovic, & Calvete, 2013). Given the strong evidence of positive selection

626 acting on many genes encoding toxins in snakes (Sunagar & Moran, 2015), a dominant role for ecology

627 driving venom composition is possible. In contrast, most TTL genes in actiniarians display patterns of

628 nucleotide variation consistent with the action of purifying selection and the toxin gene complement

629 is related to phylogeny. This may mean that other molecular mechanisms, such as gene regulation

630 that drives changes in gene expression, are important determinants of venom composition in these

631 species. Interestingly, ecological factors, such as temperature, have been shown to impact expression

632 of toxin genes in some sea anemones (O’Hara, Caldwell, & Bythell, 2018). While TTL gene families are

633 under strong selective constraint in actiniarians, some members are highly expressed in a tissue-

634 specific pattern, highlighting an alternative mechanism for a venomous lineage to generate different

635 venom profiles to meet the organisms ecological requirements (Ames & Macrander, 2016; Fingerhut

636 et al., 2018; Gao et al., 2018; Hu, Bandyopadhyay, Olivera, & Yandell, 2012; Macrander et al., 2016;

637 Modica et al., 2015; Walker et al., 2018).

638

639 Expression differences of TTLs and the production of multiple venoms

640 In sea anemones, venom peptides are restricted primarily to gland cells and nematocytes, a

25 Page 27 of 50 Molecular Ecology

641 stinging cell type that contains the envenomation machinery (nematocyst) and is widely distributed

642 throughout the cnidarian body (Fautin, 2009; Fautin & Mariscal, 1991; Kass-Simon & Scappaticci, 2002;

643 Moran et al., 2012a). Importantly, nematocysts show significant heterogeneity in their density and

644 morphology across tissue types (Basulto et al., 2006; Ewer & Fox, 1947; Fautin, 2009; Fautin &

645 Mariscal, 1991). Our results demonstrate that venom gene expression and protein localisation are

646 consistent with changes in nematocyte populations across the three tissue types that use venoms for

647 different functions. This data is consistent with recent observations that some venomous animals

648 produce functionally distinct venoms used in and defence from two discrete venom glands

649 or even regions of the sameFor venom glandReview (Dutertre et al., Only2014; Fingerhut et al., 2018; Gao et al., 2018;

650 Hu et al., 2012; Modica et al., 2015; Walker et al., 2018). Our data indicates that sea anemone species

651 probably produce at least three distinct venoms across the three tissue types we analysed. This is

652 consistent with previous studies that have revealed dynamic spatiotemporal gene expression of toxins

653 across both development stages and the whole body of actiniarians, with differences observable at

654 single cells (Columbus-Shenkar et al., 2018; Macrander et al., 2016; Moran et al., 2012b; Nicosia et al.,

655 2013; Sebé-Pedrós et al., 2018; Sunagar et al., 2018). Taken together, these results support the

656 evidence of the compartmentalisation of toxins to cells located within and among different tissue

657 types in cnidarians. We hypothesise that differences in regulatory variation drive the observed

658 expression changes that underlie functionally distinct venom profiles among cells within an organism.

659 In venomous taxa, novel morphological (venom gland) and genetic innovations (toxin genes)

660 co-evolve to meet the ecological requirements of an organism (Dutertre et al., 2014; Undheim et al.,

661 2015; Walker et al., 2018). In sea anemones, the functional and ecological roles of the morphological

662 structures used for envenomation modulates the gene expression of toxins. This is evident with

663 enzymatic toxins that have a role in protein, lipid, and carbohydrate metabolism, expressed in the

664 mesenteric filaments, a morphological structure used for digestion and envenomation. Previous

665 evidence supports this with expression of PLA2 localised to the nematocytes in the mesenteric

26 Molecular Ecology Page 28 of 50

666 filaments, suggesting its role in both digestion and envenomation (Fautin & Mariscal, 1991;

667 Schlesinger, Zlotkin, Kramarsky-Winter, & Loya, 2009). Additionally, we found that neurotoxins are

668 principally expressed in the tentacles and acrorhagi of A. tenebrosa. This pattern of neurotoxin

669 expression corresponds well with function as tentacles are used in prey capture and defence. In

670 support of our data, Ate1a, a potassium channel neurotoxin which paralyses potential prey species,

671 was found to be localised to nematocytes in the tentacles of A. tenebrosa (Madio et al., 2018). Previous

672 studies have also identified toxins to be expressed in the acrorhagi of actiniarians. However, limited

673 studies have characterised the function of venoms restricted to this novel morphological structure

674 (Honma et al., 2005). SinceFor acrorhagi Review are solely used in intraspecificOnly aggression, we hypothesise that

675 the venom cocktail distinct to acrorhagi is specialised to envenomate other sea anemones (Honma et

676 al., 2005). This is plausible given evidence of venom cocktails being prey-specific (Barlow, Pook,

677 Harrison, & Wüster, 2009; Gibbs & Mackessy, 2009). Furthermore, the acrorhagi, which is an

678 Actinioidea-specific morphological structure, shows evidence of tissue-specific expression of

679 Actinioidea-specific TTLs, specifically acrorhagins 1, 2, and sea anemone type 3 (BDS-LIKE) KTx. We

680 also provide evidence of the protein localisation of sea anemone type 3 (BDS-LIKE) KTx to the

681 acrorhagi. This evidence supports that even within a lineage that shares a common venomous

682 ancestor, novel morphological and genetic innovations co-evolve to meet the ecological requirements

683 of the organism.

684 Whether differences in the combination of toxins in venom are associated with gene

685 expression changes, or divergent selection regimes acting on changes in protein sequence, is

686 fundamental to understanding venom evolution (McLysaght & Hurst, 2016). The prevailing paradigm

687 describes the evolution of venom based on divergent selective pressure on new protein variants

688 generated through convergent amplification of TTL gene families, and driven by ecological pressures

689 (Casewell et al., 2013; Fry et al., 2009; Sunagar & Moran, 2015). Contrary to this expectation, we find

690 that convergent amplification works in concert with changes in gene expression levels to generate

27 Page 29 of 50 Molecular Ecology

691 multiple distinct venom profiles within a single organism. Specifically, in sea anemones phylogeny is

692 correlated with the toxin gene complement, and ecological factors help drive changes in the

693 expression of toxin genes to produce functionally distinct venoms profiles. Compartmentalisation of

694 toxins across tissue types allows sea anemones to produce venoms with distinct biochemical and

695 pharmacological properties that supports the functional roles of the tissue types they are restricted

696 to. Indeed, current evidence supports an additional level of complexity, with venoms localised to

697 discrete cells, or populations of cells within tissue types in sea anemones. Currently, it remains

698 unresolved whether the localisation of functionally distinct venoms to discrete cells is a unique

699 adaptive trait of sea anemones,For with Review further evidence requiredOnly to determine if this is shared across

700 Cnidaria, or has independently evolved in other venomous lineages. The multiple lines of evidence

701 indicate that the expression of TTL genes can be restricted spatiotemporally to produce functionally

702 different venom cocktails to meet the ecological and life history requirements of the organism.

703

704 Acknowledgments 705 The authors would like to thank QUT Marine group for their help and advice caring for the

706 animals. Computational resources and services used in this work were provided by the High

707 Performance Computing and Research Support Group, Queensland University of Technology,

708 Brisbane, Australia. This work was supported by QUT’s PhD Enabling Award, the Brazilian Government

709 (Science Without Borders PhD scholarship to BM), Australian Research Council (DECRA Fellowship

710 DE160101142 to EABU, ARC Linkage Grant LP140100832 to BRH and GFK), and National Health &

711 Medical Research Council (Principal Research Fellowship APP1044414 to GFK).

712

713 Author Contributions

714 JMS, HLS, CAVDB, PJP and AP collected organism samples. JMS, HLS, CAVDB assembled and

715 annotated transcriptomes. Selection and phylogenetic analyses were performed by JMS and PJP. 28 Molecular Ecology Page 30 of 50

716 Sequence validation was performed by HLS and JMS. Mass Spectrometry, venom extraction, HPLC,

717 MALDI, LC-MS/MS, MSI was performed by BM, EABU, GFK and BRH. All authors read, edited and

718 approved the final manuscript.

719

720 Data accessibility 721 Tissue-specific and ontogenetic RNA-seq data are available at the NCBI sequence read archive

722 under the accession numbers SUB2040667 and SUB2043941, respectively. A description and overview

723 of the project are available under the BioProject accession number PRJNA350366. A description of the

724 validated TTL genes using ForSanger sequencing Review can be found Only in Supplementary Table 8. Briefly these

725 validated TTL genes’ GenBank accession numbers are KY176759, KY176760, KY176761, KY176762,

726 KY176763, KY176764, KY176765, KY176766, KY176768, KY176769, KY176770, and KY176771.

727

728

29 Page 31 of 50 Molecular Ecology

729 Reference list

730

731 Amazonas, D. R., Portes-Junior, J. A., Nishiyama-Jr, M. Y., Nicolau, C. A., Chalkidis, H. M.,

732 Mourão, R. H. V., … Moura-da-Silva, A. M. (2018). Molecular mechanisms underlying

733 intraspecific variation in snake venom. Journal of Proteomics, 181, 60–72.

734 doi.org/10.1016/j.jprot.2018.03.032

735 Ames, C. L., & Macrander, J. (2016). Evidence for an alternative mechanism of toxin production in

736 the box jellyfish Alatina alata. Integrative and Comparative Biology, 56(5), 973–988.

737 doi.org/10.1093/icb/icw113For Review Only

738 Angeli, A., Zara, F. J., Turra, A., & Gorman, D. (2016). Towards a standard measure of sea anemone

739 size: assessing the accuracy and precision of morphological measures for cantilever-like

740 animals. Marine Ecology, 37(5), 1019–1026. doi.org/10.1111/maec.12315

741 Babonis, L. S., Martindale, M. Q., & Ryan, J. F. (2016). Do novel genes drive morphological novelty?

742 An investigation of the nematosomes in the sea anemone Nematostella vectensis. BMC

743 Evolutionary Biology, 16(1). doi.org/10.1186/s12862-016-0683-3

744 Barlow, A., Pook, C. E., Harrison, R. A., & Wüster, W. (2009). Coevolution of diet and prey-specific

745 venom activity supports the role of selection in snake venom evolution. Proceedings of the

746 Royal Society of London B: Biological Sciences, 276(1666), 2443–2449.

747 doi.org/10.1098/rspb.2009.0048

748 Barve, A., & Wagner, A. (2013). A latent capacity for evolutionary innovation through exaptation in

749 metabolic systems. Nature, 500(7461), 203–206. doi.org/10.1038/nature12301

750 Basulto, A., Pérez, V. M., Noa, Y., Varela, C., Otero, A. J., & Pico, M. C. (2006).

751 Immunohistochemical targeting of sea anemone cytolysins on tentacles, mesenteric filaments

752 and isolated nematocysts of Stichodactyla helianthus. Journal of Experimental Zoology Part

753 A: Comparative Experimental Biology, 305A(3), 253–258. doi.org/10.1002/jez.a.256

754 Baumgarten, S., Simakov, O., Esherick, L. Y., Liew, Y. J., Lehnert, E. M., Michell, C. T., …

30 Molecular Ecology Page 32 of 50

755 Voolstra, C. R. (2015). The genome of Aiptasia, a sea anemone model for coral symbiosis.

756 Proceedings of the National Academy of Sciences, 112(38), 11893–11898.

757 doi.org/10.1073/pnas.1513318112

758 Beckmann, A., & Özbek, S. (2012). The nematocyst: a molecular map of the cnidarian stinging

759 organelle. International Journal of Developmental Biology, 56(6-7–8), 577–582.

760 doi.org/10.1387/ijdb.113472ab

761 Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina

762 sequence data. Bioinformatics, 30(15), 2114–2120. doi.org/10.1093/bioinformatics/btu170 763 Caprioli, R. M., Farmer, T.For B., & Gile, Review J. (1997). Molecular Only imaging of biological samples: 764 localization of peptides and proteins using MALDI-TOF MS. Analytical Chemistry, 69(23),

765 4751–4760. doi.org/10.1021/ac970888i

766 Casewell, N. R., Huttley, G. A., & Wüster, W. (2012). Dynamic evolution of venom proteins in

767 squamate reptiles. Nature Communications, 3, 1066. doi.org/10.1038/ncomms2065

768 Casewell, N. R., Wüster, W., Vonk, F. J., Harrison, R. A., & Fry, B. G. (2013). Complex cocktails:

769 the evolutionary novelty of venoms. Trends in Ecology & Evolution, 28(4), 219–229.

770 doi.org/10.1016/j.tree.2012.10.020

771 Chippaux, J. P., Williams, V., & White, J. (1991). Snake venom variability: methods of study, results

772 and interpretation. Toxicon, 29(11), 1279–1303.

773 Columbus-Shenkar, Y. Y., Sachkova, M. Y., Macrander, J., Fridrich, A., Modepalli, V., Reitzel, A.

774 M., … Moran, Y. (2018). Dynamics of venom composition across a complex life cycle. eLife,

775 7, e35014. doi.org/10.7554/eLife.35014

776 Daly, M. (2016). Functional and genetic diversity of toxins in sea anemones. In P. Gopalakrishnakone

777 & A. Malhotra (Eds.), Evolution of Venomous Animals and Their Toxins (pp. 1–18).

778 Dordrecht: Springer Netherlands. doi.org/10.1007/978-94-007-6727-0_17-1

779 David, C. N., Özbek, S., Adamczyk, P., Meier, S., Pauly, B., Chapman, J., … Holstein, T. W. (2008).

780 Evolution of complex structures: minicollagens shape the cnidarian nematocyst. Trends in

781 Genetics, 24(9), 431–438. doi.org/10.1016/j.tig.2008.07.001 31 Page 33 of 50 Molecular Ecology

782 Dnyansagar, R., Zimmermann, B., Moran, Y., Praher, D., Sundberg, P., Møller, L. F., & Technau, U.

783 (2018). Dispersal and speciation: the cross Atlantic relationship of two parasitic cnidarians.

784 Molecular Phylogenetics and Evolution, 126, 346–355. doi.org/10.1016/j.ympev.2018.04.035

785 Dowell, N. L., Giorgianni, M. W., Griffin, S., Kassner, V. A., Selegue, J. E., Sanchez, E. E., &

786 Carroll, S. B. (2018). Extremely divergent haplotypes in two toxin gene complexes encode

787 alternative venom types within rattlesnake species. Current Biology, 28(7), 1016-1026.e4.

788 doi.org/10.1016/j.cub.2018.02.031

789 Dowell, N. L., Giorgianni, M. W., Kassner, V. A., Selegue, J. E., Sanchez, E. E., & Carroll, S. B. 790 (2016). The deep originFor and recentReview loss of venom Only toxin genes in rattlesnakes. Current 791 Biology, 26(18), 2434–2445. doi.org/10.1016/j.cub.2016.07.038

792 Dutertre, S., Jin, A.-H., Vetter, I., Hamilton, B., Sunagar, K., Lavergne, V., … Lewis, R. J. (2014).

793 Evolution of separate predation- and defence-evoked venoms in carnivorous cone snails.

794 Nature Communications, 5, 3521. doi.org/10.1038/ncomms4521

795 Erwin, D. H., Laflamme, M., Tweedt, S. M., Sperling, E. A., Pisani, D., & Peterson, K. J. (2011). The

796 Cambrian conundrum: early divergence and later ecological success in the early history of

797 animals. Science (New York, N.Y.), 334(6059), 1091–1097. doi.org/10.1126/science.1206375

798 Ewer, R. F., & Fox, H. M. (1947). On the functions and mode of action of the nematocysts of Hydra.

799 Proceedings of the Zoological Society of London, 117(2–3), 365–376. doi.org/10.1111/j.1096-

800 3642.1947.tb00524.x

801 Farris, J. S. (1977). Phylogenetic analysis under Dollo’s Law. Systematic Biology, 26(1), 77–88.

802 doi.org/10.1093/sysbio/26.1.77

803 Fautin, D. G. (2009). Structural diversity, systematics, and evolution of cnidae. Toxicon, 54(8), 1054–

804 1064. doi.org/10.1016/j.toxicon.2009.02.024

805 Fautin, D. G., & Mariscal, R. N. (1991). Cnidaria: . In F. Harrison & J. Westfall (Eds.) (Vol.

806 2, pp. 267 –358). New York: Wiley-Liss.

807 Felsenstein, J. (1989). PHYLIP - Phylogeny Inference Package (Version 3.2). Cladistics, 5(2), 163–

808 166. doi.org/10.1111/j.1096-0031.1989.tb00562.x 32 Molecular Ecology Page 34 of 50

809 Fingerhut, L. C. H. W., Strugnell, J. M., Faou, P., Labiaga, Á. R., Zhang, J., & Cooke, I. R. (2018).

810 Shotgun proteomics analysis of saliva and salivary gland tissue from the common octopus

811 Octopus vulgaris. Journal of Proteome Research, 17(11), 3866–3876.

812 doi.org/10.1021/acs.jproteome.8b00525

813 Fry, B. G., Roelants, K., Champagne, D. E., Scheib, H., Tyndall, J. D. A., King, G. F., … Vega, R. C.

814 R. de la. (2009). The toxicogenomic multiverse: convergent recruitment of proteins into

815 animal venoms. Annual Review of Genomics and Human Genetics, 10(1), 483–511.

816 doi.org/10.1146/annurev.genom.9.081307.164356 817 Fry, B. G., Wüster, W., Kini,For R. M., Brusic,Review V., Khan, A., Only Venkataraman, D., & Rooney, A. P. 818 (2003). Molecular evolution and phylogeny of elapid snake venom three-finger toxins.

819 Journal of Molecular Evolution, 57(1), 110–129. doi.org/10.1007/s00239-003-2461-2

820 Fu, L., Niu, B., Zhu, Z., Wu, S., & Li, W. (2012). CD-HIT: accelerated for clustering the next-

821 generation sequencing data. Bioinformatics, 28(23), 3150–3152.

822 doi.org/10.1093/bioinformatics/bts565

823 Gao, B., Peng, C., Zhu, Y., Sun, Y., Zhao, T., Huang, Y., & Shi, Q. (2018). High throughput

824 identification of novel conotoxins from the vermivorous oak cone snail (Conus quercinus) by

825 transcriptome sequencing. International Journal of Molecular Sciences, 19(12), 3901.

826 doi.org/10.3390/ijms19123901

827 Gibbs, H. L., & Mackessy, S. P. (2009). Functional basis of a molecular adaptation: prey-specific

828 toxic effects of venom from Sistrurus rattlesnakes. Toxicon, 53(6), 672–679.

829 doi.org/10.1016/j.toxicon.2009.01.034

830 Gibbs, H. L., Sanz, L., Sovic, M. G., & Calvete, J. J. (2013). Phylogeny-based comparative analysis

831 of venom proteome variation in a clade of rattlesnakes (Sistrurus sp.). PLOS ONE, 8(6),

832 e67220. doi.org/10.1371/journal.pone.0067220

833 Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., … Regev, A.

834 (2011). Full-length transcriptome assembly from RNA-Seq data without a reference genome.

835 Nature Biotechnology, 29(7), 644–652. doi.org/10.1038/nbt.1883 33 Page 35 of 50 Molecular Ecology

836 Grajales, A., & Rodríguez, E. (2014). Morphological revision of the genus Aiptasia and the family

837 Aiptasiidae (Cnidaria, Actiniaria, Metridioidea). Zootaxa, 3826(1), 55–100.

838 doi.org/10.11646/zootaxa.3826.1.2

839 Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., … Regev, A.

840 (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform

841 for reference generation and analysis. Nature Protocols, 8(8), 1494–1512.

842 doi.org/10.1038/nprot.2013.084

843 Habermann, E. (1972). Bee and wasp venoms. Science, 177(4046), 314–322. 844 doi.org/10.1126/science.177.4046.314For Review Only 845 Hale, J. E., Butler, J. P., Gelfanova, V., You, J.-S., & Knierman, M. D. (2004). A simplified

846 procedure for the reduction and alkylation of cysteine residues in proteins prior to proteolytic

847 digestion and mass spectral analysis. Analytical Biochemistry, 333(1), 174–181.

848 doi.org/10.1016/j.ab.2004.04.013

849 Honma, T., Minagawa, S., Nagai, H., Ishida, M., Nagashima, Y., & Shiomi, K. (2005). Novel peptide

850 toxins from acrorhagi, aggressive organs of the sea anemone Actinia equina. Toxicon, 46(7),

851 768–774. doi.org/10.1016/j.toxicon.2005.08.003

852 Honma, T., & Shiomi, K. (2006). Peptide toxins in sea anemones: structural and functional aspects.

853 Marine Biotechnology, 8(1), 1–10. doi.org/10.1007/s10126-005-5093-2

854 Hu, H., Bandyopadhyay, P. K., Olivera, B. M., & Yandell, M. (2012). Elucidation of the molecular

855 envenomation strategy of the cone snail Conus geographus through transcriptome sequencing

856 of its venom duct. BMC Genomics, 13(1), 284. doi.org/10.1186/1471-2164-13-284

857 Jackson, T. N. W., Koludarov, I., Ali, S. A., Dobson, J., Zdenek, C. N., Dashevsky, D., … Fry, B. G.

858 (2016). Rapid radiations and the race to redundancy: an investigation of the evolution of

859 Australian elapid snake venoms. Toxins, 8(11), 309. doi.org/10.3390/toxins8110309

860 Jaimes-Becerra, A., Chung, R., Morandini, A. C., Weston, A. J., Padilla, G., Gacesa, R., … Marques,

861 A. C. (2017). Comparative proteomics reveals recruitment patterns of some protein families

862 in the venoms of Cnidaria. Toxicon, 137, 19–26. doi.org/10.1016/j.toxicon.2017.07.012 34 Molecular Ecology Page 36 of 50

863 Jouiaei, M., Sunagar, K., Federman Gross, A., Scheib, H., Alewood, P. F., Moran, Y., & Fry, B. G.

864 (2015). Evolution of an ancient venom: recognition of a novel family of cnidarian toxins and

865 the common evolutionary origin of sodium and potassium neurotoxins in sea anemone.

866 Molecular Biology and Evolution, 32(6), 1598–1610. doi.org/10.1093/molbev/msv050

867 Jungo, F., & Bairoch, A. (2005). Tox-Prot, the toxin protein annotation program of the Swiss-Prot

868 protein knowledgebase. Toxicon, 45(3), 293–301. doi.org/10.1016/j.toxicon.2004.10.018

869 Kafri, R., Springer, M., & Pilpel, Y. (2009). Genetic redundancy: new tricks for old genes. Cell,

870 136(3), 389–392. doi.org/10.1016/j.cell.2009.01.027 871 Kass-Simon, G., & Scappaticci,For A. A.Review (2002). The behavioral Only and developmental physiology of 872 nematocysts. Canadian Journal of Zoology, 80(10), 1772–1794. doi.org/10.1139/z02-135

873 Katoh, K., & Standley, D. M. (2013). MAFFT multiple sequence alignment software version 7:

874 improvements in performance and usability. Molecular Biology and Evolution, 30(4), 772–

875 780. doi.org/10.1093/molbev/mst010

876 Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., … Drummond, A.

877 (2012). Geneious Basic: an integrated and extendable desktop software platform for the

878 organization and analysis of sequence data. Bioinformatics, 28(12), 1647–1649.

879 doi.org/10.1093/bioinformatics/bts199

880 Larson, P. (2017). Brooding sea anemones (Cnidaria: Anthozoa: Actiniaria): paragons of diversity in

881 mode, morphology, and maternity. Invertebrate Biology, 136(1), 92–112.

882 doi.org/10.1111/ivb.12159

883 Li, B., & Dewey, C. N. (2011). RSEM: accurate transcript quantification from RNA-Seq data with or

884 without a reference genome. BMC Bioinformatics, 12, 323. doi.org/10.1186/1471-2105-12-

885 323

886 Li, L., Stoeckert, C. J., & Roos, D. S. (2003). OrthoMCL: identification of ortholog groups for

887 eukaryotic genomes. Genome Research, 13(9), 2178–2189. doi.org/10.1101/gr.1224503

888 Mackessy, S. P., Sixberry, N. M., Heyborne, W. H., & Fritts, T. (2006). Venom of the brown

889 treesnake, Boiga irregularis: ontogenetic shifts and taxa-specific toxicity. Toxicon, 47(5), 35 Page 37 of 50 Molecular Ecology

890 537–548. doi.org/10.1016/j.toxicon.2006.01.007

891 Macrander, J., Broe, M., & Daly, M. (2016). Tissue-specific venom composition and differential gene

892 expression in sea anemones. Genome Biology and Evolution, 8(8), 2358–2375.

893 doi.org/10.1093/gbe/evw155

894 Macrander, J., Brugler, M. R., & Daly, M. (2015). A RNA-seq approach to identify putative toxins

895 from acrorhagi in aggressive and non-aggressive Anthopleura elegantissima polyps. BMC

896 Genomics, 16, 221. doi.org/10.1186/s12864-015-1417-4

897 Macrander, J., & Daly, M. (2016). Evolution of the cytolytic pore-forming proteins (Actinoporins) in 898 sea anemones. ToxinsFor, 8(12). Review doi.org/10.3390/toxins8120368 Only 899 Madio, B., Peigneur, S., Chin, Y. K. Y., Hamilton, B. R., Henriques, S. T., Smith, J. J., … Undheim,

900 E. A. B. (2018). PHAB toxins: a unique family of predatory sea anemone toxins evolving via

901 intra-gene concerted evolution defines a new peptide fold. Cellular and Molecular Life

902 Sciences, 75(24), 4511–4524. doi.org/10.1007/s00018-018-2897-6

903 Madio, B., Undheim, E. A. B., & King, G. F. (2017). Revisiting venom of the sea anemone

904 Stichodactyla haddoni: omics techniques reveal the complete toxin arsenal of a well-studied

905 sea anemone genus. Journal of Proteomics, 166, 83–92. doi.org/10.1016/j.jprot.2017.07.007

906 Malpezzi, E. L., de Freitas, J. C., Muramoto, K., & Kamiya, H. (1993). Characterization of peptides in

907 sea anemone venom collected by a novel procedure. Toxicon, 31(7), 853–864.

908 McLysaght, A., & Hurst, L. D. (2016). Open questions in the study of de novo genes: what, how and

909 why. Nature Reviews Genetics, 17(9), 567–578. doi.org/10.1038/nrg.2016.78

910 Menon, L. R., McIlroy, D., & Brasier, M. D. (2013). Evidence for Cnidaria-like behavior in ca. 560

911 Ma Ediacaran Aspidella. Geology, 41(8), 895–898. doi.org/10.1130/G34424.1

912 Mitchell, M. L., Hamilton, B. R., Madio, B., Morales, R. A. V., Tonkin-Hill, G. Q., Papenfuss, A. T.,

913 … Norton, R. S. (2017). The use of imaging mass spectrometry to study peptide toxin

914 distribution in Australian sea anemones. Australian Journal of Chemistry, 70(11), 1235–1237.

915 doi.org/10.1071/CH17228

916 Modica, M. V., Lombardo, F., Franchini, P., & Oliverio, M. (2015). The venomous cocktail of the 36 Molecular Ecology Page 38 of 50

917 vampire snail Colubraria reticulata (, ). BMC Genomics, 16(1), 441.

918 doi.org/10.1186/s12864-015-1648-4

919 Moran, Y., Genikhovich, G., Gordon, D., Wienkoop, S., Zenkert, C., Özbek, S., … Gurevitz, M.

920 (2012a). Neurotoxin localization to ectodermal gland cells uncovers an alternative mechanism

921 of venom delivery in sea anemones. Proceedings of the Royal Society B: Biological Sciences,

922 279(1732), 1351–1358. doi.org/10.1098/rspb.2011.1731

923 Moran, Y., Praher, D., Schlesinger, A., Ayalon, A., Tal, Y., & Technau, U. (2012b). Analysis of

924 soluble protein contents from the nematocysts of a model sea anemone sheds light on venom 925 evolution. Marine ForBiotechnology Review, 15(3), 329–339. Only doi.org/10.1007/s10126-012-9491-y 926 Moran, Y., Weinberger, H., Sullivan, J. C., Reitzel, A. M., Finnerty, J. R., & Gurevitz, M. (2008).

927 Concerted evolution of sea anemone neurotoxin genes is revealed through analysis of the

928 Nematostella vectensis genome. Molecular Biology and Evolution, 25(4), 737–747.

929 doi.org/10.1093/molbev/msn021

930 Morgenstern, D., & King, G. F. (2013). The venom optimization hypothesis revisited. Toxicon, 63,

931 120–128. doi.org/10.1016/j.toxicon.2012.11.022

932 Nguyen, L.-T., Schmidt, H. A., von Haeseler, A., & Minh, B. Q. (2015). IQ-TREE: a fast and

933 effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular

934 Biology and Evolution, 32(1), 268–274. doi.org/10.1093/molbev/msu300

935 Nicosia, A., Maggio, T., Mazzola, S., Cuttitta, A., Nicosia, A., Maggio, T., … Cuttitta, A. (2013).

936 Evidence of accelerated evolution and ectodermal-specific expression of presumptive BDS

937 toxin cDNAs from Anemonia viridis. Marine Drugs, 11(11), 4213–4231.

938 doi.org/10.3390/md11114213

939 Norton, R. S. (1991). Structure and structure-function relationships of sea anemone proteins that

940 interact with the sodium channel. Toxicon, 29(9), 1051–1084. doi.org/10.1016/0041-

941 0101(91)90205-6

942 Norton, R. S. (2009). Structures of sea anemone toxins. Toxicon, 54(8), 1075–1088.

943 doi.org/10.1016/j.toxicon.2009.02.035 37 Page 39 of 50 Molecular Ecology

944 O’Hara, E. P., Caldwell, G. S., & Bythell, J. (2018). Equistatin and equinatoxin gene expression is

945 influenced by environmental temperature in the sea anemone Actinia equina. Toxicon, 153,

946 12–16. doi.org/10.1016/j.toxicon.2018.08.004

947 Oliveira, J. S., Fuentes-Silva, D., & King, G. F. (2012). Development of a rational nomenclature for

948 naming peptide and protein toxins from sea anemones. Toxicon, 60(4), 539–550.

949 doi.org/10.1016/j.toxicon.2012.05.020

950 Olivera, B. M., Watkins, M., Bandyopadhyay, P., Imperial, J. S., de la Cotera, E. P. H., Aguilar, M.

951 B., … Lluisma, A. (2012). Adaptive radiation of venomous marine snail lineages and the 952 accelerated evolutionFor of venom Review peptide genes. Annals Only of the New York Academy of Sciences, 953 1267, 61–70. doi.org/10.1111/j.1749-6632.2012.06603.x

954 Özbek, S. (2010). The cnidarian nematocyst: a miniature extracellular matrix within a secretory

955 vesicle. Protoplasma, 248(4), 635–640. doi.org/10.1007/s00709-010-0219-4

956 Park, E., Hwang, D.-S., Lee, J.-S., Song, J.-I., Seo, T.-K., & Won, Y.-J. (2012). Estimation of

957 divergence times in cnidarian evolution based on mitochondrial protein-coding genes and the

958 fossil record. Molecular Phylogenetics and Evolution, 62(1), 329–345.

959 doi.org/10.1016/j.ympev.2011.10.008

960 Petersen, T. N., Brunak, S., Heijne, G. von, & Nielsen, H. (2011). SignalP 4.0: discriminating signal

961 peptides from transmembrane regions. Nature Methods, 8(10), 785–786.

962 doi.org/10.1038/nmeth.1701

963 Pineda, S. S., Sollod, B. L., Wilson, D., Darling, A., Sunagar, K., Undheim, E. A. B., … King, G. F.

964 (2014). Diversification of a single ancestral gene into a successful toxin superfamily in highly

965 venomous Australian funnel-web spiders. BMC Genomics, 15, 177. doi.org/10.1186/1471-

966 2164-15-177

967 Pisani, D., Pett, W., Dohrmann, M., Feuda, R., Rota-Stabelli, O., Philippe, H., … Wörheide, G.

968 (2015). Genomic data do not support comb jellies as the sister group to all other animals.

969 Proceedings of the National Academy of Sciences, 112(50), 15402–15407.

970 doi.org/10.1073/pnas.1518127112 38 Molecular Ecology Page 40 of 50

971 Prentis, P. J., & Pavasovic, A. (2014). The Anadara trapezia transcriptome: a resource for molluscan

972 physiological genomics. Marine Genomics, 18 Pt B, 113–115.

973 doi.org/10.1016/j.margen.2014.08.004

974 Prentis, P. J., Pavasovic, A., & Norton, R. S. (2018). Sea anemones: quiet achievers in the field of

975 peptide toxins. Toxins, 10(1), 36. doi.org/10.3390/toxins10010036

976 Rachamim, T., Morgenstern, D., Aharonovich, D., Brekhman, V., Lotan, T., & Sher, D. (2015). The

977 dynamically evolving nematocyst content of an anthozoan, a scyphozoan, and a hydrozoan.

978 Molecular Biology and Evolution, 32(3), 740–753. doi.org/10.1093/molbev/msu335 979 Reitzel, A. M., & Tarrant, A.For M. (2009). Review Nuclear receptor Only complement of the cnidarian Nematostella 980 vectensis: phylogenetic relationships and developmental expression patterns. BMC

981 Evolutionary Biology, 9(1), 230. doi.org/10.1186/1471-2148-9-230

982 Robinson, M. D., McCarthy, D. J., & Smyth, G. K. (2010). edgeR: a Bioconductor package for

983 differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139–

984 140. doi.org/10.1093/bioinformatics/btp616

985 Rodríguez, E., Barbeitos, M. S., Brugler, M. R., Crowley, L. M., Grajales, A., Gusmão, L., … Daly,

986 M. (2014). Hidden among sea anemones: the first comprehensive phylogenetic reconstruction

987 of the order Actiniaria (Cnidaria, Anthozoa, Hexacorallia) reveals a novel group of

988 hexacorals. PLOS ONE, 9(5), e96998. doi.org/10.1371/journal.pone.0096998

989 Ruder, T., Sunagar, K., Undheim, E. A. B., Ali, S. A., Wai, T.-C., Low, D. H. W., … Fry, B. G.

990 (2013). Molecular phylogeny and evolution of the proteins encoded by coleoid (cuttlefish,

991 octopus, and squid) posterior venom glands. Journal of Molecular Evolution, 76(4), 192–204.

992 doi.org/10.1007/s00239-013-9552-5

993 Schlesinger, A., Zlotkin, E., Kramarsky-Winter, E., & Loya, Y. (2009). Cnidarian internal stinging

994 mechanism. Proceedings of the Royal Society B: Biological Sciences, 276(1659), 1063–1067.

995 doi.org/10.1098/rspb.2008.1586

996 Schwaiger, M., Schönauer, A., Rendeiro, A. F., Pribitzer, C., Schauer, A., Gilles, A. F., … Technau,

997 U. (2014). Evolutionary conservation of the eumetazoan gene regulatory landscape. Genome 39 Page 41 of 50 Molecular Ecology

998 Research, 24(4), 639–650. doi.org/10.1101/gr.162529.113

999 Sebé-Pedrós, A., Saudemont, B., Chomsky, E., Plessier, F., Mailhé, M.-P., Renno, J., … Marlow, H.

1000 (2018). Cnidarian cell type diversity and regulation revealed by whole-organism single-cell

1001 RNA-seq. Cell, 173(6), 1520-1534.e20. doi.org/10.1016/j.cell.2018.05.019

1002 Shiomi, K. (2009). Novel peptide toxins recently isolated from sea anemones. Toxicon, 54(8), 1112–

1003 1118. doi.org/10.1016/j.toxicon.2009.02.031

1004 Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V., & Zdobnov, E. M. (2015).

1005 BUSCO: assessing genome assembly and annotation completeness with single-copy 1006 orthologs. BioinformaticsFor, 31 Review(19), 3210–3212. doi.org/10.1093/bioinformatics/btv351 Only 1007 Smith, H. L., Pavasovic, A., Surm, J. M., Phillips, M. J., & Prentis, P. J. (2018). Evidence for a large

1008 expansion and subfunctionalization of globin genes in sea anemones. Genome Biology and

1009 Evolution, 10(8), 1892–1901. doi.org/10.1093/gbe/evy128

1010 Sorek, M., Schnytzer, Y., Ben-Asher, H. W., Caspi, V. C., Chen, C.-S., Miller, D. J., & Levy, O.

1011 (2018). Setting the pace: host rhythmic behaviour and gene expression patterns in the

1012 facultatively symbiotic cnidarian Aiptasia are determined largely by Symbiodinium.

1013 Microbiome, 6(83). doi.org/10.1186/s40168-018-0465-9

1014 Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large

1015 phylogenies. Bioinformatics, 30(9), 1312–1313. doi.org/10.1093/bioinformatics/btu033

1016 Sunagar, K., Columbus-Shenkar, Y. Y., Fridrich, A., Gutkovich, N., Aharoni, R., & Moran, Y.

1017 (2018). Cell type-specific expression profiling unravels the development and evolution of

1018 stinging cells in sea anemone. BMC Biology, 16(1), 108. doi.org/10.1186/s12915-018-0578-4

1019 Sunagar, K., & Moran, Y. (2015). The rise and fall of an evolutionary innovation: contrasting

1020 strategies of venom evolution in ancient and young animals. PLoS Genet, 11(10), e1005596.

1021 doi.org/10.1371/journal.pgen.1005596

1022 Sunagar, K., Morgenstern, D., Reitzel, A. M., & Moran, Y. (2016). Ecological venomics: how

1023 genomics, transcriptomics and proteomics can shed new light on the ecology and evolution of

1024 venom. Journal of Proteomics, 135, 62–72. doi.org/10.1016/j.jprot.2015.09.015 40 Molecular Ecology Page 42 of 50

1025 Sunagar, K., Undheim, E. A. B., Chan, A. H. C., Koludarov, I., Muñoz-Gómez, S. A., Antunes, A., &

1026 Fry, B. G. (2013). Evolution stings: the origin and diversification of scorpion toxin peptide

1027 scaffolds. Toxins, 5(12), 2456–2487. doi.org/10.3390/toxins5122456

1028 Supek, F., Bošnjak, M., Škunca, N., & Šmuc, T. (2011). Revigo summarizes and visualizes long lists

1029 of gene ontology terms. PLOS ONE, 6(7), e21800. doi.org/10.1371/journal.pone.0021800

1030 Surm, J. M., Prentis, P. J., & Pavasovic, A. (2015). Comparative analysis and distribution of omega-3

1031 lcPUFA biosynthesis genes in marine molluscs. PLoS ONE, 10(8), e0136301.

1032 doi.org/10.1371/journal.pone.0136301 1033 Surm, J. M., Toledo, T. M.,For Prentis, P.Review J., & Pavasovic, A. Only (2018). Insights into the phylogenetic and 1034 molecular evolutionary histories of Fad and Elovl gene families in Actiniaria. Ecology and

1035 Evolution, 8(11), 5323–5335. doi.org/10.1002/ece3.4044

1036 Suyama, M., Torrents, D., & Bork, P. (2006). PAL2NAL: robust conversion of protein sequence

1037 alignments into the corresponding codon alignments. Nucleic Acids Research, 34(2), 609–

1038 612. doi.org/10.1093/nar/gkl315

1039 Tarrant, A. M., Reitzel, A. M., Kwok, C. K., & Jenny, M. J. (2014). Activation of the cnidarian

1040 oxidative stress response by ultraviolet radiation, polycyclic aromatic hydrocarbons and crude

1041 oil. The Journal of Experimental Biology, 217(9), 1444–1453. doi.org/10.1242/jeb.093690

1042 Terlau, H., & Olivera, B. M. (2004). Conus venoms: a rich source of novel ion channel-targeted

1043 peptides. Physiological Reviews, 84(1), 41–68. doi.org/10.1152/physrev.00020.2003

1044 Undheim, E. A. B., Hamilton, B. R., Kurniawan, N. D., Bowlay, G., Cribb, B. W., Merritt, D. J., …

1045 Venter, D. J. (2015). Production and packaging of a biological arsenal: evolution of centipede

1046 venoms under morphological constraint. Proceedings of the National Academy of Sciences,

1047 112(13), 4026–4031. doi.org/10.1073/pnas.1424068112

1048 Undheim, E. A. B., Jones, A., Clauser, K. R., Holland, J. W., Pineda, S. S., King, G. F., & Fry, B. G.

1049 (2014a). Clawing through evolution: toxin diversification and convergence in the ancient

1050 lineage Chilopoda (Centipedes). Molecular Biology and Evolution, 31(8), 2124–2148.

1051 doi.org/10.1093/molbev/msu162 41 Page 43 of 50 Molecular Ecology

1052 Undheim, E. A. B., Sunagar, K., Hamilton, B. R., Jones, A., Venter, D. J., Fry, B. G., & King, G. F.

1053 (2014b). Multifunctional warheads: diversification of the toxin arsenal of centipedes via novel

1054 multidomain transcripts. Journal of Proteomics, 102, 1–10.

1055 doi.org/10.1016/j.jprot.2014.02.024

1056 Untergasser, A., Cutcutache, I., Koressaar, T., Ye, J., Faircloth, B. C., Remm, M., & Rozen, S. G.

1057 (2012). Primer3-new capabilities and interfaces. Nucleic Acids Research, 40(15), e115.

1058 doi.org/10.1093/nar/gks596

1059 van der Burg, C. A., Prentis, P. J., Surm, J. M., & Pavasovic, A. (2016). Insights into the innate 1060 immunome of actiniariansFor using Review a comparative genomic Only approach. BMC Genomics, 17, 850. 1061 doi.org/10.1186/s12864-016-3204-2

1062 von Reumont, B. M., Undheim, E. A. B., Jauss, R.-T., & Jenner, R. A. (2017). Venomics of remipede

1063 reveals novel peptide diversity and illuminates the venom’s biological role.

1064 Toxins, 9(8), 234. doi.org/10.3390/toxins9080234

1065 Walker, A. A., Mayhew, M. L., Jin, J., Herzig, V., Undheim, E. A. B., Sombke, A., … King, G. F.

1066 (2018). The assassin bug Pristhesancus plagipennis produces two distinct venoms in separate

1067 gland lumens. Nature Communications, 9(1), 755. doi.org/10.1038/s41467-018-03091-5

1068 Wang, Y., Yap, L. L., Chua, K. L., & Khoo, H. E. (2008). A multigene family of Heteractis

1069 magnificalysins (HMgs). Toxicon, 51(8), 1374–1382. doi.org/10.1016/j.toxicon.2008.03.005

1070 Waterhouse, R. M., Seppey, M., Simão, F. A., Manni, M., Ioannidis, P., Klioutchnikov, G., …

1071 Zdobnov, E. M. (2018). BUSCO applications from quality assessments to gene prediction and

1072 phylogenomics. Molecular Biology and Evolution, 35(3), 543–548.

1073 doi.org/10.1093/molbev/msx319

1074 Wooldridge, B. J., Pineda, G., Banuelas-Ornelas, J. J., Dagda, R. K., Gasanov, S. E., Rael, E. D., &

1075 Lieb, C. S. (2001). Mojave rattlesnakes (Crotalus scutulatus scutulatus) lacking the acidic

1076 subunit DNA sequence lack Mojave toxin in their venom. Comparative Biochemistry and

1077 Physiology Part B: Biochemistry and Molecular Biology, 130(2), 169–179.

1078 Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Molecular Biology and 42 Molecular Ecology Page 44 of 50

1079 Evolution, 24(8), 1586–1591. doi.org/10.1093/molbev/msm088

1080 Young, M. D., Wakefield, M. J., Smyth, G. K., & Oshlack, A. (2010). Gene ontology analysis for

1081 RNA-seq: accounting for selection bias. Genome Biology, 11(2), R14. doi.org/10.1186/gb-

1082 2010-11-2-r14

1083

1084

1085

For Review Only

43 Page 45 of 50 Molecular Ecology

1086 Figure legend 1087

1088 Figure 1 1089 Distribution and expansion of toxin and toxin-like genes across Metazoa. A) Metazoan phylogeny 1090 showing distribution and expansion of TTL genes in representative genomes, including Ctenophore, 1091 Porifera and Placozoa (CPP), Cnidaria, Ecdysozoa, Lophotrochozoa, and Deuterostomia (opaque bars 1092 represent number of different gene families; coloured bars represent copy number). Abbreviations Ve 1093 and Hs refer to species that are considered venomous or hematophagous specialists (specialised 1094 venomous subtype), respectively (Fry et al., 2009). B) Venn diagram showing the overlap of toxin gene 1095 families across major metazoan groupings. See Supplementary Table 1 for full list TTL gene copy 1096 number. 1097 1098 Figure 2 1099 Comparative analysis of TTL within Cnidaria. A) Venn diagram of the distribution of TTL gene families 1100 within cnidarians. B) HeatFor map of Review the distribution and Only copy number of TTL gene families within 1101 cnidarians. 1102 1103 Figure 3 1104 Comparative analysis and molecular evolution of TTL within Actiniaria. A) Maximum Likelihood protein 1105 tree generated to determine actiniarian phylogeny, all bootstrap support > 95%. TTL gene family gains 1106 (green) and losses (red) are represented above and below branches, respectively. Bubble plot shows 1107 the distribution and copy number of TTL gene families within actiniarians and TTL gene families with 1108 dN/dS > 1 highlighted with a black circle, dN/dS = 1 highlighted with a grey circle, and dN/dS < 1 1109 highlighted with a white circle, above respective gene family. B) A plot of site-specific dN/dS values 1110 against amino acid residue positions for TTL gene families within actiniarians. 1111 1112 Figure 4 1113 Toxin expression profile across tissue types and ontogeny in Actinia tenebrosa. A) Heat map of 1114 differentially expressed TTL genes, Z-scaled FPKM values, for morphological structure: acrorhagi, 1115 mesenteric filaments and tentacle. B) Plot of the subclusters of differentially expressed TTL transcripts. 1116 C) Bar plot of the respective subclusters showing copy-number variation of differentially expressed 1117 TTLs across tissue types. D) Heat map of differentially expressed TTL, Z-scaled FPKM values, for 1118 ontogeny: 1, 3, 6, and 9mm size classes. E) Plot of the subclusters of differentially expressed TTL 1119 transcripts. F) Bar plot of the respective subclusters showing copy-number variation of differentially 1120 expressed TTLs across tissue types. 1121 1122 Figure 5 1123 Mass spectrometry imaging (MSI) positive mode spectra acquired from cross-sectioned animal. A) 1124 Histological image of the section that was used for MSI experiments (stained with PAS). Tagged regions 1125 of interest (ROI) were selected based on biological functions and associated cnidae profile. ROI 01 is 1126 related to actinopharnyx, column and mesenterial filaments regions; ROI O2 is the acrorhagi; ROI 03 1127 and 04 are regions related to tentacles. B) Slide sprayed with matrix CHCA. C) MSI of the average mass 1128 related to a peptide widely distributed with higher concentration in the tentacle region. D) MSI of the 1129 average mass related to a peptide with a distribution restricted to acrorhagi. E) Projection of the MSI 1130 linear positive mode spectra of ROIs and overall spectra. 1131

44 Molecular Ecology Page 46 of 50 Amphimedon queenslandica A Trichoplax adhaerens B Mnemiopsis leidyi Pleurobrachia bachei Ve Exaiptasia pallida Ve Deuterostomia Ve Orbicella faveolata Ve Ve Stylophora pistillata Ve Ve Acropora digitifera Ve Ve Hydra vulgaris Ve Ve Centruroides sculpturatus Ve 5 Ve Nasonia vitripennis Ve Tribolium castaneum Ve Stegodyphus mimosarum Ve HsAedes aegypti Hs CPP Tetranychus urticae 2 Athalia rosae Cnidaria 1 Ve Solenopsis invicta Ve 2 0 Ve Apis mellifera Ve 0 1 Hs HsIxodes scapularis 0 HsAnopheles gambiae Hs 1 0 Caenorhabditis elegans For Review Only 5 HsAnopheles darlingi Hs 0 Ve Nephila clavipes Ve Daphnia pulex 0 0 Danaus plexippus Clunio marinus 0 HsAnopheles sinensis Hs Lineage 6 Atta cephalotes CPP 3 Hypsibius dujardini Cnidaria 1 Oryctes borbonicus 2 0 Melipona quadrifasciata Ecdysozoa 0 Ramazzottius varieornatus Trichuris trichiura Lophotrochozoa 2 Crassostrea gigas Deuterostomia Mizuhopecten yessoensis 1 1 0 Aplysia californica 3 HsBiomphalaria glabrata Hs Capitella teleta Gene family 1 7 Octopus bimaculoides 3 Lottia gigantea Copy number Hs Hs Helobdella robusta 2 17 Ve Acanthaster planci Ve Ve Strongylocentrotus purpuratus Venomous Ve Protobothrops mucrosquamatus Ve Hs Hematophagous specialists Danio rerio Ecdysozoa Xenopus tropicalis Lophotrochozoa Gallus gallus Ve Ophiophagus hannah Ve Mus musculus Ve Ornithorhynchus anatinus Ve 0 100 200 Page 47 of 50 Molecular Ecology

A B venom metalloproteinase (M12B) family 0 3 0 0 0

Exaiptasia pallida venom Kunitz-type family 4 3 3 1 0

Unknown 14 7 8 5 3

3 type-B carboxylesterase/lipase family 5 1 0 1 0 true venom lectin family 9 2 0 0 0

snaclec family 1 1 1 2 2 Acropora digitifera 1 Hydra vulgaris 0 sea anemone structural class 9a family 1 0 0 0 0 0 0 0 0 sea anemone 8 toxin family 4 0 0 0 0 0 Copy number 0 0 phospholipase B-like family 0 0 0 1 0 0 0 2 phospholipase A2 family 20 13 8 13 9 5 10 peptidase M12A family 1 1 For Review Only 10 3 1 5 3 15 20 0 multicopper oxidase family 10 4 3 6 2 5 latrotoxin superfamily 0 1 0 0 0 0 0 0 0 jellyfish toxin family 2 0 1 0 4 0 1 insulin family 0 0 0 1 0 Gene family ficolin lectin family 2 5 2 5 0 0 1 0 2 DNase II family 0 0 0 0 1

0 1 cystatin family 0 2 0 1 2 0 CRISP family 0 0 0 0 2 3 2 CREC family 0 0 1 1 0

Stylophora pistillata Orbicella faveolata conopeptide P-like superfamily 0 0 0 3 0 Cnidaria small cysteine-rich protein (SCRiP) family 0 8 12 1 0

actinoporin family 3 1 2 1 3

AB hydrolase superfamily 1 0 0 0 0

Exaiptasia Orbicella Acropora Stylophora Hydra pallida faveolata digitifera pistillata vulgaris

Species Molecular Ecology Page 48 of 50

A 0 Anemonia 0 5 sulcata 3 0 Anthopleura 0 3 2 buddemeieri Actinioidea 1 0 Anthopleura 2 6 dowii 0 Actinia 1 5 0 tenebrosa 0 Megalactis 5 5 0 griffithsi 1 5 0 Aulactinia 1 veratra 5 Stichodactyla 6 haddoni dN/dS > 1 8 0 Nemanthus 0 0 4 annamensis dN/dS = 1 dN/dS < 1 Metridioidea 2 1 Calliactis 0 4 2 polypus 0 Telmatactis Copy number 0 7 sp. 17 1 0 0 Exaiptasia 0 5 0 3 pallida 10 5 1 Aiptasia 2 diaphana 15 20 Edwardsioidea 0 Nematostella 0 4 vectensis 25 0 0 Edwardsiella 3 carnea

Unknown acrorhagin1acrorhagin2 CRECCRISP family family cystatin family natterin family snaclec family

actinoporin family ficolin lectin family For Review Only magi 1 superfamily peptidase S1 family huwentoxin-1huwentoxin-2jellyfish family toxin family family psalmotoxin 1 family 5’ nucleotidase family peptidase M13 family peptidase M12A family AVIT prokineticin family true venom lectin family phospholipase A2 family AB hydrolase superfamily sea anemone NaTx family venom Kunitz-type family multicopper oxidase family phospholipasesea anemoneB-like family 8 toxin family

conopeptide P-like superfamily sea anemonesea type anemone 1 KTx type family 5 KTx family flavin monoamine oxidase family

type-B carboxylesterase/lipase family venom metalloproteinase M12B family sea anemone structural class 9a family venom complement C3 homolog family Gene family sea anemone type 3 (BDS-LIKE) KTx family

sea anemone short toxin (NaTx type III) family

B Cnidaria small cysteine rich protein (SCRiP) family

10.0

7.5

5.0

Gene Family Selective pressure [dN/dS] actinoporin family Cnidaria small cysteine-rich protein (SCRiP) family ficolin lectin family huwentoxin-1 family 2.5 sea anemone 8 toxin family sea anemone type 3 (BDS-LIKE) KTx family snaclec family

0.0

0 100 200 300 400 500 Residue position Page 49 of 50 Molecular Ecology

A B Subcluster 1 Subcluster 2 C10 2 1

0 Subcluster Z-scale (fpkm) −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 −1 Z-scale (fpkm)

Copy number Subcluster 1 Subcluster 2 −2 0 2 Mesenteric Tentacle Acrorhagi Mesenteric Tentacle Acrorhagi Value 5 Subcluster 3 filaments filaments Subcluster 4 Subcluster 5 Subcluster 3 Subcluster 4 2 1 0 0 Z-scale (fpkm) Z-scale (fpkm) −1 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0

Mesenteric Tentacle Acrorhagi Mesenteric Tentacle Acrorhagi family filaments filaments Subcluster 5 natterin family snaclec family

acrorhagin family unitz−type actinoporin family 2 ficolin lectin family

peptidase M12A family 1 true venom lectin family phospholipase A2 family venom K sea anemone 8 toxin family Mesenteric Tentacle Acrorhagi For0 Review Only filaments sea anemonesea anemone NaTx toxin typesea familyanemone 1 KTx family type 5 KTx family Z-scale (fpkm) −1 Mesenteric Tentacle Acrorhagi type−B carboxylesterase/lipase family filaments

sea anemoneGene type 3 (BDS-LIKE) family KTx family ia small cysteine−rich protein (SCRiP) family

D E Cnidar F 3

Subcluster 1 Subcluster 2

−1 0 1 2 Subcluster Value Subcluster 1 Subcluster 2 Subcluster 3 Z-scale (fpkm) Z-scale (fpkm)

−0.5 0.0 0.5 1.0 1.5 Subcluster 4 −1.0 −0.5 0.0 0.5 1.0 Copy number 1 9 mm 1 mm 3 mm 6 mm 9 mm 1 mm 3 mm 6 mm

Subcluster 3 Subcluster 4

0 −0.5 0.0 0.5 1.0 1.5 Z-scale (fpkm) Z-scale (fpkm) −1.5 −1.0 −0.5 0.0 0.5 1.0 9 mm 1 mm 3 mm 6 mm 9 mm 1 mm 3 mm 6 mm actinoporin family

9 mm 1 mm 3 mm 6 mm peptidase M12A family true venom lectin family phospholipase A2 family sea anemone 8 toxin family

Gene family

sea anemone type 3 (BDS-LIKE) KTx family Molecular Ecology Page 50 of 50

For Review Only

Mass spectrometry imaging (MSI) positive mode spectra acquired from cross-sectioned animal. A) Histological image of the section that was used for MSI experiments (stained with PAS). Tagged regions of interest (ROI) were selected based on biological functions and associated cnidae profile. ROI 01 is related to actinopharnyx, column and mesenterial filaments regions; ROI O2 is the acrorhagi; ROI 03 and 04 are regions related to tentacles. B) Slide sprayed with matrix CHCA. C) MSI of the average mass related to a peptide widely distributed with higher concentration in the tentacle region. D) MSI of the average mass related to a peptide with a distribution restricted to acrorhagi. E) Projection of the MSI linear positive mode spectra of ROIs and overall spectra.