<<

bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

1 The genome of the contractile wilhelma and

2 the evolution of metazoan neural signalling pathways

1 1 1 2 3 Warren R. Francis , Michael Eitel , Sergio Vargas , Marcin Adamski , Steven H.D. Haddock 3, Stefan Krebs4, Helmut Blum4, Dirk Erpenbeck1,5,6 Gert W¨orheide1,5,6 1Department of Earth and Environmental Sciences, Paleontology and Geobiology, Ludwig-Maximilians-Universit¨atM¨unchen Richard-Wagner Straße 10, 80333 Munich, Germany 2Research School of Biology, College of Medicine, Biology & Environment, Australian National University, Canberra ACT 0200 Australia 3Monterey Bay Aquarium Research Institute, Landing, CA 95039, USA 4Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, Ludwig-Maximilians-University Munich, Munich, Germany 5GeoBio-Center, Ludwig-Maximilians-Universit¨atM¨unchen, Munich, Germany 6Bavarian State Collection for Paleontology and Geology, Munich, Germany

4 Abstract

5 Porifera are a diverse with species performing important ecological roles in aquatic ecosys-

6 tems, and have become models for multicellularity and early-animal evolution. are the largest

7 in , but previous studies have relied on the only draft demosponge genome of

8 queenslandica. Here we present the 125-megabase draft genome of the contractile laboratory demosponge

9 Tethya wilhelma, sequenced to almost 150x coverage. We explore the genetic repertoire of transporters, re-

10 ceptors, and neurotransmitter metabolism across early-branching metazoans in the context of the evolution

11 of these gene families. Presence of many genes is highly variable across animal groups, with many gene

12 family expansions and losses. Three classes show lineage-specific expansions of GABA-B receptors,

13 far exceeding the gene number in , while ctenophores appear to have secondarily lost most genes

14 in the GABA pathway. Both GABA and glutamate receptors show lineage-specific rearrangements,

15 making it difficult to trace the evolution of these gene families. Gene sets in the examined taxa suggest that

16 nervous systems evolved independently at least twice and either changed function or were lost in sponges.

17 Changes in gene content are consistent with the view that ctenophores and sponges are the earliest-branching

18 metazoan lineages and provide additional support for the proposed ParaHoxozoa .

19 Introduction

20 The presence of neurons is a defining character of , and is symbolic of their alleged superiority over all

21 other on earth. Nonetheless, the four non-bilaterian phyla, Porifera, , and ,

22 are most different from other animals in their sensory systems and are often considered “lower” animals in

23 common parlance. Indeed, animals such as and sponges appear immobile or often unresponsive, chal-

24 lenging early theorists in their ideas of what is and is not an animal. Yet we now know that representatives

25 from all four non-bilaterian phyla demonstrate dynamic responses to outside stimuli.

26

27 Neural evolution has been discussed previously in the context of paleontology (reviewed in [Wray et al.,

28 2015]) and metazoan phylogeny (reviewed in [J´ekely et al., 2015]). Indeed, it has been suggested that many

1 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

29 features of bilaterian neurons and nervous systems represent separate, parallel evolutionary events from a

30 “simple” . A simple nervous system then must arise from proto-neurons [Schierwater et al.,

31 2009], however it is unclear what that might look like.

32

33 Several qualities can be used to define neurons or proto-neurons [Leys, 2015,Nickel, 2010] such as synapses,

34 electrical excitability, membrane potential, or secretory functions, though no single quality (and ultimately

35 gene set) solely defines such cells as neurons. Two non-bilaterian groups, ctenophores and cnidarians, are

36 thought to have true neurons. When considering the remaining two non-bilaterian phyla, sponges and

37 placozoans, many components of neural cells are found without any neuron-like cells having been identi-

38 fied [Srivastava et al., 2010, Riesgo et al., 2014a, Leys, 2015], although synapse-like structures have been

39 identified in placozoan fiber cells that show vesicles close to an osmophile contact [Grell and Benwitz, 1974].

40

41 Comparative analyses revealed a gradient of neural-like qualities indicating that “neuron-or-not” classi-

42 fications are not straightforward. While ctenophores, cnidarians, and bilaterians have true neurons, struc-

43 tural and biochemical differences, [Moroz et al., 2014, Moroz, 2015] led to the proposition that neurons in

44 ctenophores and cnidarians may not be homologous, but rather separate evolutionary outcomes from neural-

45 like precursor cells. Potentially, in the case of independent evolutions, neurons are “easy” to evolve, since it

46 involves co-expression of various pan-metazoan genetic modules in the same cell type. Alternatively, early

47 rudimentary signaling systems may have been energetically costly and not especially useful in pre-

48 , and in such cases, it may have been comparatively easy to lose such genes and with them neuronal-

49 type cells.

50

51 Interpretation of neural evolution requires an accurate metazoan phylogeny, and the phylogenetic relation-

52 ships of early-branching metazoans have been a topic of continued controversy. Some analyses support the

53 traditional phylogenetic position of sponges as to all other metazoans (“Porifera-sister”) [Philippe

54 et al., 2009,Pick et al., 2010,Nosenko et al., 2013,Pisani et al., 2015,Simion et al., 2017] while others suggest

55 that Ctenophora are the sister group to all other animals (“Ctenophora-sister”) [Dunn et al., 2008, Ryan

56 et al., 2013, Whelan et al., 2015], and some analyses also recover the classical view, a Coelenterata clade

57 uniting Cnidaria and Ctenophora [Philippe et al., 2009, Simion et al., 2017]. Importantly, phylogenomic

58 analyses can be prone to systematic artifacts under some circumstances, depending on taxon sampling [Pick

59 et al., 2010,Philippe et al., 2011], gene set [Nosenko et al., 2013], phylogenetic model [Pisani et al., 2015], or

60 use of nucleotides instead of proteins [Jarvis et al., 2014]. Other methods based on presence or absence of

61 the genes themselves have been proposed to provide a sequence-independent inference of phylogeny [Ryan

62 et al., 2010, Ryan et al., 2013, Pisani et al., 2015], relying on the assumption that gene loss is a rare event.

63 However, non-bilaterians have the additional problem that basic knowledge of many aspects of their biology

64 is absent [Dunn et al., 2015], and so the biological context that may separate or unite groups is limited.

65

66 In the context of phylogeny, the branching critically affects whether neurons evolved multiple times

67 or were lost (see schematic in Figure 1). Given the gradient of neural-like qualities, the actual evolutionary

68 scenario may be somewhere in between a simple gain-loss of neurons. While some previous studies have

69 focused on neural evolution in ctenophores [Ryan et al., 2013, Alberstein et al., 2015, Li et al., 2015] or

70 analysing the genomic data from A. queenslandica [Krishnan et al., 2014], these alone do not provide a

71 comprehensive picture of all animals.

72

73 Here we have sequenced the genome of the contractile laboratory demosponge Tethya wilhelma [Sara

74 et al., 2001] and examined the protein repertoire in the context of genes mediating the contraction, and

75 other neural-like functions. Many metabolic genes show unique expansions in different sponge , as well

76 as other phyla, making it challenging to clearly assign functions based on similarity to human proteins. We

77 consider these expansions in the context of phylogenetics, showing that even though sponges lack neurons,

78 signaling pathways have still expanded. This gives support to the hypothesis that early neural-like cells have

79 become neurons multiple times in the history of animals.

2 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

AB

HAS NEURONS

GAIN LOSS CD

Cnidaria

Placozoa

Porifera

Ctenophora

Figure 1: Schematic of neural evolution depending on metazoan phylogeny The presence of neu- rons or neural-like cells in ctenophores, cnidarians, and bilaterians can be viewed differently depending on the phylogeny. Two different metazoan phylogenies based on recent multi-gene phylogenetic analyses are the source of the Porifera-sister (A,B) [Philippe et al., 2009, Pisani et al., 2015, Simion et al., 2017] and Ctenophora-sister (C,D) [Ryan et al., 2013,Whelan et al., 2015] scenarios. Neurons can either have evolved once requiring a secondary loss in sponges, placozoans, or both (A,C), or evolved twice, in ctenophores and in cnidarians/bilaterians (B,D).

80 Results

81 Genome assembly and annotation

82 We generated a total of 61 gigabases of paired-end reads from a whole specimen of T. wilhelma (Figure 2)

83 and all associated . Because of a close association with microbes, some contigs were expected to

84 have derived from bacteria, as many reads have unexpectedly high GC content (Supplemental Figures 1-4).

85 After assembly and filtration of bacterial contigs, the final assembly was 125Mb, similar to A. queenslandica,

86 with a N50 value of 70kb (Supplemental Table 1). Gene annotation was done with a combination of a

87 deeply-sequenced RNAseq library from an adult sponge and ab initio gene predictions. Because of high

88 density of genes, extensive manual curation was often necessary to correct genes of the same strand that

89 were erroneously merged. After correction and filtering of the ab initio predictions, we counted 37,416

90 predicted genes, comparable with the counts in A. queenslandica (40,122) [Fernandez-Valverde et al., 2015]

91 and S. ciliatum (40,504) [Fortunato et al., 2014].

92 General trends in splice variation were similar between T. wilhelma and A. queenslandica (Supplemental

93 Tables 2 and 3), suggesting similar underlying biology or genome structure. One-to-one orthologs from T.

94 wilhelma and A. queenslandica had relatively low identity (Supplemental Figure 5), with the average identity

95 of 57.8%, showing a high genetic diversity within Porifera. The average identity is lower when compared to

3 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Figure 2: Contraction of a normal specimen (A) Maximally expanded state of Tethya wilhelma. (B) The same specimen approximately an hour later in the most contracted state. (C) Cartoon view of most contracted versus expanded state. Scale bar applies to all images. Photos courtesy of Dan B. Mills.

96 S. ciliatum (49.7%), N. vectensis (53.5%) and human (52.0%), which is not surprising given that A. queens-

97 landica and T. wilhelma are both demosponges. Although both genomes are too fragmented to find syntenic

98 chromosomal regions, ordered blocks of genes are still identifiable between T. wilhelma and A. queenslandica

99 (Supplemental Figure 6), though not with S. ciliatum.

100

101 Neurotransmitter metabolism across early-branching metazoans

102 Compared to most other metazoans, sponges have a limited set of behaviors (contraction, closure of osculum

103 or chambers to control flow), yet respond to many signaling molecules present in bilaterians [Ell-

104 wanger and Nickel, 2006, Ellwanger et al., 2007]. Some genes involved in -like neurotransmitter

105 metabolism have been found in sponges [Riesgo et al., 2014a, Krishnan and Schi¨oth,2015], although many

106 display a sister-group relationship to homologs found in other animals and appear to have a complex evo-

107 lutionary history with duplications in sponges and other non-bilaterian animals (Figure 3, Supplemental

108 Figures 7-9), making the prediction of their functions difficult. Thus, declaring presence or absence of any

109 individual gene or genetic module is not correct in the strictest sense, since these proteins are often many-to-

110 many orthologs to human proteins with known functions, and it is not possible to computationally predict

111 which, if any, of the sponge orthologs shares its function with the human protein.

112

113 For instance, biosynthesis of monoamine neurotransmitters (dopamine, serotonin, etc.) requires two

114 enzymes, tryptophan hydroxylase and tyrosine hydroxylase. These two enzymes appear to have arisen in

115 bilaterians from duplications of an ancestral phenylalanine hydroxylase [Cao et al., 2010], though evidence

116 is lacking as to whether this ancestral protein had multiple functions that specialized after duplication (sub-

117 functionalization) or developed new functions (neofunctionalization) post-duplication. The absence of these

118 proteins in non-bilaterians seems to be ancestral; in other words, they had not evolved yet when these groups

119 split and diversified.

120

121 Among other non-bilaterians, some monoamine neurotransmitters are found in cnidarians [Carlberg and

122 Rosengren, 1985], but are mostly absent in ctenophores (or at detection limit) [Moroz et al., 2014]. Indeed,

123 previous studies were unable to find homologs of DOPA decarboxylase (AADC, Supplemental Figure 8),

124 dopamine β-hydroxylase (DBH, Supplemental Figure 7), monoamine oxidase (MAO, Supplemental Fig-

125 ure 9), or tyrosine hydroxylase (TH) in the genome of the ctenophore M. leidyi or any available ctenophore

126 transcriptome, and it was suggested that some of these proteins were absent in sponges as well (see Supple-

127 mentary Tables 17 and 19 in [Ryan et al., 2013]). However, we found orthologs of MAO and homologs of

128 AADC and DBH in several sponges, though it is unclear if they perform the same function as the human

129 proteins. Additionally, homologs of four enzymes, AADC, MAO, DBH, and ABAT, are present in single-

4 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

A (TAT) C Glutamine Glutamic acid Alpha-keto glutarate Citric acid O O O O O O cycle Metazoa H N OH HO OH GLUD HO OH 2 GS OGDC-E1 * NH2 NH2 O 2OG-enzymes SUCLG1 GAD VitC O ABAT O O SSDH OH HO HO HO

NH2 O O Cnidaria Cteno Calcarea Choano Hsm Hexact Filasterea Placozoa Demo Gamma-amino GATP Succinate Succinic acid Bilateria B butyric acid semialdehyde Phenylpyruvate ABAT

COO- 4-hydroxyphenylpyruvate GAD

O - COO HPD GS M 2 Acetoacetate O Phenylalanine HO + Fumarate GATP KYAT catabolites TAT SSDH Phenylalanine Tyrosine Tyramine 4-hydroxyphenylacetate GLUD 2M 2 - - COO - COO PAH COO HGD + + + NH3 NH3 NH3 HO HO HO KYAT 2 3 2 TAT Standard TH amino acids DOPA Dopamine Dihydroxyphenylacetate Homovanillate HPD HO COO- AADC HO HO MeO COO- COO- HGD + + NH3 NH3 HO HO HO HO PAH 2 DBH VitC TH Noradrenaline DOMA VMA AADC 2 Catecholamine OH OH COMT OH HO HO MeO MAO - - neurotransmitters COO COO DBH MMMMM + NH3 HO HO HO PNMT 2 PNMT Catecholamine MAO Adrenaline COMT Enzyme requires O degradation 2 OH HO products NADH is a product Present Homolog Loss + NH2Me VitC is a cofactor HO Absent Absent in transcriptome

Figure 3: Neurotransmitter overview across metazoans Summary schematic of neurotransmitter biosynthesis and degradation pathways across early-branching metazoans. Each of the four non-bilaterian groups is presumed to be monophyletic, although some individual trees of genes or gene families may display alternate topologies. Bold letters refer to the enzymes. Individual gene trees that display the orthology of the clades are found in the supplemental information. Arrows are shown in one direction though many reactions can be reversible. (A) Glutamate and GABA metabolism from the citric acid cycle through the “GABA shunt” pathway. Because ABAT is absent in ctenophores, GLUD and TAT are potentially alternatives to convert α-ketoglutarate to glutamate. GATP can convert GABA to succinate semialdehyde, but this enzyme was only found in some demosponges and . (B) Monoamine metabolism, excluding tryptophan. (C) Table of presence-absence for genes in parts A and B. Presence (green) refers to a 1-to-1 ortholog where orthology is clear from the tree position. Homolog (blue) refers to a sister group position in trees before duplications with different or unknown functions. Secondary loss (red) refers to the gene missing in the clade, but homologs are found in non-metazoan phyla. Numbers inside the boxes indicate copy number specific to that group, M refers to multiple duplications within the group where the copy number is variable among species. Abbreviations for clades are as follows: Cteno, ctenophores; Hsm, homoscleromorphs; Demo, demosponges; Hexact, hexactinellids; Choano, choanoflagellates.

130 celled but not ctenophores, implying a secondary loss of these protein families in this phylum.

131

5 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

132 GABA receptors

133 The neurotransmitter gamma-amino butyric acid (GABA) has been shown to affect contraction in T. wil-

134 helma [Ellwanger et al., 2007] and the freshwater sponge E. muelleri [Elliott and Leys, 2010]. The genome of

135 T. wilhelma contains metabotropic receptors (GABA-B, mGABARs), but not ionotropic GABA receptors

136 (GABA-A, iGABARs). While humans have two mGABARs and the ctenophore M. leidyi has only one, the

137 T. wilhelma genome has nine. Sponges appear to have undergone a large expansion of this protein family

138 (Figure 4), similar to the expansion of glutamate-binding GPCRs previously observed in sponges [Krishnan

139 et al., 2014]. Based on the structure of the binding pocket of human GABAR-B1 [Geng et al., 2013], many

140 differences are observed across the mGABAR protein family, even showing that many residues involved in

141 coordination of GABA are not conserved between the two human proteins or all other animals (Supple-

142 mental Figure 11). Contrary to previous reports [Ramoino et al., 2010], we were unable to find normal

143 mGABARs in the two calcareous sponges S. ciliatum and L. complicata. Instead, in these two species, the

144 best BLAST hits from human GABA-B receptors (the putative mGABARs) had the best reciprocal hits to

145 Insulin-like growth-factor receptors. Structurally, this was due to the normal seven-transmembrane domain

146 being swapped with a C-terminal protein kinase domain (Figure 5), meaning these are not true metabotropic

147 GABA receptors. Similarly, in the filasterean Capsaspora owczarzaki, the N-terminal ligand binding domain

148 is also exchanged with other domains, suggesting as well that these are not true metabotropic GABA recep-

149 tors.

150

151 Glutamate receptors

152 Glutamate is of particular interest as it is a key metabolic intermediate and the main excitatory neurotrans-

153 mitter in animal nervous systems, acting on two types of receptors: the metabotropic glutamate receptors

154 (mGluRs) and the ionotropic ones (iGluRs). Some sponge species possess iGluRs, though these receptors

155 were absent in the transcriptomes of several demosponges [Riesgo et al., 2014a]. We were unable to find

156 iGluRs in the genome of T. wilhelma, in the genome and transcriptomes of any other demosponge, or in the

157 genomes of two choanoflagellates (M. brevicollis and S. rosetta). The top BLAST hits in demosponges have

158 a GPCR domain instead of the ion channel domain, indicating that these are not true iGluRs (Supplemental

159 Figure 13). Because the domain structure in plants is the same as most animal iGluRs, the ligand binding

160 domain was swapped out in demosponges.

161

162 The homoscleromorpha/calcarea clade appears to have an independent expansion of iGluRs (Supplemen-

163 tal Figure 12), though the normal ion transporter domain is switched with a SBP-bac-3 domain (PFAM

164 domain PF00497) compared to all other iGluRs (Supplemental Figure 13). Additionally, ctenophores and

165 placozoans appeared to have dramatic expansions of this protein family as well [Ryan et al., 2013, Moroz

166 et al., 2014,Alberstein et al., 2015], suggesting that a small set of iGluRs was present in the common ancestor

167 of eukaryotes and have diversified multiple times in both plants and animals, while other clades appear to

168 have modified or lost these proteins.

169

170 Vesicular transporters

171 Secretory systems are a common feature of all eukaryotes, as most cells have endoplasmic reticulum to secrete

172 proteins or make membrane proteins. Neurons secrete peptides (conceptually identical to any other protein)

173 or small-molecule neurotransmitters in a paracrine fashion, specifically to other neural cells. Compared to

174 peptides, small-molecule neurotransmitters need to be loaded into vesicles by dedicated transport proteins.

175 Vesicular glutamate transporters (VGluTs, SLC17A6-8) are part of a superfamily of transporters [Sreedha-

176 ran et al., 2010] that carry glutamate, aspartate, and nucleotides. The position of sponge proteins in the tree

177 is inconsistent with a clear role in glutamate transport (Supplemental Figure 14), as several sponge clades

178 and ctenophores occur as sister group to multiple duplications. Transporters in sponges, ctenophores, and

179 choanoflagellates may well act upon glutamate or other amino acids, but this needs to be experimentally

180 investigated.

181

6 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Protostome GABAR-B2 GABAR-B3 (7) Placozoans (2)

Cnidarians (9) Vertebrate GABAR-B1 (4) Filasterea (1)

Ctenophores (2)

Homoscleromorphs (1) * ** Hexactinellids (4)

Demosponges (4)

++ BS>80 BS>90

Figure 4: Metabotropic GABA receptors (GABA-B type) across metazoans Protein tree generated with RAxML. Numbers in parentheses indicate the number of species from that group, so the 46 demosponge mGABARs come from 4 species. Key bootstrap values are summarized as yellow or gray dots, for values of 90 or more, or 80 or more, respectively. Single star indicates sequences annotated as mGABARs in [Krishnan et al., 2014], double plus indicates the clade annotated as “sponge specific expansion” in [Krishnan et al., 2014]. For complete version with protein names and all bootstrap values, see Supplemental Figure 10

182 Similar to glutamate, GABA is loaded into vesicles with the vesicular inhibitory amino acid transporter

183 (VIAAT). Ctenophores, sponges, and placozoans lack one-to-one homologs of VIAAT (Supplemental Fig-

184 ure 15). Several other transporters are thought to transport GABA (ANTL or SLC6 class) and many other

185 amino acids. SLC6-class transporters, which transport diverse amino acids, are found in all non-bilaterian

186 groups, so the function of VIAAT may be redundant.

187 Glycine receptors

188 Glycine is known to affect the contraction of T. wilhelma [Ellwanger and Nickel, 2006]. Some ctenophore

189 iGluRs have been shown to bind glycine [Alberstein et al., 2015] due to the substitution of serine for arginine

190 (S687 in human GluN1), though this appears to be specific only to ctenophores, as essentially all other

191 iGluRs have the conserved serine/threonine at this position. Because no ionotropic glycine receptors could

192 be identified in the T. wilhelma genome (or any other sponges, ctenophores or placozoans), other proteins

193 may be responsible for mediating this effect.

194

7 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Vertebrate mGABARs GABR1_HUMAN Signal peptide ANF_receptor Sushi Recep_L_domain 7-transmembrane TIG Furin−like GABR1_MOUSE Tyrosine kinase Laminin G3 fibronectin3

GABR2_HUMAN Human Receptor tyrosine kinases INSRR_HUMAN

GABR2_MOUSE IGF1R_HUMAN

Triad1_g72.t1__scaffold_1 Placozoans Calcisponge mGABAR-like

scict1.027609.1_0 Sycon ciliatum ML02335a−AUGUSTUS Ctenophores

scict1.029082.1_0

Twilhelma_twi_ss.20824.4_2 Demosponges

lctid9879_0 Leucosolenia complicata Aqu2.1.28011_001

lctid37767_0 aphrocallistes_vastus_comp16141_c0_seq1_1 Hexactinellids

rosella_fibulata_TR14675_c0_g1_i1_0 Capsaspora owczarzaki Filasterean CAOG_05982T0 oscarella_t_h60_42123_comp11388_c0_seq18 Homoscleromorph

0 200 400 600 800 1000 0 200 400 600 800 1000 1200 1400

Figure 5: Domain organization of GABA-B type receptors across metazoans Scale bar displays number of amino acids. Top reciprocal BLAST hits to human for putative mGABARs in calcareous sponges are INSRR and IGF1R, due to high-scoring hits to the tyrosine kinase domain. All mGABARs from demosponges, glass sponges, and the homoscleromorph share the 7- transmembrane domain (green) with mGABARs from other animals, while calcisponge proteins have the same ligand-binding domain (red) but instead have a protein tyrosine kinase domain (purple) at the C- terminus, similar to growth-factor receptors. The filasterean Capsaspora owczarzaki has alternate domains at the N-terminus.

195 Mechanical receptors

196 Some sponges can contract in response to mechanical agitation, as reported for the demosponges E. muel-

197 leri [Elliott and Leys, 2007] and T. wilhelma [Nickel, 2010]. Several diverse protein families appear to

198 be responsible for the sense of touch [Arnad´ottirand´ Chalfie, 2010]. A subgroup of the TRP (transient

199 receptor-potential) channels, TRP-N, thought to mediate mechanosensation was determined to be absent

200 in sponges [Schuler et al., 2015], and we were unable to identify any in either T. wilhelma or S. ciliatum,

201 although other TRP-class channels were found [Ludeman et al., 2014, Schuler et al., 2015]. Because the

202 mechanosensory function of TRP channels may be redundant, we analysed for the presence of PIEZO, a

203 280kDa trimeric protein [Ge et al., 2015] involved in touch sensation in mammals [Coste et al., 2012]. Al-

204 though two homologs were found in vertebrates, we found one copy in all other animals (Supplemental

205 Figure 16) as well as fungi, plants and most other eukaryotic groups, suggesting an ancient and conserved

206 function of this protein.

207

8 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

208 Voltage-gated channels

209 Voltage-gated ion channels are necessary for the propagation of electrical signals down axons and dendrites

210 [Zakon, 2012,Moran et al., 2015], and have specificities for sodium, potassium, or calcium. Previous analyses

211 were unable to clearly identify potassium or sodium channels in sponges [Liebeskind et al., 2011]; only one

212 partial potassium channel was found in the transcriptome of the homoscleromorph Corticium candelabrum

213 [Riesgo et al., 2014a,Li et al., 2015]. We were unable to find any voltage-gated sodium or potassium channels

214 in the genome or transcriptome of any sponge. We then examine voltage-gated hydrogen channels (hvcn1), as

215 these proteins have been found in a number of single-celled eukaryotes [Smith et al., 2011], and are extremely

216 conserved. These channels were found in all sponge groups, although the high protein identity resulted in a

217 poorly-resolved tree (Supplemental Figure 19).

218 Reports of action potentials in hexactinellids [Leys et al., 1999,Leys et al., 2007,Nickel, 2010] showed that

219 sponge action potentials were inhibited by divalent cations [Leys et al., 1999], suggesting a role of calcium

220 channels instead. Because voltage-gated sodium and calcium channels arose from a duplication event [Gur

221 Barzilai et al., 2012], the ion selectivity may be variable within this protein family. Most sponges have only a

222 single CaV-channel (Supplemental Figure 18) and several Hv-channels, and no voltage-gated channel of any

223 kind was found in any glass sponge. However, all glass sponge sequences are from transcriptomes, therefore

224 either the expression level of the true channels is low in glass sponges, or they have independently evolved

225 another mechanism to propagate action potentials.

226

9 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

227 Discussion

228 Gene content variation of metazoa

229 Among the thousands of genes in the genome, we focused on genes that may be mediating contractile be-

230 havior in T. wilhelma, and the interactions of those genes within broader metabolic pathways. Many of the

231 “housekeeping” genes in our study have lineage-specific duplications in at least one animal phylum. Consid-

232 ering the importance of “single-copy” proteins in phylogenetic analyses, as taxon sampling improves, it may

233 be found that very few or no genes are single copy across most or all animal phyla. Many other genes that

234 are critical for neural functioning in bilaterians have independent losses in other animal lineages (Figure 6).

235

Gene family expansions Demo/Hexact Calcarea/Hsm Ctenophores Placozoans Cnidarians mGluRs iGluRs iGluRs iGluRs VGluT mGABARs DBH-like * Kv-channels mGABARs VIAAT MAO/PAOX-like Hv-channels * DBH-like MAO/PAOX-like DBH-like NaV-channels DBH-like

Gene family losses iGluRs mGABARs *# ABAT MAO - VIAAT DBH-like * MAO AADC-like NaV-channels DBH-like NaV-channels Kv-channels AADC-like Kv-channels

Figure 6: Summary of gene expansions and losses Demo/Hexact refers to the clade of demosponges and glass sponges. Calcarea/Hsm refers to the unnamed clade of calcareous sponges and homoscleromorphs. The star indicates that expansion or loss was found in one class but not the other. Number sign indicates the domain rearrangement in calcisponges.

236 Glutamate and GABA receptor evolution

237 There is stark contrast in the relative abundance of mGABARs and iGluRs in sponges and ctenophores.

238 The relative dearth of mGABARs in ctenophores may reflect the apparently absence of amino-butyrate

239 amino-transferase (ABAT) in ctenophores, suggesting that ctenophores use an alternate pathway to produce

240 glutamate or metabolize GABA (Figure 3), rarely use GABA as a neurotransmitter, or simply are missing

241 this pathway. Other aminotransferases such as GLUD or TAT may perform some of the exchange between

242 α-ketoglutarate and glutamate, particularly as ctenophores have two copies of GLUD while most animals

243 have only one. Ctenophores also have multiple (variable) copies of glutamate synthase and three copies of

244 KYAT one of which may serve to balance glutamate metabolism in these animals.

245

246 There are two explanations for the diversity of mGABARs in sponges. Given the high variability of

247 amino acids in the mGABAR binding pocket (Supplemental Figure 11), it is plausible that many of these

248 receptors do not bind GABA at all, and have diversified for other ligands. There is precedent for this as it

249 was shown that the independent expansion of ctenophore iGluRs also included several key mutations to the

250 binding pocket which changed the ligand specificity of these proteins [Alberstein et al., 2015]. For the other

10 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

251 hypothesis, all of the receptors could bind GABA, essentially mediating the same contraction signal, but

252 their kinetics could differ and be influenced by factors such as, for instance, temperature. Because sponges

253 are mostly immobile, they often can be subject to environment variation in terms of light, oxygen, and

254 temperature. The possession of a set of proteins capable of triggering the same response (e.g. contraction)

255 with varying daily or seasonal environmental conditions (e.g. temperature) would be beneficial and may

256 explain the diverse set of receptors observed in sponges. Experimental characterization of these binding

257 domains is necessary and may even show that a combination of these hypotheses explains the diversification

258 of mGABARs in Porifera.

259

260 The apparent absence of true mGABARs in calcareous sponges (the genome of S. ciliatum and transcrip-

261 tome of L. complicata) conflicts with a previous study that identified key proteins in the GABA pathway

262 by immunostaining [Ramoino et al., 2010]. The best mGABARs BLAST hits found in the two calcisponges

263 display a conserved ligand binding domain but the seven-transmembrane domain has been swapped with

264 a tyrosine kinase domain (Figure 5). Structural similarity of the conserved N-terminal domain may result

265 in a false-positive signal in studies using immunostaining with standard antibodies [Ramoino et al., 2010].

266 On the other hand, compared to ctenophores, which apparently lack ABAT, this enzyme was found in both

267 of the calcareous sponges analyzed. Thus it would be surprising if these sponges had no capacity to create

268 or respond to GABA. Since true vertebrate-like mGABARs are found in all other sponge classes, and our

269 study could only examine two calcareous sponges, it could be that mGABAR presence is variable in this

270 class. The genome of S. ciliatum contains 40 proteins annotated as mGluRs [Fortunato et al., 2014], so a

271 third possibility is that even in the absence of true mGABARs, some of these proteins may have evolved

272 affinity for GABA and mediate its signaling in calcareous sponges.

273

274 Although a putative iGluR was identified in the transcriptome of the demosponge fasciculata,

275 this sequence was only a fragment, so the glutamate affinity and domain structure could not be determined.

276 As with the mGABARs, the domain structure is different between the sponge classes. Otherwise, it appears

277 that only calcareous sponges and homoscleromorphs have NMDA/AMPA-like iGluRs. The presence of these

278 proteins in plants and other single-celled eukaryotes suggests that at least iGluRs were present in the com-

279 mon ancestor of all eukaryotes, and their absence in demosponges is likely the product of secondary losses.

280 In the context of contractions of T. wilhelma, the abundance of mGluRs and mGABARs could plausibly

281 work in antagonistic ways via the action of different G-proteins making ionotropic channels not necessary

282 for the modulation of this behavior.

283

284 Variation in neurotransmitter metabolism

285 Many of the oxidative enzymes in the monoamine pathway require molecular oxygen, suggesting an im-

286 portant role of this molecule both the synthesis of the neurotransmitters (with PAH, TH, and DBH) and

287 their inactivation (with MAO). Two catabolic pathways arise from tyrosine (Figure 3) and require oxygen

288 at nearly all steps. It is unclear why intermediate products of one of these two, the catecholamine pathway,

289 became neurotransmitters and the other did not, particularly as hydroxyphenylpyruvate pathway is univer-

290 sally found in animals and catabolic intermediates are likely to be universal.

291

292 MAO was found in most animal groups, but we were unable to find any in placozoans or ctenophores.

293 The topology of the MAO suggests a secondary loss of this protein in these phyla (Sup-

294 plemental Figure 9). Related genes (PAOX, polyamine oxidase) were found in placozoans with several

295 placozoan-specific duplications, and again, potentially one of these may catalyze the oxidation of aromatic

296 amines. The analysis of these proteins also uncovered a clade including sponges, cnidarians, and ,

297 though the function of these proteins cannot be predicted based on searches. In vitro charac-

298 terization of these enzymes may reveal the function to provide evidence as to how these could have been

299 important for metabolism in early animals, and was subsequently replaced or lost in most other metazoan

300 lineages.

301

302 Remarkably, the DBH group has independent expansions in three sponge classes as well as placozoans

11 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

303 and cnidarians (Supplemental Figure 7). No DBH homologs were identified in calcareous sponges or in

304 ctenophores. A putative homolog of this group was found in the choanoflagellate M. brevicollis but not in

305 any other non-metazoan. The alignment and the phylogenetic position of the M. brevicollis protein suggest

306 that it may be a member of the copper-binding oxygenase superfamily, rather than a true homolog of DBH

307 (see Supplemental Alignment).

308

309 The presence of DBH-like and AADC-like enzymes in most animal groups suggests the possibility to

310 make phenylethanolamines (like octopamine or noradrenaline) from tyrosine, and then subsequently inac-

311 tivate them with MAO. All demosponges appear to lack AADC, and ctenophores appear to lack both of

312 these enzymes calling into question a previous report of the detectability of monoamine neurotransmitters

313 in ctenophores [Carlberg and Rosengren, 1985].

314

315 Conserved properties of neurons

316 Neurons are defined by the presence of five key aspects: membrane potential, voltage-gated ion channels,

317 secretory pathways, ligand-gated ion channels, and cell-cell junctions to form synapses. Voltage-gated chan-

318 nels, secretory systems, and ligand-gated ion channels are thoroughly discussed above. Membrane potential

319 is maintained in animal cells by sodium-potassium pumps (ATP-ases), which are a class of cation pumps

320 exclusively found in animals [Stein, 1995, S´aezet al., 2009]. It is thought that such pumps are necessary

321 because animals are the only multicellular group that lacks any kind of cell wall, thus careful control of

322 ionic balance is necessary to resist osmotic stress [Stein, 1995]. For non-bilaterian animals, cell layers were

323 in direct contact with water, so potentially all cells needed this protein to function normally. Therefore

324 having neuron-like functionality is unlikely to rest upon the gain or loss of this gene. The last feature is

325 the presence of cell-cell connections. Many proteins involved in synapse structure or neurotransmission are

326 found in sponges, [Srivastava et al., 2010,Riesgo et al., 2014a,Moran et al., 2015,Leys, 2015] though it is not

327 clear which genes are necessary for neural functioning, or may have evolved independently.

328 Neural evolution and losses

329 Based on recent phylogenies, both Porifera-sister and Ctenophora-sister evolutionary scenarios require either

330 at least one loss of neurons or two independent gains (Figure 1) of this cell-type. The only scenario that

331 allows for a single evolution of neurons and no losses is the “Coelenterata” hypothesis (reviewed in [J´ekely

332 et al., 2015]), which joins cnidarians and ctenophores in a clade. However, many molecular datasets [Dunn

333 et al., 2008,Ryan et al., 2013,Whelan et al., 2015,Pisani et al., 2015] and morphological evidence [Harbison,

334 1985] argue against this scenario (but also see [Philippe et al., 2009] and [Simion et al., 2017]). One other

335 alternative is that placozoans have an unidentified neuron-like cell in a Porifera-sister context, which would

336 therefore allow for a single origin of neural systems in animals and no losses.

337

338 What do the two different scenarios mean for evolution of neuronal cells? Considering the basic properties

339 of neurons related to electrical signaling or secretory pathways, it had been shown before that many of the

340 genes involved are universally found in animals. A single origin and multiple losses implies that the genetic

341 toolkit necessary for all of these functions was present in the same single-celled organism or the same cell

342 type (an hypothetical proto-neuron) of the last common ancestor of crown-group Metazoa, and either that

343 cell type was lost or its functions were split up.

344

345 Sponges and ctenophores both appear to have lost several gene families (Figure 6), though ctenophores

346 nonetheless have neural cells. Thus, the losses of the GABA or monoamine pathways are not critical for the

347 functioning of neural cell types overall. However, voltage-gated potassium and sodium channels are thought

348 to be essential for the propagation of electrical signals down axons and dendrites and have been found in

349 all animal groups except sponges [Moran et al., 2015]. The NaV-channel tree shows a single origin of this

350 protein family (Supplemental Figure 17), and presence of these channels in choanoflagellates suggests they

351 were present in the common ancestor of all animals; the apparent absence in sponges therefore is probably a

352 secondary loss. By comparison, ctenophores have a mostly-unique expansion of Kv-channels relative to the

353 rest of metazoans [Li et al., 2015] and a duplication in NaV channels. Together with the loss of this protein

12 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

354 family in sponges, the gene content argues for a combination of both multiple, independent gains and a loss

355 of neural-type cells and their associated functions across animals.

356

357 Properties of the earliest metazoans are unknown, including life cycle or number of cell types, but it

358 seems parsimonious to conceive that the first obligate multicellular animals did not have anything close to a

359 sophisticated nervous system [Wray et al., 2015]. Yet, the genomic evidence shows that these animals could

360 respond to environmental or paracrine signals, regulate the cell internal ion concentrations and respond to

361 changes in their concentrations, and secrete small molecules that could serve as effectors in unconnected (but

362 proximal) cells. Thus, the earliest animals likely had the capacity to develop nerve cells using the genetic

363 toolkit they possessed, though the number of times this occurred is unclear. This capacity appears to have

364 been lost in sponges with the loss of voltage-gated channels. As we were unable to find putative genes

365 to mediate action potentials in glass sponges, either all of the four transcriptomes were incomplete or the

366 unique action potentials of glass sponges may represent a third case of the evolution of neural-like functions

367 in Metazoa.

368

369 Methods

370 Sequence data

371 Project overview can be found at spongebase.net. Reference data from the demosponge Tethya wilhelma 372 are available at: https://bitbucket.org/molpalmuc/tethya_wilhelma-genome

373

374 Raw genomic reads for T. wilhelma are available on NCBI SRA under accession numbers SRR2163223

375 (genomic reads), SRR2296844 (mate pairs), SRR5369934 (DNA Moleculo), and SRR4255675 (RNAseq).

376

377 Genome assembly

378 Processing and assembly

379 We generated 25Gb of 100bp paired-end Illumina reads of genomic DNA and 35Gb of 125bp Illumina gel-free

380 mate-pair reads. Contigs were assembled with SOAPdenovo2 [Luo et al., 2012] using a kmer of 83bp. We

381 also generated 436Mb of Moleculo synthetic long reads. Because both haplotypes are represented in the

382 Moleculo reads, we merged the Moleculo reads using HaploMerger [Huang et al., 2012]. Contigs and merged

383 Moleculo reads were then scaffolded using the gel-free mate-pairs with SSPACE [Boetzer et al., 2011] and

384 BESST [Sahlin et al., 2014]. The first draft assembly had 7,947 contigs, totaling 145 megabases.

385

386 Removal of low-coverage contigs

387 To examine the completeness of the genome, we generated a plot of kmer coverage against GC percentage for

388 the contigs (Supplemental Figure 1) using custom Python scripts (available at http://github.org/wrf/lavaLampPlot).

389 This revealed 1,040 contigs with a coverage of zero that were carried over from the Moleculo reads and were

390 not assembled (Supplemental Figure 2), accounting for 6 megabases. As these reads likely derived from

391 bacterial contamination in the aquarium water, these contigs were removed, leaving 6,907 contigs totalling

392 138 megabases.

393

394 Separation of bacterial contigs

395 Additionally, the plot revealed many contigs with lower coverage (20x-90x) and high GC content (50-75%)

396 suggesting the presence of bacteria (Supplemental Figure 3). Because many of these contigs were shorter

397 than 10kb, separation of the bacterial contigs was done through several steps. We found 4,858 contigs with

398 mapped RNAseq reads and GC content under 50%, as expected of metazoans. These contigs accounted for

13 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

399 88% of the sponge assembly, or 121 megabases. For the 2,014 contigs with no mapped RNAseq, we used

400 blastn to search the contigs against the A. queenslandica scaffolds and all complete bacterial genomes from

401 Genbank (5,242 sequences). Based on subtraction of bitscores, 62 contigs were identified as sponge and 565

402 were identified as bacterial. For the remaining 1,387 contigs, most of which were under 10kb, we repeated

403 the search with tblastx against A. queenslandica scaffolds and the genomes of Sinorhizobium medicae and

404 Roseobacter litoralis, which were the most similar complete genomes to the two bacterial 16S rRNAs identi-

405 fied in the contigs. After all sorting, 798 putative bacterial contigs accounted for 12.7 megabases and were

406 separated to bring the total to 6,109 sponge contigs. Contigs for the two bacteria were binned by tetranu-

407 cleotide frequency using MetaWatt [Strous et al., 2012] (Supplemental Figure 4).

408

409 Genome coverage and completeness

410 Coverage was estimated two ways: kmer frequency and read mapping. Kmers of 31bp were counted using

411 the Jellyfish kmer counter [Mar¸caisand Kingsford, 2011] and analyzed using custom Python and R scripts 412 (“fastqdumps2histo.py” and “jellyfish gc coverage blob plot v2.R”, available at http://github.org/wrf/ 413 lavaLampPlot). As expected, the kmer distribution showed two peaks, one for kmers at heterozygous 414 positions and one for homozygous positions, whereupon the coverage peak was at 131-fold coverage for ho-

415 mozygous positions. Because of sequencing errors, this method often underestimates coverage, and so to

416 confirm this estimate we then mapped all reads to the genome using Bowtie2 [Langmead and Salzberg, 2012].

417 The sum of mapped reads divided by the total length provided an estimated coverage of 159-fold physical

418 coverage.

419

420 Of the original reads, 185 million (86.5%) mapped back to the assembled sponge contigs. Completeness

421 for gene content was assessed with BUSCO [Sim˜aoet al., 2015], whereupon we found 728 (86%) complete

422 genes and 42 (4.9%) predicted-incomplete genes. Overall, these data suggest that the genome assembly is

423 adequate for downstream analyses.

424

425 Genome annotation

426 Transcriptome versions

427 The transcriptome for T. wilhelma was assembled de novo using Trinity (release r20140717) [Grabherr et al.,

428 2011, Haas et al., 2013]. Default parameters were used, except for strand specific assembly, in silico read

429 normalization, and trimming (–SS lib type RF –normalize reads –trimmomatic). This produced 127,012

430 transcripts with an average length of 913bp. Assembled transcripts were mapped to the genomic assembly

431 using GMAP [Wu and Watanabe, 2005] to produce a GFF file of the transcript mapping. Of these, 114,744

432 transcripts were mapped 166,847 times, allowing for multiple mappings.

433

434 For the genome-guided transcriptome, strand-specific RNAseq reads were mapped against the genome

435 build using Tophat2 v2.0.13 [Kim et al., 2013] using strand-specific mapping (option –library-type fr-

436 firststrand) and otherwise default parameters. Mapped reads were then joined into transcripts using StringTie

437 v1.0.2 [Pertea et al., 2015] with default parameters.

438

439 Additionally, ab initio gene models were predicted using AUGUSTUS [Stanke et al., 2008]. AUGUSTUS

440 was trained on the webAugustus server [Hoff and Stanke, 2013] using the highest expression transcripts for

441 each Trinity component and the assembled contigs. This identified 27,551 putative genes. The majority of

442 these overlapped partially or completely with a predicted gene based on the Trinity mapping or Stringtie

443 genes. However, 3,866 genes (4,321 transcripts) had no overlap with any predicted exon from either the

444 Trinity or StringTie set, and were kept for the final set. Considering the possibility that some of these may

445 be pseudogenes, we aligned these proteins to the SwissProt database with BLASTP [Camacho et al., 2009]. −5 446 Of these, only 759 had reliable hits (E-value < 10 ) to 688 proteins. The annotated functions were diverse,

447 including proteins similar to many receptors and large structural proteins such as fibrillin (potentially any

448 protein with EGF repeats), dynein heavy chain, and titin; because very large proteins may be split across

14 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

449 multiple contigs, the predicted genes may be only fragments of the full gene. Only 42 of the hits were against

450 transposable elements.

451 Filtering of the final gene set

452 Because assembly of transcripts for both StringTie and Trinity relies on overlaps in the genome or RNAseq

453 reads, genes that overlap in the untranslated regions (UTRs) can sometimes be erroneously fused. For

454 StringTie, we developed a custom Python script to separate non-overlapping transcripts belonging to the 455 same “gene” (stringtiesplitgenes.py, available at https://bitbucket.org/wrf/sequences/). Tandem du- 456 plications can lead to RNAseq reads bridging the two tandem copies and result in both copies being called

457 the same gene. The original StringTie set contained 46,572 transcripts for 32,112 genes, while the corrected

458 set contained 33,200 genes and identified 1,088 new non overlapping genes.

459

460 Positional errors in the genome or allelic variations may result in some RNAseq reads not mapping

461 to the genome, so some genes are fragmented in the genome-guided transcriptome but not the de novo

462 assembly. Making use of the protein predictions from TransDecoder, we compared the predicted pro-

463 teins between the two transcriptomes using a custom Python script (transdecodersplitgenes.py, available 464 at https://bitbucket.org/wrf/sequences/). This identified 406 StringTie transcripts that were better 465 modeled by Trinity transcripts.

466

467 Functional gene annotation

468 Many genes of functional importance were examined manually, and the best transcript from StringTie, Trin-

469 ity, or AUGUSTUS was retained for the final gene set. In the GFF and fasta versions of the transcriptomes,

470 names of protein functions were assigned several ways. Target genes that were manually curated and edited,

471 such as those used in all trees, are named by the generic function or the annotated function of the closest

472 human protein. For instance, the dopamine beta-hydroxylase (DBH) homolog in T. wilhelma was manu-

473 ally corrected, and the position in a phylogenetic tree demonstrated that demosponges diverged before the

474 duplication which created DBH and the two DBH-like proteins in humans, thus the T. wilhelma protein

475 is annotated as DBH-like. Secondly, automated ortholog finding pipelines (HaMStR [Ebersberger et al.,

476 2009]) used for phylogeny [Cannon et al., 2016] have identified homologs in T. wilhelma, which have been

477 manually checked based on positions in the phylogenetic trees. Thirdly, single-direction BLAST results were

478 kept as annotations provided that the BLAST hit had a bitscore over 1000, or a bitscore over 300 and the

479 T. wilhelma protein covered at least 75% of the best hit against the human protein dataset from SwissProt.

480 The bitscore and length cutoffs were applied to reduce the number of annotations based on a single domain.

481

482 Analysis of splice variation

483 Using the transcriptome from StringTie, splice variation was assessed using a custom Python script (splice- 484 variantstats.py, available at https://bitbucket.org/wrf/sequences/). In this script, several ambiguous 485 definitions were clarified to define the different splice types. Firstly, single exon genes with no variants are

486 distinguished from single exon genes with variations, that is, a gene with two exons can have a variant with

487 one exon. For loci with only two transcripts, the canonical or main transcript is defined as the one with

488 the higher expression level, as measured by the higher FPKM value reported from StringTie. For loci with

489 three or more transcripts, main or canonical exons are those included in at least two transcripts. A cassette

490 exon must occur in less than 50% of the transcripts for a locus, otherwise such case is defined as a skipped

491 exon. A retained intron is any portion that exactly spans two other exons; for highly expressed transcripts

492 this may include erroneously retained introns due to intermediates in splicing. A summary of the splicing

493 types is displayed in Supplemental Table 2.

494

495 Intron retention was recently reported to be a common mode of alternative splicing in A. queens-

496 landica [Fernandez-Valverde et al., 2015]. We found 3,295 transcripts with 3,565 retained intron events

497 (Supplemental Table 2). We then analyzed the length of the retained introns and found the phase of the

15 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

498 retained piece to be randomly distributed (unlike cassette exons, Supplemental Table 3), suggesting that

499 many of the retained introns result from incomplete splicing rather than functional retention.

500

501 Microsynteny across sponges

502 Putative synteny blocks were identified using a custom Python script (microsynteny.py, available at https: 503 //bitbucket.org/wrf/sequences/). Briefly, the script combines the gene positions on scaffolds for both 504 the query and the reference with BLASTX hits for the query against the reference. If a minimum of three

505 genes in a row on a query scaffold match to different genes on the same reference scaffold, the group is

506 kept. By default, this mandated a gap of no more than five genes before discarding the block, and that the

507 next gene must occur within 30kb. This method was designed to work for highly fragmented genomes with

508 thousands of scaffolds, so the order and direction of the corresponding genes on the reference scaffold do not

509 need to match those of the query scaffold.

510

511 StringTie transcripts for T. wilhelma were aligned against the A. queenslandica v2.0 protein set with

512 BLASTX [Camacho et al., 2009], and positions were taken from the accompanying A. queenslandica v2.0

513 GTF. The same procedure was attempted against the S. ciliatum gene models, though essentially no syn-

514 tenic blocks were detected, indicating either substantial differences in gene content or gene order between

515 demosponges and calcareous sponges.

516

517 Collection of reference data

518 Proteins for Oikopleura dioica [Denoeud et al., 2010] were downloaded from Genoscope. Gene models

519 for Ciona intestinalis [Dehal et al., 2002], Branchiostoma floridae [Putnam et al., 2008], ad-

520 herens [Srivastava et al., 2008], Capitella teleta, Lottia gigantea, Helobdella robusta [Simakov et al., 2013],

521 kowalevskii [Simakov et al., 2015], and Monosiga brevicollis [King et al., 2008] were downloaded

522 from the JGI genome portal. Gene models for Sphaeroforma arctica, Capsaspora owczarzaki [Suga et al.,

523 2013] and Salpingoeca rosetta [Fairclough et al., 2013] were downloaded from the Broad Institute.

524

525 We used genomic data of the cnidarians Nematostella vectensis [Moran et al., 2014], Exaiptasia pall-

526 ida [Baumgarten et al., 2015], and Hydra magnipapillata as well as transcriptomes from 33 other cnidari-

527 ans [Bhattacharya et al., 2016, Zapata et al., 2015, Pratlong et al., 2015, Brinkman et al., 2015, Ponce et al.,

528 2016], mostly corals.

529

530 For demosponges, we used the genome of [Srivastava et al., 2010,Fernandez-

531 Valverde et al., 2015] and transcriptomic data from: phyllophila [Qiu et al., 2015], fici-

532 formis [Riesgo et al., 2014a], crambe [Versluis et al., 2015], varians [Riesgo et al., 2014b], Hal-

533 isarca dujardini [Borisenko et al., 2016], elegans [P´erez-Porro et al., 2013], carteri, Xestospon-

534 gia testutinaria [Ryu et al., 2016], Scopalina sp., and anhelens. We used data from the genome of the

535 Sycon ciliatum [Fortunato et al., 2014] and the transcriptome of Leucosolenia complicata.

536 For hexactinellids (glass sponges), we used transcriptome data from Aphrocallistes vastus [Ludeman et al.,

537 2014], Hyalonema populiferum, Rosella fibulata, and Sympagella nux [Whelan et al., 2015]. For homosclero-

538 morphs, we used two transcriptomes from Oscarella carmela and Corticium candelabrum [Ludeman et al.,

539 2014].

540

541 We used data from the two published draft genomes of ctenophores [Ryan et al., 2013,Moroz et al., 2014],

542 as well as transcriptome data from 11 additional ctenophores: Bathocyroe fosteri, Bathyctena chuni, Beroe

543 abyssicola, Bolinopsis infundibulum, Charistephane fugiens, Dryodora glandiformis, Euplokamis dunlapae,

544 Hormiphora californensis, Lampea lactea, Thalassocalyce inconstans, and Velamen parallelum.

545

546 We used data from the unpublished draft genome of a novel placozoan species, designated H13.

547

16 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

548 Gene trees

549 For protein trees, candidate proteins were identified by reciprocal BLAST alignment using blastp or tblastn.

550 All BLAST searches were done using the NCBI BLAST 2.2.29+ package [Camacho et al., 2009]. Because

551 most functions were described for human, mouse, or fruit fly proteins, these served as the queries for all

552 datasets. Candidate homologs were kept for analysis if they reciprocally aligned by blastp to a query pro-

553 tein, usually human. Alignments for protein sequences were created using MAFFT v7.029b, with L-INS-i

554 parameters for accurate alignments [Katoh and Standley, 2013]. Phylogenetic trees were generated using ei-

555 ther FastTree [Price et al., 2010] with default parameters or RAxML-HPC-PTHREADS v8.1.3 [Stamatakis,

556 2014], using the PROTGAMMALG model for proteins and 100 bootstrap replicates with the “rapid boot-

557 strap” (-f a) algorithm and a random seed of 1234.

558

559 Domain annotation

560 Domains for individual protein trees were annotated with “hmmscan” v3.1b1 from the HHMER pack-

561 age [Eddy, 2011] using the PFAM-A database v27.0 [Finn et al., 2016] as queries. Signal peptides were pre-

562 dicted using the stand-alone version of SignalP v4.1 [Petersen et al., 2011]. Domain structures were visualized 563 using a custom Python script, “pfampipeline.py”, available at https://github.com/wrf/genomeGTFtools .

564

565 Acknowledgments

566 W.R.F would like to thank K. Achim, M. Nickel and J. Musser for helpful discussions. G.W and D.E. would

567 like to thank M. Nickel for providing the initial T. wilhelma speciments to set up the culture in Munich.

568 This work was supported by a LMUexcellent grant (Project MODELSPONGE) to D.E. and G.W. as part

569 of the German Excellence Initiative, partially by research grant 9278 (“Early evolution of multicellular

570 sponges”) from VILLUM FONDEN to G.W., and NIH grant NIGMS-5-R01-GM087198 to S.H.D.H. The

571 authors declare no competing interests.

572 References

573 References

574 [Alberstein et al., 2015] Alberstein, R., Grey, R., Zimmet, A., Simmons, D. K., and Mayer, M. L.

575 (2015). Glycine activated ion channel subunits encoded by ctenophore glutamate receptor genes.

576 Proceedings of the National Academy of Sciences, 112(44):E6048–E6057.

577 [Arnad´ottirand´ Chalfie, 2010] Arnad´ottir,J.´ and Chalfie, M. (2010). Eukaryotic Mechanosensitive

578 Channels. Annual Review of Biophysics, 39(1):111–137.

579 [Baumgarten et al., 2015] Baumgarten, S., Simakov, O., Esherick, L. Y., Liew, Y. J., Lehnert, E. M.,

580 Michell, C. T., Li, Y., Hambleton, E. a., Guse, A., Oates, M. E., Gough, J., Weis, V. M., Aranda,

581 M., Pringle, J. R., and Voolstra, C. R. (2015). The genome of Aiptasia , a sea anemone model for

582 symbiosis. Proceedings of the National Academy of Sciences, page 201513318.

583 [Bhattacharya et al., 2016] Bhattacharya, D., Agrawal, S., Aranda, M., Baumgarten, S., Belcaid, M.,

584 Drake, J. L., Erwin, D., Foret, S., Gates, R. D., Gruber, D. F., Kamel, B., Lesser, M. P., Levy,

585 O., Liew, Y. J., MacManes, M., Mass, T., Medina, M., Mehr, S., Meyer, E., Price, D. C., Putnam,

586 H. M., Qiu, H., Shinzato, C., Shoguchi, E., Stokes, A. J., Tambutt´e,S., Tchernov, D., Voolstra,

587 C. R., Wagner, N., Walker, C. W., Weber, A. P., Weis, V., Zelzion, E., Zoccola, D., and Falkowski,

588 P. G. (2016). Comparative genomics explains the evolutionary success of -forming corals. eLife,

589 5:1–26.

590 [Boetzer et al., 2011] Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D., and Pirovano, W. (2011).

591 Scaffolding pre-assembled contigs using SSPACE. Bioinformatics, 27(4):578–579.

17 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

592 [Borisenko et al., 2016] Borisenko, I., Adamski, M., Ereskovsky, A., and Adamska, M. (2016). Sur-

593 prisingly rich repertoire of Wnt genes in the demosponge Halisarca dujardini. BMC evolutionary

594 biology, 16(1):123.

595 [Brinkman et al., 2015] Brinkman, D. L., Jia, X., Potriquet, J., Kumar, D., Dash, D., Kvaskoff, D.,

596 and Mulvenna, J. (2015). Transcriptome and venom proteome of the box jellyfish Chironex fleckeri.

597 BMC Genomics, 16(1):407.

598 [Camacho et al., 2009] Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer,

599 K., and Madden, T. L. (2009). BLAST+: architecture and applications. BMC bioinformatics,

600 10:421.

601 [Cannon et al., 2016] Cannon, J. T., Vellutini, B. C., Smith, J., Ronquist, F., Jondelius, U., and

602 Hejnol, A. (2016). is the sister group to . Nature, 530(7588):89–93.

603 [Cao et al., 2010] Cao, J., Shi, F., Liu, X., Huang, G., and Zhou, M. (2010). Phylogenetic analysis

604 and evolution of aromatic amino acid hydroxylase. FEBS Letters, 584(23):4775–4782.

605 [Carlberg and Rosengren, 1985] Carlberg, M. and Rosengren, E. (1985). Biochemical basis for adren-

606 ergic neurotransmission in coelenterates. Journal of Comparative Physiology B, 155(2):251–255.

607 [Coste et al., 2012] Coste, B., Xiao, B., Santos, J. S., Syeda, R., Grandl, J., Spencer, K. S., Kim, S. E.,

608 Schmidt, M., Mathur, J., Dubin, A. E., Montal, M., and Patapoutian, A. (2012). Piezo proteins are

609 pore-forming subunits of mechanically activated channels. Nature, 483(7388):176–181.

610 [Dehal et al., 2002] Dehal, P., Satou, Y., Campbell, R. K., Chapman, J., Degnan, B., De Tomaso, A.,

611 Davidson, B., Di Gregorio, A., Gelpke, M., Goodstein, D. M., Harafuji, N., Hastings, K. E. M., Ho,

612 I., Hotta, K., Huang, W., Kawashima, T., Lemaire, P., Martinez, D., Meinertzhagen, I. a., Necula,

613 S., Nonaka, M., Putnam, N., Rash, S., Saiga, H., Satake, M., Terry, A., Yamada, L., Wang, H.-G.,

614 Awazu, S., Azumi, K., Boore, J., Branno, M., Chin-Bow, S., DeSantis, R., Doyle, S., Francino, P.,

615 Keys, D. N., Haga, S., Hayashi, H., Hino, K., Imai, K. S., Inaba, K., Kano, S., Kobayashi, K.,

616 Kobayashi, M., Lee, B.-I., Makabe, K. W., Manohar, C., Matassi, G., Medina, M., Mochizuki, Y.,

617 Mount, S., Morishita, T., Miura, S., Nakayama, A., Nishizaka, S., Nomoto, H., Ohta, F., Oishi, K.,

618 Rigoutsos, I., Sano, M., Sasaki, A., Sasakura, Y., Shoguchi, E., Shin-i, T., Spagnuolo, A., Stainier,

619 D., Suzuki, M. M., Tassy, O., Takatori, N., Tokuoka, M., Yagi, K., Yoshizaki, F., Wada, S., Zhang,

620 C., Hyatt, P. D., Larimer, F., Detter, C., Doggett, N., Glavina, T., Hawkins, T., Richardson,

621 P., Lucas, S., Kohara, Y., Levine, M., Satoh, N., and Rokhsar, D. S. (2002). The draft genome

622 of Ciona intestinalis: insights into and vertebrate origins. Science (New York, N.Y.),

623 298(5601):2157–2167.

624 [Denoeud et al., 2010] Denoeud, F., Henriet, S., Mungpakdee, S., Aury, J.-M., Da Silva, C.,

625 Brinkmann, H., Mikhaleva, J., Olsen, L. C., Jubin, C., Canestro, C., Bouquet, J.-M., Danks, G.,

626 Poulain, J., Campsteijn, C., Adamski, M., Cross, I., Yadetie, F., Muffato, M., Louis, A., Butcher,

627 S., Tsagkogeorga, G., Konrad, A., Singh, S., Jensen, M. F., Cong, E. H., Eikeseth-Otteraa, H.,

628 Noel, B., Anthouard, V., Porcel, B. M., Kachouri-Lafond, R., Nishino, A., Ugolini, M., Chourrout,

629 P., Nishida, H., Aasland, R., Huzurbazar, S., Westhof, E., Delsuc, F., Lehrach, H., Reinhardt, R.,

630 Weissenbach, J., Roy, S. W., Artiguenave, F., Postlethwait, J. H., Manak, J. R., Thompson, E. M.,

631 Jaillon, O., Du Pasquier, L., Boudinot, P., Liberles, D. a., Volff, J.-N., Philippe, H., Lenhard, B.,

632 Crollius, H. R., Wincker, P., and Chourrout, D. (2010). Plasticity of Animal Genome Architecture

633 Unmasked by Rapid Evolution of a Pelagic . Science, 1381(2010).

634 [Dunn et al., 2008] Dunn, C. W., Hejnol, A., Matus, D. Q., Pang, K., Browne, W. E., Smith, S. a.,

635 Seaver, E., Rouse, G. W., Obst, M., Edgecombe, G. D., Sørensen, M. V., Haddock, S. H. D.,

636 Schmidt-Rhaesa, A., Okusu, A., Kristensen, R. M., Wheeler, W. C., Martindale, M. Q., and Giribet,

637 G. (2008). Broad phylogenomic sampling improves resolution of the animal tree of life. Nature,

638 452(7188):745–9.

639 [Dunn et al., 2015] Dunn, C. W., Leys, S. P., and Haddock, S. H. D. (2015). The hidden biology of

640 sponges and ctenophores. Trends in Ecology & Evolution, pages 1–10.

641 [Ebersberger et al., 2009] Ebersberger, I., Strauss, S., and von Haeseler, A. (2009). HaMStR: profile

642 hidden markov model based search for orthologs in ESTs. BMC evolutionary biology, 9:157.

18 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

643 [Eddy, 2011] Eddy, S. R. (2011). Accelerated profile HMM searches. PLoS Computational Biology,

644 7(10).

645 [Elliott and Leys, 2007] Elliott, G. R. D. and Leys, S. P. (2007). Coordinated contractions effectively

646 expel water from the aquiferous system of a freshwater sponge. Journal of Experimental Biology,

647 210(21):3736–3748.

648 [Elliott and Leys, 2010] Elliott, G. R. D. and Leys, S. P. (2010). Evidence for glutamate, GABA and

649 NO in coordinating behaviour in the sponge, muelleri (Demospongiae, ). The

650 Journal of experimental biology, 213:2310–2321.

651 [Ellwanger et al., 2007] Ellwanger, K., Eich, A., and Nickel, M. (2007). GABA and glutamate specif-

652 ically induce contractions in the sponge Tethya wilhelma. Journal of Comparative Physiology A:

653 Neuroethology, Sensory, Neural, and Behavioral Physiology, 193(1):1–11.

654 [Ellwanger and Nickel, 2006] Ellwanger, K. and Nickel, M. (2006). Neuroactive substances specifically

655 modulate rhythmic body contractions in the nerveless metazoon Tethya wilhelma (Demospongiae,

656 Porifera). Frontiers in , 3:7.

657 [Fairclough et al., 2013] Fairclough, S. R., Chen, Z., Kramer, E., Zeng, Q., Young, S., Robertson,

658 H. M., Begovic, E., Richter, D. J., Russ, C., Westbrook, M. J., Manning, G., Lang, B. F., Haas,

659 B., Nusbaum, C., and King, N. (2013). Premetazoan genome evolution and the regulation of cell

660 differentiation in the choanoflagellate Salpingoeca rosetta. Genome biology, 14(2):R15.

661 [Fernandez-Valverde et al., 2015] Fernandez-Valverde, S. L., Calcino, A. D., and Degnan, B. M. (2015).

662 Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene

663 annotation in the sponge Amphimedon queenslandica. BMC Genomics, 16(1):1–11.

664 [Finn et al., 2016] Finn, R. D., Coggill, P., Eberhardt, R. Y., Eddy, S. R., Mistry, J., Mitchell, A. L.,

665 Potter, S. C., Punta, M., Qureshi, M., Sangrador-Vegas, A., Salazar, G. A., Tate, J., and Bateman,

666 A. (2016). The Pfam protein families database: towards a more sustainable future. Nucleic Acids

667 Research, 44(D1):D279–D285.

668 [Fortunato et al., 2014] Fortunato, S. a. V., Adamski, M., Ramos, O. M., Leininger, S., Liu, J., Ferrier,

669 D. E. K., and Adamska, M. (2014). Calcisponges have a ParaHox gene and dynamic expression of

670 dispersed NK homeobox genes. Nature, 514(7524):620–623.

671 [Ge et al., 2015] Ge, J., Li, W., Zhao, Q., Li, N., Chen, M., Zhi, P., Li, R., Gao, N., Xiao, B., and Yang,

672 M. (2015). Architecture of the mammalian mechanosensitive Piezo1 channel. Nature, 527(5):64–69.

673 [Geng et al., 2013] Geng, Y., Bush, M., Mosyak, L., Wang, F., and Fan, Q. R. (2013). Structural

674 mechanism of ligand activation in human GABA(B) receptor. Nature, 504(7479):254–9.

675 [Grabherr et al., 2011] Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. a.,

676 Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N.,

677 Gnirke, A., Rhind, N., di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., Friedman, N.,

678 and Regev, A. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference

679 genome. Nature biotechnology, 29(7):644–52.

680 [Grell and Benwitz, 1974] Grell, K. G. and Benwitz, G. (1974). Spezifische Verbindungsstrukuren der

681 Faserzellen von Trichoplax adhaerens F.E. Schulze. Z. Naturforsch., 29c:790.

682 [Gur Barzilai et al., 2012] Gur Barzilai, M., Reitzel, A. M., Kraus, J. E. M., Gordon, D., Technau, U.,

683 Gurevitz, M., and Moran, Y. (2012). of Sodium Ion Selectivity in Metazoan

684 Neuronal Signaling. Cell Reports, 2(2):242–248.

685 [Haas et al., 2013] Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden,

686 J., Couger, M. B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet,

687 N., Strozzi, F., Weeks, N., Westerman, R., William, T., Dewey, C. N., Henschel, R., Leduc, R. D.,

688 Friedman, N., and Regev, A. (2013). De novo transcript sequence reconstruction from RNA-seq

689 using the Trinity platform for reference generation and analysis. Nature protocols, 8(8):1494–512.

690 [Harbison, 1985] Harbison, G. R. (1985). On the classification and evolution of the Ctenophora. In

691 Conway Morris, S. C., George, J. D., Gibson, R., and Platt, H. M., editors, The Origin and Rela-

692 tionships of Lower , pages 78–100. Clarendon Press, Oxford.

19 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

693 [Hoff and Stanke, 2013] Hoff, K. J. and Stanke, M. (2013). WebAUGUSTUS–a web service for training

694 AUGUSTUS and predicting genes in eukaryotes. Nucleic Acids Research, 41(W1):W123–W128.

695 [Huang et al., 2012] Huang, S., Chen, Z., Huang, G., Yu, T., Yang, P., Li, J., Fu, Y., Yuan, S., Chen,

696 S., and Xu, A. (2012). HaploMerger: Reconstructing allelic relationships for polymorphic diploid

697 genome assemblies. Genome Research, 22(8):1581–1588.

698 [Jarvis et al., 2014] Jarvis, E. D., Mirarab, S., Aberer, A. J., Li, B., Houde, P., Li, C., Ho, S. Y. W.,

699 Faircloth, B. C., Nabholz, B., Howard, J. T., Suh, A., Weber, C. C., da Fonseca, R. R., Li, J., Zhang,

700 F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M. S., Zavidovych, V.,

701 Subramanian, S., Gabaldon, T., Capella-Gutierrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K.,

702 Schierup, M., Lindow, B., Warren, W. C., Ray, D., Green, R. E., Bruford, M. W., Zhan, X., Dixon,

703 A., Li, S., Li, N., Huang, Y., Derryberry, E. P., Bertelsen, M. F., Sheldon, F. H., Brumfield, R. T.,

704 Mello, C. V., Lovell, P. V., Wirthlin, M., Schneider, M. P. C., Prosdocimi, F., Samaniego, J. A.,

705 Velazquez, A. M. V., Alfaro-Nunez, A., Campos, P. F., Petersen, B., Sicheritz-Ponten, T., Pas, A.,

706 Bailey, T., Scofield, P., Bunce, M., Lambert, D. M., Zhou, Q., Perelman, P., Driskell, A. C., Shapiro,

707 B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y.,

708 Yang, H., Wang, J., Smeds, L., Rheindt, F. E., Braun, M., Fjeldsa, J., Orlando, L., Barker, F. K.,

709 Jonsson, K. A., Johnson, W., Koepfli, K.-P., O’Brien, S., Haussler, D., Ryder, O. A., Rahbek, C.,

710 Willerslev, E., Graves, G. R., Glenn, T. C., McCormack, J., Burt, D., Ellegren, H., Alstrom, P.,

711 Edwards, S. V., Stamatakis, A., Mindell, D. P., Cracraft, J., Braun, E. L., Warnow, T., Jun, W.,

712 Gilbert, M. T. P., and Zhang, G. (2014). Whole-genome analyses resolve early branches in the tree

713 of life of modern . Science, 346(6215):1320–1331.

714 [J´ekely et al., 2015] J´ekely, G., Paps, J., and Nielsen, C. (2015). The phylogenetic position of

715 ctenophores and the origin(s) of nervous systems. EvoDevo, 6(1):1.

716 [Katoh and Standley, 2013] Katoh, K. and Standley, D. M. (2013). MAFFT multiple sequence align-

717 ment software version 7: improvements in performance and usability. Molecular biology and evolu-

718 tion, 30(4):772–80.

719 [Kim et al., 2013] Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., and Salzberg, S. L.

720 (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and

721 gene fusions. Genome biology, 14(4):R36.

722 [King et al., 2008] King, N., Westbrook, M. J., Young, S. L., Kuo, A., Abedin, M., Chapman, J.,

723 Fairclough, S., Hellsten, U., Isogai, Y., Letunic, I., Marr, M., Pincus, D., Putnam, N., Rokas, A.,

724 Wright, K. J., Zuzow, R., Dirks, W., Good, M., Goodstein, D., Lemons, D., Li, W., Lyons, J. B.,

725 Morris, A., Nichols, S., Richter, D. J., Salamov, A., Sequencing, J. G. I., Bork, P., Lim, W. a.,

726 Manning, G., Miller, W. T., McGinnis, W., Shapiro, H., Tjian, R., Grigoriev, I. V., and Rokhsar,

727 D. (2008). The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans.

728 Nature, 451(7180):783–8.

729 [Krishnan et al., 2014] Krishnan, A., Dnyansagar, R., Alm´en,M. S., Williams, M. J., Fredriksson,

730 R., Manoj, N., and Schi¨oth,H. B. (2014). The GPCR repertoire in the demosponge Amphimedon

731 queenslandica: insights into the GPCR system at the early divergence of animals. BMC Evolutionary

732 Biology, 14(1):270.

733 [Krishnan and Schi¨oth,2015] Krishnan, A. and Schi¨oth,H. B. (2015). The role of G protein-coupled

734 receptors in the early evolution of neurotransmission and the nervous system. The Journal of

735 experimental biology, 218(Pt 4):562–571.

736 [Langmead and Salzberg, 2012] Langmead, B. and Salzberg, S. L. (2012). Fast gapped-read alignment

737 with Bowtie 2.

738 [Leys, 2015] Leys, S. P. (2015). Elements of a ’nervous system’ in sponges. The Journal of experimental

739 biology, 218(Pt 4):581–91.

740 [Leys et al., 1999] Leys, S. P., Mackie, G. O., and Meech, R. W. (1999). Impulse conduction in a

741 sponge. The Journal of experimental biology, 202 (Pt 9)(June 1997):1139–1150.

742 [Leys et al., 2007] Leys, S. P., Mackie, G. O., and Reiswig, H. M. (2007). The Biology of Glass Sponges.

743 Advances in Marine Biology, 52(06):1–145.

20 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

744 [Li et al., 2015] Li, X., Liu, H., Chu Luo, J., Rhodes, S. a., Trigg, L. M., van Rossum, D. B., Anishkin,

745 A., Diatta, F. H., Sassic, J. K., Simmons, D. K., Kamel, B., Medina, M., Martindale, M. Q., and

746 Jegla, T. (2015). Major diversification of voltage-gated K\n +\n channels occurred in ancestral

747 parahoxozoans. Proceedings of the National Academy of Sciences, page 201422941.

748 [Liebeskind et al., 2011] Liebeskind, B. J., Hillis, D. M., and Zakon, H. H. (2011). Evolution of sodium

749 channels predates the origin of nervous systems in animals. Proceedings of the National Academy of

750 Sciences of the United States of America, 108(22):9154–9159.

751 [Ludeman et al., 2014] Ludeman, D. A., Farrar, N., Riesgo, A., Paps, J., and Leys, S. P. (2014).

752 Evolutionary origins of sensation in metazoans: functional evidence for a new sensory organ in

753 sponges. BMC Evolutionary Biology, 14(3):1–11.

754 [Luo et al., 2012] Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q.,

755 Liu, Y., Tang, J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., Cheung,

756 D. W., Yiu, S.-M., Peng, S., Xiaoqian, Z., Liu, G., Liao, X., Li, Y., Yang, H., Wang, J., Lam, T.-W.,

757 and Wang, J. (2012). SOAPdenovo2: an empirically improved memory-efficient short-read de novo

758 assembler. GigaScience, 1(1):18.

759 [Mar¸caisand Kingsford, 2011] Mar¸cais,G. and Kingsford, C. (2011). A fast, lock-free approach for

760 efficient parallel counting of occurrences of k-mers. Bioinformatics, 27(6):764–770.

761 [Moran et al., 2015] Moran, Y., Barzilai, M. G., Liebeskind, B. J., and Zakon, H. H. (2015). Evolu-

762 tion of voltage-gated ion channels at the emergence of Metazoa. Journal of Experimental Biology,

763 218:515–525.

764 [Moran et al., 2014] Moran, Y., Fredman, D., Praher, D., Li, X. Z., Wee, L. M., Rentzsch, F., Zamore,

765 P. D., Technau, U., and Seitz, H. (2014). Cnidarian microRNAs frequently regulate targets by

766 . Genome Research, 24(4):651–663.

767 [Moroz, 2015] Moroz, L. L. (2015). Convergent evolution of neural systems in ctenophores. Journal

768 of Experimental Biology, 218:598–611.

769 [Moroz et al., 2014] Moroz, L. L., Kocot, K. M., Citarella, M. R., Dosung, S., Norekian, T. P., Po-

770 volotskaya, I. S., Grigorenko, A. P., Dailey, C., Berezikov, E., Buckley, K. M., Ptitsyn, A., Reshetov,

771 D., Mukherjee, K., Moroz, T. P., Bobkova, Y., Yu, F., Kapitonov, V. V., Jurka, J., Bobkov, Y. V.,

772 Swore, J. J., Girardo, D. O., Fodor, A., Gusev, F., Sanford, R., Bruders, R., Kittler, E., Mills, C. E.,

773 Rast, J. P., Derelle, R., Solovyev, V. V., Kondrashov, F. a., Swalla, B. J., Sweedler, J. V., Rogaev,

774 E. I., Halanych, K. M., and Kohn, A. B. (2014). The ctenophore genome and the evolutionary

775 origins of neural systems. Nature, 17:1–123.

776 [Nickel, 2010] Nickel, M. (2010). Evolutionary emergence of synaptic nervous systems: what can we

777 learn from the non-synaptic, nerveless Porifera? Biology, 129(1):1–16.

778 [Nosenko et al., 2013] Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel,

779 J., Maldonado, M., M¨uller,W. E. G., Nickel, M., Schierwater, B., Vacelet, J., Wiens, M., and

780 W¨orheide,G. (2013). Deep metazoan phylogeny: when different genes tell different stories. Molecular

781 phylogenetics and evolution, 67(1):223–33.

782 [P´erez-Porro et al., 2013] P´erez-Porro, a. R., Navarro-G´omez,D., Uriz, M. J., and Giribet, G. (2013).

783 A NGS approach to the encrusting Mediterranean sponge (Porifera, Demospongiae,

784 ): transcriptome sequencing, characterization and overview of the gene expression

785 along three life cycle stages. Molecular ecology resources, 454:494–509.

786 [Pertea et al., 2015] Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T.-C., Mendell, J. T., and

787 Salzberg, S. L. (2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq

788 reads. Nature Biotechnology, 33(3).

789 [Petersen et al., 2011] Petersen, T. N., Brunak, S., von Heijne, G., and Nielsen, H. (2011). SignalP

790 4.0: discriminating signal peptides from transmembrane regions. Nature methods, 8(10):785–6.

791 [Philippe et al., 2011] Philippe, H., Brinkmann, H., Lavrov, D. V., Littlewood, D. T. J., Manuel,

792 M., W¨orheide,G., and Baurain, D. (2011). Resolving difficult phylogenetic questions: why more

793 sequences are not enough. PLoS biology, 9(3):e1000602.

21 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

794 [Philippe et al., 2009] Philippe, H., Derelle, R., Lopez, P., Pick, K., Borchiellini, C., Boury-Esnault,

795 N., Vacelet, J., Renard, E., Houliston, E., Qu´einnec,E., Da Silva, C., Wincker, P., Le Guyader, H.,

796 Leys, S., Jackson, D. J., Schreiber, F., Erpenbeck, D., Morgenstern, B., W¨orheide,G., and Manuel,

797 M. (2009). Phylogenomics revives traditional views on deep animal relationships. Current biology :

798 CB, 19(8):706–12.

799 [Pick et al., 2010] Pick, K. S., Philippe, H., Schreiber, F., Erpenbeck, D., Jackson, D. J., Wrede, P.,

800 Wiens, M., Ali´e,A., Morgenstern, B., Manuel, M., and W¨orheide,G. (2010). Improved phylogenomic

801 taxon sampling noticeably affects nonbilaterian relationships. Molecular Biology and Evolution,

802 27(9):1983–1987.

803 [Pisani et al., 2015] Pisani, D., Pett, W., Dohrmann, M., Feuda, R., Rota-Stabelli, O., Philippe, H.,

804 Lartillot, N., and W¨orheide,G. (2015). Genomic data do not support comb jellies as the sister group

805 to all other animals. Proceedings of the National Academy of Sciences, 112(50):201518127.

806 [Ponce et al., 2016] Ponce, D., Brinkman, D. L., Potriquet, J., and Mulvenna, J. (2016). Tentacle tran-

807 scriptome and venom proteome of the pacific sea nettle, Chrysaora fuscescens (Cnidaria: Scyphozoa).

808 Toxins, 8(4).

809 [Pratlong et al., 2015] Pratlong, M., Haguenauer, A., Chabrol, O., Klopp, C., Pontarotti, P., and

810 Aurelle, D. (2015). The red coral ( Corallium rubrum ) transcriptome: a new resource for population

811 genetics and local adaptation studies. Molecular Ecology Resources, 15(5):1205–1215.

812 [Price et al., 2010] Price, M. N., Dehal, P. S., and Arkin, A. P. (2010). FastTree 2 - Approximately

813 maximum-likelihood trees for large alignments. PLoS ONE, 5(3).

814 [Putnam et al., 2008] Putnam, N. H., Butts, T., Ferrier, D. E. K., Furlong, R. F., Hellsten, U.,

815 Kawashima, T., Robinson-Rechavi, M., Shoguchi, E., Terry, A., Yu, J.-K., Benito-Guti´errez,E. L.,

816 Dubchak, I., Garcia-Fern`andez,J., Gibson-Brown, J. J., Grigoriev, I. V., Horton, A. C., de Jong,

817 P. J., Jurka, J., Kapitonov, V. V., Kohara, Y., Kuroki, Y., Lindquist, E., Lucas, S., Osoegawa,

818 K., Pennacchio, L. a., Salamov, A. a., Satou, Y., Sauka-Spengler, T., Schmutz, J., Shin-I, T., Toy-

819 oda, A., Bronner-Fraser, M., Fujiyama, A., Holland, L. Z., Holland, P. W. H., Satoh, N., and

820 Rokhsar, D. S. (2008). The amphioxus genome and the evolution of the chordate karyotype. Nature,

821 453(7198):1064–71.

822 [Qiu et al., 2015] Qiu, F., Ding, S., Ou, H., Wang, D., Chen, J., and Miyamoto, M. M. (2015). Tran-

823 scriptome changes during the life cycle of the red sponge, Mycale phyllophila (Porifera, Demospon-

824 giae, Poecilosclerida). Genes, 6(4):1023–1052.

825 [Ramoino et al., 2010] Ramoino, P., Ledda, F. D., Ferrando, S., Gallus, L., Bianchini, P., Diaspro, A.,

826 Fato, M., Tagliafierro, G., and Manconi, R. (2010). Metabotropic ??-aminobutyric acid (GABAB)

827 receptors modulate feeding behavior in the calcisponge Leucandra aspera. Journal of Experimental

828 Zoology Part A: Ecological Genetics and Physiology, 313A(3):132–140.

829 [Riesgo et al., 2014a] Riesgo, A., Farrar, N., Windsor, P. J., Giribet, G., and Leys, S. P. (2014a). The

830 Analysis of Eight Transcriptomes from All Poriferan Classes Reveals Surprising Genetic Complexity

831 in Sponges. Molecular biology and evolution.

832 [Riesgo et al., 2014b] Riesgo, A., Peterson, K., Richardson, C., Heist, T., Strehlow, B., McCauley,

833 M., Cotman, C., Hill, M., and Hill, A. (2014b). Transcriptomic analysis of differential host gene

834 expression upon uptake of symbionts: a case study with and the major bioeroding

835 sponge Cliona varians. BMC genomics, 15(1):376.

836 [Ryan et al., 2010] Ryan, J. F., Pang, K., Mullikin, J. C., Martindale, M. Q., and Baxevanis,

837 A. D. (2010). The homeodomain complement of the ctenophore Mnemiopsis leidyi suggests that

838 Ctenophora and Porifera diverged prior to the ParaHoxozoa. EvoDevo, 1(1):9.

839 [Ryan et al., 2013] Ryan, J. F., Pang, K., Schnitzler, C. E., a. D. Nguyen, A.-d., Moreland, R. T.,

840 Simmons, D. K., Koch, B. J., Francis, W. R., Havlak, P., Smith, S. a., Putnam, N. H., Haddock,

841 S. H. D., Dunn, C. W., Wolfsberg, T. G., Mullikin, J. C., Martindale, M. Q., Baxevanis, A. D.,

842 Comparative, N., and Program, S. (2013). The Genome of the Ctenophore Mnemiopsis leidyi and

843 Its Implications for Cell Type Evolution. Science, 342(6164):1242592–1242592.

22 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

844 [Ryu et al., 2016] Ryu, T., Seridi, L., Moitinho-Silva, L., Oates, M., Liew, Y. J., Mavromatis, C.,

845 Wang, X., Haywood, A., Lafi, F. F., Kupresanin, M., Sougrat, R., Alzahrani, M. A., Giles, E.,

846 Ghosheh, Y., Schunter, C., Baumgarten, S., Berumen, M. L., Gao, X., Aranda, M., Foret, S.,

847 Gough, J., Voolstra, C. R., Hentschel, U., and Ravasi, T. (2016). Hologenome analysis of two

848 marine sponges with different microbiomes. BMC genomics, 17(1):158.

849 [S´aezet al., 2009] S´aez,A. G., Lozano, E., and Zald´ıvar-River´on,A. (2009). Evolutionary history of

850 Na,K-ATPases and their osmoregulatory role. Genetica, 136(3):479–490.

851 [Sahlin et al., 2014] Sahlin, K., Vezzi, F., Nystedt, B., Lundeberg, J., and Arvestad, L. (2014). BESST

852 - Efficient scaffolding of large fragmented assemblies. BMC Bioinformatics, 15(1):281.

853 [Sara et al., 2001] Sara, M., Sara, A., Nickel, M., and Br¨ummer,F. (2001). Three New Species

854 of Tethya (Porifera: Demospongia) from German Aquaria. Stuttgarter Beitr¨agezur Naturkunde,

855 631(15S):1–16.

856 [Schierwater et al., 2009] Schierwater, B., Kolokotronis, S., Eitel, M., and DeSalle, R. (2009). The

857 Diploblast-Bilateria Sister hypothesis. Communicative & Integrative Biology, 2(5):1–3.

858 [Schuler et al., 2015] Schuler, a., Schmitz, G., Reft, A., Ozbek, S., Thurm, U., and Bornberg-Bauer,

859 E. (2015). The rise and fall of TRP-N, an ancient family of mechanogated ion channels, in Metazoa.

860 Genome Biology and Evolution, 7(6):1–27.

861 [Simakov et al., 2015] Simakov, O., Kawashima, T., Marl´etaz,F., Jenkins, J., Koyanagi, R., Mitros,

862 T., Hisata, K., Bredeson, J., Shoguchi, E., Gyoja, F., Yue, J.-X., Chen, Y.-C., Freeman, R. M.,

863 Sasaki, A., Hikosaka-Katayama, T., Sato, A., Fujie, M., Baughman, K. W., Levine, J., Gonzalez,

864 P., Cameron, C., Fritzenwanker, J. H., Pani, A. M., Goto, H., Kanda, M., Arakaki, N., Yamasaki,

865 S., Qu, J., Cree, A., Ding, Y., Dinh, H. H., Dugan, S., Holder, M., Jhangiani, S. N., Kovar, C. L.,

866 Lee, S. L., Lewis, L. R., Morton, D., Nazareth, L. V., Okwuonu, G., Santibanez, J., Chen, R.,

867 Richards, S., Muzny, D. M., Gillis, A., Peshkin, L., Wu, M., Humphreys, T., Su, Y.-H., Putnam,

868 N. H., Schmutz, J., Fujiyama, A., Yu, J.-K., Tagawa, K., Worley, K. C., Gibbs, R. A., Kirschner,

869 M. W., Lowe, C. J., Satoh, N., Rokhsar, D. S., and Gerhart, J. (2015). genomes and

870 origins. Nature, pages 1–19.

871 [Simakov et al., 2013] Simakov, O., Marletaz, F., Cho, S.-J., Edsinger-Gonzales, E., Havlak, P., Hell-

872 sten, U., Kuo, D.-H., Larsson, T., Lv, J., Arendt, D., Savage, R., Osoegawa, K., de Jong, P.,

873 Grimwood, J., Chapman, J. a., Shapiro, H., Aerts, A., Otillar, R. P., Terry, A. Y., Boore, J. L.,

874 Grigoriev, I. V., Lindberg, D. R., Seaver, E. C., Weisblat, D. a., Putnam, N. H., and Rokhsar, D. S.

875 (2013). Insights into bilaterian evolution from three spiralian genomes. Nature, 493(7433):526–31.

876 [Sim˜aoet al., 2015] Sim˜ao,F. A., Waterhouse, R. M., Ioannidis, P., and Kriventseva, E. V. (2015).

877 BUSCO : assessing genome assembly and annotation completeness with single-copy orthologs.

878 Genome analysis, 31(June):9–10.

879 [Simion et al., 2017] Simion, P., Philippe, H., Baurain, D., Jager, M., Richter, D. J., Di Franco, A.,

880 Roure, B., Satoh, N., Qu´einnec, E.,´ Ereskovsky, A., Lap´ebie,P., Corre, E., Delsuc, F., King, N.,

881 W¨orheide,G., and Manuel, M. (2017). A Large and Consistent Phylogenomic Dataset Supports

882 Sponges as the Sister Group to All Other Animals. Current Biology, pages 1–10.

883 [Smith et al., 2011] Smith, S. M. E., Morgan, D., Musset, B., Cherny, V. V., Place, A. R., Hastings,

884 J. W., and DeCoursey, T. E. (2011). Voltage-gated proton channel in a dinoflagellate. Proceedings

885 of the National Academy of Sciences, 108(44):18162–18167.

886 [Sreedharan et al., 2010] Sreedharan, S., Shaik, J. H. A., Olszewski, P. K., Levine, A. S., Schi¨oth,H. B.,

887 and Fredriksson, R. (2010). Glutamate, aspartate and nucleotide transporters in the SLC17 family

888 form four main phylogenetic clusters: evolution and tissue expression. BMC genomics, 11(iii):17.

889 [Srivastava et al., 2008] Srivastava, M., Begovic, E., Chapman, J., Putnam, N. H., Hellsten, U.,

890 Kawashima, T., Kuo, A., Mitros, T., Salamov, A., Carpenter, M. L., Signorovitch, A. Y., Moreno,

891 M. a., Kamm, K., Grimwood, J., Schmutz, J., Shapiro, H., Grigoriev, I. V., Buss, L. W., Schier-

892 water, B., Dellaporta, S. L., and Rokhsar, D. S. (2008). The Trichoplax genome and the nature of

893 placozoans. Nature, 454(7207):955–60.

23 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

894 [Srivastava et al., 2010] Srivastava, M., Simakov, O., Chapman, J., Fahey, B., Gauthier, M. E. a.,

895 Mitros, T., Richards, G. S., Conaco, C., Dacre, M., Hellsten, U., Larroux, C., Putnam, N. H.,

896 Stanke, M., Adamska, M., Darling, A., Degnan, S. M., Oakley, T. H., Plachetzki, D. C., Zhai, Y.,

897 Adamski, M., Calcino, A., Cummins, S. F., Goodstein, D. M., Harris, C., Jackson, D. J., Leys, S. P.,

898 Shu, S., Woodcroft, B. J., Vervoort, M., Kosik, K. S., Manning, G., Degnan, B. M., and Rokhsar,

899 D. S. (2010). The Amphimedon queenslandica genome and the evolution of animal complexity.

900 Nature, 466(7307):720–6.

901 [Stamatakis, 2014] Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and

902 post-analysis of large phylogenies. Bioinformatics, 30(9):1312–1313.

903 [Stanke et al., 2008] Stanke, M., Diekhans, M., Baertsch, R., and Haussler, D. (2008). Using native and

904 syntenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics, 24(5):637–

905 644.

906 [Stein, 1995] Stein, W. D. (1995). The sodium pump in the evolution of animal cells. Philosophical

907 transactions of the Royal Society of London. Series B, Biological sciences, 349(1329):263–9.

908 [Strous et al., 2012] Strous, M., Kraft, B., Bisdorf, R., and Tegetmeyer, H. E. (2012). The bin-

909 ning of metagenomic contigs for microbial physiology of mixed cultures. Frontiers in Microbiology,

910 3(DEC):1–11.

911 [Suga et al., 2013] Suga, H., Chen, Z., de Mendoza, A., Seb´e-Pedr´os,A., Brown, M. W., Kramer, E.,

912 Carr, M., Kerner, P., Vervoort, M., S´anchez-Pons, N., Torruella, G., Derelle, R., Manning, G., Lang,

913 B. F., Russ, C., Haas, B. J., Roger, A. J., Nusbaum, C., and Ruiz-Trillo, I. (2013). The Capsaspora

914 genome reveals a complex unicellular prehistory of animals. Nature communications, 4:2325.

915 [Versluis et al., 2015] Versluis, D., D’Andrea, M. M., Ramiro Garcia, J., Leimena, M. M., Hugenholtz,

916 F., Zhang, J., Ozt¨urk,B.,¨ Nylund, L., Sipkema, D., van Schaik, W., de Vos, W. M., Kleerebezem,

917 M., Smidt, H., and van Passel, M. W. J. (2015). Mining microbial metatranscriptomes for expression

918 of antibiotic resistance genes under natural conditions. Scientific reports, 5(January):11981.

919 [Whelan et al., 2015] Whelan, N. V., Kocot, K. M., Moroz, L. L., and Halanych, K. M. (2015). Error

920 , signal , and the placement of Ctenophora sister to all other animals. Proceedings of the National

921 Academy of Sciences, 112(18):1–6.

922 [Wray et al., 2015] Wray, G. A., Smith, A., Peterson, K., Donoghue, P., Benton, M., Thackray, J.,

923 Budd, G., Marshall, C., Shu, D., Isozaki, Y., Zhang, X., Han, J., Maruyama, S., Walcott, C., Knoll,

924 A., Walter, M., Narbonne, G., Christie-Blick, N., Raymond, P., Cloud, P., Budd, G., Jensen, S.,

925 Briggs, D., Fortey, R., Morris, S. C., Seilacher, A., Bose, P., Pfluger, F., Rasmussen, B., Bengtson,

926 S., Fletcher, I., McNaughton, N., Pecoits, E., Konhauser, K., Aubet, N., Heaman, L., Veroslavsky,

927 G., Stern, R., Gingras, M., Huldtgren, T., Cunningham, J., Yin, C., Stampanoni, M., Marone, F.,

928 Donoghue, P., Bengtson, S., Gaucher, C., Poire, D., Bossi, J., Bettucci, L., Beri, A., Fedonkin,

929 M., Simonetta, A., Ivantsov, A., Sprigg, R., Glaessner, M., Gehling, J., Seilacher, A., Retallack,

930 G., Narbonne, G., Seilacher, A., Grazhdankin, D., Legouta, A., Li, C., Chen, J., Hua, T., Maloof,

931 A., Tang, F., Bengtson, S., Wang, Y., Wang, X., Yin, C., Yin, Z., Grant, S., Grotzinger, J.,

932 Watters, W., Knoll, A., Fedonkin, M., Vickers-Rich, P., Swalla, B., Trusler, P., Hall, M., Xiao, S.,

933 Zhang, Y., Knoll, A., Chen, J.-Y., Oliveri, P., Li, C., Zhou, G., Gao, F., Hagadorn, J., Peterson,

934 K., Davidson, E., Hagadorn, J., Chen, L., Xiao, S., Pang, K., Zhou, C., Yuan, X., Bengtson, S.,

935 Budd, G., Cunningham, J., Margoliash, E., Brown, R., Richardson, M., Boulter, D., Ramshaw,

936 J., Jefferies, R., Runnegar, B., Knoll, A., Carroll, S., Wray, G., Levinton, J., Shapiro, L., Aris-

937 Brosou, S., Yang, Z., Bromham, L., Rambaut, A., Fortey, R., Cooper, A., Penny, D., Welch, J.,

938 Fontanillas, E., Bromham, L., Wheat, C., Wahlberg, N., Schulte, J., Erwin, D., Laflamme, M.,

939 Tweedt, S., Sperling, E., Pisani, D., Peterson, K., Filipski, A., Murillo, O., Freydenzon, A., Tamura,

940 K., Kumar, S., Tamura, K., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar,

941 S., Hug, L., Roger, A., Ho, S., Phillips, M., Sanderson, M., Thorne, J., Kishino, H., Painter, I.,

942 Sanderson, M., Britton, T., Anderson, C., Jacquet, D., Lundqvist, S., Bremer, K., Shaul, S., Graur,

943 D., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Mello, B., Schrago, C.,

944 Rambaut, A., Bromham, L., Kishino, H., Thorne, J., Bruno, W., Douzery, E., Snell, E., Bapteste,

945 E., Delsuc, F., Philippe, H., Battistuzzi, F., Filipski, A., Hedges, S., Kumar, S., Schwartz, R.,

24 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

946 Mueller, R., Ho, S., Phillips, M., Drummond, A., Cooper, A., Blair, J., Hedges, S., Warnock, R.,

947 Yang, Z., Donoghue, P., Battistuzzi, F., Billing-Ross, P., Murillo, O., Filipski, A., Kumar, S., Ayala,

948 F., Rzhetsky, A., Ayala, F., Peterson, K., Lyons, J., Nowak, K., Takacs, C., Wargo, M., McPeek,

949 M., Cutler, D., Chernikova, D., Motamedi, S., Csuros, M., Koonin, E., Rogozin, I., Richter, D.,

950 King, N., Dunn, C., Giribet, G., Edgecombe, G., Hejnol, A., Emes, R., Grant, S., Burkhardt, P.,

951 Hejnol, A., Martindale, M., Balavoine, G., Adoutte, A., Hirth, F., Miller, D., Ball, E., Northcutt,

952 R., Strausfeld, N., Hirth, F., Tosches, M., Arendt, D., Lyons, T., Reinhard, C., Planavsky, N.,

953 Planavsky, N., Sperling, E., Frieder, C., Raman, A., Girguis, P., Levin, L., Knoll, A., Stanley, S.,

954 Peterson, K., Cotton, J., Gehling, J., Pisani, D., Butterfield, N., Benito-Gutierrez, E., Arendt, D.,

955 Holland, L., Carvalho, J., Escriva, H., Laudet, V., Schubert, M., Shimeld, S., Yu, J.-K., Turner,

956 S., Young, J., Pisani, D., Poling, L., Lyons-Weiler, M., Hedges, S., Otsuka, J., Sugaya, N., Hedges,

957 S., Blair, J., Venturi, M., Shoe, J., and Gu, X. (2015). Molecular clocks and the early evolution

958 of metazoan nervous systems. Philosophical transactions of the Royal Society of London. Series B,

959 Biological sciences, 370(1684):424–431.

960 [Wu and Watanabe, 2005] Wu, T. D. and Watanabe, C. K. (2005). GMAP: a genomic mapping and

961 alignment program for mRNA and EST sequences. Bioinformatics, 21(9):1859–1875.

962 [Zakon, 2012] Zakon, H. H. (2012). Adaptive evolution of voltage-gated sodium channels: the first 800

963 million . Proceedings of the National Academy of Sciences of the United States of America, 109

964 Suppl(Supplement 1):10619–25.

965 [Zapata et al., 2015] Zapata, F., Goetz, F. E., Smith, S. A., Howison, M., Siebert, S., Church, S. H.,

966 Sanders, S. M., Ames, C. L., McFadden, C. S., France, S. C., Daly, M., Collins, A. G., Haddock,

967 S. H. D., Dunn, C. W., and Cartwright, P. (2015). Phylogenomic Analyses Support Traditional

968 Relationships within Cnidaria. Plos One, 10(10):e0139068.

25 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

969 1 Supplemental Figures

Raw reads coverage vs GC 1e+06 100

1e+05 80 Bacteria 10000

Bacteria 60

Haploid Diploid Repeats Repeats 1000 GC% 40 100

20 10

0 1 100 200 300 400 500

Coverage Counts

Supplemental Figure 1: Coverage vs. GC content for reads Lava lamp plot of the unfiltered paired-end reads. Coverage was calculated as the median 31-mer coverage for each read. High-GC reads indicate the presence of bacteria in the raw reads.

26 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

T.wilhelma Moleculo contigs coverage vs GC 100 80 100 60 GC%

40 10 20

0 1 100 200 300 400

Coverage Counts

Supplemental Figure 2: Coverage vs. GC content for genomic scaffolds Heat map of percent GC versus coverage of reads for the all Moleculo reads.

27 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. 100 80 60 10 %GC 40 20

0 1 50 100 150 200 250 300 Counts Coverage

Supplemental Figure 3: Coverage vs. GC content for genomic scaffolds Scatterplot of percent GC versus coverage of reads for the all scaffolds. The 1,040 contigs with zero coverage are carried over from low-coverage Moleculo reads, and likely derive from amplified contaminating DNA.

28 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

80 Non−Rhizobium 1 Non−Rhizobium 2 Non−Rhodobacter 1 Non−Rhodobacter 2 Non−Rhodobacter 3 70 Non−Rhodobacter 4 Possible Rhodospirales Sponge contigs 60 50 40 30 20

0 50 100 150 200 250 300

Supplemental Figure 4: Separation of contigs derived from bacterial symbionts Sponge scaffolds are in green, while scaffolds assigned to bacteria are identified by blue (Rhizobiales) and pink (Rhodobacter). Low-coverage contigs were removed. Seven bins were identified with MetaWatt to separate the bacterial contigs, though several bins appeared to correspond to the same bacteria.

29 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

100 Twi to A.queenslandica Twi to S.ciliatum Twi to N.vectensis Twi to H.sapiens 80 60 Percent Identity Percent 40 20

0 100 200 300 400 500

Genes ordered by identity with Aqu

Supplemental Figure 5: Percent identity between sponges Calculated protein percent identity between 570 one-to-one orthologs between T. wilhelma and A. queens- landica (blue), S. ciliatum (green), the anemone N. vectensis (purple) and human (red). Average identity between T. wilhelma and A. queenslandica is shown as the dotted black line at 57%.

30 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

513 500 400

750 total blocks 300 3400 Tethya genes in blocks 200 Number of blocks 84 100 53 34 17 14 9 8 7 2 3 1 0 1 1 1 0 1 0 0 0 0

3 6 9 12 15 18 21 Number of genes in block

Supplemental Figure 6: Length of microsyntenic blocks Histogram of number of genes in detected microsyntenic blocks between T. wilhelma and A. queenslandica v2.0 gene models.

31 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Acropora_digitifera_3751 Porites_australiensis_13444 Placozoans Porites_australiensis_51025 Acropora_millepora_12988

Porites_australiensis_11347Porites_australiensis_18616 Acropora_digitifera_14732

Fungia_scutaria_27502

Cnidarians Porites_lobata_21197 Porites_lobata_4120 Porites_lobata_9984 Fungia_scutaria_56771

Porites_lobata_1117 Anthopleura_elegantissima_2927 Triad1_g4243.t2__scaffold_4 Triad1_g4245.t1__scaffold_4 Homoscleromorphs Fungia_scutaria_66260 Hoilungia_braker1_g10332.t1

NVE14232 03_aurelia_rc_finalASM_12217_1_partial Hoilungia_braker1_g10056.t1 Triad1_g4244.t1__scaffold_4 Porites_lobata_6537 Hoilungia_braker1_g10328.t1 Hoilungia_braker1_g10329.t1 atollavanhoeffeni_t_h01_03529_comp2328_c0_seq4 Hoilungia_braker1_g10333.t1 Triad1_g4246.t1__scaffold_4Hoilungia_braker1_g10331.t1

03_aurelia_rc_finalASM_1859_5AIPGENE27905 Fungia_scutaria_26369 AIPGENE2967 Anthopleura_elegantissima_20778 Triad1_g4251.t1__scaffold_4 Triad1_g4252.t1__scaffold_4 83 97 NVE8089 Triad1_g4248.t1__scaffold_4Triad1_g4249.t1__scaffold_4

98 42 88 90 Triad1_g4253.t1__scaffold_4 Dopamine 99 oscarella_t_h60_47698_comp11615_c3_seq1

85 oscarella_t_h60_29495_comp10620_c0_seq16Ccandelabrum_TR58537|c0_g4_i4_3 β-hydroxylase 85 oscarella_t_h60_62676_comp12448_c0_seq1 92

97 Ccandelabrum_TR48586|c0_g1_i1_1_partialHexactinellids Drerio_AAI63055.1__DBHDOPO_MOUSE DOPO_HUMANDOPO_BOVIN 99 Cmilii_XP_007898364.1__DBH 98 65 Csavignyi_SNAP00000093669 77 99 73 54 hyalonema_populiferum_TR13423_c0_g1_i1 92 54 aphrocallistes_vastus_comp22191_c0_seq1 Sakowv30025742m 88 90 aphrocallistes_vastus_comp6648_c0_seq1 53 35 rosella_fibulata_TR4070_c0_g1_i3sympagella_nux_TR9313_c0_g1_i2 Amel_4.5_GB53665-PA 98 81 sympagella_nux_TR13486_c0_g1_i1 NV15929-PA 29 86 99 59 87 38 52 Cgigas_EKC28453 Lotgi1|120180

obirnaseq.109316.2 99 Ocbimv22031264m.p 21 Helro1|64253Capca1|141249 21 39 32 19 22 97 Clionavarians_TR46177|c0_g3_i8 37 20 23 Clionavarians_TR53822|c0_g1_i1_1Clionavarians_TR47267|c5_g6_i1 72 30 twi_ss.14527.1_1 50 twi_ss.22187.1_1 93 99 58 Clionavarians_TR29792|c0_g2_i8 89 Clionavarians_TR47985|c1_g1_i2 96 97 79 twi_ss.25299b.2_1twi_ss.13807.1_0 99 twi_ss.18896.1_1

Slacustris_c104467_g1_i1_3 Xtestutinaria_TR12937|c0_g1_i1_1 54 Slacustris_c103001_g1_i2_5ephydatiamuelleri_23971_comp68610_c0_seq9 ephydatiamuelleri_08575_comp59157_c0_seq1Aqu2.1.27547_001Xtestutinaria_TR1767|c0_g1_i6_0Aqu2.1.27566_001 Aqu2.1.37883_001 80 obirnaseq.102626.1 Slacustris_c104522_g3_i1_3 ephydatiamuelleri_17567_comp66370_c0_seq2 96 Sakowv30039026m Aqu2.1.27548_001

Lotgi1|135247 Ocbimv22029230m.p Lotgi1|135206 Sakowv30035086m

pmar-devo-sra.189769.1_4 MOXD1_DANRE

MOXD1_CHICK MOXD2_MOUSE Csavignyi_SNAP00000111011 DBH-like 2 MOXD1_HUMAN Csavignyi_SNAP00000088972 MOXD2_DANRE MOXD1_MOUSE

H9GUE0_ANOCA__MOXD1 bflornaseq.50705.1_0 Btaurus_DAA30212.1__DBHL2 bflornaseq.24074.1_0 Cmilii_XP_007897496.1__DBHL1 bflornaseq.4601.9_0 bflornaseq.28386.2_1

bflornaseq.34114.1_0 Demosponges

NV15993-PA Xtropicalis_XP_012810196.1__DBHL2 MOX11_DROME DBH-like 1 Amel_4.5_GB40694-PA Prototomes NV10994-PA /hemichordate MOX12_DROME 0.3 Vertebrates Amel_4.5|GB41735-PA-trimmed

Supplemental Figure 7: Dopamine β -hydroxylase homologs across metazoans Tree of DBH and DBH-like proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

32 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

TYDC1_ORYSJ 83 TYDC2_ARATH 50 TYDC1_ARATH TYDC1_PAPSO 98 99 TYDC5_PAPSO TYDC3_PAPSO Plants TYDC2_PAPSO TYDC2_PETCR TYDC1_PETCR 74TYDC4_PETCR 85 TYDC3_PETCR SARC_01107T0 50 Odioica_GSOIDT00000153001 40 Platygyra_carnosus_4652 lctid68041_0 99 Syconcoactum_contig_34351_2_partial Sponges 75 98 scict1.001978.2_0_partial Syconcoactum_contig_14168_4-partial Capsaspora_KJE95580.1 Ccandelabrum_TR92788|c0_g5_i7_4-c0_g8_i14_2-partial 92 42 oscarellacarmela_comp38470_c0_seq2_2 Hoilungia_stringtie|12206.1|m.15829 Triad1_g1959.t1__scaffold_2 Placozoans 39 Nve_XP_001635455.1 35 Alatina_alata_c38444_g1_i1_0 16 Chironex_fleckeri_TR31060_c0_g1_i1_0 50 NVE18494 Cnidarians AIPGENE1003 AIPGENE10179 Cioin2|262216_partial Cioin2|230165 Csavignyi_FGENESH00000078696 12 Csavignyi_GENEFINDER00000066949 Helro1|186120 81 Capca1|158583 Lotgi1|181541 Lotgi1|201667__AADC-like 15 82 Cgigas_EKC41301-trimmed 60 Ocbimv22022526m.p 43 obirnaseq.78969.4_1 21 Ocbimv22022529m.p Ocbimv22022527m.p Lingula_comp152792_c2_seq3__DDC-like 25 Sakowv30017927m 7 Bf_V2_253_g40219.t2 21 95 Bf_V2_287_g43385.t2 86 Bf_V2_250_g40085.t26 34 Cmilii_XP_007895676.1__AADC Aromatic amino acid Drerio_NP_998507.1__AADC 98 Xtropicalis_XP_012820058.1__AADC 94 DDC_MOUSE Decarboxylase DDC_HUMAN 88 DDC_BOVIN Cgigas_EKC25403-trimmed Ectopleura_larynx_g20429.2_i2_joined-partial Ectopleura_larynx_g36591.2_i1_partial 91 HAEP_T-CDS_v02_46779 98 86 6 HAEP_T-CDS_v02_10472 hydra_sra.4784.1_2 66 HAEP_T-CDS_v02_45038_partial 99 hydra_sra.19042.1_0 82 Hydrozoan AADC HAEP_T-CDS_v02_1229_partial 33 hydra_sra.8276.1_2 Dmel_FBpp0080698 Dmel_FBpp0080697 32 TC013401 97 53 Bmori_XP_004931016.1 33 TC013402 23 Amel_4.5|GB45938-PA Invertebrate 54 71 NV11111-PA E9RJV1_GRYBI AADC-like Amel_4.5|GB45973-PA__AADC-like NV11109-PA 41 TC013480 35 DDC_DROME 82 DDC_MANSE Bmori_NP_001037174.1__AADC Spurpuratus_XP_011664746.1__AADC 93 Sakowv30016396m Helro1|101612 86 Helro1|84539 Helro1|84403 Capca1|119245 75 Lotgi1|139922__HDC-like 96 Invertebrate 95 Ocbimv22019847m.p Amel_4.5|GB55830-PA 85 TC012567 HDC-like Dmel_FBpp0085475 TC030580 59 Dmel_FBpp0085476 44 NV18137-PA 52 Amel_4.5|GB55831-PA Spurpuratus_XP_789367.3__HDC 98 PFL3_pfl_40v0_9_20150316_1g34573.t1 Skowalevskii_NP_001161568.1_HDC 93 pmar16.36248-30514.1_0 Oreochromis_niloticus_XP_005463234.1__HDC 96 A7KBS5_DANRE__HDC Xtropicalis_XP_002939672.3__HDC Homoscleromorphs CHICK_BAP16218.1__HDC 83 DCHS_BOVIN Histidine DCHS_MOUSE 82 Calcarea DCHS_HUMAN Aplysia_XP_012940696.1 Decarboxylase Aplysia_NP_001191536.1__HDC Placozoans 93 Capca1|180248 73 Ocbimv22006132m.p 59 Cgigas_EKC37654__HDC Cnidarians 93 28 LOTGIDRAFT_119964__HDC Dpulex_EFX79676.1 Apisum_XP_008179690.1__HDC Bmori_XP_012551886.1__HDC Echinoderm/hemichordate 99 50 TC010062 70 89 DCHS_DROME Chordates NV12919-PA 99 Amel_4.5|GB47379-PA__HDC-like Vertebrates Bterrestris_XP_003393425.1__HDC 0.5 Plants

Supplemental Figure 8: Aromatic amino acid decarboxylase homologs across metazoans Tree of AADC and histidine decarboxylase (HDC) proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

33 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Demosponges Tedaniaanhelens_TR22979_c0_g1_i1_4 Crellaelegans_TR35249_c1_g2_i2_2_partial Homoscleromorphs Mycalephyllophila_TR82776_c1_g1_i1_1 Demosponge Calcarea MAO-like MAO

Ctenophores Ocbimv22029554m.p Aqu2.1.28354_001 Cnidarian Placozoans lctid61359_0 MAO Cnidarians NVE15783

Aqu2.1.28355_001 Protostomes Ocbimv22009147m.p scict1.018726.1_0 Stylophora_pistillata_6593 Helro1|185502 Montastraea_cavernosa_96132 Porites_australiensis_49217 43 Chordates Dappu1|300978 Fungia_scutaria_42224 Acropora_digitifera_1151 Vertebrates Acropora_millepora_6158Astreopora_sp_23076

NVE12954AIPGENE19988

NVE15952 Fugu_XP_011602810.1__MAO-like AIPGENE11888 Acropora_millepora_18147 Drerio_XP_686310.4__MAO-like Fungia_scutaria_28738 Plants and other outgroups Monbr1_g1998.t2__scaffold_4 Porites_australiensis_4435 Astreopora_sp_27937

Stylissacarteri_TR109385_c0_g1_i2_3 Montastraea_cavernosa_29604_partial S. arctica 86

C3ZEL9_BRAFL Capca1|136246 85 Acropora_millepora_14580 bflornaseq.4585.1-g19676.t2 84 Bf_V2_196_g33248.t2

96 99 49 Acropora_digitifera_16241100Stylophora_pistillata_8673Favia_sp_43415NVE24608 Capca1|178716 Fungia_scutaria_37446AIPGENE28638 Capca1|202784 Porites_australiensis_8729

98 Capca1|106886 88 28 90 Capca1|132857C3ZDC1_BRAFL Ocbimv22008553m.p-joined 22 C3ZEA0_BRAFL Lotgi1|190244 24 22 68 Fugu_XP_003961620.1__MAOXlaevis_NP_001088354.1__MAOA Drerio_AAH70013.1__MAOChicken_XP_015155953.1__MAOA 60 AOFB_BOVINAOFB_MOUSE ephydatiamuelleri_25711_comp78073_c0_seq1 19 47 77 48AOFB_HUMAN 85 9037 AOFA_MOUSEAOFA_HUMANAOFA_BOVIN Mycalephyllophila_TR82040_c2_g1_i2_2 Cnidarian and 30 84 61 76 79 62 44 50

Halisarcadujardini_HADA01043500.1_5 52 Tedaniaanhelens_TR28067_c0_g1_i11_4 sponge group 84 Monoamine twi_ss.16693.1_1 Stylissacarteri_TR100164_c0_g1_i1_5Clionavarians_TR64330_c0_g1_i1_3 Mycalephyllophila_TR80872_c0_g2_i1_1 60 Oxidase A/B

ephydatiamuelleri_18278_comp66677_c0_seq1

ephydatiamuelleri_24826_comp69421_c0_seq1 SARC_08858T0

twi_ss.16970.1_2 Capsaspora and

AIPGENE14762 Bf_V2_184_g30703.t1 94 Bf_V2_263_g41383.t1 cnidarian group Bf_V2_98_g16122.t1 100 CAOG_00719T0 74 Acropora_millepora_19274 61 100 AIPGENE7844 AIPGENE22275 79 NVE21536 44 13 70 AIPGENE22267 AIPGENE3902 AIPGENE17020 3991 94 62 55 99 59 AIPGENE9768 75 96 SARC_03516T0 hydra_sra.16499.1_0 69 hydra_sra.12474.1_2 S.rosetta_PTSG_06229T0

25

47 Plant Lysine 85 97 Demethylase 100 96 LDL1_ORYSJ LDL2_ARATHLDL1_ARATH Primary amine LDL2_ORYSJ 100

26 35 84 84 FBpp0304939 Oxidase (PAOX) 7 9583 100 Amel_4.5|GB48386-PA39 Bf_V2_225_g37232.t2 18 19 6 KDM1A_MOUSEKDM1A_HUMANAIPGENE19852_partial 39 KDM1A 71 Triad1_g11565.t1__scaffold_47Hoilungia_braker1_g10903.t1 89 68 hydra_sra.30586.1__KDM1-like_partial Amel_4.5|GB45215-PANV18306-PA 66 30 40

58

Dappu1|311731 99 resomia_t_x0_059461_comp74353_c0_seq1

Amel_4.5|GB45939-PA T2M9S7_HYDVU Dappu1|327058 PAOX_MOUSEPAOX_HUMAN NVE17156__KDM1-like_partial Bf_V2_246_g39525.t1-bflornaseq.31519.2-g39527.t2 56 88 F6VLR2_XENTR SMOX_MOUSESMOX_HUMAN Lotgi1|145165 78 97 83E1BRG3_CHICK F1NJV1_CHICK__PAOX-likeV9KLB5_CALMI__SMOX-like 54 F6X020_MONDOKDM1B_HUMANH3AB83_LATCHCgigas_EKC24089 Bf_V2_39_g8825.t2 43 75 hormiphora_t_h20_09573_comp8723_c0_seq1KDM1B_MOUSE 36

ML095310abolinopsis_t_h20_10482_comp9645_c0_seq1 bathocyroe_t_h20_29738_comp16766_c0_seq1 69 NVE13555 bathyctena_t_h30_08640_comp12342_c0_seq1euplokamis_t_h20_09797_comp8288_c0_seq1

99

Monbr1_g8203.t1__scaffold_27 99 KDM1B Dappu1|53901 95

S.rosetta_PTSG_06283T0__PAOX-like Dappu1|301970

78

87 Corticiumcandelabrum_TR56531_c1_g1_i1_1 91

Aqu2.1.40694_001

Aqu2.1.33483_001

NV17776-PA Ctenophore

NV10959-PA FBpp0074952 Amel_4.5|GB50076-PA FBpp0308562 KDM1

NV10961-PA Hoilungia_braker1_g09335.t1 Triad1_g432.t1__scaffold_1 NV10958-PA NV10962-PA

Triad1_g9931.t1__scaffold_18 Triad1_g431.t1__scaffold_1 Hoilungia_braker1_g04836.t1 Amel_4.5|GB50162-PA_partial Hoilungia_braker1_g04834.t1Hoilungia_braker1_g04835.t1

Hoilungia_braker1_g04833.t1

FBpp0074958 TC009166__MAO-like FBpp0076286__SMOX-likePlacozoan Lysine-specific PAOX-like Histone Demethylase (KDM1) 0.8

Supplemental Figure 9: Monoamine oxidase homologs across metazoans Tree of monoamine oxidase (MAO) and related proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Searches for lysine demethylase (KDM1) and primary amine oxidase (PAOX) were not exhaustive, and were added to display the sole positions of ctenophores (only have KDM) and placozoans (only have PAOX) in this protein family. Bootstrap values are 100 unless otherwise shown.

34 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

CAOG_08244T0 41 CAOG_00213T0 CAOG_09932T0 Filasterea CAOG_05982T0 Aqu2.1.43530_001 21 twi_ss.25346.1_2 41 ephydatiamuelleri_20738_comp67567_c0_seq1 ephydatiamuelleri_25644_comp76203_c0_seq1 ephydatiamuelleri_25735_comp78991_c0_seq1 97 ephydatiamuelleri_22338_comp68117_c0_seq2 52 ephydatiamuelleri_06289_comp54437_c0_seq1 54 ephydatiamuelleri_05142_comp51299_c0_seq1 ephydatiamuelleri_20901_comp67615_c1_seq2 Aqu2.1.23001_001 Aqu2.1.22999_001 twi_ss.11316.1_0 69 ephydatiamuelleri_09349_comp60264_c0_seq6 ephydatiamuelleri_10299_comp61291_c0_seq1 Slacustris_c104630_g1_i1 Hexactinellids rosella_fibulata_TR14631_c0_g1_i1_1 aphrocallistes_vastus_comp22152_c0_seq3_3 aphrocallistes_vastus_comp16141_c0_seq1_1 41 rosella_fibulata_TR14675_c0_g1_i1_0 sympagella_nux_TR10361_c0_g1_i1_0 rosella_fibulata_TR4584_c0_g1_i5_2 95 sympagella_nux_TR20329_c0_g1_i1_2_partial aphrocallistes_vastus_comp22385_c1_seq6_1 aphrocallistes_vastus_comp22385_c1_seq2_1 aphrocallistes_vastus_comp22385_c1_seq1_1 aphrocallistes_vastus_comp12269_c0_seq1_2 aphrocallistes_vastus_comp17700_c0_seq1_4 97 rosella_fibulata_TR4083_c0_g1_i1_1 rosella_fibulata_TR4083_c0_g1_i2_1 hyalonema_populiferum_TR19_c0_g1_i1_4 97 rosella_fibulata_TR7683_c0_g1_i1_0 rosella_fibulata_TR7683_c0_g1_i3_0 62 aphrocallistes_vastus_comp11820_c0_seq1_4 rosella_fibulata_TR8723_c0_g2_i1_2 88 75 sympagella_nux_TR21451_c0_g1_i1_0 15 93 rosella_fibulata_TR3980_c0_g1_i1_1 rosella_fibulata_TR3353_c0_g1_i1_0 aphrocallistes_vastus_comp20208_c0_seq3_0 97 hyalonema_populiferum_TR13658_c0_g1_i1_1_partial rosella_fibulata_TR7995_c0_g1_i1_2 sympagella_nux_TR21457_c0_g1_i3_1 86 sympagella_nux_TR21457_c0_g1_i2_0 sympagella_nux_TR21457_c0_g1_i1_0 twi_ss.15124.1_2 ephydatiamuelleri_17836_comp66476_c0_seq2 ephydatiamuelleri_08108_comp58383_c0_seq9 twi_c31705_g2_i8_0 97 twi_c31705_g2_i4_1 twi_ss.30746.2_1 85 twi_ss.29636.1_1 Aqu2.1.28011_001 Demosponges Aqu2.1.38315_001 52 ephydatiamuelleri_22930_comp68288_c0_seq43 Slacustris_c104641_g1_i1 Slacustris_c98123_g1_i3_partial 25 ephydatiamuelleri_16611_comp65944_c0_seq8 Aqu2.1.39153_001 95 ephydatiamuelleri_16976_comp66116_c0_seq1 20 ephydatiamuelleri_16977_comp66116_c0_seq2 38 Slacustris_c103807_g1_i3 16 Aqu2.1.25564_001 Aqu2.1.38717_001 54 Aqu2.1.27571_001 Aqu2.1.27573_001-trimmed Aqu2.1.39154_001 69 45 Aqu2.1.38823_001 97 Aqu2.1.41785_001 70 Aqu2.1.26762_001 ephydatiamuelleri_23301_comp68424_c0_seq32 36 twi_ss.13620.1_2 44 twi_ss.20824.4_2 ephydatiamuelleri_11626_comp62613_c0_seq9 98 ephydatiamuelleri_23659_comp68518_c0_seq1 ephydatiamuelleri_23660_comp68518_c0_seq5 ML02335a-AUGUSTUS hormiphora_t_x0_09819_comp8395_c0_seq2 37 oscarella_t_h60_65871_comp34165_c0_seq1_partial Ctenophores oscarella_t_h60_42123_comp11388_c0_seq18 oscarella_t_h60_52462_comp11760_c0_seq13 93 oscarella_t_h60_64267_comp17249_c0_seq1 Porites_australiensis_13238 Homoscleromorph Alatina_alata_c61081_g1_i1_5__lCt 32 86 Acropora_digitifera_408__lCt Acropora_millepora_2820__lCt Capca1|22448 Capca1|42446 Cnidarians 47 Triad1_g522.t1__scaffold_1 Hoilungia_stringtie|10218.1|m.13450 Triad1_g72.t1__scaffold_1 20 Acropora_digitifera_6641 Acropora_millepora_2277 48 hydra_sra.36426.1_0 76 adi_v1.15790 Porites_australiensis_19582 1 C3Y433_BRAFL-trimmed 25 Capca1|22458 10 Lanatina_comp153988_c0_seq2_1 4 Stylophora_pistillata_2729 Porites_australiensis_23261 80 Stylophora_pistillata_11795 64 Stylophora_pistillata_9171 92 AIPGENE7219__lCt Acropora_millepora_3159 NVE1157 NVE5469__lCt 56 Porites_australiensis_9038__lCt 5 Lanatina_comp149962_c0_seq3_2 40 98 obirnaseq.37923.1_2 94 Capca1|204049 TC007169 71 Amel_4.5|GB53009 Dmel_NP_001285554.1__GABA-B3G GABAR-B3 Porites_australiensis_8820 95 Anthopleura_elegantissima_62189 47 G5ECB2_CAEEL__gbb-2 C3YEC0_BRAFL 98 84 Q1LUN9_DANRE GABR2_HUMAN 65 GABR2_MOUSE 2 Capca1|222380 97 1 Lanatina_comp141557_c0_seq2_3 GABAR-B2 74 Amel_4.5|GB49239 TC014995 95 Dmel_NP_001287456.1__GABA-B2C Hoilungia_stringtie|8751.1|m.11539 5 AIPGENE12245 Hoilungia_stringtie|15673.1|m.20278 36 Triad1_g1739.t1__scaffold_2 97 Hoilungia_stringtie|6504.1|m.8438 Demosponges Triad1_g6059.t1__scaffold_6 Hoilungia_stringtie|6503.1|m.8445 Triad1_g6057.t1__scaffold_6 Oscarella carmela Dmel_NP_001246033.1__GABA-B1C 2 62 Amel_4.5|GB49131 81 TC016191-016192-partial Hexactinellids B3VBI8_CAEEL__gbb-1 43 64 obirnaseq.69248.1_2 93 Lanatina_comp155098_c1_seq4_1 58 Capca1|107055 Ctenophores C3Z4V0_BRAFL GABR1_MOUSE GABAR-B1 GABR1_HUMAN Placozoans F1QAJ3_DANRE F1RDY7_DANRE Hoilungia_stringtie|3833.2|m.4840 Cnidarians 70 Triad1_g7917.t1__scaffold_9 5 Hoilungia_stringtie|3832.1|m.4618 Triad1_g7915.t1__scaffold_9 Protostomes 55 Hoilungia_stringtie|3838.1|m.4714 Hoilungia_stringtie|11079.1|m.14461 61 Triad1_g7871.t1__scaffold_9 Chordates Triad1_g3741.t1__scaffold_3 Placozoans Hoilungia_stringtie|10558.1|m.13818 11 Triad1_g5475.t1__scaffold_5 Vertebrates Stylophora_pistillata_9477 Porites_australiensis_37273 Stylophora_pistillata_9644 Capsaspora (Filasterean outgroup) 0.4

Supplemental Figure 10: mGABA receptors across metazoans Complete version of Figure 4 metabotropic GABA receptor (GABA-B type) protein tree generated with RAxML. Bootstrap values are 100 unless otherwise shown. The majority of deeper nodes were poorly re- solved; branch transparency corresponds to bootstrap support for values under 50, meaning half-transparent.

35 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

246 268 287 367 395 446 GABR1_HUMAN -PGCSSV- -YGSSSPALS FRTHPSA GLFYETEA L I G - WY A GGFQEAPL GABR1_MOUSE -PGCSSV- -YGSSSPALS FRTHPSA GLFYETEA L I G - WY A GGFQEAPL F1QAJ3_DANRE -PGCSSV- -YGSSSPALS FRTHPSA GLFYETEA L I G - WY A GGFQEAPL Dmel_NP_001246033.1__GABA-B1C -AGCSTV- -YGASSPALS FRTHPSA GLFYVVAA F I G - WY E EGYQEAPL Amel_4.5|GB49131 -AGCSTV- -YGASSPALS FRTHPSA GLFYVVAA F I G - WY E EGYQEAPL C3Z4V0_BRAFL -TGCSSV- -YGSSSPALS FRTHPSA GVFYEDMA IIG-WYP PGYPEAPL B3VBI8_CAEEL__gbb-1 -TGCSPV- -YGGSSPALS FRTHPSA GLFYVTEA F I G - WY A GGFPEAPL GABR2_HUMAN GGVCPSV- -FAATTPVLA FRTVPSD GQFDQNMA IPG-WYE GPSKFHGY GABR2_MOUSE GGVCPSV- -FAATTPVLA FRTVPSD GQFDQNMA IPG-WYE GPSKFHGY Q1LUN9_DANRE GGVCPSV- -FAATTPVLA FRTVPSD GQFDENLA IPG-WYQ ETSKFHGF C3YEC0_BRAFL GGTCSSV- -YDVTSPEFS FRTVPSD GLFDERMA LNG-PYS RTNPFHGY Dmel_NP_001287456.1__GABA-B2C GAACTHV- -YADTHPMFT FRVVPSE GNFNEHFA IMA-TYS EYSRFHGY Amel_4.5|GB49239 GAACTHV- -YADTHPMFT FRVVPSE GN F N E TWA IMG-TYT EYSRFHGY Hoilungia_stringtie|15673.1|m.20278 GSGYSSV- -YGATSPSLS LRTIPSD ASFYEKEG L L G - WL S DTYGYSNF Triad1_g1739.t1__scaffold_2 GSGYSSV- -YGATSPSLS LRTIPSD ASFYEKEG F L G - WL S ESYGYSNF Hoilungia_stringtie|6503.1|m.8445 GAGYSSV- -YSATSPVLS FRTIESE ASFYEDTG L I G - WL E QQDIYAAF Triad1_g6057.t1__scaffold_6 GAGYSSV- -YSASSPVLS FRTIESE ASFYEDTG L I G - WL E KRDLYAAF Hoilungia_stringtie|6504.1|m.8438 GADFSTV- -FGSTSPVLS YRTVSSD ANFYEDLG L L G - WL Q QPLGYYPF G5ECB2_CAEEL__gbb-2 GGQCTEV- -YAETHAKFA FRVVPGS VDVDEEMA LPG-YHS P N N TWRG Y AIPGENE12245 GAGYSKC- -YSSTSPQLS YRTIPDD GNFFGQGA LLG-YIE RRDDHSSY Hoilungia_stringtie|10558.1|m.13818 GAQNSRC- -HGSTSTILS FRTALSD AIGQAPVI L VG - WY P TTNTLRAY Triad1_g5475.t1__scaffold_5 GAQNSRS- -HGSTSPILS FRTALSD AIGQAPVI L VG - WY P TTNTLRAY Hoilungia_stringtie|11079.1|m.14461 GPVSSIC- -NAATSATLS YRTALSD VYAYPDMT L L G - WY F YLASVHTW Hoilungia_stringtie|3832.1|m.4618 GPVLSAV- -PSSASMKL MRTAISD LFAYSVKA LPG-FFF PLLGSYTF Hoilungia_stringtie|3833.2|m.4840 GPVFSYI- -HTATSMELT FRTAISD LFAYSPIA MPG-FYY PLYGSYSY Triad1_g7917.t1__scaffold_9 GPVLSSV- -HTATSMQLV FRTSISD LFAYPSKA LPG-FYL PLSGSYSY Hoilungia_stringtie|3838.1|m.4714 GPILSQT- -PSATSLKLA FRTSISD LNAYPGQI L P G - WL Y HLSGTYTY Triad1_g3741.t1__scaffold_3 GPLTTVG- -SGVSSAAL FTTETTD LFADLIPT ITF-GYI PFKYFQTN Capca1|42446 AAANSEL- -YASVSPILS FRTSSPD TICNMPQA MPG-WFD HATNLATQ Capca1|22448 GGGCSIV- -YGCMSPSLS FRTVQPE TASYEDEM F AN - WF D TGKDFGPA Hoilungia_stringtie|10218.1|m.13450 GAACSIV- -YASASPALS FRTYPSE AGFYSPQG IPG-WLG GAIDYASY Triad1_g72.t1__scaffold_1 GAACSVV- -YASASPALS FRIYPSE GGFYSSQG ITG-WLE GAIDFASF Triad1_g522.t1__scaffold_1 GSACSTV- -YASVSPALS FRTRSSE TGFFPSPG GPG-WFT GGKKFAGF Capca1|22458 GGGCSEA- -FSAGSPDLS FRTIRSD YNSYAKEA I P G - WWA GYNPYHTF Lanatina_comp153988_c0_seq2_1 GGGCSVS- -YGAARLKLS WRN I P TM FDGSEQTF Y Y G - WY N GFNDYHPF AIPGENE7219 GPGCSES- -MANANPALG VRTVPPE GMFRETTA LMSGSYR EPNPYASF C3Y433_BRAFL-trimmed GPG-GAV- -FSTYSPALS FKTIQTD ALMAPHQC FNG-ALR KSTVYVPL Hoilungia_stringtie|8751.1|m.11539 GPGCSPV- -YAGSSPALS YRLYPSE LSSYEDIA F MG - WY S GGWK T AG F Dmel_NP_001285554.1__GABA-B3G GSACSEV- -FGSTSPALS YRTVAPD GSFSQELA LHE-SMG AISQYAPQ Amel_4.5|GB53009 GTACSEV- -FGSTAPALS LRTVAPD GSFSPQLA LPS-DTI TYSKFAGQ Lanatina_comp149962_c0_seq3_2 GSSCSDV- -HASSAPALS YRLAAAD GSFSEDIA LVG-DFT TRSAHAPQ NVE5469/23-200 GPACSAV- -YASTASDLS YRTVQPD GMFYEDAA L L D - WAD TANLYVPY twi_ss.15124.1_2/34-214 GADCSIA- -PLSSSPSLT LRVVSSD LSVYPAYA T F A - WY P ALFGQAQF ephydatiamuelleri_17836_comp66476_c0_seq2 GADCSVS- -PLSTSPQLE LRTVASD VNTYPTHA Y F A - WY P SLQDQVPY oscarella_t_h60_42123_comp11388_c0_seq18 AEGCTIV- -FLSSSPSLD YHLLPAE GLFYEEVG F TG - WY D DFQQYTAY oscarella_t_h60_52462_comp11760_c0_seq13 AAGCSTV- -FLSSSPALQ HQINPDE ALLYEPTA F P G - WY H QFTYYAAY oscarella_t_h60_64267_comp17249_c0_seq1 AEGCSVV- -FLSSSPALE HQLLPSE GLFYEETG F VG - WF Q DFDHYTAY Aqu2.1.25564_001 GGGCSLV- -YYSTSPSLN FTLRPSD LNCDPKIS TLG-WYP DDHSIESY Aqu2.1.26762_001 GSGCSVA- -YSSSSVHLR FQIYVSE L N MWE P K A LYG-WYS QPTYVAHM Aqu2.1.41785_001 GSGCSVA- -FSSSSVHLR FQIYASE L N MWE P K A LYG-WYN LSSYVAHM Aqu2.1.39154_001 GAGCSAA- -YSSAAPMLS FRTYPSD LNMYSGKA IPG-WYQ YPIPDALT twi_ss.13620.1_2 GCGCSVA- -YGSAAITLS FRTHPSI INTDPDLA T Y G - WY Q SQTYTAAL twi_ss.20824.4_2 GCGCSVA- -YGTTSTTLS FRANPPS LNSYPNP- AYG-WYA ILDTTAAQ ephydatiamuelleri_23659_comp68518_c0_seq1 GCGCSVA- -YFSASTALS FRTNPSF LNSYSGPA MRD-SFV EASDAGGR ephydatiamuelleri_23660_comp68518_c0_seq5 GCGCSVA- -YFSASIELS FRTNPSL VNSYPEPA MRD-SFV EASDAGGR ephydatiamuelleri_11626_comp62613_c0_seq9 GCGCSVA- -YVDVSSALD FRTSPSD LNSYPGQA TN D - WY N EKSEAAGR ephydatiamuelleri_23301_comp68424_c0_seq32 GSGCSVA- -PEPLEAQPS FQLFPSG LNMYSNIA TNG-DYT QYTYVSAL Aqu2.1.27571_001 GCGCSTA- -AIAGAFVLN FRTLVSF INTYQDIA IPM-WYN TELGLSGV Aqu2.1.27573_001-trimmed GCGCSIS- -YATGADVLT FRTLVSF INTYPDIA IPM-WYN TELGLSGV Aqu2.1.38717_001 GCGCSTA- -FAASTHILN YRALISN INAYPSFS L P M - WY A TELGIAGL Aqu2.1.28011_001 GSGCSLA- -CASSSSELA FQMLPTE LAMYAPMA T Y G - WY T SYLINSEH Aqu2.1.38315_001 GAGCSVA- -CASSSHNLA FQMLPSA LAMYPGHA IYG-WYS RYSTHNIQ twi_ss.29636.1_1 GAGCSVA- -CVSSSNELN FQLLPTE LAMFSKQA TYG-TYE THYDYAPH ephydatiamuelleri_22930_comp68288_c0_seq43 GGGCENA- -CDSSTPEAE FQTLPND LAMYVNDT TLG-WYT SSSQDEET ephydatiamuelleri_16611_comp65944_c0_seq8 GSGCEAA- -SDSSAPELE FQMLPNE LAVYQPDA L RD - WY G DPNRDDPS twi_ss.30746.2_1 GSGCSVA- -HISSSPELS FQLLGSG LAMFLPRA VYP-FYS TSNIASQQ Aqu2.1.38823_001 GGGCPST- -YNPGLMRYS VQVFPSR LNTYQNTA IYD-WYN SEMDGYAF Aqu2.1.39153_001 GAGCSLA- -YASQSSVLN FRTVPSS LNMYPPHA VYG-WYP SESYAAPF ephydatiamuelleri_16977_comp66116_c0_seq2 GCDCPEL- -YDRGTMSLN FRTYPSD LSMNPLNA TLG-WYP FQTSVAPF ephydatiamuelleri_16976_comp66116_c0_seq1 GGGCPAV- -YDTGAISLS IRTYPSD LNMYPLIA TLG-WYP FQTSVAPF twi_c31705_g2_i4_1 GAGCSIA- -HSSSSAVLD LRAILSD INADPVTT L L E - WH V LVDESVYF twi_c31705_g2_i8_0 GAGCPLA- -YSSSSGVLD FRTIPSD LNVCPEYA M Y G - WY D AADEIASF ephydatiamuelleri_08108_comp58383_c0_seq9 APGCTEE- -YGFS--ILG FSMLPSE INANASTT VH A - WN T SITEDTVP Aqu2.1.23001_001/19-196 GGGCSLA- -YFATSPALS FRVSPSE GFFYSNKA LPG-FYG DQFLSPTL twi_ss.11316.1_0/38-229 DGGCSPA- -LSASSPALS FRTIPSE GYFYEDKA L P A - WY T EPHAYITF ephydatiamuelleri_10299_comp61291_c0_seq1 GGGCSSS- -FGSSSPTLT FRTIPSE I WMY E D K A L P G - WY S AVSKYMRF ephydatiamuelleri_09349_comp60264_c0_seq6 DGGCNTV- -FGSSSPSLS YRTIPSE AWMY E D Q A L P G - WY T NMHSYVPF hydra_sra.36426.1_0/39-213 GPPYTEQ- -YTEANEPFE FQTPPSI ALFSVSGA LFE-ILP KCDENLAY hormiphora_t_x0_09819_comp8395_c0_seq2 GPVFTDV- -PTASSSEFS LRLHPVE ANFDTDDA LPS-SLP KEDKAGSL ML02335a-AUGUSTUS GPVFTDV- -PTASSSSLS LRLHPVE ANFDEDDA LPS-SYK IEEKAGSL Aqu2.1.43530_001 GPPCEDS- -YASTTPQLN FRTVPSF VYSQPEYL FVG-NTE NDLSVAAS twi_ss.25346.1_2 GPGCSTSE -YGYNSPIFH YQTARSI AFLSERLA FIGTSYS GPDPRAGT ephydatiamuelleri_25735_comp78991_c0_seq1 GPGCSEA- VSVATTVELT LRTVSTS GFLGGELT FHD-RSV IDLLYGPT ephydatiamuelleri_25644_comp76203_c0_seq1 GLACSDS- VSIATSPSLN PRPISSS AFL-GPVA FHD-RSL SDLAYGPV ephydatiamuelleri_06289_comp54437_c0_seq1 GPLCSEA- VHLASSAALA GIRGSID -FSQHLAQ NA-DIPL VDSDTFNF ephydatiamuelleri_05142_comp51299_c0_seq1 GPRCYNS- VHMCGSPNLG TSTTPCI VNIDGILT FLR-TKL SINKLGNA ephydatiamuelleri_20901_comp67615_c1_seq2 GLSCYDL- AHISSSFILG VGITPSM AFIGSSLA FLY-RKL TTSLRGNL ephydatiamuelleri_22338_comp68117_c0_seq2 GPSCSES- LHMAASPALK FGMAGSA LLANSDLA QVE-ITL A S P P WAN L ephydatiamuelleri_20738_comp67567_c0_seq1 GLPCSTP- ISGSSSPALS YRVASSG G--TEHQC LPD-RDV NDNIYVNS CAOG_00213T0 QALIQSCV -ITVARPSAT FYNGQNQ YPKTQAQV TPH-QYT SCRFGLAD CAOG_05982T0 QALVDSCV --IETGKAVA FGAGSNQ YPMTQAQV TPN-TYT YCRFGSAT CAOG_09932T0 QALVDSCV --IETGKAVA FGAGSNQ YPMTQAQV TPN-TYT YCRFGSAT

Supplemental Figure 11: Binding pocket alignment of mGABARs across metazoans Select residues involved in the binding of GABA are highlighted, and numbers correspond to the position in the human protein GABR1/GABAR-B1, based on the structure of GABR1 [Geng et al., 2013]. Highly conserved residues not thought to be involved in binding are highlighted in blue. The two human proteins are highlighted in pink. Placozoan proteins are highlighted in gray. Sponge proteins are highlighted in blue. The two ctenophore proteins are highlighted in violet.

36 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

A.thaliana_NP_565744.1__GluR5 G5EKN9_SOLLC__SlGLR3.1 86 V.vinifera_F6HIJ5_VITVI V.vinifera_F6HIJ6_VITVI A.thaliana_NP_565743.1__GluR3.5 A.thaliana_NP_001030971.1__GluR3.4 52 G5EKP2_SOLLC__SlGLR3.4 G5EKP0_SOLLC__SlGLR3.2 99 V.vinifera_A5AMA8_VITVI 99 A.thaliana_NP_974686.1__GluR3.2 A.thaliana_NP_028351.2__GluR A.thaliana_NP_190716.3__GluR3.6 G5EKP3_SOLLC__SlGLR3.5 52 V.vinifera_D7SWB7_VITVI 54 A.thaliana_NP_174978.1__GluR3.3 99 V.vinifera_D7UDC6_VITVI 51 G5EKP1_SOLLC__SlGLR3.3 G5EKN1_SOLLC__SlGLR1.1 Plants G5EKN2_SOLLC__SlGLR1.2 V.vinifera_A5AGU5_VITVI V.vinifera_A5BMN8_VITVI V.vinifera_F6HM67_VITVI V.vinifera_F6GXG0_VITVI V.vinifera_A5AUG7_VITVI 62 A.thaliana_NP_187061.1__GluR1.1 A.thaliana_NP_187408.2__GluR1.4 77 A.thaliana_NP_199652.1__GluR1.3 A.thaliana_NP_199651.1__GluR1.2 G5EKN7_SOLLC__SlGLR2.5 G5EKN4_SOLLC__SlGLR2.2 98 G5EKN6_SOLLC__SlGLR2.4 Sponges 91 G5EKN5_SOLLC__SlGLR2.3 78 G5EKN3_SOLLC__SlGLR2.1 A.thaliana_NP_180476.3__GluR2.7 A.thaliana_NP_180474.1__GluR2.9 93 A.thaliana_NP_180475.2__GluR2.8 95 A.thaliana_NP_196682.1__GluR2.5 A.thaliana_NP_196679.1__GluR2.6 O.carmela 98 A.thaliana_NP_180047.1__GluR2.3 A.thaliana_NP_180048.1__GluR2.2 A.thaliana_NP_194899.1__GluR2.4 88 A.thaliana_NP_198062.2__GluR2.1 G5EKN8_SOLLC__SlGLR2.6 L.complicata V.vinifera_A5AIS1_VITVI 80 V.vinifera_A5AD54_VITVI 84 V.vinifera_A5BDG6_VITVI V.vinifera_F6H9F4_VITVI S.ciliatum V.vinifera_F6H9D0_VITVI V.vinifera_F6H9G4_VITVI V.vinifera_A5AU42_VITVI V.vinifera_A5AQR7_VITVI V.vinifera_A5AVQ8_VITVI 85 V.vinifera_F6H9G5_VITVI V.vinifera_F6H9H0_VITVI oscarella_t_h60_63112_comp13096_c0_seq1 90 oscarella_t_h60_08441_comp7318_c0_seq1 91 L.complicata_lctid7089 S.ciliatum_scpid22929_GLUR3.5 scict1.030436.1_0 99 oscarella_t_h60_33991_comp10944_c0_seq1 oscarella_t_h60_03154_comp3263_c0_seq1 91 oscarella_t_h60_50237_comp11700_c0_seq3 oscarella_t_h60_50246_comp11700_c0_seq12 81 S.ciliatum_scpid26529_GLUR3.4 L.complicata_lctid31759 oscarella_t_h60_65012_comp23829_c0_seq1 oscarella_t_h60_06601_comp6249_c0_seq1 42 oscarella_t_h60_65051_comp24134_c0_seq1 78 S.ciliatum_scpid25297_GLUR3.1 L.complicata_lctid34850 81 S.ciliatum_scpid27594_GLUR3.7 L.complicata_lctid7088 69 S.ciliatum_scpid21909_GLUR2.9 S.ciliatum_scpid15641_GLUR2.9 euplokamis_t_h20_19863_comp13082_c0_seq1 Hcal_TR15585_c2_g1_i1_m.37437 Hcal_TR15585_c3_g1_i1_m.37441-TR15585_c0_g1_i1 Pleurobrachia_bachei_AEX15543.1 67 MLRB004413 euplokamis_t_h30_23254_comp14560_c1_seq1-h10_06856 55 MLRB00443-edited 95 Pleurobrachia_bachei_AEX15551.1_trimmed Hcal_TR7392_c0_g1_i1_m.11709 Pleurobrachia_bachei_AEX15547.1 53 Hcal_TR7373_c0_g1_i1_m.11556 99 euplokamis_t_h20_19416_comp12942_c0_seq1 Pleurobrachia_bachei_AEX15548.1-trimmed Ctenophores euplokamis_t_h30_09966_comp8826_c0_seq1 61 Pleurobrachia_bachei_AEX15546.1 MLRB306921 ML30697a-trimmed Pleurobrachia_bachei_AEX15539.1 Hcal_TR18969_c1_g2_i2_m.48365_partial 94 Pleurobrachia_bachei_AEX15541.1 10 Pleurobrachia_bachei_AEX15549.1 91 Pleurobrachia_bachei_AEX15542.1 ML00626a 89 ML032222a ML032221a ML05909a Pleurobrachia_bachei_AEX15550.1 Pleurobrachia_bachei_AEX15544.1 ML111714a 85 ML15636a 78 ML085016a 63 65 Pleurobrachia_bachei_AEX15540.1 36 ML0850-17a-18a-fusion Pleurobrachia_bachei_AEX15545.1 MLRB150054-fixed ML03683a 95 ML027316a-fixed ML07344a-recut 58 ML150012a-MLRB150049 88 ML150010a-trimmed-MLRB150043 99 MLRB064020 C3ZFG6_BRAFL 96 C3YQ18_BRAFL C3YZA0_BRAFL-trimmed 78 C3ZMS5_BRAFL-cutshort C3ZKA8_BRAFL-trimmed 64 adi_v1.00972 85 A7RPU4_NEMVE 83 adi_v1.17049-trimmed 54 adi_v1.11422 Cnidarians AIPGENE1622 90 A7T1G4_NEMVE H13_TR19588_c2_g2_i3_m.37163 g1037.t1__scaffold_1__Triad1-18943 H13_TR30549_c0_g1_i1_m.66150 g1036.t1__scaffold_1__Triad1-18262 57 H13_TR15638_c1_g1_i1_m.29638 g1035.t2__scaffold_1__Triad1-18823 g9262.t1__scaffold_14__Triad1-30612 g9261.t1__scaffold_14__Triad1-30609 94 Triad1_55165 H13_TR9210_c0_g1_i1_m.20124 g10441.t1__scaffold_23__Triad1-32461 H13_TR13015_c0_g2_i3_m.25087-partial Placozoans 99 g10442.t1__scaffold_23__Triad1-61396 g10443.t2__scaffold_23__Triad1-3218 94 H13_TR30216_c5_g1_i3_m.64945 A7SFF8_NEMVE_NMDA-like AIPGENE3629-joined_AIPGENE1829-allele 98 A7SGV8_NEMVE_NMDA-like 68 AIPGENE14101 A7RLA0_NEMVE_NMDA-like 18 A7SL56_NEMVE adi_v1.13302 Ocbimv22034182m.p 99 99 D.melanogaster_NP_730940.1__NMDAR-Z-like Amel_4.5_GB46886 81 X.laevis_NP_001081616.1_NMDA1.2 NMDAR Q6ZM67_DANRE_grin1b E7F101_DANRE_grin1a 46NMDZ1_MOUSE 71 NMDZ1_HUMAN C3YII6_BRAFL 71 obirnaseq.137713.1_1 Mouse_NP_001263284.1_NMDA3A NMD3A_HUMAN NMD3B_MOUSE 99 NMD3B_HUMAN Amel_4.5_GB48097 D.melanogaster_NP_001014714.1__NMDAR-like C3Y993_BRAFL C3Y747_BRAFL 22 96 NMDE2_MOUSE NMDE2_HUMAN 86 NMDE1_HUMAN NMDE1_MOUSE NMDE3_MOUSE NMDE3_HUMAN E9QBK8_DANRE_grin2db_NMDE4-like NMDE4_MOUSE NMDE4_HUMAN oscarella_t_h60_52136_comp11753_c0_seq16 Glycine oscarella_t_h60_15764_comp9193_c0_seq1 67 oscarella_t_h60_42587_comp11410_c0_seq2 H13_braker1_g01988.t1 T.adherens_g388.t5 0 obirnaseq.63134.1_0 13 D.melanogaster_NP_001260049.1_25a 84 99 D.melanogaster_NP_727328.1_8a scict1.009114.1_0 13 obirnaseq.13493.1_0 Acidic ligand 95 obirnaseq.109126.1_2 15 PH13_braker1_g02946.t1 71 g5117.t1__scaffold_5__Triad1-14565 92 g5089.t1__scaffold_5__Triad1-25027 PH13_braker1_g02971.t1 PH13_braker1_g02972.t1 g5085.t1__scaffold_5__Triad1-25025 C3XPD8_BRAFL C3ZRD9_BRAFL Unknown ligand 86 60 C3ZQV0_BRAFL 0 X.tropicalis_NP_001096470.1_DELTA1 98 GRID1_HUMAN GRID1_MOUSE DeltaR GRID2_DANRE GRID2_MOUSE GRID2_HUMAN Ocbimv22018191m.p 36 C3ZMC8_BRAFL 89 C3ZZY6_BRAFL 3 obirnaseq.94822.1_1 60 Amel_4.5_GB53122 80 Amel_4.5_GB49273 Amel_4.5_GB49268 98 Q9VIE2_DROME__Clumsy 69 Amel_4.5_GB49275 D.melanogaster_CAB64942.1 Clumsy Danio_rerio_XP_009290381.1_K5 GRIK5_HUMAN 8 GRIK5_MOUSE Danio_rerio_XP_017206665.1_K4 GRIK4_MOUSE GRIK4_HUMAN 54 C3ZWB5_BRAFL Danio_rerio_XP_009303592.1_K1 81 GRIK1_MOUSE Ctenophores GRIK1_HUMAN Danio_rerio_XP_017206939.1_K3 GRIK3_MOUSE GRIK3_HUMAN Placozoans 99 87 Danio_rerio_XP_017206873.1_K2 KainateR B9V8S1_XENLA_GRIK2 98GRIK2_MOUSE 99GRIK2_HUMAN obirnaseq.32300.1_1 Cnidarians 85 obirnaseq.114940.1_2 79 Amel_4.5_GB40973 D.melanogaster_NP_476855.1 D.melanogaster_NP_001261621.2 Q71E63_DANRE_gria2a Protostomes Q71E62_DANRE_gria2b B9V8R8_XENLA_GRIA2 99GRIA2_HUMAN 99GRIA2_MOUSE Chordates 55 B3DGS8_DANRE_gria1a Q71E64_DANRE_gria2a X.laevis_NP_001153151.1_AMPA1 GRIA1_MOUSE 78GRIA1_HUMAN Vertebrates Q71E61_DANRE_gria3a 99Q71E60_DANRE_gria3b X.laevis_NP_001153154.1_AMPA3 99GRIA3_MOUSE AMPAR 51 GRIA3_HUMAN X.laevis_NP_001153157.1_AMPA4 Plants GRIA4_MOUSE GRIA4_HUMAN 52 Q71E59_DANRE_gria4a 0.5 Q71E58_DANRE_gria4b

Supplemental Figure 12: Phylogenetic tree of ionotropic glutamate receptors across metazoans Protein tree generated with RAxML using the PROTCATWAG model. Bootstrap values are 100 unless otherwise shown. These receptors are not found in the genomes or transcriptomes of demosponges or hexactinellids, so the Sponge clade refers to calcareous sponges and homoscleromorphs. For the three sponges, the blue star indicates sequences derived from a transcriptome. Based on [Alberstein et al., 2015], some receptors are predicted to bind ligands other than glutamate, shown with the green star, red diamond, and black star, for glycine, acidic ligands, and unknown, respectively.37 Four placozoan proteins have substitutions at the conserved acidic residue (D723 in human GluN1), as either GY in ctenophores, or GG/WY in placozoans; the carboxyl of the glutamic/aspartic acid is needed to coordinate the amino group of glutamate, suggesting that these proteins do not bind an α -amino acid. bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

NMD3A_HUMAN Human NMDA-class iGluRs

NMDE1_HUMAN

GRIA2_HUMAN Human AMPA-class iGluRs

euplokamis_t_h30_09966_comp8826_c0_seq1 Ctenophore iGluRs

ML032222a

Pleurobrachia_bachei_AEX15542.1

oscarella_t_h60_42587_comp11410_c0_seq2 Calcisponge/Hsm AMPA-like

scict1.009114.1_0

A.thaliana_NP_187061.1__GluR1.1 Plant iGluRs

S.ciliatum_scpid25297_GLUR3.1 Calcisponge/Hsm iGluR-like

oscarella_t_h60_06601_comp6249_c0_seq1 Signal peptide

L.complicata_lctid34850 ANF_receptor 7-transmembrane bac SBP type 3 Aqu2.1.29334_001 Demosponge mGluR-like Ligand channel NCD3G E.muelleri−Emu1128−0.9−mRNA−1 NMDAR2_C

twi_ss.7125c.5|m.10167

0 250 500 750 1000 15001250

Supplemental Figure 13: Domain organization of ionotropic glutamate receptors across meta- zoans Scale bar displays number of amino acids. Top BLAST hits for human iGluRs in demosponges appear to be metabotropic, due to the presence of a 7-transmembrane domain instead of the ion channel, while the ligand-binding domain is conserved. Ctenophore iGluRs and some calcisponge/homoscleromorph (Hsm) pro- teins have the vertebrate-type domain organization, though the other calcisponge/homoscleromorph proteins (main sponge group in Supplemental Figure 12) have an SBP domain.

38 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

CAOG_01813T0__S17A9-like Alatina_alata_c59016_g1_i1__S17A9 98 hydra_sra.33023.2__s17a9 37 Scopalina_sp_TR21479_c0_g1_i1__s17a9 100 X.testutinaria_TR78129_c0_g1_i1__s17a9 Aqu2.1.12587_001_partial__s17a9 46 Stylissa_carteri_TR73134_c0_g1_i2__s17a9 91 twi_ss.30106.1__s17a9 81 NVE8304 75 NVE22746 Porites_australiensis_3031__S17A9 99 Acropora_millepora_10895__S17A9 Danio_rerio_NP_001002635.1__SLC17A9 50 Xtropicalis_XP_002944466.2__SLC17A9 Nucleotide transporter 73 Gallus_gallus_NP_001006292.1__SLC17A9 87 S17A9_HUMAN 75 S17A9_MOUSE Branchiostoma_belcheri_XP_019645208.1__SLC17A9 FBpp0077146__S17A9 82 Amel_4.5|GB46143-PA__S17A9 (SLC17A9) HELRODRAFT_76708__SLC17A9 40 49 Capca1|155164__S17A9 66 Capca1|155158 62 Lingula_comp143954_c0_seq4__S17A9 79 obirnaseq.110733.2__S17A9 98 Cgigas_EKC18089__S17A9 83 LOTGIDRAFT_112031__SLC17A9 CAOG_02811T0 obirnaseq.94311.1 6 Hoilungia_braker1_g07337.t1 Triad1_g69.t1__scaffold_1 Hoilungia_braker1_g07338.t2 scict1.031446.1 lctid25328 lctid59401 scict1.029742.13 scict1.015158.1 100 36 bathyctena_t_h10_20781_comp13657_c0_seq4 beroeabyssicola_t_h20_12199_comp12599_c0_seq1 65 ML011726a 99 bolinopsis_t_h20_00340_comp262_c0_seq1 91 euplokamis_t_h20_11645_comp9428_c0_seq1 beroeabyssicola_t_h20_15083_comp14311_c0_seq2 thalassocalyceinconstans_t_h10_05529_comp6090_c0_seq1 78 bathocyroe_t_h20_21180_comp15532_c0_seq1 45 31 Hcal_TR27180_c1_g1_i1 41 dryodora_t_h20_15255_comp12806_c0_seq1 72 38 ML21903a bolinopsis_t_h20_19456_comp13734_c0_seq1 42 velamen_t_h20_17746_comp13342_c0_seq1 Monbr1_g5507.t1__scaffold_14 99 Srosetta_PTSG_01814T0 Halisarca_dujardini_HADA01062594.1 19 Halisarca_dujardini_HADA01062593.1 sympagella_nux_TR6482_c0_g1_i1_partial 94 aphrocallistes_vastus_comp19335_c0_seq1 Scopalina_sp_TR32724_c0_g1_i1 68 Scopalina_sp_TR32724_c0_g2_i2 95 72 twi_ss.14577.1 76 Tedania_anhelens_TR19189_c0_g1_i10 88 Petrosia_ficiformis_TR2695_c2_g1_i1 Aqu2.1.28662_001 97 X.testutinaria_TR52323_c0_g1_i1 69 X.testutinaria_TR24582_c0_g1_i4 Aqu2.1.29226_001 13 Aqu2.1.28663_001 scict1.023385.2 lctid55483 Corticium_candelabrum_TR84939_c2_g3_i1_partial 75 65 Oscarella_carmela_comp9729_c0_seq1 84 Oscarella_carmela_comp31204_c0_seq3 78 Oscarella_carmela_comp37189_c0_seq1 91 Oscarella_carmela_comp40500_c0_seq9 Hydractinia_symbiolongicarpus_c17024_g1_i4 Alatina_alata_g24140.1_i2 96 AIPGENE1813 NVE15609__VGLUT-like Hydractinia_symbiolongicarpus_g7761.1_i1 Alatina_alata_g35508.1_i1 92 Acropora_millepora_17115 AIPGENE1891 97 NVE15610__VGLUT-like NVE21028__VGLUT-like 14 95 NVE16311__VGLUT-like Cnidarian VGLUT-like 35 Alatina_alata_c52723_g1_i1 Hydractinia_symbiolongicarpus_c16442_g1_i1 78 hydra_sra.17150.1 95 Alatina_alata_g57195.1_i1 hydra_sra.36087.1 Hydractinia_symbiolongicarpus_c19851_g1_i1 Acropora_millepora_10896 Porites_australiensis_15186 19 NVE20928__VGLUT-like 88 AIPGENE24249 Triad1_g5316.t1__scaffold_5 Hoilungia_braker1_g11731.t1 98Hoilungia_braker1_g10108.t1 40 Alatina_alata_c54481_g1_i1 97 Hydractinia_symbiolongicarpus_c23495_g1_i1 96 hydra_sra.10353.1 21 22 Alatina_alata_g56947.1_i1 77 Alatina_alata_g51821.1_i3 34 hydra_sra.10649.1 Hydractinia_symbiolongicarpus_c14056_g1_i1 22 NVE17787__VGLUT Acropora_millepora_12796 52 Porites_australiensis_46055 82 NVE8595__VGLUT 50 AIPGENE9842 33 Helro1|65417 FBpp0083297 94 TC010895 58 NV17997-PA 50 96 Amel_4.5|GB53933-PA 52 Lingula_comp156026_c0_seq2 61 Cgigas_EKC30442 64 Lotgi1|126078 65 Lotgi1|102784 76 Lotgi1|161078 bflornaseq.42024.2 bflornaseq.42037.1 bflornaseq.29980.1-29981.1-joined EKC38935 20 Cgigas_EKC18280 Lingula_comp149033_c2_seq2 Lingula_comp152169_c0_seq4 89 Capca1|208926 Lingula_comp149080_c0_seq7 59 Cgigas_EKC34298 97 Lotgi1|233350 Bf_V2_327_g44834.t1__bflornaseq.13338.1 10 23 Spurpuratus_XP_786480.3__VGLUT Lotgi1|128007__VGLUT 94 obirnaseq.87250.1 86 Cgigas_EKC24439__VGLUT 82 Lingula_comp143616_c3_seq1__VGLUT 75 35 Capca1|177109__VGLUT 55 TC008459__VGLUT Q9VQC0_DROME__VGLUT 81 Amel_4.5|GB54867-PA__VGLUT 53 NV15959-PA__VGLUT VGLU3_DANRE Gallus_gallus_NP_001305958.1__VGLUT3 99 VGLU3_MOUSE VGLU3_HUMAN Glutamate transporter VGLU1_XENTR Danio_rerio_NP_001092225.1__VGLUT1 78 VGLU1_HUMAN 83 VGLU1_MOUSE VGL2A_DANRE VGLUT 81VGL2B_DANRE 29Gallus_gallus_NP_001161855.1__VGLUT2 91VGLU2_MOUSE 98VGLU2_HUMAN Spurpuratus_XP_780445.3 Spurpuratus_XP_795625.3 (SLC17A6,7,8) 72 Spurpuratus_XP_783585.2 94 Spurpuratus_XP_782868.2 82 Spurpuratus_XP_785484.3 Danio_rerio_NP_001070195.1__sialin 47 Salmo_salar_NP_001167306.1__sialin S17A5_HUMAN 99 S17A5_MOUSE 99 Alligator_mississippiensis_XP_006269813.1 99Gallus_NP_001026257.1 98 Gallus_gallus_XP_015140231.1__sialin Sialin (SLC17A5) Callorhinchus_milii_XP_007906946.1 Danio_rerio_XP_005159826.1 91 Alligator_mississippiensis_XP_006268054.2 68 Pelodiscus_sinensis_XP_014424898.1 E1BDD2_BOVIN__SLC17A4 99 S17A4_MOUSE 95S17A4_HUMAN NPT3_BOVIN NPT3_MOUSE 56 NPT3_HUMAN SLC17A4 89 NPT4_HUMAN 94 NPT1_MOUSE NPT1_HUMAN Capca1|183805 15 40 Helro1|186317 27 Lingula_comp149120_c0_seq6 Lotgi1|197570 27 80 Lotgi1|133858 35 obirnaseq.111983.1 32 Cgigas_EKC24609 7 Alatina_alata_c52956_g1_i1 99 Alatina_alata_c52359_g1_i1 46 Alatina_alata_g38981.1_i1 50 Hydractinia_symbiolongicarpus_c22518_g1_i1 Demosponges 53 NVE10158__sialin-like AIPGENE16892 Porites_australiensis_22512 74 Acropora_millepora_10266 Homoscleromorphs NVE12518__sialin-like 95 72 AIPGENE10305 71 NVE9481__sialin-like 80 AIPGENE16935 TC005982 Calcarea 7 NV11798-PA TC006631 TC006632 Amel_4.5|GB41670-PA 64 83 NV10357-PA Hexactinellids NV15849-PA 97 NV17612-PA NV17613-PA 61 TC006625 NV11203-PA Amel_4.5|GB42969-PA Ctenophores Amel_4.5|GB51651-PA 57 NV18394-PA 2 87 Amel_4.5|GB51650-PA 91 FBpp0086261 Placozoans FBpp0309074 Amel_4.5|GB43225-PA NV15041-PA Hoilungia_braker1_g05533.t1 Capca1|113983 Cnidarians EKC40068 Lotgi1|107883 87 91 Lotgi1|155133 10 99 Lotgi1|161420 Protostomes 67 Lotgi1|107712 27 Capca1|93612 Helro1|108787 Helro1|71683 31 93 obirnaseq.69929.1 Echinoderm/hemichordate 81 Cgigas_EKC35063 Cgigas_EKC35062 bflornaseq.39778.2 Capca1|209782 Chordates 25 Lotgi1|113326 97 90 Cgigas_EKC25794 63 Lotgi1|167964 93 obirnaseq.30932.1 obirnaseq.22900.1 Vertebrates 81 obirnaseq.22904.3 53 obirnaseq.22889.1 obirnaseq.30926.2 obirnaseq.30928.1 5330 obirnaseq.22894.5 24 obirnaseq.30927.1 Filastereans obirnaseq.123251.2 3765 obirnaseq.123253.1 36 obirnaseq.30934.1 Choanoflagellates 47 obirnaseq.30937.1 0.6

Supplemental Figure 14: Vesicular glutamate transporter homologs across metazoans Tree of VGLUT (SLC17A6-8) proteins across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

39 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

SC6A1_HUMAN SC6A1_MOUSE 32 Srosetta_PTSG_08575T0 Hoilungia_braker1_g07132.t1 91 94 Triad1_g3650.t1__scaffold_3 Srosetta_PTSG_00672T0 Spurpuratus_XP_788574.3__S38AB 80 29 S38AB_DANRE S38AB_HUMAN 44 S38AB_MOUSE TC015432__S38 Na-coupled neutral AA transporter NV10096-PA__S38 31 Lingula_comp153547_c1_seq2_1__S38 65 EKC22480__S38 99 Lotgi1|174956__S38 (SLC38) CAOG_10085T0 lampea_t_h30_15888_comp12890_c0_seq2__cg4 Hcal_TR16507_c0_g1_i4__cg4 97 80 bolinopsis_t_h20_07298_comp7564_c0_seq1__cg4 45 thalassocalyceinconstans_t_h10_14237_comp12108_c0_seq1__cg4 57 beroeabyssicola_t_h01_04633_comp4052_c0_seq1__cg4 99 76 dryodora_t_h20_15361_comp12868_c0_seq1__cg4 Hoilungia_braker1_g00642.t1 Triad1_g3043.t2__scaffold_3 82 Cgigas_EKC37816 60 AIPGENE11622 Porites_australiensis_5139 Unknown VIAAT-like Montastraea_cavernosa_84647 37Madracis_auretenra_42167 scict1.009980.1 88 lctid63806 Alatina_alata_c58688_g1_i1 hydra_sra.11928.1 93 resomia_t_x0_062726_comp75224_c0_seq1 19 Alatina_alata_c52262_g1_i1 hydra_sra.11232.1 hydra_sra.11218.1 34 Alatina_alata_c49879_g1_i1 Alatina_alata_c53411_g1_i1 98 hydra_sra.20297.1 56 hydra_sra.20302.1 94 AIPGENE25242 Acropora_millepora_11455 Porites_australiensis_9197 AIPGENE7072 26 NVE17671 Porites_australiensis_38448 Cnidarian-specific VIAAT-like 98 Acropora_millepora_17942 resomia_t_x0_049257_comp70251_c0_seq1 94 98 resomia_t_x0_027597_comp54692_c0_seq1 98 Alatina_alata_g42698.1_i1 51 Alatina_alata_c51053_g1_i1 94 Acropora_millepora_8356 56 Porites_australiensis_37320 99 Acropora_millepora_12688 Porites_australiensis_49386 59 NVE17670 43 AIPGENE27144 NVE5099 NVE22971 79 NVE15339 85 AIPGENE28594 83 Acropora_millepora_16417 74 Porites_australiensis_36679 Alatina_alata_c44674_g1_i1 99 NVE24524 81 hydra_sra.24414.1 32 AIPGENE4721 99 NVE8024 Alatina_alata_g47107.1_i1 99 hydra_sra.1815.1 29 Alatina_alata_g24024.1_i1 AIPGENE26117 NVE8637 76 AIPGENE29146 NVE25975 24 hydra_sra.28166.1 Alatina_alata_g44676.1_i1 PFL3_pfl_40v0_9_20150316_1g2763.t1 Spurpuratus_XP_795408.3 obirnaseq.30198.1 95 Helro1|183498 68 81 Capca1|225531 39 TC009185 FBpp0086703 95 Amel_4.5|GB40541-PA NV15949-PA Spurpuratus_XP_791315.3__VIAAT 31 bflornaseq.38036.1 GABA transporter VIAAT 85 Gallus_gallus_XP_417347.3__VIAAT VIAAT_XENTR 54Danio_rerio_NP_001074170.1__VIAAT 49VIAAT_MOUSE VIAAT_HUMAN (SLC32A1) oscarella_t_h60_07389_comp6812_c0_seq2 beroeabyssicola_t_h20_14227_comp13818_c0_seq1 bathocyroe_t_h20_04478_comp6860_c0_seq1 Hcal_TR19444_c0_g1_i2 5232thalassocalyceinconstans_t_h20_14855_comp14523_c0_seq1 24 velamen_t_h20_22074_comp14838_c0_seq2 62bolinopsis_t_h20_25026_comp15044_c0_seq3 49ML073035a Monbr1_g6159.t1 Srosetta_PTSG_07915T0 Lotgi1|157294__ANTL1 41 28 99 Lingula_comp152251_c0_seq2__ANTL1 bflornaseq.11495.1 74 oscarella_t_h60_33621_comp10920_c0_seq1 Similar to 58 twi_ss.6165.1 25 98 Petrosia_ficiformis_TR11201_c0_g1_i1 99 X.testutinaria_TR68852_c1_g1_i1 38 93 Aqu2.1.23868_001 Aromatic and neutral Triad1_g5051.t1__scaffold_5 Hoilungia_braker1_g03002.t1 63 Porites_australiensis_6407__ANTL1 62 Acropora_millepora_13042__ANTL1 AA transporter 80 NVE21968__ANTL1 98 Exaiptasia_pallida_KXJ22393.1__ANTL1 euplokamis_t_h30_07022_comp6902_c0_seq1 90 charistephanefugiens_t_h10_09118_comp9861_c0_seq1 (ANTL) Hcal_TR17487_c0_g1_i1 88 dryodora_t_h20_09505_comp9226_c0_seq1 bathocyroe_t_h20_17157_comp14441_c0_seq1 5842 thalassocalyceinconstans_t_h20_24470_comp18878_c0_seq3 97 25ML104321a 81velamen_t_h20_12670_comp11072_c0_seq1 83bolinopsis_t_h20_08753_comp8580_c0_seq1 CAOG_00255T0 Monbr1_g2633.t1 34 scict1.017867.1 lctid70109_0 lctid78596_0 99 scict1.018556.1 96 scict1.015001.1 90 scict1.015001.2 lctid12311_0 lctid80334_0 aphrocallistes_vastus_comp20147_c1_seq1 rosella_fibulata_TR11879_c0_g1_i1 sympagella_nux_TR7911_c0_g2_i1 87 aphrocallistes_vastus_comp6524_c0_seq1 sympagella_nux_TR20508_c0_g1_i4 rosella_fibulata_TR13163_c0_g1_i1 78 Scopalina_sp_TR28065_c0_g3_i1 44 twi_ss.8957.1_1 56 Tedania_anhelens_TR6077_c3_g2_i2 Aqu2.1.31258_001 Demosponges 63 Aqu2.1.31260_001 Petrosia_ficiformis_TR11578_c0_g1_i1 33 25 X.testutinaria_TR48957_c0_g1_i8 82 X.testutinaria_TR65433_c1_g3_i10 Homoscleromorphs oscarella_t_h60_10767_comp8125_c0_seq6 Triad1_g5318.t2__scaffold_5 Triad1_g5319.t1__scaffold_5 Calcarea euplokamis_t_h20_13527_comp10429_c0_seq1__S36A 39 bathyctena_t_h30_00646_comp699_c0_seq1__S36A dryodora_t_h20_27957_comp16799_c0_seq3__S36A bathocyroe_t_h20_09052_comp10596_c0_seq3__S36A Hexactinellids 41 95 76 thalassocalyceinconstans_t_h10_13472_comp11685_c1_seq1__S36A 68 velamen_t_h20_29161_comp16666_c0_seq1__S36A 96 bolinopsis_t_h20_13495_comp11280_c1_seq1__S36A 79 ML085720a__S36A Acropora_millepora_4745__S36A Ctenophores Porites_australiensis_21159__S36A 22 89 NVE5526__S36A NVE5525__S36A 93 AIPGENE23310__S36A Placozoans 68 Alatina_alata_c55134_g1_i1__S36A hydra_sra.16612.1__S36A 45 hydra_sra.16610.1__S36A Cnidarians Alatina_alata_c43290_g1_i1__S36A 98 hydra_sra.11681.1__S36A 21 Alatina_alata_c41537_g1_i1__S36A bflornaseq.11538.1-g25112.t1 Protostomes 40 Spurpuratus_XP_003723741.1__S36A Lingula_comp153549_c1_seq2_1__S36A 63 Lotgi1|218879__S36A Echinoderm/hemichordate 15 89 NV16821-PA__S36A 97 FBpp0292523__S36A 71 NV18592-PA__S36A 71 Amel_4.5|GB51487-PA__S36A Chordates 40 NV11499-PA__S36A S36A4_MOUSE S36A4_HUMAN Vertebrates S36A3_HUMAN S36A3_MOUSE S36A1_HUMAN H-coupled AA transporter 97 S36A1_MOUSE 46 S36A2_MOUSE Filastereans S36A2_HUMAN (SLC36) Choanoflagellates 0.6

Supplemental Figure 15: Vesicular inhibitory amino acid transporter homologs across meta- zoans Tree of VIAAT (SLC32A1) proteins and related transpoters across all metazoan groups, generated with RAxML using the PROTGAMMALG model. Bootstrap values are 100 unless otherwise shown.

40 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Spongospora_subterranea_A0A0H5R9G2

20 Ostreococcus_lucimarinus_A4RY52_OSTLU 70 M2XB31_GALSU Red 38 Klebsormidium_flaccidum_A0A0U9I7C4_KLEFL

Oryza_brachyantha_J3L063_ORYBR

Beta_vulgaris_A0A0J8B722_BETVU

K4CUY8_SOLLC

36 F6HQ45_VITVI 49 Citrus_sinensis_A0A067H648_CITSI Plants 99 Eutrema_salsugineum_V4KNG8_EUTSA

PIEZO_ARATH

D.discoideum_Y2801_DICDI 92 Rozella_allomycis_EPZ31590.1

Monbr1_g1806.t2 99 Salpingoeca_rosetta_PTSG_01300T0 Choanoflagellates

20 Monbr1_g2005.t1 euplokamis_t_h30_28970_comp15884_c0_seq4 Ctenophores bathyctena_t_x0_34756_comp23030_c0_seq1_0

98 beroeabyssicola_t_h20_27141_comp17803_c0_seq1 55 Pbachei_3461157_partial 68

hormiphora_t_h10_11700-TR2190_c1_g3_i2 46 bolinopsis_t_h20_24908_comp15026_c0_seq11

ML018021-AUGUSTUS

Aqu2.1.38329_001

ephydatiamuelleri_12638-joined

twi_ss.19116.1-19102.1 oscarella_t_h60_60372_comp11964_c0_seq14 Sponges scict1.026728.1_scict1.029409.2_0

L.complicata_lctid5362

T.adherens_g4404.t1

Hoilungia_stringtie_3695.2_m.4488 99 Placozoans

resomia_t_x0_097703_comp79502_c0_seq1

hydra_sra.24978.1_2

NVE3870 AIPGENE4679-13595-4700_partial Cnidarians adi_v1.23438-05905-05906-05907_partial

Acropora_millepora_552

Porites_australiensis_8982 99 Montastraea_cavernosa_16922

Brafl_PIEZO_partial

O.dioica_2913001-5897001-5899001 57 94 F1NVW5_CHICK 39 PIEZ2_MOUSE

PIEZ2_HUMAN

E1BX07_CHICK 48 Vertebrates PIEZ1_HUMAN

PIEZ1_MOUSE

Spur_390358264-joined

obirnaseq.123544.1_0 42 97 98 C.gigas_EKC42879-EKC42880

Capca1_219762 Transcriptome 81 Protostomes L.anatina_comp156528_c0_seq1

D0VWN8_CAEEL

Partial protein 94 PIEZO_DROME

A0A087ZN43_APIME 99 Joined protein TC013391 0.4

Supplemental Figure 16: Phylogenetic tree of Piezo homologs across metazoans Protein tree generated with RAxML using the PROTCATWAG model and 100 bootstraps. Yellow circle indicates the sequence was compete when joined with other genes or partial sequences, red partial circle indicates the sequence is incomplete in the genome or transcriptome. A blue star indicates that the sequence derived from a transcriptome, so copy number cannot be determined with certainty. Bootstrap values are 100 unless otherwise shown.

41 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Monbr1_g1391.t1__scaffold_3 Srosetta_PTSG_00336T0 Choanoflagellates Hcal_TR17996_c0_g1_i1 beroeabyssicola_t_h20_12271_comp12645_c0_seq2 bolinopsis_t_h20_09329_comp8981_c0_seq1 velamen_t_h20_28621_comp16542_c0_seq1 77 ML17059a__SCN-like euplokamis_t_h20_12092_comp9658_c0_seq3 98 Hcal_TR16070_c2_g2_i1 Ctenophores bathocyroe_t_h20_22759_comp15829_c0_seq2 88 74 thalassocalyceinconstans_t_h10_09040_comp8843_c0_seq1 98 bolinopsis_t_h20_18283_comp13341_c0_seq1 velamen_t_h20_34229_comp17681_c0_seq1 64 ML358826a__SCN-like Hoilungia_braker1_g08315.t1 Triad1|54699_g3484.t1 85 Hoilungia_braker1_g10012.t1 Triad1_g3491.t2_Triad1-23340 Placozoans Hoilungia_braker1_g08313.t1 Triad1_g3486.t1 Hoilungia_braker1_g10010.t1 Triad1_g3490.t1 100 Hoilungia_braker1_g05943.t1 Hoilungia_braker1_g08314.t1 99 Triad1_g3487.t1 PFL3_pfl_40v0_9_20150316_1g3063.t1 91 Spurpuratus_XP_793384.3 FBpp0300666 81 Amel_4.5|GB47190-PA 92 TC008776 Protostome 99 Lingula_comp154129_c1_seq10 92 Capca1|134859 Lotgi1|118343 Na 2 Cgigas_EKC21550 V 99 Halocynthia_roretzi_BAA95896.1 49 Brafl1|75071 Brafl1|249620_bflornaseq.37403.1 bflornaseq.37394.1-37393.1-Brafl1-143759 bflornaseq.37392.1-37391.1 84 Halocynthia_roretzi_BAA04133.1 Pmarinus_GENSCAN00000002143 50 Pmarinus_GENSCAN00000032390 F6XEC0_XENTR__scn4a 43 SCN4A_MOUSE 86 SCN4A_HUMAN 60 K9J7R3_XENTR__scn5a SCN5A_HUMAN 99 SCN5A_MOUSE 99 SCNBA_HUMAN 98 SCNBA_MOUSE 99 SCNAA_HUMAN SCNAA_MOUSE F6XFJ2_XENTR__scn8a SCN8A_RAT SCN8A_HUMAN Vertebrate 87 K9J7Z6_XENTR__scn2a F6UXH2_XENTR__scn1a 81 79 81 F6TZ24_XENTR__scn3a NaV1 SCN3A_RAT 25 SCN3A_HUMAN 73 SCN7A_HUMAN 44 SCN9A_HUMAN SCN9A_MOUSE 34SCN2A_HUMAN SCN2A_RAT 92 SCN1A_HUMAN SCN1A_MOUSE SCNA_DROME__para TC004749__SCN 48 Amel_4.5|GB42728-PA 93 Protostome NV14617-PA__SCN Lingula_comp153551_c1_seq30 81 Cgigas_EKC22630__SCN NaV1 Lotgi1|107523 Capca1|210954 Helro1|89291 93 Helro1|119038 99 Helro1|109965 Helro1|64539 Rhopilema_esculentum_TR59849_c0_g1_i2 hydra_sra.25523.1 Cnidarian 99 Craspedacusta_sowerbyi_TR39962_c1_g1_i1 Corallium_rubrum_TR22337_c0_g1_i3 94 AIPGENE4043 Stylophora_pistillata_5656 NaV Group 2 Porites_australiensis_12936 45 81 Acropora_millepora_9921 Nemve1|171660_NVE7195 AIPGENE22631 Nemve1|88319_NVE6936 Anthozoan 22 Porites_australiensis_8690 Stylophora_pistillata_17883 Ctenophores Corallium_rubrum_TR32421_c1_g1_i1 NaV Group 67 AIPGENE9486 98 99 Placozoans NVE7348__SCN-like Stylophora_pistillata_13842 Cnidarians Acropora_millepora_904 98 Porites_australiensis_6111 Cnidarian Protostomes Alatina_alata_c37058_g1_i1 70 Rhopilema_esculentum_TR83407_c2_g1_i2 Echinoderm/hemichordate Cyanea_capillata_AAA75572.1 NaV Group 1 Craspedacusta_sowerbyi_TR78689_c5_g1_i1 Chordates 96 resomia_t_x0_058327_comp73969_c0_seq1 94 Hydractinia_symbiolongicarpus_c23789_g1_i2 Vertebrates 77 Polyorchis_penicillatus_AAC38974.1 93 hydra_sra.26839.1 Choanoflagellates hydra_sra.39059.1 0.6

Supplemental Figure 17: Phylogenetic tree of voltage-gated sodium channel alpha subunits across metazoans Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap values are 100 unless otherwise shown.

42 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

scict1.007366.1 lctid45949 scict1.016080.1 lctid3586 Sponges Aqu2.1.42586_001 Twilhelma_c34057_g1_i2_twi_ss.12332.4 59 Tedania_anhelens_TR33204_c2_g1_i1 64 Scopalina_sp_TR28310_c4_g1_i2 100 Srosetta_PTSG_09464T0 Monbr1_g2466.t1__scaffold_5 euplokamis_t_h30_14855_comp11501_c0_seq1 Hcal_TR25961_c0_g1_i1__CaC-like bathocyroe_t_h20_30616_comp16858_c2_seq1 100 ML190424a__CaC-like 99velamen_t_h20_35925_comp17943_c0_seq2 Ctenophores 55 bolinopsis_t_h20_17550_comp13069_c0_seq1 Triad1_g1763.t1__scaffold_2 Hoilungia_braker1_g05522.t1 92 CAC1E_HUMAN CAC1B_HUMAN 95 CAC1A_MOUSE N/P/Q-type Ca CAC1A_HUMAN V 99 bflornaseq.27973.1 Amel_GB18730-PA 99 NV11619-PA 32 TC011227 85 TC011226 CAC1A_DROME__cacophony 99 Helro1|73530 Cacophony Helro1|128993 69 Helro1|119050 Lingula_comp155476_c1_seq27 71 82 Cgigas_EKC27184 hydra_sra.3016.1 Acropora_millepora_1337 56 Porites_australiensis_47371 NVE1263__CaA-like 98 hydra_sra.19987.4 Rhopilema_esculentum_TR102482_c2_g1_i1 90 AIPGENE18550 NVE18768__CaA-like Porites_australiensis_44917 Acropora_millepora_1120 Alatina_alata_c56718_g1_i3 hydra_sra.37904.1 NVE4667__CaA-like AIPGENE13431 Acropora_millepora_894 Oscarella_carmela_comp39348_c0_seq21 Hoilungia_braker1_g06121.t1 Triad1_g1163.t2__scaffold_1 Lingula_comp156742_c0_seq2 56 Cgigas_EKC35362 98 Lotgi1|51275 TC004715 CAC1D_DROME__Ca-alpha1D bflornaseq.3496.1_bflornaseq.3497.1 94 CAC1S_HUMAN CAC1F_HUMAN CAC1D_HUMAN 96 CAC1C_HUMAN L-type Ca 99 V CAC1C_MOUSE hydra_sra.20483.3 Alatina_alata_c60357_g1_i1 Chironex_fleckeri_TR37575_c1_g1_i8__CAC1C 75 Chrysaora_fuscescens_TR20395_c0_g1_i2__CAC1C Rhopilema_esculentum_TR83197_c1_g2_i1__CAC1C Corallium_rubrum_TR89372_c0_g1_i1__CAC1C 95 NVE6254__CaC-like Stylophora_pistillata_AAD11470.1 Porites_australiensis_45764 60 Acropora_millepora_1095 Srosetta_PTSG_03773T0 Hoilungia_braker1_g01453.t1 Triad1_g2446.t1__scaffold_2 CAC1I_HUMAN 100 CAC1H_MOUSE T-type CaV 58 CAC1H_HUMAN CAC1G_HUMAN bflornaseq.16516.1 Demosponges 100 NV13530-PA__CAC1G-like FBpp0088415__CAC1G-like 96 82 Homoscleromorphs TC005355__CAC1G-like 98 Capca1|89566__CAC1G-like Calcarea 54 Helro1|66349__CAC1G-like 58 Helro1|170765__CAC1G-like 39 44 Helro1|148954__CAC1G-like 81 Ctenophores Lingula_comp156482_c1_seq2__CAC1G-like Placozoans 98 EKC37236__CAC1G-like Lymnaea_stagnalis_AAO83843.2 71 Cnidarians Lotgi1|220094__CAC1G-like Chironex_fleckeri_TR32013_c0_g1_i1__CAC1G Protostomes Corallium_rubrum_TR78644_c0_g1_i1__CAC1G 99 Porites_australiensis_46716__CAC1G Chordates AIPGENE17378__CAC1G-like 48 99 NVE5017__CaG-like Vertebrates Rhopilema_esculentum_TR92782_c0_g1_i2__CAC1G Acropora_millepora_3959_partial 0.7 Choanoflagellates NVE7616__CaG-like

Supplemental Figure 18: Phylogenetic tree of voltage-gated calcium channels across metazoans Protein tree generated with RAxML using the PROTGAMMALG model and 100 bootstraps. Bootstrap values are 100 unless otherwise shown.

43 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Spurpuratus_XP_003727478.1

obirnaseq.13988.1 Anthozoan Group 1

Hoilungia_braker1_g02370.t1 Protostomes Spurpuratus_NP_001119779.1

HVCN1_CIOIN

Limulus_polyphemus_XP_013780203.1

Triad1_g3120.t1 Lingula_comp143296_c0_seq1 Cgigas_EKC42142obirnaseq.60928.1

Stylophora_pistillata_1911 Callorhinchus_milii_XP_007900448.1

Acropora_digitifera_10161 Acropora_millepora_20758 Madracis_auretenra_9126 Fungia_scutaria_68297 Porites_australiensis_31663 Danio_rerio_NP_001002346.1 Montastraea_cavernosa_102151 Xenopus_tropicalis_NP_001011262.1 Astreopora_sp_21046 Lotgi1|214105

NVE9761 Bf_V2_32_g6646.t1 Corticium_candelabrum_TR78892_c4_g2_i1_partialscpid86080_partiallctid39095

Gorgonia_ventalina_29581 scict1.001824.1lctid82010 Vertebrate lctid73916 HVCN1_MOUSE Gallus_gallus_NP_001025834.1HVCN1_HUMAN Corallium_rubrum_TR23690_c0_g1_i2scict1.015543.1 Calcisponges HV Channel scict1.015451.2lctid79331 0.918 scict1.004256.2 0.251 lctid66549 Hoilungia_braker1_g02371.t1 0.878 0.109

0.894 0.851 Triad1_g3122.t2 1 0.988 0.737 0.949 0.886 Placozoans Hoilungia_braker1_g02369.t2 0.912 0.862 Triad1_g3119.t1

0.805 0.96 Hydrozoans Chironex_fleckeri_TR36069_c0_g1_i1 0.96 Plants Chlorella_variabilis_EFN53563.1 Rhopilema_esculentum_TR71317_c0_g2_i1hydra_sra.26490.1 0.679 0.925 Polysphondylium_pallidum_EFA75681.1 0.758 Sakowv30029403m Karlodinium_veneficum_AEQ59286.1

0.999 Sakowv30036085m 1 Coccolithus_braarudii_ADM25825.1 Phaeodactylum_tricornutum_XP_002180795.1 Thalassiosira_pseudonana_XP_002293360.1

1

bathyctena_t_h30_45826_comp149487_c0_seq1

euplokamis_t_h10_08856_comp7263_c0_seq1 AIPGENE7391 Anthopleura_elegantissima_65824 twi_ss.15839.1 thalassocalyceinconstans_t_h10_22914_comp14651_c9_seq1 Crella_elegans_TR40839_c1_g2_i1 Montastraea_cavernosa_114866

Madracis_auretenra_16803Porites_australiensis_15483Anthozoan Group 2 Tedania_anhelens_TR26390_c0_g1_i2 Fungia_scutaria_905

Crambe_crambe_TR20663_c0_g2_i1 ML024915a Stylophora_pistillata_2919 Scopalina_sp_TR32798_c0_g1_i4 Stylissa_carteri_TR109532_c1_g2_i2Stylissa_carteri_TR77178_c0_g1_i1 velamen_t_h20_12523_comp11005_c0_seq1Hcal_TR20584_c0_g1_i1

Scopalina_sp_TR12640_c1_g1_i2

X.testutinaria_TR69758_c2_g3_i2Crella_elegans_TR35316_c0_g1_i3 Mycale_phyllophila_TR97962_c0_g1_i1

bathocyroe_t_h20_10680_comp11581_c0_seq1 Petrosia_ficiformis_TR21872_c0_g1_i1 bolinopsis_t_h20_31421_comp23663_c0_seq1 Tedania_anhelens_TR18381_c0_g1_i2

Halisarca_dujardini_HADA01057378.1

Crambe_crambe_TR30718_c2_g1_i5 Demosponges Ctenophores

beroeabyssicola_t_h20_14301_comp13867_c1_seq1

Halisarca_dujardini_HADA01036606.1

0.3

Supplemental Figure 19: Phylogenetic tree of voltage-gated proton channels across metazoans Protein tree generated with FastTree.

44 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

970 2 Supplemental Tables

Supplemental Table 1: Summary statistics of the Tethya wilhelma genome. Holobiont genomic includes the sponge and all associated bacteria. Feature Type Count Assembly size (Mb) Sponge only 126.0 Total gaps (Mb) Sponge only 1.348 Estimated kmer coverage Sponge only 131x Estimated mapping coverage Sponge only 159.3x Number of contigs All 6,907 Number of contigs Sponge only 6,109 Number of contigs Bacterial 789 GC % Sponge only 39.98% Contig N50 (kb) All 70.7 Contig N50 (kb) Sponge only 73.4 Contig N50 (kb) Bacterial 48.2 Genome size (bp) Mitochondrion 19,754 Estimated mapping coverage Mitochondrion 669.6x GC % Mitochondrion 34.43% Paired-end reads Holobiont genomic 100bp 259,518,468 Total paired-end bases (Gb) Holobiont genomic 25.951 Reads aligning back to genome All contigs 214,103,768 Mate-pair reads Holobiont genomic 125bp 280,837,536 Total mate-pair bases (Gb) Holobiont genomic 35.104 Moleculo (TruSeq) long reads Holobiont genomic 125,150 Total Moleculo bases (Mb) Holobiont genomic 436.7 Paired-end RNA-seq reads dUTP Stranded 201,451,574 Paired-end RNA-seq bases (Gb) dUTP Stranded 25.181 Trinity De novo transcripts 127,012 RNA-seq mapping fraction All contigs 68.3% StringTie Genome guided transcripts 46,572

45 bioRxiv preprint doi: https://doi.org/10.1101/120998; this version posted March 28, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license.

Supplemental Table 2: Summary of splice variation for the genome-guided transcriptome for T. wilhelma (Twi) and transcript set v2.0 for A. queenslandica (Aqu). Splice Type Twi transcripts Twi Aqu transcripts Aqu events/exons events/exons Cassette exons 4779 9089 3591 5535 Canonical splicing 5049 - 2602 - Skipped exons 3868 8329 1022 721 Alternative splice acceptor - 3747 - 638 Intron retention 3295 3565 3437 3400 Alternative splice donor - 3264 - 521 Alternative N-terminus 1964 - 1965 - Alternative C-terminus 1788 - 1968 - Intronic start 471 - 571 - Intronic end 285 - 592 - Non-canonical 73 - 135 - Single exon with variants 246 - 13 - No variants 12088 - 24027 - Single exon and no variants 15421 - 12400 -

Supplemental Table 3: Skipped exon and retained intron frame, for T. wilhelma (Twi) and A. queenslandica (Aqu). Skipped exons tend to have lengths as multiples of three. Feature Position 1 Position 2 Position 3 Twi Skipped exons 6904 5179 5331 Twi Retained introns 1202 1204 1159 Aqu Skipped exons 2671 1914 1972 Aqu Retained introns 1244 1092 1101

46