<<

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

1 Comparative genomics reveals evolutionary drivers of sessile life and

2 left-right shell asymmetry in bivalves

3

4 1, 2 # , Fan Mao 1, 2 # , Xiao 1, 2 # , Haiyan 3 # , Zhiming 1, 2 # , Fei 4, Jun

5 1, 2, Lili Wang 3, Yuanyan Xiong 5, Mengqiu Chen 5, Yongbo 6, Yuewen Deng 7, Quan 8,

6 Lvping Zhang 1, 2, Wenguang 1, 2, Xuming Li 3, Haitao Ma 1, 2, Yuehuan Zhang 1, 2, Xiyu 3,

7 Min Liu 3, Hongkun Zheng 3 * , Nai-Kei Wong 1* , Ziniu Yu 1, 2 *

8

9 1 CAS Key Laboratory of Tropical Marine Bio-resources and Ecology and Guangdong Provincial

10 Key Laboratory of Applied Marine , Innovation Academy of South Sea Ecology and

11 Environmental Engineering, South China Sea Institute of Oceanology, Chinese Academy of

12 Sciences, Guangzhou 510301, China;

13 2 Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou

14 511458, China;

15 3 Biomarker Technologies Corporation, 101301, China;

16 4 Key Laboratory of Experimental Marine Biology, Center for Mega-Science, Institute of

17 Oceanology, Chinese Academy of Sciences, Qingdao 266071, China;

18 5 State Key Laboratory of Biocontrol, College of Life Sciences, Sun Yat-sen University,

19 Guangzhou 510275, China;

20 6 Key Laboratory of Aquatic Germplasm Resources, College of Biological and

21 Environmental Sciences, Zhejiang Wanli University, Ningbo 315100, China;

22 7 College of Fisheries, Guangdong Ocean University, Zhanjiang 524088, China;

23 8 Key Laboratory of Applied Chemistry, College of Environmental and Chemical

24 Engineering, Yanshan University, 066044, China.

25

26 # These authors contributed equally to this work.

27 * Corresponding authors

28 E-mail: [email protected] (Yu Z), [email protected] (Wong N),

29 [email protected] (Zheng H)

The updated email and affiliation of Nai-Kei Wong: [email protected], Department of Pharmacology, Medical College, Shantou 515041, China bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

30 31

32 Running title: Yang Z et al. / Genomic drivers of bivalve sessility and shell asymmetry. 33

34 Total word counts (from “Introduction” to “Conclusions” or “Materials and methods”): 5770

35 Total figures: 4

36 Total tables: 0

37 Total references: 120 38 References from 2014: 31 39 Total supplementary figures: 13

40 Total supplementary tables: 15

41 Total supplementary files: 2

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

60 Abstract

61 Bivalves are -rich mollusks with prominent protective roles in coastal ecosystems.

62 Across these ancient lineages, colony-founding larvae anchor themselves either by byssus

63 production or by cemented attachment. The latter mode of sessile life is strongly molded by

64 left-right shell asymmetry during larval development of Ostreoida such as

65 Crassostrea hongkongensis. Here, we sequenced the genome of C. hongkongensis in high

66 resolution and compared it to reference bivalve genomes to unveil genomic determinants

67 driving cemented attachment and shell asymmetry. Importantly, loss of the homeobox gene

68 antennapedia (Antp) and broad expansion of lineage-specific extracellular gene families are

69 implicated in a shift from byssal to cemented attachment in bivalves. Evidence from

70 comparative transcriptomics shows that the left-right asymmetrical C. hongkongensis

71 plausibly diverged from the symmetrical fucata in expression profiles marked by

72 elevated activities of orthologous transcription factors and lineage-specific shell-related gene

73 families including tyrosinases, which may cooperatively govern asymmetrical shell formation

74 in Ostreoida oysters.

75

76

77 KEYWORDS: Comparative genomic, Ostreoida oysters, attachment, shell asymmetry,

78 bivalves

79

80

81

82

83

84

85

86

87

88

89

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

90 Introduction

91 Bivalves belong to the ancient lineages of comprising nearly 9,600 species that

92 thrive in aquatic environments, with notable economic and ecological importance [1]. As

93 bilaterian organisms, they rely nutritionally on filtering phytoplankton, and primarily follow a

94 life cycle that transitions from free-swimming larvae to attached juveniles, culminating in

95 sessile life [2, 3]. Among filter-feeding bivalves, oysters of the superfamily Ostreoidea serve

96 as crucial guardians of marine ecosystems by forming reefs that clean up water and

97 sustain biodiversity [4,5]. Due to climate change and coastal degradation, however, bivalves

98 face profound challenges from warming waters and ocean acidification, which destabilize

99 habitats, raise infection risks and dampen the bivalve capacity of acquiring carbonate for shell

100 formation [6-8].

101 To cope with diverse ecosystems, a variety of sessile strategies has emerged in bivalves

102 during evolution, among which two modes of sessile life prevail. Characteristically, majority

103 of the bivalves, including (mussel), Pectinidae (scallop), and (pearl oyster)

104 secret adhesive byssal threads to stabilize themselves against marine turbulences [9-13]. In

105 contrast, Ostreoida oysters have evolved a highly sophisticated machinery of cemented

106 attachment through producing organic-inorganic hybrid adhesive substances in place of

107 byssus, which allows them to permanently fuse the left shell with rock surfaces or shells of

108 other individuals in intertidal zones [14]. Compared with byssus, cemented attachment

109 exhibits superiority in physical adhesion and mechanical tension, enabling oysters to

110 efficiently create and thrive in large reef communities [2]. Developmentally, as a salient

111 feature of their exoskeleton, shell formation processes in bivalves are strongly molded by

112 their preferences for sessile life [15]. Quite distinctively, byssally attached bivalve species

113 tend to possess a bilaterally symmetrical shell, whereas cement-attached oysters present a

114 high degree of phenotypic variability and morphological asymmetry characteristic of their

115 radically distinct left-right (L/R) shells [15]. Nevertheless, the molecular mechanisms driving

116 these extraordinary innovations in bivalve evolution remain enigmatic, particularly in

117 genomic contexts.

118 The Hong Kong oyster (Crassostrea hongkongensis, first described as Crassostrea rivularis

119 by Gould, 1861) is economically valuable aquacultural species endemic to the South China

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

120 coastline [16]. As an ideal model for studying shell asymmetry, C. hongkongensis larvae

121 follows a typical developmental cycle of cemented attachment and asymmetrical

122 differentiation of the L/R shells. In order to elucidate the genetic basis underpinning the

123 evolution of bivalve sessile life and asymmetry of shell formation, we sequenced and

124 analyzed the complete genome of C. hongkongensis and performed comparative genomic

125 analysis along with several other bivalve species, including two congeneric Ostreoida oysters,

126 Crassostrea gigas, and Crassostrea virginica [12,17-21]. In addition, we monitored

127 transcriptomic changes of C. hongkongensis embryos during the critical window of larval

128 attachment, and compared any asymmetry-related gene expression patterns in the L/R mantles

129 of adult C. hongkongensis and byssus-producing pearl oyster (Pinctada fucata). Our

130 comparative genomic data and associated functional assays reveal extensive molecular

131 adaptations across the oyster genome that support the evolutionary switch from byssal to

132 cemented attachment and divergence from symmetrical shell in Ostreoida oysters.

133

134 Results

135 Genome sequencing, annotation and Hi-C, phylogenomics and evolutionary rate

136 Efforts on genome sequencing and assembly are inherently challenging for many marine

137 such as mollusks, , and platyhelminths due to their remarkable genetic

138 heterozygosity (or polymorphisms) [17,18,21,22]. Based on k-mer analysis, the genome size

139 of a single wild-stock Hong Kong oyster (C. hongkongensis) individual was estimated to be

140 695 Mb with 1.2% of heterozygosity (Figure S1), which is broadly comparable to that of the

141 (1.3%) [17]. To circumvent limitations of short-read next-generation

142 sequencing in assembling highly polymorphic genomes, PacBio sequencing in combination of

143 Illumina sequencing was instead opted as the dominant mode of genome sequencing in our

144 study. We first generated 23.25 Gb of raw PacBio reads and 147.25 Gb of Illumina reads,

145 being equivalent to 31.9-fold and 201.8-fold genome coverage, respectively (Table S1 and

146 S2). Following stepwise optimization of assembly algorithms, these reads were assembled

147 into a 729.6 Mb genome with a contig N50 of 314.1 kb and a scaffold N50 size of 500.4 kb,

148 with the longest contig spanning 2.37 Mb (Table S3). The contig N50 of the oyster genome is

149 at least one order of magnitude more expansive than those of published bivalve genomes

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

150 (Table S4), demonstrating the superiority of long-read sequencing technologies in coping

151 with high polymorphism in genome assembly of . However, the

152 assembled genome size turned out to be slightly larger than that estimated by k-mer analysis.

153 Such discrepancy may reflect sequence preferences of Illumina reads. The high integrity and

154 quality of the assembly were evidenced by a productive mapping of 97.57% of sequencing

155 reads and a low single-nucleotide error rate (Table S5 and S6). Moreover, Benchmarking

156 Universal Single-Copy Orthologs (BUSCOS) analysis confirmed a high degree of

157 completeness (92.84%) for the assembled genome (Table S7), which is comparable in

158 genome completeness to other published bivalves (Table S4).

159 In order to assemble the oyster genome to chromosomal level, we generated ~44.4

160 million valid Hi-C interaction pairs with over 50-fold coverage (Table S8). Meanwhile,

161 690.39 Mb of genome sequence were anchored into 10 of pseudo-chromosomes with Hi-C

162 data by using LACHESIS, covering 94.66 % of the assembled genome (Figure 1A, Figure

163 S2 & Table S9). Among them, 648.56 Mb of genome sequence were reoriented and anchored

164 into chromosomes, constituting 93.94% of the total anchored sequences (Table S9).

165 Moreover, high consistency between Hi-C based pseudo-chromosomes with the genetic map

166 of one congeneric species, C. gigas, was confirmed (p = 0.978-0.996, Figure S3), implicating

167 high reliability in chromosomal genome assembly. Overall, by leveraging PacBio and Hi-C

168 enhanced Illumina sequencing, a very high quality and chromosome-anchored complete

169 genome was obtained, thus providing a robust framework for subsequent exploration of oyster

170 biology and evolution of bivalves.

171 For gene annotation, we predicted 30,021 protein-coding genes in the genome by

172 integrating results from ab initio prediction, homology-based searches with reference

173 genomes and RNA-seq (Table S10), with an estimated BUSCO completeness of 91.09%

174 (Table S11). Of these, more than 97.97% of the predicted genes (28,329 genes) were

175 annotated in the public databases (Table S12). The gene number here resembles that in a

176 close relative species, C. gigas (28,027) [17]. In addition, transposon elements (TE) constitute

177 46.2% of the C. hongkongensis genome, among which the prevailing TE is class II Helitron

178 (12.4%, 90.4 Mb) (Table S13). Phylogenetic analysis showed that three Ostreoida oyster

179 species (C. hongkongensis, C. gigas, C. virginica) clustered together (Figure 1B), and that

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

180 Ostreoida oyster speciation took root around 92.1 million years ago (Mya), in agreement with

181 evidence from mitochondrial genomes [23]. Within bivalves, Ostreoida oysters are closest to

182 the Pteriidae oyster Pinctada fucata, and their point of divergence was estimated to be 357.5

183 Mya (Figure 1B). These results corroborates the hypothesis that a common ancestor of

184 primitive Ostreoida and Pteriidae oysters existed prior to the - extinction

185 event, whereas speciation of modern Ostreoida oysters began at the end of

186 -Paleogene extinction event [24-27]. Consistently, comparative genomic synteny

187 shows high genomic collinearity between three Ostreoida oyster genomes except for large

188 intra-chromosomal inversions, but substantial inter-chromosomal translocations and

189 rearrangements occur between chromosomes of Ostreoida oysters and Pinctada fucata

190 (Figure S4), which is in agreement with their phylogenetic relationship and duration of

191 divergence.

192 Homeobox gene cluster

193 Radical changes toward a sessile life require evolutionary innovations in anatomical

194 organization. In contrast to byssus-producing bivalves [12,28], Ostreoida oysters do not

195 possess a byssal gland or secret byssus during lifetime [29,30], though a vestigial foot

196 transiently appears at the stage and degenerates following attachment and

197 metamorphosis (Figure 2B). Developmentally, the homeobox (Hox) genes are known for their

198 crucial roles in regulating body-plan development and organogenetic transitions in metazoans

199 [31-34]. In view of this, we compared the clustering of Hox genes in byssus-producing and

200 byssus-null bivalve species. A salient feature in byssal bivalves including Pinctada fucata,

201 Mizuhopecten yessoensis, Chlamys farreri, Mytilus galloprovincialis, Bathymodiolus

202 platifrons, and philippinarum is an intact Hox and para-Hox gene cluster (Figure

203 2A and Figure S5). In contrast, a disputed Hox gene cluster reportedly exits in C. gigas oyster

204 genome [17], whereas a coherent Hox gene cluster is configured linearly in one single-locus

205 in both C. hongkongesis and C. virginica, probably in part due to fragmented genome

206 assembly in C. gigas. Intriguingly, one of the key Hox members antennapedia (Antp) is lost in

207 all three Ostreoida oysters (Figure 2A), thereby implicating Antp gene as an essential driver

208 of byssus formation. Sequence alignment reveals that Antp gene possesses a conserved

209 homeobox domain in bivalves (Figure S6).

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

210 As evidenced in expression profiles of three representative byssus-producing bivalve

211 species, Antp and its orthologues are predominantly expressed in the byssal gland (Figure

212 2C). Due to unavailability of molecular tools like CRISPR/Cas9 or TALEN for manipulating

213 bivalve genomes, genetic ablation of the Antp gene is not yet feasible in pearl oyster for

214 phenotypic appraisal of its function. However, histological evidence suggests that the byssal

215 gland is one of the appendage organs capable of secreting thin extended byssal threads in their

216 mature form as observable bysuss outside the organism (Figure 2D and Figure 2E). Based

217 on the fact that regenerative ability varies among individuals, we assessed Antp function in

218 this phenotypic trait. Remarkably, mRNA expression levels of Antp are highly correlated with

219 the number of regenerative byssus in the pearl oyster (n = 24, R2 = 0.36, p = 0.0012; Figure

220 2F). Taken together, our evidence strongly implicates Antp as a transcriptional regulator

221 central to byssal secretion in P. fucata. Further, the loss of the Antp gene seems to be

222 associated with a physical loss of byssal gland in oysters. In an evolutionary perspective, Antp

223 seems to play a critical role in appendage diversification in , which has previously

224 been evidenced by its involvement in leg formation in the crustacean Daphnia [35], and

225 repression of abdominal limb in the Achaearanea tepidariorum [36]. In addition,

226 ectopic expression of Hox transcription factor Antp reportedly induced expression of the silk

227 protein sericin-1 as a biopolymer in the silkworm Bombyx mori [37,38]. Collectively, these

228 findings support a conserved function of Antp in secretory appendage in two distinct lineages,

229 mollusks and arthropods.

230

231 Gene expansion and oyster attachment

232 In place of byssal attachment, Ostreoida oysters adopt an ingeniously cost-effective way of

233 sessile life, namely, cemented attachment [29,30]. Such adhesive mechanism is characterized

234 by extraordinary mechanical strength and superior flexibility needed to resist powerful tidal

235 scour and absorb surge energy [39]. Cemented attachment allows oysters to efficiently anchor

236 and thrive in marine environments, and ultimately supports the genesis and health of oyster

237 reefs. Nevertheless, the molecular mechanisms underlying oyster adhesive production have

238 remained enigmatic. Taking into account that commented attachment is an innovation unique

239 to Ostreoida oysters, we first ventured to investigate which gene families are expanded as a

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

240 common event in three Ostreoida species. Our results show that in C. gigas, C.

241 hongkongensis, and C. virginica, there are 58, 172, and 321 expanded species-specific gene

242 families, respectively, which can be further reduced to 32 expanded core gene families in

243 Ostreoida oyster genomes (Figure 3A and Figure S7 & Table S14).

244 To elucidate how expansion of these core gene families facilitates cemented attachment,

245 we determined the correlations between their expression levels and specific developmental

246 stages (Figure 3B). Developmentally, attachment is an intricate secretion process involving a

247 broad spectrum of chemical reactions and proteins, notably extracellular enzymes or matrices

248 [40]. It is thus unsurprising to identify a small conductance calcium-activated potassium

249 channel (SK channel) gene family and 9 extracellular gene families at work in this process,

250 which show high correlations in the pediveliger and spat stages corresponding to larval

251 initiation of attachment. SK channels are widely expressed calcium-activated potassium

252 channels in neurons [41,42], with crucial roles in regulating dendritic excitability, synaptic

253 transmission, and synaptic plasticity [43,44]. Interestingly, increased expression of expanded

254 SK channels may aid free-swimming larvae in sensing external environments in search for an

255 appropriate attachment site. On the other hand, the function of extracellular gene families is

256 strictly related to key processes of shell attachment, including matrix secretion

257 (Epidermal growth factor (EGF), EGF3, lamin EGF, Apec), processing of matrix

258 modification (Cu-oxidase, Cu-oxidase2, Cu-oxidase3, and astacin), among others (Figure 3B).

259 Indeed, many adhesive proteins contain specific protein-binding domains [45], such as

260 EGF-like domains in the slug mucus proteins (e.g. Sm40 and Sm85) [46] and sea star

261 footprint proteins (e.g. Sf1) [47], raising the possibility that EGF family expansion in C.

262 hongkongensis is functionally linked to cemented attachment. Additionally, physico-chemical

263 properties of many adhesive proteins arise in part from post-translational modifications,

264 which ultimately support their adhesive functions [45]. Protein oxidation in marine

265 bio-adhesives indeed contributes to enhanced crosslinking between shell disks and substrates

266 during attachment [48,49]. A notable gene expansion in the copper oxidase family is likely to

267 contribute to stabilization of extracellular matrixes in the form of crosslinking between the

268 oyster shell and external substrates. Copper-based enzyme lysyl oxidase is known to be

269 essential for cross-linking and strengthening fibers in connective tissues via collagen

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

270 oxidation [50]. Concomitantly, copper ion, as part of oxidative enzymes, is a mandatory

271 cofactor for oxidase activity, which creates cross-linking sites from common amino acids, to

272 enhance the cemented attachment [51-53]. In further transcriptomic analysis, we found

273 evidence that 9 extracellular gene families were starkly upregulated during the larvae-spat

274 transformation of embryo development stages (Figure S8), corroborating their functional

275 importance in attachment formation.

276

277 L-DOPA induced attachment

278 During larvae-spat transformation, embryonic oysters execute an intrinsic program of

279 developmental changes, in which cemented attachment is tightly coupled to metamorphosis

280 [54]. In this context, we set out to distinguish molecular determinants of cemented attachment

281 from that of metamorphosis at the veliger stage by means of two pharmacologic agents:

282 L-3,4-dihydroxyphenylalanine (L-DOPA) and norepinephrine (NE) at the veliger stage. The

283 former simultaneously promoted normal attachment and metamorphosis, whereas the latter

284 induced metamorphosis only but not attachment (Figure 3C and Figure S9) [54]. Based on

285 this, gene expression induced by L-DOPA rather than NE was hypothesized to be a driver for

286 the initiation of attachment in C. gigas. We accordingly scrutinized 24 transcriptomes

287 following pharmacological challenges at two time points within the temporal span of oyster

288 attachment. Our results show that the expression of 1225 genes was specifically altered by

289 treatment of L-DOPA rather than NE (Figure 3C), confirming the former’s essential roles as

290 an attachment signal.

291 Remarkably, several neurotransmitter receptors (including metabotropic glutamate

292 receptor and neuropeptide Y receptor) were starkly increased, consistent with the assumption

293 that neuromuscular coordination is mandatory for guiding embryos to settle in suitable niches

294 and initiate attachment (Figure S10) [55,56]. Moreover, genes of metal ion channels or

295 binding proteins were significantly enriched, with notable examples like organic cation

296 transporter protein, transient receptor potential cation channel (ZIP12) and

297 voltage-dependent calcium channel (Ca2+-ATPase), which is intuitively consistent with the

298 well-documented stimulatory roles of selective cations in oyster larval settling [57]. To

299 highlight, potassium voltage-gated channel activity was proven to be vital for oyster larval

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

300 attachment, since its inhibitor tetraethyl ammonium can effectively block this developmental

301 process [58]. Typically, attachment initiates in oyster larvae with the aid of fibrous adhesive

302 proteins and other bioorganic substances including mucopolysaccharides and phospholipids

303 [2,59]. As a consequence, extensive extracellular matrix and adhesion proteins including

304 collagen, cadherin, fibrocystin, and hemicentin would increase in response to L-DOPA

305 simulation, presumably paving the way for larval attachment [60].

306 To search out the crucial molecular determinants governing this process, we performed

307 WGCNA to construct a potential connected gene network functionally associated with

308 L-DOPA induced attachment, wherein 15 of modules were subsequently identified (Figure

309 S11). Among them, the MEpink module is the most correlated with L-DOPA induced

310 attachment (p < 0.01) and contains 139 of genes (topological overlap > 0.3). Intriguingly,

311 within this module, a hub forming the most connections in the network was found to be zinc

312 transporter ZIP12 (Figure 3D), which is a pivotal regulator of zinc flux. As a co-factor

313 essential to a wide spectrum of proteins such as matrix metalloproteinases, zinc plays vital

314 regulatory roles in enzymatic catalysis and macromolecular stability [61]. High abundance of

315 zinc is also a salient feature in aragonite- or calcite-rich shells in certain mollusks [62].

316 Meanwhile, among the gene families that specifically expanded in Ostreoida oysters, astacin

317 is a cell-secreted or plasma membrane-associated protease that possesses zinc binding activity

318 and takes part in proteolytic processing of extracellular proteins [63]. Its expression was

319 markedly elevated both during larvae-spat transformation or larval response to L-DOPA

320 treatment (Figure S8g & S10d). Predictably, chelation of zinc potently retarded oyster larval

321 attachment (Figure S12), providing additional hints that initial creation of matrix structures

322 requires zinc and associated protein activities for cement attachment. Accordingly, based on

323 genomic results on extracellular gene family expansion and transcriptomic profiles for the

324 attachment stage, we conceived a conceptual model to delineate the mechanistic determinants

325 and processes at work in the cement attachment strategy of oyster larvae (Figure 3E). We

326 postulate that attachment formation apparently results from an intricate coordination of at

327 least three types of fundamental activities, namely: larval sensing of habitable surfaces,

328 matrix/ion secretion, and matrix modification to mobilize adhesive processes.

329

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

330 Asymmetry in left-right shell formation

331 Symmetry is an elegant guiding principle for the implementation of body plans [64]. Across

332 the class , majority of bivalves display a perfect or near-perfect conformity to

333 bilaterally symmetrical shells [15,65]. In contrast, Ostreoida oysters may appear unorthodox

334 in adopting morphological asymmetry in their shell formation due to functional differentiation

335 of the left-right (L/R) shells (Figure S13). The left shell is visibly much thicker and more

336 convex than its right counterpart, which is apt for attaching to rocky surfaces or neighboring

337 oysters within a reef community. On the other hand, the right shell is capable of physical

338 displacement and hermetic lockdown to regulate water intake and ward off predation (Figure

339 4A). Moreover, structural variance in shell asymmetry is also amply reflected by a greater

340 proportion of prismatic layer in the right shell (Figure S14), which is responsible for

341 controlling initiation of calcite crystal formation and growth [66,67]. Although asymmetry of

342 body forms has been traditionally stereotyped as defects that may jeopardize survival of an

343 organism [68], the example of Ostreoida oysters clearly defies this rule. We reason that such

344 an intriguing differentiation of asymmetrical shells could confer unexpected benefits such as

345 improved population fitness in an otherwise intrinsically harsh coastal environment. With the

346 advent of the left shell and its versatile attachment machinery, oysters can easily economize

347 resources or secure their foothold on rocks or peers’ shells within an oyster reef via cemented

348 attachment [69]. This strategy permits oysters to lower their thresholds for founding and

349 expanding productive colonies in demanding physical habitats, literally through stacking of

350 individuals at high densities, without sacrificing resistance to environmental challenges such

351 as tidal turbulences.

352 To further elucidate the molecular basis of left-right asymmetry, comparative

353 transcriptomics was carried out for quantify the gene expression profiles in L/R mantles of C.

354 hongkongensis and pearl oyster, which are the key organ controlling shell formation [70,71].

355 As expected, 188 asymmetry-related differentially expressed genes (DEGs) of the L/R

356 mantles were identified in C. hongkongensis, whereas only 53 asymmetry-related DEGs were

357 found in the pearl oyster (Figure 4B), which reflects a radical genetic divergence

358 underpinning shell asymmetry. Next, to test the hypothesis that lineage-specific divergence of

359 orthologues contribute to symmetry breakage, 10,050 of the orthologues were paired between

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

360 the two species (Table S15). Our results indicate that a few but crucial asymmetry-related

361 orthologues are specifically expressed C. hongkongensis (Figure 4C), including homeobox

362 gene paired-like homeodomain transcription factor (Pitx2) and homeobox B4a (Hox-B4a),

363 and regulatory factor X6 (RFX6). Notably, Pitx2 is a central regulator orchestrating the Nodal

364 cascade, which is responsible not only for directing L/R axis formation in mammals [72], but

365 also shell coiling and L/R asymmetry in some mollusks such as the snail [73]. Another gene

366 of interest is the RFX6, recognized for its fundamental importance in guiding pancreatic islet

367 development and insulin production in mammals [74]. While insulin-related peptide gene is

368 known for being a critical driver of oyster growth [75], this new evidence alludes to novel

369 roles of Rfx6-insluin signaling in maintaining shell asymmetry in oysters. As predicted,

370 asymmetry-related expression of Pitx2 and RFX6 in L/R mantles was confirmed by real-time

371 qPCR in three Ostreoida lineages with asymmetrical shells, whereas such gene expression

372 patterns were absent in three symmetrical bivalves, pearl oyster, scallop and mussel (Figure

373 4D).

374 However, it should be noted that majority of asymmetry-related genes in C.

375 hongkongensis are not orthologous to the pearl oyster. For example, tyrosinases are one of the

376 key gene families involved in steering shell formation and pigmentation by means of

377 oxidation and cross-linking of o-diphenols [76,77]. Phylogenetic analysis reveals that more

378 than a half of tyrosinase genes (55%) clustered in several lineage-restricted clades, suggesting

379 rapid and independent expansion of this gene family in bivalves (Figure 4E). Remarkably,

380 several high-abundance members of the tyrosinase family seem to be strongly associated with

381 L/R asymmetry and were expressed preferentially in the right mantles of C. hongkongensis,

382 whereas no obvious variance was noted between L/R mantles in the pearl oyster (Figure 4F).

383 Therefore, it seems logical to infer that rapid expansion and divergent expression of

384 tyrosinase family contribute importantly to the emergence and neofuncationalization of

385 asymmetrical shell formation in Ostreoida lineages. Lastly, in determining when precisely

386 expression of these asymmetry-related genes kick off in oyster embryogenesis, we found that

387 71.1% of these genes start expression at the spat stage (Figure S15), implying that a complete

388 asymmetrical pattern becomes established in the juveniles only after metamorphosis.

389

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

390 Conclusion

391 Ostreoida oysters have evolved remarkable innovations for streamlining their bodyplans,

392 which are enabled by novel cemented attachment and an allied gene machinery diverging

393 from L/R symmetry. These evolutionary breakthroughs poise oysters as highly successfully

394 reef builders and ecological guardians integral to marine ecosystems spanning the globe. To

395 reveal the genomic changes driving these evolutionary innovations, we sequenced the

396 complete genome of C. hongkongensis, obtained active transcriptomes developmentally

397 critical to the attachment window, and made comparisons with other bivalve genomes. The

398 homeobox gene Antp of the Hox cluster, found to be lost in Ostreoida oysters, is evidently a

399 pivotal regulator of byssal secretion and expression of byssal proteins in P. fucata, and

400 potentially a critical gene governing the radical switch from byssal to cemented attachment.

401 Furthermore, extensive extracellular gene families were expanded in the Ostreoida lineages

402 specifically, presumably contributing to the operationalization of cemented attachment.

403 Ion-binding genes were significantly enriched in L-DOPA induced attachment in oyster, with

404 zinc-binding genes being a prominent network that coordinates extracellular matrix

405 modification and initiates adhesion. Moreover, Ostreoida divergence from shell symmetry is

406 probably under the joint control of a suite of transcriptionally identified asymmetry-related

407 DEGs of the L/R mantles, notably the transcription factors Pitx2 and RFX6, as well as

408 expanded lineage-specific family of tyrosinases. Thus, on the basis of genomic determinants

409 and coordinated gene networks as revealed in this study, we have advanced a detailed picture

410 of how shell asymmetry is switched on and driven in bivalves such as Ostreoida oysters. In

411 order to provide insights into bivalve biology and disease in contexts of climate change or

412 biological conservation, further investigation on the attachment-governing genes may be

413 warranted.

414

415 Materials and methods

416 Illumina sequencing

417 Genomic DNA was extracted by using DNeasy Blood & Tissue Kit (Cat. no. 69582, Qiagen,

418 Germany) from a two-year old single individual of C. hongkongensis. Two types of pair-end

419 libraries (220 bp and 500 bp) and six types of long-insert mate-pair libraries (3 kb, 4 kb, 5 kb,

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

420 8 kb, 10 kb, and 15 kb) were constructed by using Illumina’s paired-end and mate-end kits,

421 according to the manufacturer’s instructions. Libraries were sequenced on an Illumina Hiseq

422 2500 platform. For raw reads, sequencing adaptors were removed. Contaminated reads (such

423 as chloroplast, mitochondrial, bacterial, and viral sequences, etc.) were screened by alignment

424 in accordance with an NCBI-NR database by using BWA v0.7.13 [78] with default parameters.

425 FastUniq v1.1 [79] was used to remove duplicated read pairs. Low-quality reads were filtered

426 out, according to the following criteria: 1) reads with ≥10% unidentified nucleotides (N); 2)

427 reads with >10 nucleotides aligned to an adapter, allowing ≤10% mismatches; 3) reads

428 with >50% bases with Phred quality <5.

429 PacBio sequencing

430 Genomic DNA was sheared by a g-TUBE device (Cat. no. 520079, Covaris, MA) with 20 kb

431 settings. Sheared DNA was then purified and concentrated with AMPure XP beads (Cat. no.

432 10136224, Beckman Coulter, CA) and further used for single-molecule real-time (SMRT) bell

433 preparation according to the manufacturer’s protocol (Pacific Biosciences, CA), and 20 kb

434 template preparation by using BluePippin size selection (Sage Science). Size selected and

435 isolated SMRT bell fractions were purified with AMPure XP beads. Finally, these purified

436 SMRT bells were used for primer and polymerase (P6) binding, according to manufacturer’s

437 binding calculator (Pacific Biosciences). Single-molecule sequencing was performed on a

438 PacBio RS-II platform with C4 chemistry. Only PacBio subreads no shorter than 500 bp were

439 included for performing oyster genome assembly.

440 Genome size estimation

441 About 34 Gb (52×) corrected Illumina reads from the 180 bp and 500 bp were selected to

442 perform genome size estimation. The oyster genome size was estimated based on the formula:

443 Genome size = Kmer number/Peak depth.

444 De novo genome assembly of Illumina data

445 Clean Illumina reads were assembled de novo into longer contigs by using ALLPATH-LG [80]

446 with default parameters. Adjacent contigs were linked to scaffolds by leveraging mate-pair

447 information with SSPACE v2.3 [81], while gaps were filled by using GapCloser v1.12 [81]

448 implemented in a SOAPdenovo2 package [82].

449 De novo genome assembly of PacBio data

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

450 Canu+LoRDEC+WTDBG

451 We used an error correction module of Canu v1.5 [83] to select longer subreads with the

452 settings ‘genomeSize = 3,500,000,000’ and ‘corOutCoverage = 80’, detect raw subreads

453 overlapping through a highly sensitive overlapper MHAP v2.12 (‘corMhapSensitivity =

454 low/normal/high’), and complete an error correction through a falcon_sense method

455 (‘correctedErrorRate = 0.025’). Subsequently, output subreads of Canu were further corrected

456 by LoRDEC v0.6 [84] with the parameters ‘-k 19 -s 3’. Based on these two rounds of

457 error-corrected subreads, we generated a draft assembly by using WTDBG 1.1.006

458 (https://github.com/ruanjue/wtdbg) with the command ‘wtdbg -i pbreads.fasta -t 64 -H -k 21

459 -S 1.02 -e 3 -o wtdbg’.

460 Hybrid genome assembly

461 Contigs produced by ALLPATH-LG were optimized with the aid of contigs of PacBio

462 assembly by using quickmerge with the parameters ‘-hco 5.0 -c 1.5 -l 100000 -ml 5000’.

463 Optimized contigs were linked to scaffolds by leveraging Illumina mate-pair information by

464 using SSPACE and gaps were filled by using PBjelly v2.

465 Evaluation of oyster assembly

466 To appraise the genome quality, we first mapped Illumina reads to the oyster assembly by

467 using Burrows-Wheeler Alignment (BWA) tool. Next, completeness of genomes was verified

468 by mapping 248 highly conserved eukaryotic genes and 908 benchmarking universal

469 single-copy orthologues in metazoa to the genomes by using CEGMA v2.5 [85] and BUSCO

470 v3.0.2b [86], respectively.

471 Hi-C sequencing and assembly

472 Sequencing

473 According to the Hi-C procedure [87], nuclear DNA from muscles of oyster individuals was

474 cross-linked, then excised with a restriction enzyme, leaving pairs of distally located but

475 physically intercalated DNA molecules attached to one another. The sticky ends of these

476 digested fragments were biotinylated, which were then ligated to each other to form chimeric

477 circles. Biotinylated circles, as chimeras of physically associated DNA molecules from the

478 original cross-linking, were enriched, sheared and sequenced [88]. After adaptor removal and

479 filtering out low-quality reads, Hi-C reads were aligned to our assembled genome to evaluate

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

480 the ratios of mapped reads, distribution of insert fragments, sequencing coverage and number

481 of valid interaction pairs. Uniquely mapped reads spanning two digested fragments that are

482 distally located but physically associated DNA molecules are defined as valid interaction

483 pairs.

484 Assembly

485 Scaffolds of PacBio+Illumina assembly were reduced to fragments with a length of 300 kb,

486 which were then re-assembled by using the LACHESIS software [88] based on Hi-C data.

487 Regions that failed to be restored to the original assembly or contained an average Hi-C data

488 coverage of less than 0.5% were considered assembly errors, and were broken into smaller

489 scaffolds. Consistency in assembly of Hi-C data based pseudo-chromosomes was assessed by

490 comparisons with a genetic map for the Crassostrea gigas [89] by using software of

491 ALLMAPS [90].

492 Genome annotation

493 Repetitive sequence prediction

494 Repeat composition of the assemblies was estimated by building a repeat library employing

495 the de novo prediction programs LTR-FINDER [91], MITE-Hunter [92], RepeatScout [93]

496 and PILER-DF [94]. The database was classified by using PASTEClassifier [95] and then

497 combined with the Repbase database [96] to create a final repeat library. Repeat sequences in

498 oyster genome were identified and classified by using the RepeatMasker program [97]. The

499 LTR family classification criterion was defined as that 5’-LTR sequences of the same family

500 would share at least 80% identity over at least 80% of their lengths.

501 Protein-coding gene prediction

502 Protein-coding genes were predicted based on de novo and protein homology approaches. The

503 algorithms Genscan [98], Augustus [99], GlimmerHMM [100], GeneID [101] and SNAP [102]

504 were used for de novo gene prediction. Alignment of homologous peptides from C. gigas, C.

505 virginica, Lottia gigantea, and Danio rerio to our assemblies was performed to identify

506 homologous genes with the aid of GeMoMa [103]. Consensus gene models were generated by

507 integrating the de novo predictions and protein alignments using EVidenceModeler (EVM)

508 [104].

509 Functional annotation of protein-coding genes

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

510 Annotation of the predicted genes was performed by blasting their sequences against a

511 number of nucleotide and protein sequence databases, including COG [105], KEGG [106],

512 NCBI-NR and Swiss-Prot [107], with an E-value cutoff of 1e-5. Gene ontology (GO) for each

513 gene were assigned by using Blast2GO [108] based on NCBI databases.

514 Evolution of oysters

515 Protein sequences of Haliotis discus hannai [109], Lottia gigantea (GCF_000327385.1),

516 Aplysia californica (GCF_000002075.1), Biomphalaria glabrata (GCF_000457365.1),

517 Crassostrea gigas (GCF_000297895.1), Crassostrea virginica (GCF_002022765.2), Pinctada

518 fucata (https://marinegenomics.oist.jp), Chlamys farreri (CfBase), Bathymodiolus platifrons

519 (GCA_002080005.1), Modiolus philippinarum (GCA_002080025.1), Octopus bimaculoides

520 (GCF_001194135.1), and Homo sapiens (GCF_000001405.26) were retrieved for analysis.

521 Proteomes of the aforementioned twelve species and that of C. hongkongensis, comprising a

522 total of 295,905 protein sequences, were clustered into 38,939 orthologue groups by using

523 OrthoMCL v3.1 [110] based on an all-to-all BLASTP strategy with an E-value of 1e-5 and by

524 using Markov Chain Clustering (MCL) algorithms with default inflation parameters (1.5).

525 Based on clustering results, C. hongkongensis-specific gene families were determined and

526 annotated. To infer phylogenetic relationships, we extracted 387 single-copy gene families

527 from all thirteen species to perform multiple alignments of proteins for each family with

528 MUSCLE v3.8.31 [111]. All of the alignments were combined into one supergene to construct

529 a phylogenetic tree by using RAxML v8.2.12 [112] with 1000 rapid bootstrap analyses,

530 followed by a search of the best-scoring ML tree in a single run. Finally, divergence times

531 were estimated by using MCMCTree from the PAML package [113] in conjunction with a

532 molecular clock model. Several reference-calibrated time points obtained from TimeTree

533 database (http://timetree.org/) were used to date divergence times of interest. Expansion and

534 contraction of OrthoMCL derived homologue clusters were determined by CAFÉ v2.1 [114]

535 calculations on the basis of changes in gene family size with respect to phylogeny and species

536 divergence time. In addition, we obtained domain-based expanded gene families of three

537 Crassostrea species, according to previous works by Albertin et al. (2015) [115].

538 Syntenic analysis

539 All-to-all BLASTP analyses of protein sequences were performed between C. hongkongensis,

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

540 C,gigas, C. virginica, and P. fucata with an E-value threshold set at 1e-5. Syntenic regions

541 within and between species were identified by using MCScan based on BLASTP results. A

542 syntenic region was considered valid, if it contained a minimum of 10 collinear genes and a

543 maximum of 25 gaps (genes) between two adjacent collinear genes.

544 Homeobox gene analysis

545 Structures of homeobox genes in oyster were determined by using the GeMoMa v1.4.2

546 software [116] with default parameters based on available homeobox gene models.

547 Predictions were handled by applying a GeMoMa annotation filter (GAF) with default

548 parameters except for evidence percentage filter (e = 0.1). These were then manually verified

549 to achieve a single high-confidence transcript prediction per locus. Exact annotations of each

550 homeobox gene were completed with the aid of phylogenetic relationships.

551 Transcriptomic analysis

552 Embryos at different developmental stages during oyster embryogenesis including zygote, 2-4

553 cells, blastula, morula, gastrula, trochophore, D-, veliger, pediveliger and spat were

554 collected for RNA isolation. Similarly, RNA extraction was done with various tissues

555 including hemocytes, muscles, gill, labial palp, hepatopancreas, gonads and mantles. To

556 compare asymmetry-related mantle gene expression in the C. hongkongensis and P. fucata,

557 their L/R mantles were collected. For both left and right mantles, unilateral tissues from five

558 individuals were pooled as one sample, and each of the L/R mantle groups contained at least

559 three replicates. Total RNA was isolated by using the Trizol reagent (Cat. no. 15596026,

560 Invitrogen, CA), followed by treatment with RNase-free DNase I (Cat. no. M6101, Promega,

561 WI), according to the manufacturers’ instructions. RNA quality was then checked by using an

562 Agilent 2100 Bioanalyzer. Illumina RNA-Seq libraries were prepared and sequenced in a

563 HiSeq 2500 system by a PE150 strategy following the manufacturer’s instructions (Illumina,

564 CA). After trimming raw reads based on quality scores from the quality trimming program

565 Btrim, clean reads were aligned to the oyster assembly genome by using TopHat v2.1.1 [117]

566 and then assembled by using Cufflinks v2.1.1 [118]. Differential expression of genes in the

567 various tissues was evaluated by using Cuffdiff [118].

568 WGCNA and co-expression network analysis

569 Weighted correlation network analysis (WGCNA) [119] was applied to construct a weighted

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

570 gene co-expression network of genes having a high correlation with cemented attachment.

571 The top 10,000 differential genes exhibiting transcriptional changes in response to L-DOPA

572 treatment were selected for WGCNA, wherein the modules showed high correlation with

573 cemented attachment. We estimated the weight for each pair of genes forming intersections

574 within these modules and analyzed differentially expressed genes relevant to cemented

575 attachment by using DESeq2. Cytoscape [120] was used to delineate the co-expression

576 network of significant gene pairs with weight >0.3.

577 Byssal regeneration

578 Functional relationships between antennapedia (Antp) mRNA expression levels and

579 phenotypic traits of byssal threads in adult pearl oysters (Pinctada fucata) were explored.

580 Briefly, 50-100 pearl oysters (2 years old) were collected and maintained in aerated

581 laboratory tanks. Byssal mass comprising the byssal stem and existing old threads of pearl

582 oysters were excised. Then, individual pearl oysters were placed in beakers (one oyster per

583 beaker) to allow identification of subsequent regrowth of nascent thread mass. Particular care

584 was taken in removing old threads and attachment discs from the shells. Preliminary

585 experiments indicate that removal of the threads did not affect subsequent thread formation.

586 Byssal thread formation was estimated as the number of threads/oyster observed 24 h later.

587 Subsequently, the corresponding byssal gland of each pearl oyster was collected for

588 RNA extraction by using TRIzol reagent, according to the manufacturer’s instructions.

589 Purified RNA samples were diluted to 1 µg/µL and pooled to perform cDNA synthesis by

590 utilizing PrimerScript first strand cDNA synthesis kit (Cat. no. 6110A, Takara Bio, Japan),

591 following the manufacturer’s protocol. Real-time qPCR analysis was performed to determine

592 Antp mRNA expression with gene-specific primers (Table S16).

593 Pharmacological treatment

594 Chemical compounds were obtained from Sigma-Aldrich, unless otherwise specified.

595 Working solutions were freshly prepared in deionized (DI) water approximately 1 h before in

596 vivo experiments, which were conducted in large beakers to allow observation of oyster

597 attachment and metamorphosis. Groups of oyster larvae at the pediveliger stage were placed

598 in three beakers containing 50 mL sea water (at a density of 20 larvae/mL). There were three

599 groups in total: an unstimulated control, an L-3,4-dihydroxyphenylalanine (L-DOPA)

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

600 treatment and a norepinephrine (NE) treatment. Oyster larvae were challenged (6 h and 24 h)

601 with different concentrations of NE (10-4, 10-5, 10-6 M) or L-DOPA (10-5, 10-6, 10-7 M).

602 Previous studies have shown that this concentration range is sufficiently potent for inducing a

603 larval response [121,122].

604 In addition, oyster larvae were collected following various treatment durations (6 h and

605 24 h) for RNA-seq and transcriptomic analysis to determine any temporally driven differences

606 between the L-DOPA treatment group (10-5 M) and unstimulated control. By a similar design,

607 oyster larvae were exposed to NE (10-5 M) for 6 h and 24 h, and their transcriptomic profiles

608 were examined in relation to oyster metamorphosis.

609

610 Data availability

611 The C. hongkongensis genome studied in this Hong Kong oyster genome project has been

612 deposited at the NCBI under the BioProject number PRJNA592306 at

613 https://www.ncbi.nlm.nih.gov/bioproject/PRJNA592306. Hi-C data have been deposited as

614 SRR10583824 at https://www.ncbi.nlm.nih.gov/sra/SRR10583824. RNA-seq data of various

615 transcriptomes have been deposited as PRJNA588628 at

616 https://www.ncbi.nlm.nih.gov/bioproject/PRJNA588628.

617

618 CRediT author statement

619 Yang Zhang : Conceptualization, Methodology, Validation, Investigation, Data Curation,

620 Writing - Original Draft, Writing - Review & Editing, Visualization, Supervision, Project

621 administration, Funding acquisition. Fan Mao: Methodology, Validation, Investigation, Data

622 Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Funding

623 acquisition. Shu Xiao: Methodology, Validation, Resources, Funding acquisition. Haiyan Yu:

624 Methodology, Formal analysis, Investigation, Data Curation. Zhiming Xiang: Methodology,

625 Validation, Data Curation, Funding acquisition. Fei Xu: Formal analysis, Validation, Data

626 Curation. Jun Li: Validation, Resources. Lili Wang: Formal analysis. Yuanyan Xiong:

627 Formal analysis. Mengqiu Chen: Formal analysis. Yo ng b o Ba o : Formal analysis. Yuewen

628 Deng: Validation. Quan Huo: Validation. Lvping Zhang: Validation. Wenguang Liu:

629 Validation. Xuming Li: Formal analysis. Haitao Ma: Formal analysis. Yuehuan Zhang:

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

630 Resources. Xiyu Mu: Formal analysis. Min Liu: Formal analysis. Hongkun Zheng:

631 Conceptualization, Formal analysis, Data Curation, Project administration. Nai-Kei Wong:

632 Writing - Review & Editing, Visualization. Ziniu Yu: Conceptualization, Writing - Review &

633 Editing, Visualization, Supervision, Project administration, Funding acquisition.

634 635 Competing interest 636 We declare that none of the authors have competing financial or non-financial 637 interests. 638

639 Acknowledgments

640 We are deeply grateful to our lab members and collaborators, who have provided us with able

641 assistance or valuable advice at all stages of this study. We acknowledge grant support from

642 Key Special Project for Introduced Talents Team of Southern Marine Science and

643 Engineering Guangdong Laboratory (Guangzhou) (GML2019ZD0407), Key Deployment

644 Project of Centre for Ocean Mega-Research of Science, Chinese Academy of Science

645 (COMS2019Q11), the National Science Foundation of China (No. 32073002, 31902404), the

646 China Agricultural Research System (No. CARS-49), the Science and Technology Program of

647 Guangzhou, China (No.201804020073), Natural Science Foundation of Guangdong Province

648 (2020A1515011533), the Program of the Pearl River Young Talents of Science and Technology

649 in Guangzhou of China (201806010003), Institution of South China Sea Ecology and

650 Environmental Engineering, Chinese Academy of Sciences (ISEE2018PY01, ISEE2018PY03,

651 ISEE2018ZD01), and Science and Technology Planning Project of Guangdong Province,

652 China (2017B030314052, 201707010177).

653

654 ORCID

655 0000-0002-0789-4938 (Yang Zhang)

656 0000-0001-6899-5591 (Fan Mao)

657 0000-0002-7276-3213 (Shu Xiao)

658 0000-0001-9709-0417 (Haiyan Yu)

659 0000-0003-1428-2910 (Zhiming Xiang)

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

660 0000-0002-9426-3615 (Hongkun Zheng)

661 0000-0003-1303-3170 (Nai-Kei Wong)

662 0000-0002-1049-4345 (Ziniu Yu)

663 664 References 665 [1] Appeltans W, Ahyong ST, Anderson G, Angel MV, Artois T, Bailly N, et al. The magnitude of 666 global marine species diversity. Curr Biol 2012;22:2189-202. 667 [2] Tibabuzo Perdomo AM, Alberts EM, Taylor SD, Sherman DM, CP, Wilker JJ. Changes in 668 cementation of reef building oysters transitioning from larvae to adults. ACS Appl Mater Interfaces 669 2018;10:14248-53. 670 [3] Cranfield HJ. Observations on the behaviour of the pediveliger of Ostrea edulis during attachment 671 and cementing. Mar Biol 1973;22:203-9. 672 [4] Dame RF, Zingmark RG, Haskin E. Oyster reefs as processors of estuarine materials. J Exp Mar 673 Biol and Ecol 1984;83:239-47. 674 [5] Grabowski JH, Peterson CH. Restoring oyster reefs to recover ecosystem services. Theor Ecol 675 Series 2007;4:281-98. 676 [6] Parker LM, Ross PM, O'Connor WA, Portner HO, Scanes E, Wright JM. Predicting the response of 677 molluscs to the impact of ocean acidification. Biology (Basel) 2013;2:651-92. 678 [7] Kroeker KJ, Kordas RL, Crim R, Hendriks IE, Ramajo L, Singh GS, et al. Impacts of ocean 679 acidification on marine organisms: quantifying sensitivities and interaction with warming. Glob 680 Biol 2013;19:1884-96. 681 [8] Gazeau F, Parker LM, Comeau S, Gattuso JP, O'Connor WA, Martin S, et al. Impacts of ocean 682 acidification on marine shelled molluscs. Mar Biol 2013;160:2207-45. 683 [9] Pujol JP. Formation of the Byssus in the Common Mussel (Mytilus edulis L.). Nature 684 1967;214:204-5. 685 [10] Priemel T, Degtyar E, Dean MN, Harrington MJ. Rapid self-assembly of complex biomolecular 686 architectures during mussel byssus biofabrication. Nat Commun 2017;8:14539. 687 [11] Harrington MJ, Masic A, Holten-Andersen N, Waite JH, Fratzl P. Iron-clad fibers: a metal-based 688 biological strategy for hard flexible coatings. Science 2010;328:216-20. 689 [12] Li Y, Sun X, Hu X, Xun X, Zhang J, Guo X, et al. Scallop genome reveals molecular adaptations 690 to semi-sessile life and neurotoxins. Nat Commun 2017;8:1721. 691 [13] Li S, Liu C, A, L, Zhang R. Influencing mechanism of ocean acidification on byssus 692 performance in the pearl oyster Pinctada fucata. Environ Sci Technol 2017;51:7696-706. 693 [14] Burkett JR, Hight LM, Kenny P, Wilker JJ. Oysters produce an organic-inorganic adhesive for 694 intertidal reef construction. J Am Chem Soc 2010;132:12531-3. 695 [15] Stanley SM. Relation of Shell Form to Life Habits of the Bivalvia (Mollusca). Geological Society 696 of America Memoir; 1970, 125:296 p. 697 [16] Guo XM, Ford SE, Zhang FS. Molluscan aquaculture in China. J Shellfish Res 1999;18:19-31. 698 [17] Zhang G, X, Guo X, Li L, Luo R, Xu F, et al. The oyster genome reveals stress adaptation 699 and complexity of shell formation. Nature 2012;490:49-54. 700 [18] Wang S, Zhang JB, WQ, Li J, Xun XG, Sun Y, et al. Scallop genome provides insights into 701 evolution of bilaterian karyotype and development. Nat Ecol & Evol 2017;1(5):120.

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

702 [19] Du XD, Fan GY, Jiao Y, Zhang H, Guo XM, Huang RL, et al. The pearl oyster Pinctada fucata 703 martensii genome and multi-omic analyses provide insights into biomineralization. Gigascience 2017; 704 6(8):1-12. 705 [20] X, H, Huo Z, Ding J, Li Z, Yan L, et al. Clam genome sequence clarifies the molecular 706 basis of its benthic adaptation and extraordinary shell color diversity. iScience 2019;19:1225-37. 707 [21] Simakov O, Marletaz F, Cho SJ, Edsinger-Gonzales E, Havlak P, Hellsten U, et al. Insights into 708 bilaterian evolution from three spiralian genomes. Nature 2013;493:526-31. 709 [22] Sea Urchin Genome Sequencing C, Sodergren E, Weinstock GM, Davidson EH, Cameron RA, 710 Gibbs RA, et al. The genome of the sea urchin Strongylocentrotus purpuratus. Science 711 2006;314:941-52. 712 [23] Ren J, Liu X, Jiang F, Guo X, Liu B. Unusual conservation of mitochondrial gene order in 713 Crassostrea oysters: evidence for recent speciation in Asia. BMC Evol Biol 2010;10:394. 714 [24] Barnosky AD, Matzke N, Tomiya S, Wogan GO, Swartz B, Quental TB, et al. Has the earth's sixth 715 mass extinction already arrived? Nature 2011;471:51-7. 716 [25] Pimm SL, Jenkins CN, Abell R, Brooks TM, Gittleman JL, Joppa LN, et al. The biodiversity of 717 species and their rates of extinction, distribution, and protection. Science 2014;344:1246752. 718 [26] Schulte P, Alegret L, Arenillas I, Arz JA, Barton PJ, Bown PR, et al. The chicxulub asteroid impact 719 and mass extinction at the Cretaceous-Paleogene boundary. Science 2010;327:1214-8. 720 [27] Baumiller TK, Salamon MA, Gorzelak P, Mooi R, Messing CG, Gahn FJ. Post-Paleozoic crinoid 721 radiation in response to benthic predation preceded the Mesozoic marine revolution. P Natl Acad Sci 722 USA 2010;107:5893-6. 723 [28] Sigurdsson JB, Titman CW, Davies PA. The dispersal of young post-larval bivalve molluscs by 724 byssus threads. Nature 1976;262:386-7. 725 [29] Hopkins AE. Attachment of larvae of the Olympia oyster, Ostrea lurida, to plane surfaces. 726 Ecology 1935;16:82-7. 727 [30] Nelson TC. The attachment of oyster Larvae. Biol Bull 1924;46:143-51. 728 [31] Garcia-Fernandez J. The genesis and evolution of homeobox gene clusters. Nat Rev Genet 729 2005;6:881-92. 730 [32] Lemons D, McGinnis W. Genomic evolution of Hox gene clusters. Science 2006;313:1918-22. 731 [33] Biscotti MA, Canapa A, Forconi M, Barucca M. Hox and ParaHox genes: a review on molluscs. 732 Genesis 2014;52:935-45. 733 [34] Frobius AC, Funch P. Rotiferan Hox genes give new insights into the evolution of metazoan 734 bodyplans. Nat Commun 2017;8:9. 735 [35] Shiga Y, Yasumoto R, Yamagata H, Hayashi S. Evolving role of Antennapedia protein in 736 limb patterning. Development 2002;129:3555-61. 737 [36] Khadjeh S, Turetzek N, Pechmann M, Schwager EE, Wimmer EA, Damen WGM, et al. Divergent 738 role of the Hox gene Antennapedia in is responsible for the convergent evolution of abdominal 739 limb repression. P Natl Acad Sci USA 2012;109:4921-6. 740 [37] Kimoto M, Tsubota T, Uchino K, Sezutsu H, Takiya S. Hox transcription factor Antp regulates 741 sericin-1 gene expression in the terminal differentiated silk gland of Bombyx mori. Dev Biol 742 2014;386:64-71. 743 [38] Li JY, Ye LP, JQ, Song J, ZY, Yun KC, et al. Comparative proteomic analysis of the 744 silkworm middle silk gland reveals the importance of ribosome biogenesis in silk protein production. J 745 Proteomics 2015;126:109-20.

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

746 [39] Metzler RA, Rist R, Alberts E, Kenny P, Wilker JJ. Composition and Structure of oyster adhesive 747 reveals heterogeneous materials properties in a biological composite. Adv Funct Mater 748 2016;26:6814-21. 749 [40] Guerette PA, Hoon S, Seow Y, Raida M, Masic A, Wong FT, et al. Accelerating the design of 750 biomimetic materials by integrating RNA-seq with proteomics and materials science. Nat Biotechnol 751 2013;31:908-15. 752 [41] Kohler M, Hirschberg B, Bond CT, Kinzie JM, Marrion NV, Maylie J, et al. Small-conductance, 753 calcium-activated potassium channels from mammalian brain. Science 1996;273:1709-14. 754 [42] Hirschberg B, Maylie J, Adelman JP, Marrion NV. Gating properties of single SK channels in 755 hippocampal CA1 pyramidal neurons. Biophys J 1999;77:1905-13. 756 [43] Faber ES, Sah P. Functions of SK channels in central neurons. Clin Exp Pharmacol Physiol 757 2007;34:1077-83. 758 [44] H, Shepard PD. SK Ca2+-activated K+ channel ligands alter the firing pattern of 759 dopamine-containing neurons in vivo. Neuroscience 2006;140:623-33. 760 [45] Hennebert E, Maldonado B, Ladurner P, Flammang P, Santos R. Experimental strategies for the 761 identification and characterization of adhesive proteins in : a review. Interface Focus 762 2015;5:20140064. 763 [46] Li D, Graham LD. Epidermal secretions of terrestrial flatworms and slugs: Lehmannia valentiana 764 mucus contains matrilin-like proteins. Comp Biochem Physiol B Biochem Mol Biol 2007;148:231-44. 765 [47] Hennebert E, Wattiez R, Demeuldre M, Ladurner P, Hwang DS, Waite JH, et al. Sea star tenacity 766 mediated by a protein that fragments, then aggregates. P Natl Acad Sci USA 2014;111:6317-22. 767 [48] Papov VV, Diamond TV, Biemann K, Waite JH. Hydroxyarginine-containing polyphenolic 768 proteins in the adhesive plaques of the marine mussel Mytilus edulis. J Biol Chem 1995;270:20183-92. 769 [49] Lee BP, Messersmith PB, Israelachvili JN, Waite JH. Mussel-Inspired Adhesives and Coatings. 770 Annu Rev Mater Res 2011;41:99-132. 771 [50] Rucker RB, Kosonen T, Clegg MS, Mitchell AE, Rucker BR, Uriu-Hare JY, et al. Copper, lysyl 772 oxidase, and extracellular matrix protein cross-linking. Am J Clin Nutr 1998;67:996S-1002S. 773 [51] Walker G. A study of the cement apparatus of the cypris larva of the barnacle Balanus balanoides. 774 Mar Biol 1971;9:205-12. 775 [52] Senkbeil T, Mohamed T, Simon R, Batchelor D, Di Fino A, Aldred N, et al. In vivo and in situ 776 synchrotron radiation-based mu-XRF reveals elemental distributions during the early attachment phase 777 of barnacle larvae and juvenile barnacles. Anal Bioanal Chem 2016;408:1487-96. 778 [53] Patrick F. Biological and Biomimetic Adhesives: Challenges and Opportunities. In: Smith 779 AM editor. Multiple metal-based cross-links: protein oxidation and metal coordination in a 780 biological glue. Cambridge : Royal Society of Chemistry; 2013; p. 3-15. 781 [54] Coon SL, Fitt WK, Bonar DB. Competence and delay of metamorphosis in the Pacific oyster 782 Crassostrea gigas. Mar Biol 1990;106:379-87. 783 [55] Bonar DB, Coon SL, Walch M, Weiner RM, Fitt W. Control of oyster settlement and 784 metamorphosis by endogenous and exogenous chemical cues. B Mar Sci 1990;46:484-98. 785 [56] Morse DE. Neurotransmitter-mimetic inducers of larval settlement and metamorphosis. B Mar Sci 786 1985;37:697-706. 787 [57] Nell JA, Holliday JE. Effects of potassium and copper on the settling rate of sydney rock oyster 788 (Saccostrea commercialis) Larvae. Aquaculture 1986;58:263-7. 789 [58] Wang J, CL, Xu CL, Yu WC, Li Z, Li YC, et al. Voltage-gated potassium ion channel may play

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

790 a major role in the settlement of Pacific oyster (Crassostrea gigas) larvae. Aquaculture 791 2015;442:48-50. 792 [59] Alberts EM, Taylor SD, Edwards SL, Sherman DM, Huang CP, Kenny P, et al. Structural and 793 compositional characterization of the adhesive produced by reef building oysters. ACS Appl Mater 794 Interfaces 2015;7:8533-8. 795 [60] Foulon V, Boudry P, Artigaud S, Guerard F, Hellio C. In Silico snalysis of Pacific oyster 796 (Crassostrea gigas) transcriptome over developmental stages reveals candidate genes for larval 797 settlement. Int J Mol Sci 2019;20(1):197. 798 [61] Zhang T, Liu J, Fellner M, Zhang C, Sui D, Hu J. Crystal structures of a ZIP zinc transporter 799 reveal a binuclear metal center in the transport pathway. Sci Adv 2017; 3(8):e1700344. 800 [62] Du Y, F, Zhu L. Biosorption of divalent Pb, Cd and Zn on aragonite and calcite mollusk 801 shells. Environ Pollut 2011;159:1763-8. 802 [63] Bond JS, Beynon RJ. The astacin family of metalloendopeptidases. Protein Sci 1995;4:1247-61. 803 [64] Sadeghi H, Allard P, Prince F, Labelle H. Symmetry and limb dominance in able-bodied gait: a 804 review. Gait Posture 2000;12:34-45. 805 [65] Weiss IM, Schonitzer V. The distribution of chitin in larval shells of the bivalve mollusk Mytilus 806 galloprovincialis. J Struct Biol 2006;153:264-77. 807 [66] Marin F, Le Roy N, Marie B. The formation and mineralization of mollusk shell. Front Biosci 808 (Schol Ed) 2012;4:1099-125. 809 [67] Marin F, Luquet G, Marie B, Medakovic D. Molluscan shell proteins: primary structure, origin, 810 and evolution. Curr Top Dev Biol 2008;80:209-76. 811 [68] Splitt MP, Burn J, Goodship J. Defects in the determination of left-right asymmetry. J Med Genet 812 1996;33:498-503. 813 [69] Savazzi E. Adaptational strategies of bivalves living as infaunal secondary soft bottom dwellers. 814 Neus Jahrb Geol P-A 1982;164:229-44. 815 [70] Wilbur KM, Saleuddin ASM. Shell Formation In: Saleuddin ASM, Wilbur KM, editors. The 816 mollusca. London New York: Academic Press; 1983, p. 235-87. 817 [71] Joubert C, Piquemal D, Marie B, Manchon L, Pierrat F, Zanella-Cleon I, et al. Transcriptome and 818 proteome analysis of Pinctada margaritifera calcifying mantle and shell: focus on biomineralization. 819 BMC Genomics 2010;11:613. 820 [72] Yoshioka H, Meno C, Koshiba K, Sugihara M, Itoh H, Ishimaru Y, et al. Pitx2, a bicoid-type 821 homeobox gene, is involved in a lefty-signaling pathway in determination of left-right asymmetry. Cell 822 1998;94:299-305. 823 [73] Grande C, Patel NH. Nodal signalling is involved in left-right asymmetry in snails. Nature 824 2009;457:1007-11. 825 [74] Smith SB, HQ, Taleb N, Kishimoto NY, Scheel DW, Y, et al. Rfx6 directs islet formation 826 and insulin production in mice and humans. Nature 2010;463:775-80. 827 [75] Hamano K, Awaji M, Usuki H. cDNA structure of an insulin-related peptide in the Pacific oyster 828 and seasonal changes in the gene expression. J Endocrinol 2005;187:55-67. 829 [76] Nagai K, Yano M, Morimoto K, Miyamoto H. Tyrosinase localization in mollusc shells. Comp 830 Biochem Physiol B Biochem Mol Biol 2007;146:207-14. 831 [77] Aguilera F, McDougall C, Degnan BM. Evolution of the tyrosinase gene family in bivalve 832 molluscs: independent expansion of the mantle gene repertoire. Acta Biomater 2014;10:3855-65. 833 [78] Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform.

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

834 Bioinformatics 2009;25:1754-60. 835 [79] Xu H, Luo X, Qian J, X, Song J, Qian G, et al. FastUniq: a fast de novo duplicates removal 836 tool for paired short reads. PLoS One 2012;7:e52249. 837 [80] Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft 838 assemblies of mammalian genomes from massively parallel sequence data. P Natl Acad Sci USA 839 2011;108:1513-8. 840 [81] Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using 841 SSPACE. Bioinformatics 2011;27:578-9. 842 [82] Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. Erratum: SOAPdenovo2: an empirically 843 improved memory-efficient short-read de novo assembler. Gigascience 2015;4:30. 844 [83] Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and 845 accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 846 2017;27:722-36. 847 [84] Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. Bioinformatics 848 2014;30:3506-14. 849 [85] Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic 850 genomes. Bioinformatics 2007;23:1061-7. 851 [86] Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing 852 genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 853 2015;31:3210-2. 854 [87] Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, et al. 855 Comprehensive mapping of long-range interactions reveals folding principles of the human genome. 856 Science 2009;326:289-93. 857 [88] Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J. Chromosome-scale 858 scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 859 2013;31:1119-25. 860 [89] Li C, Wang J, Song K, J, Xu F, Li L, et al. Construction of a high-density genetic map and 861 fine QTL mapping for growth and nutritional traits of Crassostrea gigas. BMC Genomics 2018;19. 862 [90] Tang H, Zhang X, Miao C, Zhang J, Ming R, Schnable JC, et al. ALLMAPS: robust scaffold 863 ordering based on multiple maps. Genome Biol 2015;16. 864 [91] Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR 865 retrotransposons. Nucleic Acids Res 2007;35:W265-8. 866 [92] Han Y, Wessler SR. MITE-Hunter: a program for discovering miniature inverted-repeat 867 transposable elements from genomic sequences. Nucleic Acids Res 2010;38:e199. 868 [93] Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. 869 Bioinformatics 2005;21 Suppl 1:i351-8. 870 [94] Edgar RC, Myers EW. PILER: identification and classification of genomic repeats. Bioinformatics 871 2005;21 Suppl 1:i152-8. 872 [95] Wicker T, Sabot F, -Van A, Bennetzen JL, Capy P, Chalhoub B, et al. A unified classification 873 system for eukaryotic transposable elements. Nat Rev Genet 2007;8:973-82. 874 [96] Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic 875 genomes. Mob DNA 2015;6:11. 876 [97] Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic 877 sequences. Curr Protoc Bioinformatics 2009;Chapter 4:Unit 4 10.

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

878 [98] Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 879 1997;268:78-94. 880 [99] Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. 881 Bioinformatics 2003;19 Suppl 2:ii215-25. 882 [100] Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio 883 eukaryotic gene-finders. Bioinformatics 2004;20:2878-9. 884 [101] Blanco E, Parra G, Guigo R. Using geneid to identify genes. Curr Protoc Bioinformatics 885 2007;Chapter 4:Unit 4 3. 886 [102] Korf I. Gene finding in novel genomes. BMC Bioinformatics 2004;5:59. 887 [103] Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F. Using intron position 888 conservation for homology-based gene prediction. Nucleic Acids Res 2016;44:e89. 889 [104] Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, et al. Automated eukaryotic gene 890 structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. 891 Genome Biol 2008;9:R7. 892 [105] Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG 893 database: an updated version includes eukaryotes. BMC Bioinformatics 2003;4:41. 894 [106] Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 895 2000;28:27-30. 896 [107] Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, et al. The 897 SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 898 2003;31:365-70. 899 [108] Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool 900 for annotation, visualization and analysis in functional genomics research. Bioinformatics 901 2005;21:3674-6. 902 [109] Nam BH, Kwak W, Kim YO, Kim DG, Kong HJ, Kim WJ, et al. Genome sequence of pacific 903 abalone (Haliotis discus hannai): the first draft genome in family Haliotidae. Gigascience 2017;6:1-8. 904 [110] Li L, Stoeckert CJ, Jr., Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic 905 genomes. Genome Res 2003;13:2178-89. 906 [111] Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. 907 Nucleic Acids Res 2004;32:1792-7. 908 [112] Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large 909 phylogenies. Bioinformatics 2014;30:1312-3. 910 [113] Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 911 2007;24:1586-91. 912 [114] De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene 913 family evolution. Bioinformatics 2006;22:1269-71. 914 [115] Albertin CB, Simakov O, Mitros T, Wang ZY, Pungor JR, Edsinger-Gonzales E, et al. The 915 octopus genome and the evolution of cephalopod neural and morphological novelties. Nature 916 2015;524:220-4. 917 [116] Keilwagen J, Hartung F, Grau J. GeMoMa: Homology-Based Gene Prediction Utilizing Intron 918 Position Conservation and RNA-seq Data. Methods Mol Biol 2019;1962:161-77. 919 [117] Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. 920 Bioinformatics 2009;25:1105-11. 921 [118] Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

922 transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 923 2012;7:562-78. 924 [119] Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. 925 BMC Bioinformatics 2008;9:559. 926 [120] Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software 927 environment for integrated models of biomolecular interaction networks. Genome Res 928 2003;13:2498-504. 929 [121] Coon SL, Bonar DB, Weiner RM. Chemical Production of Cultchless Oyster Spat Using 930 Epinephrine and Norepinephrine. Aquaculture 1986;58:255-62. 931 [122] Coon SL, Bonar DB, Weiner RM. Induction of settlement and metamorphosis of the Pacific 932 oyster, Crassostrea gigas (Thunberg), by L-Dopa and catecholamines. J Exp Mar Biol and Ecol 933 1985;94:211-21. 934

935 Figure legends

936 Figure 1 The genome landscape and phylogenetic analysis of the oyster Crassostrea hongkongensis.

937 A. Circos plot highlights genome characteristics across 10 chromosomes in a megabase (Mb) scale. The GC

938 content, global heterozygosity, gene density and repeat coverage are presented from outer to inner circles in turn

939 with non-overlapping 1 Mb sliding windows. B. Analysis on gene family expansion/contraction and divergence

940 time across 12 representative mollusks species. A total of 87 gene families are expanded in the Hong Kong oyster,

941 C. hongkongensis. The human genome was set as an outgroup. Three Ostreoida oyster species (Crassostrea

942 hongkongensis, Crassostrea gigas, and Crassostrea virginica) are clustered together. Gene family

943 expansion/contraction is indicated by a plus or minus sign.

944

945 Figure 2 Loss of the homeobox gene antennapedia (Antp) is implicated in an adaptive shift from byssal

946 attachment to cemented attachment.

947 A. Comparison of Homobox (Hox) cluster organization in bivalves with two distinct attachment styles, byssal

948 attachment and cemented attachment. Unlike the disputed Hox gene cluster in C. gigas oyster genome, Hox gene

949 cluster configures linearly in both C. hongkongesis and C. virginica. Essentially, Antp is lost in all three Ostreoida

950 oysters. B. Overview of key body-plan organization in Pinctada fucata and C. hongkongensis. P. fucata possesses

951 a byssal gland and byssus, whereas adult individuals of Ostreoida oyster have lost their byssus gland and byssus. C.

952 Tissue distribution of Antp othologues in three byssally attached bivalves, P. f u c a ta , Mytilus galloprocincialis, and

953 Mizuhopecten yessoensis. BG, byssal gland; DG, stomach. Antp mRNA abundance is displayed in percentage, and

954 its expression in BG accounted for more than 50%. D. Morphology of newly regenerated byssus 48 h after

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

955 excision of original byssus. Scale bar: 2 mm. E. Anatomic analysis of byssal gland with cross section. Vertical

956 cross-section of the byssal gland displaying ciliated walls (Cl), lamina propria (LP), and byssal remnants (BY)

957 within a chamber. Scale bar: 50 μm. F. Correlation between abundance of Antp mRNA and regenerated byssus

958 numbers in P. f u ca ta . Antp mRNA level in the byssal gland was determined by real-time qPCR, while newly

959 regenerated byssus threads was counted 48 h after excision of original byssus. Pearson’s correlation coefficients

960 and p-values were calculated with two tailed tests with 95% confidence.

961

962 Figure 3 Molecular basis of attachment initiation in Ostreoida oysters.

963 A. Veen plot shows the common gene family expansion in three Ostreoida oyster species, C. hongkongensis, C.

964 virginica, and C. gigas, among which 32 core gene family expansions were identified. B. Heatmap illustrates the

965 correlation between expression levels of 32 core gene family and developmental stages of C. hongkongensis larvae.

966 The high correlation of transcriptional activated gene family with attachment is presented by red. C.

967 Pharmacological responses of oyster veliger larvae during attachment initiation and metamorphosis.

968 L-3,4-dihydroxyphenylalanine (L-DOPA) stimulated larval attachment and metamorphosis, while

969 noepinephrine (NE) only induced metamorphosis without attachment. Veen plot shows that L-DOPA/NE induced

970 specific genes, among which L-DOPA specifically induced genes may participate in attachment initiation. D.

971 Construction of coordinated gene networks based on the zinc transporter ZIP12, which is a hub forming the

972 highest degrees of gene connections in Weighted correlation network analysis (WGCNA) analysis. Red and

973 green dots indicate up-regulated genes and down regulated genes. E. Schematic diagram conceptualizing the

974 molecular basis for initiation of larval attachment in oysters. Square box indicates oyster-specific expanded gene

975 families involved in larval attachment (p <0.001). Filled color (blue) was scaled with correlation values at the spat

976 stage. Ellipse box indicates L-DOPA specifically induced genes after L-DOPA treatment, which are filled in red

977 scaled with values in log2 (FC). FC, fold change. 978

979 Figure 4 Left-right asymmetry of shell formation in Ostreoida oysters.

980 A. Comparison of the ratio of left/right shell weight and morphology between the C. hongkongensis and P. f u c a t e .

981 B. Volcano plot shows the left- and right-mantle differentially expressed genes, which are filtered by |log2(FC)| ≥1

982 with p-value <0.05. C. Expression profile of 1:1 orthologues in L/R mantle of C. hongkongensis and P. ca t e. A

983 total of 10,491 orthologues were paired and only a few asymmetrical orthologues were specifically expressed in

984 the Hong Kong oyster. The x- and y-axes indicate logFC of expression ratio in R/L mantle of C. hongkongensis and

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

985 P. fu ca te, respectively. D. Expression patterns of two pivotal transcription factors in left and right mantles across

986 five bivalves. Ch, C. hongkongensis; Cg, C. gigas; Cv, C. virginica; My, M. yessoensis; Pf, P. fu ca t e . E.

987 Dendrogram of known tyrosinases from five mollusks was constructed by maximum likelihood (ML) method.

988 Bivalve and molluscan TyrA orthologous groups are indicated by curvatures and annotated as A1-A3. Specific

989 tyrosinase orthologous groups are marked with color background and annotated with a species’ name. Species are

990 represented with different shapes: triangle, Mizuhopecten yessoensis; circle, ; pentagon, Pinctada fucata.

991 F. Expression patterns of tyrosinase families in left and right mantles of two bivalve species, as determined by

992 FPKM. Total FPKM of different types of orthologous genes is displayed in cumulative histograms. Different

993 members of tyrosinase are presented with different colors.

994

a b 40 chr1 chr1020 0 20 +49 /-127 0 40 Haliotis discus hannai +8 /-27 40 60 +62 /-73 Lottia gigantea +13/ -45 chr920 0 Gastropoda +36 /-99 Aplysia californica +28 /-88 20 0 chr2 +67 /-45 60 Biomphalaria glabrata

40 +87 /-45 Crassostrea Hongkongensis 40 +21 /-41 +14 /-22 chr8 60 +98/ -41 Crassostrea gigas +30 /-110 20 GC Content 23 0 + /-14 Crassostrea virginica Heterozygosity +106 /-40 +16 /-3 +69 /-104 0 Gene Density Pinctada fucata Bivalvia 20 60 Repeat Coverage +2 /-31 +76 /-100 +69 /-14 Chlamys farreri

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint 40 +78 /-85 chr7 40 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Bathymodiolus platifrons chr3 +88 /-43 60 +112 /-69 20 Modiolus philippinarum 0 +45 /-178 Octopus bimaculoides Cephalopoda 0 20 60 +104/ -75 Homo sapiens Outgroup 40 chr640 60 chr4 20 0 0 20 800 600 400 200 0 60

40 chr5 Million years ago (MYA) Hox1 Hox2 Hox3 Hox4 Hox5 Lox5 Antp Lox4 Lox2 Post2 Post1 A Ancestral Hox cluster

M. philippinaurm

B. platifrons

M. galloprovincialis

C. farreri

M. yessoensis Byssal attachment

P. fucata

C. gigas

C. hongkongensis

C. virginica Cemented attachment

B C Palps P. fucata Stomach

Foot M. galloprovincialis Byssus Byssal gland M. yessoensis Heart Adducator 0 20 40 60 80 100 muscle Relative Antp mRNA abundance (%) Gills Mantle BG Mantle Muscle DG Gill Foot

D E F 12 R 2 =0.3584 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 2 mm p=0.0012 CI 8 BY

Chamber 4 LP

25 μm 0 Regenerated byssus number -4 -3 -2

Log 10 (Antp mRNA) A B

PF01039.21_Carboxyl_trans PF16589.4_BRCT_2 1 PF02838.14_Glyco_hydro_20b C. virginica PF00875.17_DNA_photolyase PF00270.28_DEAD C. hongkongensis PF04117.11_Mpv17_PMP22 PF03782.16_AMOP PF00929.23_RNase_T PF01564.16_Spermine_synth 0.5 PF01400.23_Astacin PF03530.13_SK_channel PF03067.14_LPMO_10 PF13574.5_Reprolysin_2 PF03645.12_Tctex−1 98 PF01557.17_FAA_hydrolase 37 PF09772.8_Tmem26 180 0 PF02793.21_HRM PF00335.19_Tetraspannin PF00008.26_EGF PF06119.13_NIDO PF00053.23_Laminin_EGF 32 PF12947.6_EGF_3 PF00957.20_Synaptobrevin 5 −0.5 PF07534.15_TLD PF00582.25_Usp 11 PF04505.11_CD225 10 PF06701.12_MIB_HERC2 PF16977.4_ApeC PF00643.23_zf−B_box PF07731.13_Cu−oxidase_2 C. gigas −1 PF07732.14_Cu−oxidase_3 PF00394.21_Cu−oxidase

a va ula ula eliger Spat Zygote Mor 2−4cellsBlastula Gastr D−larv ediv eliger larP Trochophore V C Attachment Metamorphosis L-Dopa D

Pediveliger Spat NDCBE MGLUR NE IFT Pr2 NBCn C2MS ALDH18A1

hypothetical protein Beta-catenin EFCB6

ZIP12 UBX

L-Dopa 548 179 41 NE 677 20 2 Prominin PKN2

bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.

2+ E Sensing Ca + - K HCO3 2+ Tetraspannin Matrix/Ions secretion Zn Oyster larva

e

r SK 2 s

e

1

a

g

-

3

n channel P P

O

I a

T

C

h

Z

A

c

H

-

x

+

e 2

Cadherin a

C Collagen Matrix modification K+ EGF Matrilin Cu-oxidase -OH EGF_3 Cohesin - Correlation S S - Astacin Laminin WDR1 0.5 1.0 EGF

Log 2 (FC) Hemicentin

- HRM S S - - - - - 1.5 3.5 Apec SH HS SS 2+ - Ca +HCO3 CaCO3

Surface A B Left preference gene Right preference gene

C.hongkongensis P.fucata Crassostrea 20 20 hongkongensis 15 15 Left Right p<0.0001 shell shell 10 10 0

Pinctada fucata 1 5 5 -log (p-value)

0 1.0 1.4 1.8 2.2 0 Ratio of L/R shell weight -4 0 4 -4 0 4 log (FC) log (FC) C 2 2

5 Pitx2 Right EVX1 D 3 SLC2A1 L R Mab21 GHR Ch Ch 1

-1 Pf Cg Pf Cg

Pinctada fucata Gene density OPR Rfx6 6 -3 NacreinF1 4 NPYR 2 0 Left -5 (%) -5 -3 -1 1 3 5 My Cv My Cv Left Right Crassostrea hongkongensis Pitx2 RFX6

Ostreid ae E ata expansion F n bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder.tada All rights nreserved.fucsio No reuse allowed without permission. Pinc pa 4000 ex TyrA1 3500

3000

2500 M

Ty P rA3 K

F 2000

1500

1000

TyrA2 500 Ostre id ae zu expan hope sion 0 cte n y L R L R ex es pan s sio oen n sis Crassostrea Pinctada fucata hongkongensis