bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
1 Comparative genomics reveals evolutionary drivers of sessile life and
2 left-right shell asymmetry in bivalves
3
4 Yang Zhang 1, 2 # , Fan Mao 1, 2 # , Shu Xiao 1, 2 # , Haiyan Yu 3 # , Zhiming Xiang 1, 2 # , Fei Xu 4, Jun
5 Li 1, 2, Lili Wang 3, Yuanyan Xiong 5, Mengqiu Chen 5, Yongbo Bao 6, Yuewen Deng 7, Quan Huo 8,
6 Lvping Zhang 1, 2, Wenguang Liu 1, 2, Xuming Li 3, Haitao Ma 1, 2, Yuehuan Zhang 1, 2, Xiyu Mu 3,
7 Min Liu 3, Hongkun Zheng 3 * , Nai-Kei Wong 1* , Ziniu Yu 1, 2 *
8
9 1 CAS Key Laboratory of Tropical Marine Bio-resources and Ecology and Guangdong Provincial
10 Key Laboratory of Applied Marine Biology, Innovation Academy of South China Sea Ecology and
11 Environmental Engineering, South China Sea Institute of Oceanology, Chinese Academy of
12 Sciences, Guangzhou 510301, China;
13 2 Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Guangzhou
14 511458, China;
15 3 Biomarker Technologies Corporation, Beijing 101301, China;
16 4 Key Laboratory of Experimental Marine Biology, Center for Mega-Science, Institute of
17 Oceanology, Chinese Academy of Sciences, Qingdao 266071, China;
18 5 State Key Laboratory of Biocontrol, College of Life Sciences, Sun Yat-sen University,
19 Guangzhou 510275, China;
20 6 Zhejiang Key Laboratory of Aquatic Germplasm Resources, College of Biological and
21 Environmental Sciences, Zhejiang Wanli University, Ningbo 315100, China;
22 7 College of Fisheries, Guangdong Ocean University, Zhanjiang 524088, China;
23 8 Hebei Key Laboratory of Applied Chemistry, College of Environmental and Chemical
24 Engineering, Yanshan University, Qinhuangdao 066044, China.
25
26 # These authors contributed equally to this work.
27 * Corresponding authors
28 E-mail: [email protected] (Yu Z), [email protected] (Wong N),
29 [email protected] (Zheng H)
The updated email and affiliation of Nai-Kei Wong: [email protected], Department of Pharmacology, Shantou University Medical College, Shantou 515041, China bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
30 31
32 Running title: Yang Z et al. / Genomic drivers of bivalve sessility and shell asymmetry. 33
34 Total word counts (from “Introduction” to “Conclusions” or “Materials and methods”): 5770
35 Total figures: 4
36 Total tables: 0
37 Total references: 120 38 References from 2014: 31 39 Total supplementary figures: 13
40 Total supplementary tables: 15
41 Total supplementary files: 2
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
60 Abstract
61 Bivalves are species-rich mollusks with prominent protective roles in coastal ecosystems.
62 Across these ancient lineages, colony-founding larvae anchor themselves either by byssus
63 production or by cemented attachment. The latter mode of sessile life is strongly molded by
64 left-right shell asymmetry during larval development of Ostreoida oysters such as
65 Crassostrea hongkongensis. Here, we sequenced the genome of C. hongkongensis in high
66 resolution and compared it to reference bivalve genomes to unveil genomic determinants
67 driving cemented attachment and shell asymmetry. Importantly, loss of the homeobox gene
68 antennapedia (Antp) and broad expansion of lineage-specific extracellular gene families are
69 implicated in a shift from byssal to cemented attachment in bivalves. Evidence from
70 comparative transcriptomics shows that the left-right asymmetrical C. hongkongensis
71 plausibly diverged from the symmetrical Pinctada fucata in expression profiles marked by
72 elevated activities of orthologous transcription factors and lineage-specific shell-related gene
73 families including tyrosinases, which may cooperatively govern asymmetrical shell formation
74 in Ostreoida oysters.
75
76
77 KEYWORDS: Comparative genomic, Ostreoida oysters, attachment, shell asymmetry,
78 bivalves
79
80
81
82
83
84
85
86
87
88
89
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
90 Introduction
91 Bivalves belong to the ancient lineages of Mollusca comprising nearly 9,600 species that
92 thrive in aquatic environments, with notable economic and ecological importance [1]. As
93 bilaterian organisms, they rely nutritionally on filtering phytoplankton, and primarily follow a
94 life cycle that transitions from free-swimming larvae to attached juveniles, culminating in
95 sessile life [2, 3]. Among filter-feeding bivalves, oysters of the superfamily Ostreoidea serve
96 as crucial guardians of marine ecosystems by forming oyster reefs that clean up water and
97 sustain biodiversity [4,5]. Due to climate change and coastal degradation, however, bivalves
98 face profound challenges from warming waters and ocean acidification, which destabilize
99 habitats, raise infection risks and dampen the bivalve capacity of acquiring carbonate for shell
100 formation [6-8].
101 To cope with diverse ecosystems, a variety of sessile strategies has emerged in bivalves
102 during evolution, among which two modes of sessile life prevail. Characteristically, majority
103 of the bivalves, including Mytilidae (mussel), Pectinidae (scallop), and Pteriidae (pearl oyster)
104 secret adhesive byssal threads to stabilize themselves against marine turbulences [9-13]. In
105 contrast, Ostreoida oysters have evolved a highly sophisticated machinery of cemented
106 attachment through producing organic-inorganic hybrid adhesive substances in place of
107 byssus, which allows them to permanently fuse the left shell with rock surfaces or shells of
108 other individuals in intertidal zones [14]. Compared with byssus, cemented attachment
109 exhibits superiority in physical adhesion and mechanical tension, enabling oysters to
110 efficiently create and thrive in large reef communities [2]. Developmentally, as a salient
111 feature of their exoskeleton, shell formation processes in bivalves are strongly molded by
112 their preferences for sessile life [15]. Quite distinctively, byssally attached bivalve species
113 tend to possess a bilaterally symmetrical shell, whereas cement-attached oysters present a
114 high degree of phenotypic variability and morphological asymmetry characteristic of their
115 radically distinct left-right (L/R) shells [15]. Nevertheless, the molecular mechanisms driving
116 these extraordinary innovations in bivalve evolution remain enigmatic, particularly in
117 genomic contexts.
118 The Hong Kong oyster (Crassostrea hongkongensis, first described as Crassostrea rivularis
119 by Gould, 1861) is an economically valuable aquacultural species endemic to the South China
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
120 coastline [16]. As an ideal model for studying shell asymmetry, C. hongkongensis larvae
121 follows a typical developmental cycle of cemented attachment and asymmetrical
122 differentiation of the L/R shells. In order to elucidate the genetic basis underpinning the
123 evolution of bivalve sessile life and asymmetry of shell formation, we sequenced and
124 analyzed the complete genome of C. hongkongensis and performed comparative genomic
125 analysis along with several other bivalve species, including two congeneric Ostreoida oysters,
126 Crassostrea gigas, and Crassostrea virginica [12,17-21]. In addition, we monitored
127 transcriptomic changes of C. hongkongensis embryos during the critical window of larval
128 attachment, and compared any asymmetry-related gene expression patterns in the L/R mantles
129 of adult C. hongkongensis and byssus-producing pearl oyster (Pinctada fucata). Our
130 comparative genomic data and associated functional assays reveal extensive molecular
131 adaptations across the oyster genome that support the evolutionary switch from byssal to
132 cemented attachment and divergence from symmetrical shell in Ostreoida oysters.
133
134 Results
135 Genome sequencing, annotation and Hi-C, phylogenomics and evolutionary rate
136 Efforts on genome sequencing and assembly are inherently challenging for many marine
137 invertebrates such as mollusks, annelids, and platyhelminths due to their remarkable genetic
138 heterozygosity (or polymorphisms) [17,18,21,22]. Based on k-mer analysis, the genome size
139 of a single wild-stock Hong Kong oyster (C. hongkongensis) individual was estimated to be
140 695 Mb with 1.2% of heterozygosity (Figure S1), which is broadly comparable to that of the
141 Pacific oyster (1.3%) [17]. To circumvent limitations of short-read next-generation
142 sequencing in assembling highly polymorphic genomes, PacBio sequencing in combination of
143 Illumina sequencing was instead opted as the dominant mode of genome sequencing in our
144 study. We first generated 23.25 Gb of raw PacBio reads and 147.25 Gb of Illumina reads,
145 being equivalent to 31.9-fold and 201.8-fold genome coverage, respectively (Table S1 and
146 S2). Following stepwise optimization of assembly algorithms, these reads were assembled
147 into a 729.6 Mb genome with a contig N50 of 314.1 kb and a scaffold N50 size of 500.4 kb,
148 with the longest contig spanning 2.37 Mb (Table S3). The contig N50 of the oyster genome is
149 at least one order of magnitude more expansive than those of published bivalve genomes
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
150 (Table S4), demonstrating the superiority of long-read sequencing technologies in coping
151 with high polymorphism in genome assembly of marine invertebrates. However, the
152 assembled genome size turned out to be slightly larger than that estimated by k-mer analysis.
153 Such discrepancy may reflect sequence preferences of Illumina reads. The high integrity and
154 quality of the assembly were evidenced by a productive mapping of 97.57% of sequencing
155 reads and a low single-nucleotide error rate (Table S5 and S6). Moreover, Benchmarking
156 Universal Single-Copy Orthologs (BUSCOS) analysis confirmed a high degree of
157 completeness (92.84%) for the assembled genome (Table S7), which is comparable in
158 genome completeness to other published bivalves (Table S4).
159 In order to assemble the oyster genome to chromosomal level, we generated ~44.4
160 million valid Hi-C interaction pairs with over 50-fold coverage (Table S8). Meanwhile,
161 690.39 Mb of genome sequence were anchored into 10 of pseudo-chromosomes with Hi-C
162 data by using LACHESIS, covering 94.66 % of the assembled genome (Figure 1A, Figure
163 S2 & Table S9). Among them, 648.56 Mb of genome sequence were reoriented and anchored
164 into chromosomes, constituting 93.94% of the total anchored sequences (Table S9).
165 Moreover, high consistency between Hi-C based pseudo-chromosomes with the genetic map
166 of one congeneric species, C. gigas, was confirmed (p = 0.978-0.996, Figure S3), implicating
167 high reliability in chromosomal genome assembly. Overall, by leveraging PacBio and Hi-C
168 enhanced Illumina sequencing, a very high quality and chromosome-anchored complete
169 genome was obtained, thus providing a robust framework for subsequent exploration of oyster
170 biology and evolution of bivalves.
171 For gene annotation, we predicted 30,021 protein-coding genes in the genome by
172 integrating results from ab initio prediction, homology-based searches with reference
173 genomes and RNA-seq (Table S10), with an estimated BUSCO completeness of 91.09%
174 (Table S11). Of these, more than 97.97% of the predicted genes (28,329 genes) were
175 annotated in the public databases (Table S12). The gene number here resembles that in a
176 close relative species, C. gigas (28,027) [17]. In addition, transposon elements (TE) constitute
177 46.2% of the C. hongkongensis genome, among which the prevailing TE is class II Helitron
178 (12.4%, 90.4 Mb) (Table S13). Phylogenetic analysis showed that three Ostreoida oyster
179 species (C. hongkongensis, C. gigas, C. virginica) clustered together (Figure 1B), and that
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
180 Ostreoida oyster speciation took root around 92.1 million years ago (Mya), in agreement with
181 evidence from mitochondrial genomes [23]. Within bivalves, Ostreoida oysters are closest to
182 the Pteriidae oyster Pinctada fucata, and their point of divergence was estimated to be 357.5
183 Mya (Figure 1B). These results corroborates the hypothesis that a common ancestor of
184 primitive Ostreoida and Pteriidae oysters existed prior to the Permian-Triassic extinction
185 event, whereas speciation of modern Ostreoida oysters began at the end of
186 Cretaceous-Paleogene extinction event [24-27]. Consistently, comparative genomic synteny
187 shows high genomic collinearity between three Ostreoida oyster genomes except for large
188 intra-chromosomal inversions, but substantial inter-chromosomal translocations and
189 rearrangements occur between chromosomes of Ostreoida oysters and Pinctada fucata
190 (Figure S4), which is in agreement with their phylogenetic relationship and duration of
191 divergence.
192 Homeobox gene cluster
193 Radical changes toward a sessile life require evolutionary innovations in anatomical
194 organization. In contrast to byssus-producing bivalves [12,28], Ostreoida oysters do not
195 possess a byssal gland or secret byssus during lifetime [29,30], though a vestigial foot
196 transiently appears at the veliger stage and degenerates following attachment and
197 metamorphosis (Figure 2B). Developmentally, the homeobox (Hox) genes are known for their
198 crucial roles in regulating body-plan development and organogenetic transitions in metazoans
199 [31-34]. In view of this, we compared the clustering of Hox genes in byssus-producing and
200 byssus-null bivalve species. A salient feature in byssal bivalves including Pinctada fucata,
201 Mizuhopecten yessoensis, Chlamys farreri, Mytilus galloprovincialis, Bathymodiolus
202 platifrons, and Modiolus philippinarum is an intact Hox and para-Hox gene cluster (Figure
203 2A and Figure S5). In contrast, a disputed Hox gene cluster reportedly exits in C. gigas oyster
204 genome [17], whereas a coherent Hox gene cluster is configured linearly in one single-locus
205 in both C. hongkongesis and C. virginica, probably in part due to fragmented genome
206 assembly in C. gigas. Intriguingly, one of the key Hox members antennapedia (Antp) is lost in
207 all three Ostreoida oysters (Figure 2A), thereby implicating Antp gene as an essential driver
208 of byssus formation. Sequence alignment reveals that Antp gene possesses a conserved
209 homeobox domain in bivalves (Figure S6).
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
210 As evidenced in expression profiles of three representative byssus-producing bivalve
211 species, Antp and its orthologues are predominantly expressed in the byssal gland (Figure
212 2C). Due to unavailability of molecular tools like CRISPR/Cas9 or TALEN for manipulating
213 bivalve genomes, genetic ablation of the Antp gene is not yet feasible in pearl oyster for
214 phenotypic appraisal of its function. However, histological evidence suggests that the byssal
215 gland is one of the appendage organs capable of secreting thin extended byssal threads in their
216 mature form as observable bysuss outside the organism (Figure 2D and Figure 2E). Based
217 on the fact that regenerative ability varies among individuals, we assessed Antp function in
218 this phenotypic trait. Remarkably, mRNA expression levels of Antp are highly correlated with
219 the number of regenerative byssus in the pearl oyster (n = 24, R2 = 0.36, p = 0.0012; Figure
220 2F). Taken together, our evidence strongly implicates Antp as a transcriptional regulator
221 central to byssal secretion in P. fucata. Further, the loss of the Antp gene seems to be
222 associated with a physical loss of byssal gland in oysters. In an evolutionary perspective, Antp
223 seems to play a critical role in appendage diversification in arthropods, which has previously
224 been evidenced by its involvement in leg formation in the crustacean Daphnia [35], and
225 repression of abdominal limb in the spider Achaearanea tepidariorum [36]. In addition,
226 ectopic expression of Hox transcription factor Antp reportedly induced expression of the silk
227 protein sericin-1 as a biopolymer in the silkworm Bombyx mori [37,38]. Collectively, these
228 findings support a conserved function of Antp in secretory appendage in two distinct lineages,
229 mollusks and arthropods.
230
231 Gene expansion and oyster attachment
232 In place of byssal attachment, Ostreoida oysters adopt an ingeniously cost-effective way of
233 sessile life, namely, cemented attachment [29,30]. Such adhesive mechanism is characterized
234 by extraordinary mechanical strength and superior flexibility needed to resist powerful tidal
235 scour and absorb surge energy [39]. Cemented attachment allows oysters to efficiently anchor
236 and thrive in marine environments, and ultimately supports the genesis and health of oyster
237 reefs. Nevertheless, the molecular mechanisms underlying oyster adhesive production have
238 remained enigmatic. Taking into account that commented attachment is an innovation unique
239 to Ostreoida oysters, we first ventured to investigate which gene families are expanded as a
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
240 common event in three Ostreoida species. Our results show that in C. gigas, C.
241 hongkongensis, and C. virginica, there are 58, 172, and 321 expanded species-specific gene
242 families, respectively, which can be further reduced to 32 expanded core gene families in
243 Ostreoida oyster genomes (Figure 3A and Figure S7 & Table S14).
244 To elucidate how expansion of these core gene families facilitates cemented attachment,
245 we determined the correlations between their expression levels and specific developmental
246 stages (Figure 3B). Developmentally, attachment is an intricate secretion process involving a
247 broad spectrum of chemical reactions and proteins, notably extracellular enzymes or matrices
248 [40]. It is thus unsurprising to identify a small conductance calcium-activated potassium
249 channel (SK channel) gene family and 9 extracellular gene families at work in this process,
250 which show high correlations in the pediveliger and spat stages corresponding to larval
251 initiation of attachment. SK channels are widely expressed calcium-activated potassium
252 channels in neurons [41,42], with crucial roles in regulating dendritic excitability, synaptic
253 transmission, and synaptic plasticity [43,44]. Interestingly, increased expression of expanded
254 SK channels may aid free-swimming larvae in sensing external environments in search for an
255 appropriate attachment site. On the other hand, the function of extracellular gene families is
256 strictly related to key processes of shell attachment, including matrix secretion
257 (Epidermal growth factor (EGF), EGF3, lamin EGF, Apec), processing of matrix
258 modification (Cu-oxidase, Cu-oxidase2, Cu-oxidase3, and astacin), among others (Figure 3B).
259 Indeed, many adhesive proteins contain specific protein-binding domains [45], such as
260 EGF-like domains in the slug mucus proteins (e.g. Sm40 and Sm85) [46] and sea star
261 footprint proteins (e.g. Sf1) [47], raising the possibility that EGF family expansion in C.
262 hongkongensis is functionally linked to cemented attachment. Additionally, physico-chemical
263 properties of many adhesive proteins arise in part from post-translational modifications,
264 which ultimately support their adhesive functions [45]. Protein oxidation in marine
265 bio-adhesives indeed contributes to enhanced crosslinking between shell disks and substrates
266 during attachment [48,49]. A notable gene expansion in the copper oxidase family is likely to
267 contribute to stabilization of extracellular matrixes in the form of crosslinking between the
268 oyster shell and external substrates. Copper-based enzyme lysyl oxidase is known to be
269 essential for cross-linking and strengthening fibers in animal connective tissues via collagen
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
270 oxidation [50]. Concomitantly, copper ion, as part of oxidative enzymes, is a mandatory
271 cofactor for oxidase activity, which creates cross-linking sites from common amino acids, to
272 enhance the cemented attachment [51-53]. In further transcriptomic analysis, we found
273 evidence that 9 extracellular gene families were starkly upregulated during the larvae-spat
274 transformation of embryo development stages (Figure S8), corroborating their functional
275 importance in attachment formation.
276
277 L-DOPA induced attachment
278 During larvae-spat transformation, embryonic oysters execute an intrinsic program of
279 developmental changes, in which cemented attachment is tightly coupled to metamorphosis
280 [54]. In this context, we set out to distinguish molecular determinants of cemented attachment
281 from that of metamorphosis at the veliger stage by means of two pharmacologic agents:
282 L-3,4-dihydroxyphenylalanine (L-DOPA) and norepinephrine (NE) at the veliger stage. The
283 former simultaneously promoted normal attachment and metamorphosis, whereas the latter
284 induced metamorphosis only but not attachment (Figure 3C and Figure S9) [54]. Based on
285 this, gene expression induced by L-DOPA rather than NE was hypothesized to be a driver for
286 the initiation of attachment in C. gigas. We accordingly scrutinized 24 transcriptomes
287 following pharmacological challenges at two time points within the temporal span of oyster
288 attachment. Our results show that the expression of 1225 genes was specifically altered by
289 treatment of L-DOPA rather than NE (Figure 3C), confirming the former’s essential roles as
290 an attachment signal.
291 Remarkably, several neurotransmitter receptors (including metabotropic glutamate
292 receptor and neuropeptide Y receptor) were starkly increased, consistent with the assumption
293 that neuromuscular coordination is mandatory for guiding embryos to settle in suitable niches
294 and initiate attachment (Figure S10) [55,56]. Moreover, genes of metal ion channels or
295 binding proteins were significantly enriched, with notable examples like organic cation
296 transporter protein, transient receptor potential cation channel (ZIP12) and
297 voltage-dependent calcium channel (Ca2+-ATPase), which is intuitively consistent with the
298 well-documented stimulatory roles of selective cations in oyster larval settling [57]. To
299 highlight, potassium voltage-gated channel activity was proven to be vital for oyster larval
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
300 attachment, since its inhibitor tetraethyl ammonium can effectively block this developmental
301 process [58]. Typically, attachment initiates in oyster larvae with the aid of fibrous adhesive
302 proteins and other bioorganic substances including mucopolysaccharides and phospholipids
303 [2,59]. As a consequence, extensive extracellular matrix and adhesion proteins including
304 collagen, cadherin, fibrocystin, and hemicentin would increase in response to L-DOPA
305 simulation, presumably paving the way for larval attachment [60].
306 To search out the crucial molecular determinants governing this process, we performed
307 WGCNA to construct a potential connected gene network functionally associated with
308 L-DOPA induced attachment, wherein 15 of modules were subsequently identified (Figure
309 S11). Among them, the MEpink module is the most correlated with L-DOPA induced
310 attachment (p < 0.01) and contains 139 of genes (topological overlap > 0.3). Intriguingly,
311 within this module, a hub forming the most connections in the network was found to be zinc
312 transporter ZIP12 (Figure 3D), which is a pivotal regulator of zinc flux. As a co-factor
313 essential to a wide spectrum of proteins such as matrix metalloproteinases, zinc plays vital
314 regulatory roles in enzymatic catalysis and macromolecular stability [61]. High abundance of
315 zinc is also a salient feature in aragonite- or calcite-rich shells in certain mollusks [62].
316 Meanwhile, among the gene families that specifically expanded in Ostreoida oysters, astacin
317 is a cell-secreted or plasma membrane-associated protease that possesses zinc binding activity
318 and takes part in proteolytic processing of extracellular proteins [63]. Its expression was
319 markedly elevated both during larvae-spat transformation or larval response to L-DOPA
320 treatment (Figure S8g & S10d). Predictably, chelation of zinc potently retarded oyster larval
321 attachment (Figure S12), providing additional hints that initial creation of matrix structures
322 requires zinc and associated protein activities for cement attachment. Accordingly, based on
323 genomic results on extracellular gene family expansion and transcriptomic profiles for the
324 attachment stage, we conceived a conceptual model to delineate the mechanistic determinants
325 and processes at work in the cement attachment strategy of oyster larvae (Figure 3E). We
326 postulate that attachment formation apparently results from an intricate coordination of at
327 least three types of fundamental activities, namely: larval sensing of habitable surfaces,
328 matrix/ion secretion, and matrix modification to mobilize adhesive processes.
329
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
330 Asymmetry in left-right shell formation
331 Symmetry is an elegant guiding principle for the implementation of body plans [64]. Across
332 the class Bivalvia, majority of bivalves display a perfect or near-perfect conformity to
333 bilaterally symmetrical shells [15,65]. In contrast, Ostreoida oysters may appear unorthodox
334 in adopting morphological asymmetry in their shell formation due to functional differentiation
335 of the left-right (L/R) shells (Figure S13). The left shell is visibly much thicker and more
336 convex than its right counterpart, which is apt for attaching to rocky surfaces or neighboring
337 oysters within a reef community. On the other hand, the right shell is capable of physical
338 displacement and hermetic lockdown to regulate water intake and ward off predation (Figure
339 4A). Moreover, structural variance in shell asymmetry is also amply reflected by a greater
340 proportion of prismatic layer in the right shell (Figure S14), which is responsible for
341 controlling initiation of calcite crystal formation and growth [66,67]. Although asymmetry of
342 body forms has been traditionally stereotyped as defects that may jeopardize survival of an
343 organism [68], the example of Ostreoida oysters clearly defies this rule. We reason that such
344 an intriguing differentiation of asymmetrical shells could confer unexpected benefits such as
345 improved population fitness in an otherwise intrinsically harsh coastal environment. With the
346 advent of the left shell and its versatile attachment machinery, oysters can easily economize
347 resources or secure their foothold on rocks or peers’ shells within an oyster reef via cemented
348 attachment [69]. This strategy permits oysters to lower their thresholds for founding and
349 expanding productive colonies in demanding physical habitats, literally through stacking of
350 individuals at high densities, without sacrificing resistance to environmental challenges such
351 as tidal turbulences.
352 To further elucidate the molecular basis of left-right asymmetry, comparative
353 transcriptomics was carried out for quantify the gene expression profiles in L/R mantles of C.
354 hongkongensis and pearl oyster, which are the key organ controlling shell formation [70,71].
355 As expected, 188 asymmetry-related differentially expressed genes (DEGs) of the L/R
356 mantles were identified in C. hongkongensis, whereas only 53 asymmetry-related DEGs were
357 found in the pearl oyster (Figure 4B), which reflects a radical genetic divergence
358 underpinning shell asymmetry. Next, to test the hypothesis that lineage-specific divergence of
359 orthologues contribute to symmetry breakage, 10,050 of the orthologues were paired between
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
360 the two species (Table S15). Our results indicate that a few but crucial asymmetry-related
361 orthologues are specifically expressed C. hongkongensis (Figure 4C), including homeobox
362 gene paired-like homeodomain transcription factor (Pitx2) and homeobox B4a (Hox-B4a),
363 and regulatory factor X6 (RFX6). Notably, Pitx2 is a central regulator orchestrating the Nodal
364 cascade, which is responsible not only for directing L/R axis formation in mammals [72], but
365 also shell coiling and L/R asymmetry in some mollusks such as the snail [73]. Another gene
366 of interest is the RFX6, recognized for its fundamental importance in guiding pancreatic islet
367 development and insulin production in mammals [74]. While insulin-related peptide gene is
368 known for being a critical driver of oyster growth [75], this new evidence alludes to novel
369 roles of Rfx6-insluin signaling in maintaining shell asymmetry in oysters. As predicted,
370 asymmetry-related expression of Pitx2 and RFX6 in L/R mantles was confirmed by real-time
371 qPCR in three Ostreoida lineages with asymmetrical shells, whereas such gene expression
372 patterns were absent in three symmetrical bivalves, pearl oyster, scallop and mussel (Figure
373 4D).
374 However, it should be noted that majority of asymmetry-related genes in C.
375 hongkongensis are not orthologous to the pearl oyster. For example, tyrosinases are one of the
376 key gene families involved in steering shell formation and pigmentation by means of
377 oxidation and cross-linking of o-diphenols [76,77]. Phylogenetic analysis reveals that more
378 than a half of tyrosinase genes (55%) clustered in several lineage-restricted clades, suggesting
379 rapid and independent expansion of this gene family in bivalves (Figure 4E). Remarkably,
380 several high-abundance members of the tyrosinase family seem to be strongly associated with
381 L/R asymmetry and were expressed preferentially in the right mantles of C. hongkongensis,
382 whereas no obvious variance was noted between L/R mantles in the pearl oyster (Figure 4F).
383 Therefore, it seems logical to infer that rapid expansion and divergent expression of
384 tyrosinase family contribute importantly to the emergence and neofuncationalization of
385 asymmetrical shell formation in Ostreoida lineages. Lastly, in determining when precisely
386 expression of these asymmetry-related genes kick off in oyster embryogenesis, we found that
387 71.1% of these genes start expression at the spat stage (Figure S15), implying that a complete
388 asymmetrical pattern becomes established in the juveniles only after metamorphosis.
389
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
390 Conclusion
391 Ostreoida oysters have evolved remarkable innovations for streamlining their bodyplans,
392 which are enabled by novel cemented attachment and an allied gene machinery diverging
393 from L/R symmetry. These evolutionary breakthroughs poise oysters as highly successfully
394 reef builders and ecological guardians integral to marine ecosystems spanning the globe. To
395 reveal the genomic changes driving these evolutionary innovations, we sequenced the
396 complete genome of C. hongkongensis, obtained active transcriptomes developmentally
397 critical to the attachment window, and made comparisons with other bivalve genomes. The
398 homeobox gene Antp of the Hox cluster, found to be lost in Ostreoida oysters, is evidently a
399 pivotal regulator of byssal secretion and expression of byssal proteins in P. fucata, and
400 potentially a critical gene governing the radical switch from byssal to cemented attachment.
401 Furthermore, extensive extracellular gene families were expanded in the Ostreoida lineages
402 specifically, presumably contributing to the operationalization of cemented attachment.
403 Ion-binding genes were significantly enriched in L-DOPA induced attachment in oyster, with
404 zinc-binding genes being a prominent network that coordinates extracellular matrix
405 modification and initiates adhesion. Moreover, Ostreoida divergence from shell symmetry is
406 probably under the joint control of a suite of transcriptionally identified asymmetry-related
407 DEGs of the L/R mantles, notably the transcription factors Pitx2 and RFX6, as well as
408 expanded lineage-specific family of tyrosinases. Thus, on the basis of genomic determinants
409 and coordinated gene networks as revealed in this study, we have advanced a detailed picture
410 of how shell asymmetry is switched on and driven in bivalves such as Ostreoida oysters. In
411 order to provide insights into bivalve biology and disease in contexts of climate change or
412 biological conservation, further investigation on the attachment-governing genes may be
413 warranted.
414
415 Materials and methods
416 Illumina sequencing
417 Genomic DNA was extracted by using DNeasy Blood & Tissue Kit (Cat. no. 69582, Qiagen,
418 Germany) from a two-year old single individual of C. hongkongensis. Two types of pair-end
419 libraries (220 bp and 500 bp) and six types of long-insert mate-pair libraries (3 kb, 4 kb, 5 kb,
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
420 8 kb, 10 kb, and 15 kb) were constructed by using Illumina’s paired-end and mate-end kits,
421 according to the manufacturer’s instructions. Libraries were sequenced on an Illumina Hiseq
422 2500 platform. For raw reads, sequencing adaptors were removed. Contaminated reads (such
423 as chloroplast, mitochondrial, bacterial, and viral sequences, etc.) were screened by alignment
424 in accordance with an NCBI-NR database by using BWA v0.7.13 [78] with default parameters.
425 FastUniq v1.1 [79] was used to remove duplicated read pairs. Low-quality reads were filtered
426 out, according to the following criteria: 1) reads with ≥10% unidentified nucleotides (N); 2)
427 reads with >10 nucleotides aligned to an adapter, allowing ≤10% mismatches; 3) reads
428 with >50% bases with Phred quality <5.
429 PacBio sequencing
430 Genomic DNA was sheared by a g-TUBE device (Cat. no. 520079, Covaris, MA) with 20 kb
431 settings. Sheared DNA was then purified and concentrated with AMPure XP beads (Cat. no.
432 10136224, Beckman Coulter, CA) and further used for single-molecule real-time (SMRT) bell
433 preparation according to the manufacturer’s protocol (Pacific Biosciences, CA), and 20 kb
434 template preparation by using BluePippin size selection (Sage Science). Size selected and
435 isolated SMRT bell fractions were purified with AMPure XP beads. Finally, these purified
436 SMRT bells were used for primer and polymerase (P6) binding, according to manufacturer’s
437 binding calculator (Pacific Biosciences). Single-molecule sequencing was performed on a
438 PacBio RS-II platform with C4 chemistry. Only PacBio subreads no shorter than 500 bp were
439 included for performing oyster genome assembly.
440 Genome size estimation
441 About 34 Gb (52×) corrected Illumina reads from the 180 bp and 500 bp were selected to
442 perform genome size estimation. The oyster genome size was estimated based on the formula:
443 Genome size = Kmer number/Peak depth.
444 De novo genome assembly of Illumina data
445 Clean Illumina reads were assembled de novo into longer contigs by using ALLPATH-LG [80]
446 with default parameters. Adjacent contigs were linked to scaffolds by leveraging mate-pair
447 information with SSPACE v2.3 [81], while gaps were filled by using GapCloser v1.12 [81]
448 implemented in a SOAPdenovo2 package [82].
449 De novo genome assembly of PacBio data
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
450 Canu+LoRDEC+WTDBG
451 We used an error correction module of Canu v1.5 [83] to select longer subreads with the
452 settings ‘genomeSize = 3,500,000,000’ and ‘corOutCoverage = 80’, detect raw subreads
453 overlapping through a highly sensitive overlapper MHAP v2.12 (‘corMhapSensitivity =
454 low/normal/high’), and complete an error correction through a falcon_sense method
455 (‘correctedErrorRate = 0.025’). Subsequently, output subreads of Canu were further corrected
456 by LoRDEC v0.6 [84] with the parameters ‘-k 19 -s 3’. Based on these two rounds of
457 error-corrected subreads, we generated a draft assembly by using WTDBG 1.1.006
458 (https://github.com/ruanjue/wtdbg) with the command ‘wtdbg -i pbreads.fasta -t 64 -H -k 21
459 -S 1.02 -e 3 -o wtdbg’.
460 Hybrid genome assembly
461 Contigs produced by ALLPATH-LG were optimized with the aid of contigs of PacBio
462 assembly by using quickmerge with the parameters ‘-hco 5.0 -c 1.5 -l 100000 -ml 5000’.
463 Optimized contigs were linked to scaffolds by leveraging Illumina mate-pair information by
464 using SSPACE and gaps were filled by using PBjelly v2.
465 Evaluation of oyster assembly
466 To appraise the genome quality, we first mapped Illumina reads to the oyster assembly by
467 using Burrows-Wheeler Alignment (BWA) tool. Next, completeness of genomes was verified
468 by mapping 248 highly conserved eukaryotic genes and 908 benchmarking universal
469 single-copy orthologues in metazoa to the genomes by using CEGMA v2.5 [85] and BUSCO
470 v3.0.2b [86], respectively.
471 Hi-C sequencing and assembly
472 Sequencing
473 According to the Hi-C procedure [87], nuclear DNA from muscles of oyster individuals was
474 cross-linked, then excised with a restriction enzyme, leaving pairs of distally located but
475 physically intercalated DNA molecules attached to one another. The sticky ends of these
476 digested fragments were biotinylated, which were then ligated to each other to form chimeric
477 circles. Biotinylated circles, as chimeras of physically associated DNA molecules from the
478 original cross-linking, were enriched, sheared and sequenced [88]. After adaptor removal and
479 filtering out low-quality reads, Hi-C reads were aligned to our assembled genome to evaluate
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
480 the ratios of mapped reads, distribution of insert fragments, sequencing coverage and number
481 of valid interaction pairs. Uniquely mapped reads spanning two digested fragments that are
482 distally located but physically associated DNA molecules are defined as valid interaction
483 pairs.
484 Assembly
485 Scaffolds of PacBio+Illumina assembly were reduced to fragments with a length of 300 kb,
486 which were then re-assembled by using the LACHESIS software [88] based on Hi-C data.
487 Regions that failed to be restored to the original assembly or contained an average Hi-C data
488 coverage of less than 0.5% were considered assembly errors, and were broken into smaller
489 scaffolds. Consistency in assembly of Hi-C data based pseudo-chromosomes was assessed by
490 comparisons with a genetic map for the Crassostrea gigas [89] by using software of
491 ALLMAPS [90].
492 Genome annotation
493 Repetitive sequence prediction
494 Repeat composition of the assemblies was estimated by building a repeat library employing
495 the de novo prediction programs LTR-FINDER [91], MITE-Hunter [92], RepeatScout [93]
496 and PILER-DF [94]. The database was classified by using PASTEClassifier [95] and then
497 combined with the Repbase database [96] to create a final repeat library. Repeat sequences in
498 oyster genome were identified and classified by using the RepeatMasker program [97]. The
499 LTR family classification criterion was defined as that 5’-LTR sequences of the same family
500 would share at least 80% identity over at least 80% of their lengths.
501 Protein-coding gene prediction
502 Protein-coding genes were predicted based on de novo and protein homology approaches. The
503 algorithms Genscan [98], Augustus [99], GlimmerHMM [100], GeneID [101] and SNAP [102]
504 were used for de novo gene prediction. Alignment of homologous peptides from C. gigas, C.
505 virginica, Lottia gigantea, and Danio rerio to our assemblies was performed to identify
506 homologous genes with the aid of GeMoMa [103]. Consensus gene models were generated by
507 integrating the de novo predictions and protein alignments using EVidenceModeler (EVM)
508 [104].
509 Functional annotation of protein-coding genes
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
510 Annotation of the predicted genes was performed by blasting their sequences against a
511 number of nucleotide and protein sequence databases, including COG [105], KEGG [106],
512 NCBI-NR and Swiss-Prot [107], with an E-value cutoff of 1e-5. Gene ontology (GO) for each
513 gene were assigned by using Blast2GO [108] based on NCBI databases.
514 Evolution of oysters
515 Protein sequences of Haliotis discus hannai [109], Lottia gigantea (GCF_000327385.1),
516 Aplysia californica (GCF_000002075.1), Biomphalaria glabrata (GCF_000457365.1),
517 Crassostrea gigas (GCF_000297895.1), Crassostrea virginica (GCF_002022765.2), Pinctada
518 fucata (https://marinegenomics.oist.jp), Chlamys farreri (CfBase), Bathymodiolus platifrons
519 (GCA_002080005.1), Modiolus philippinarum (GCA_002080025.1), Octopus bimaculoides
520 (GCF_001194135.1), and Homo sapiens (GCF_000001405.26) were retrieved for analysis.
521 Proteomes of the aforementioned twelve species and that of C. hongkongensis, comprising a
522 total of 295,905 protein sequences, were clustered into 38,939 orthologue groups by using
523 OrthoMCL v3.1 [110] based on an all-to-all BLASTP strategy with an E-value of 1e-5 and by
524 using Markov Chain Clustering (MCL) algorithms with default inflation parameters (1.5).
525 Based on clustering results, C. hongkongensis-specific gene families were determined and
526 annotated. To infer phylogenetic relationships, we extracted 387 single-copy gene families
527 from all thirteen species to perform multiple alignments of proteins for each family with
528 MUSCLE v3.8.31 [111]. All of the alignments were combined into one supergene to construct
529 a phylogenetic tree by using RAxML v8.2.12 [112] with 1000 rapid bootstrap analyses,
530 followed by a search of the best-scoring ML tree in a single run. Finally, divergence times
531 were estimated by using MCMCTree from the PAML package [113] in conjunction with a
532 molecular clock model. Several reference-calibrated time points obtained from TimeTree
533 database (http://timetree.org/) were used to date divergence times of interest. Expansion and
534 contraction of OrthoMCL derived homologue clusters were determined by CAFÉ v2.1 [114]
535 calculations on the basis of changes in gene family size with respect to phylogeny and species
536 divergence time. In addition, we obtained domain-based expanded gene families of three
537 Crassostrea species, according to previous works by Albertin et al. (2015) [115].
538 Syntenic analysis
539 All-to-all BLASTP analyses of protein sequences were performed between C. hongkongensis,
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
540 C,gigas, C. virginica, and P. fucata with an E-value threshold set at 1e-5. Syntenic regions
541 within and between species were identified by using MCScan based on BLASTP results. A
542 syntenic region was considered valid, if it contained a minimum of 10 collinear genes and a
543 maximum of 25 gaps (genes) between two adjacent collinear genes.
544 Homeobox gene analysis
545 Structures of homeobox genes in oyster were determined by using the GeMoMa v1.4.2
546 software [116] with default parameters based on available homeobox gene models.
547 Predictions were handled by applying a GeMoMa annotation filter (GAF) with default
548 parameters except for evidence percentage filter (e = 0.1). These were then manually verified
549 to achieve a single high-confidence transcript prediction per locus. Exact annotations of each
550 homeobox gene were completed with the aid of phylogenetic relationships.
551 Transcriptomic analysis
552 Embryos at different developmental stages during oyster embryogenesis including zygote, 2-4
553 cells, blastula, morula, gastrula, trochophore, D-larva, veliger, pediveliger and spat were
554 collected for RNA isolation. Similarly, RNA extraction was done with various tissues
555 including hemocytes, muscles, gill, labial palp, hepatopancreas, gonads and mantles. To
556 compare asymmetry-related mantle gene expression in the C. hongkongensis and P. fucata,
557 their L/R mantles were collected. For both left and right mantles, unilateral tissues from five
558 individuals were pooled as one sample, and each of the L/R mantle groups contained at least
559 three replicates. Total RNA was isolated by using the Trizol reagent (Cat. no. 15596026,
560 Invitrogen, CA), followed by treatment with RNase-free DNase I (Cat. no. M6101, Promega,
561 WI), according to the manufacturers’ instructions. RNA quality was then checked by using an
562 Agilent 2100 Bioanalyzer. Illumina RNA-Seq libraries were prepared and sequenced in a
563 HiSeq 2500 system by a PE150 strategy following the manufacturer’s instructions (Illumina,
564 CA). After trimming raw reads based on quality scores from the quality trimming program
565 Btrim, clean reads were aligned to the oyster assembly genome by using TopHat v2.1.1 [117]
566 and then assembled by using Cufflinks v2.1.1 [118]. Differential expression of genes in the
567 various tissues was evaluated by using Cuffdiff [118].
568 WGCNA and co-expression network analysis
569 Weighted correlation network analysis (WGCNA) [119] was applied to construct a weighted
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
570 gene co-expression network of genes having a high correlation with cemented attachment.
571 The top 10,000 differential genes exhibiting transcriptional changes in response to L-DOPA
572 treatment were selected for WGCNA, wherein the modules showed high correlation with
573 cemented attachment. We estimated the weight for each pair of genes forming intersections
574 within these modules and analyzed differentially expressed genes relevant to cemented
575 attachment by using DESeq2. Cytoscape [120] was used to delineate the co-expression
576 network of significant gene pairs with weight >0.3.
577 Byssal regeneration
578 Functional relationships between antennapedia (Antp) mRNA expression levels and
579 phenotypic traits of byssal threads in adult pearl oysters (Pinctada fucata) were explored.
580 Briefly, 50-100 pearl oysters (2 years old) were collected and maintained in aerated
581 laboratory tanks. Byssal mass comprising the byssal stem and existing old threads of pearl
582 oysters were excised. Then, individual pearl oysters were placed in beakers (one oyster per
583 beaker) to allow identification of subsequent regrowth of nascent thread mass. Particular care
584 was taken in removing old threads and attachment discs from the shells. Preliminary
585 experiments indicate that removal of the threads did not affect subsequent thread formation.
586 Byssal thread formation was estimated as the number of threads/oyster observed 24 h later.
587 Subsequently, the corresponding byssal gland of each pearl oyster was collected for
588 RNA extraction by using TRIzol reagent, according to the manufacturer’s instructions.
589 Purified RNA samples were diluted to 1 µg/µL and pooled to perform cDNA synthesis by
590 utilizing PrimerScript first strand cDNA synthesis kit (Cat. no. 6110A, Takara Bio, Japan),
591 following the manufacturer’s protocol. Real-time qPCR analysis was performed to determine
592 Antp mRNA expression with gene-specific primers (Table S16).
593 Pharmacological treatment
594 Chemical compounds were obtained from Sigma-Aldrich, unless otherwise specified.
595 Working solutions were freshly prepared in deionized (DI) water approximately 1 h before in
596 vivo experiments, which were conducted in large beakers to allow observation of oyster
597 attachment and metamorphosis. Groups of oyster larvae at the pediveliger stage were placed
598 in three beakers containing 50 mL sea water (at a density of 20 larvae/mL). There were three
599 groups in total: an unstimulated control, an L-3,4-dihydroxyphenylalanine (L-DOPA)
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
600 treatment and a norepinephrine (NE) treatment. Oyster larvae were challenged (6 h and 24 h)
601 with different concentrations of NE (10-4, 10-5, 10-6 M) or L-DOPA (10-5, 10-6, 10-7 M).
602 Previous studies have shown that this concentration range is sufficiently potent for inducing a
603 larval response [121,122].
604 In addition, oyster larvae were collected following various treatment durations (6 h and
605 24 h) for RNA-seq and transcriptomic analysis to determine any temporally driven differences
606 between the L-DOPA treatment group (10-5 M) and unstimulated control. By a similar design,
607 oyster larvae were exposed to NE (10-5 M) for 6 h and 24 h, and their transcriptomic profiles
608 were examined in relation to oyster metamorphosis.
609
610 Data availability
611 The C. hongkongensis genome studied in this Hong Kong oyster genome project has been
612 deposited at the NCBI under the BioProject number PRJNA592306 at
613 https://www.ncbi.nlm.nih.gov/bioproject/PRJNA592306. Hi-C data have been deposited as
614 SRR10583824 at https://www.ncbi.nlm.nih.gov/sra/SRR10583824. RNA-seq data of various
615 transcriptomes have been deposited as PRJNA588628 at
616 https://www.ncbi.nlm.nih.gov/bioproject/PRJNA588628.
617
618 CRediT author statement
619 Yang Zhang : Conceptualization, Methodology, Validation, Investigation, Data Curation,
620 Writing - Original Draft, Writing - Review & Editing, Visualization, Supervision, Project
621 administration, Funding acquisition. Fan Mao: Methodology, Validation, Investigation, Data
622 Curation, Writing - Original Draft, Writing - Review & Editing, Visualization, Funding
623 acquisition. Shu Xiao: Methodology, Validation, Resources, Funding acquisition. Haiyan Yu:
624 Methodology, Formal analysis, Investigation, Data Curation. Zhiming Xiang: Methodology,
625 Validation, Data Curation, Funding acquisition. Fei Xu: Formal analysis, Validation, Data
626 Curation. Jun Li: Validation, Resources. Lili Wang: Formal analysis. Yuanyan Xiong:
627 Formal analysis. Mengqiu Chen: Formal analysis. Yo ng b o Ba o : Formal analysis. Yuewen
628 Deng: Validation. Quan Huo: Validation. Lvping Zhang: Validation. Wenguang Liu:
629 Validation. Xuming Li: Formal analysis. Haitao Ma: Formal analysis. Yuehuan Zhang:
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
630 Resources. Xiyu Mu: Formal analysis. Min Liu: Formal analysis. Hongkun Zheng:
631 Conceptualization, Formal analysis, Data Curation, Project administration. Nai-Kei Wong:
632 Writing - Review & Editing, Visualization. Ziniu Yu: Conceptualization, Writing - Review &
633 Editing, Visualization, Supervision, Project administration, Funding acquisition.
634 635 Competing interest 636 We declare that none of the authors have competing financial or non-financial 637 interests. 638
639 Acknowledgments
640 We are deeply grateful to our lab members and collaborators, who have provided us with able
641 assistance or valuable advice at all stages of this study. We acknowledge grant support from
642 Key Special Project for Introduced Talents Team of Southern Marine Science and
643 Engineering Guangdong Laboratory (Guangzhou) (GML2019ZD0407), Key Deployment
644 Project of Centre for Ocean Mega-Research of Science, Chinese Academy of Science
645 (COMS2019Q11), the National Science Foundation of China (No. 32073002, 31902404), the
646 China Agricultural Research System (No. CARS-49), the Science and Technology Program of
647 Guangzhou, China (No.201804020073), Natural Science Foundation of Guangdong Province
648 (2020A1515011533), the Program of the Pearl River Young Talents of Science and Technology
649 in Guangzhou of China (201806010003), Institution of South China Sea Ecology and
650 Environmental Engineering, Chinese Academy of Sciences (ISEE2018PY01, ISEE2018PY03,
651 ISEE2018ZD01), and Science and Technology Planning Project of Guangdong Province,
652 China (2017B030314052, 201707010177).
653
654 ORCID
655 0000-0002-0789-4938 (Yang Zhang)
656 0000-0001-6899-5591 (Fan Mao)
657 0000-0002-7276-3213 (Shu Xiao)
658 0000-0001-9709-0417 (Haiyan Yu)
659 0000-0003-1428-2910 (Zhiming Xiang)
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
660 0000-0002-9426-3615 (Hongkun Zheng)
661 0000-0003-1303-3170 (Nai-Kei Wong)
662 0000-0002-1049-4345 (Ziniu Yu)
663 664 References 665 [1] Appeltans W, Ahyong ST, Anderson G, Angel MV, Artois T, Bailly N, et al. The magnitude of 666 global marine species diversity. Curr Biol 2012;22:2189-202. 667 [2] Tibabuzo Perdomo AM, Alberts EM, Taylor SD, Sherman DM, Huang CP, Wilker JJ. Changes in 668 cementation of reef building oysters transitioning from larvae to adults. ACS Appl Mater Interfaces 669 2018;10:14248-53. 670 [3] Cranfield HJ. Observations on the behaviour of the pediveliger of Ostrea edulis during attachment 671 and cementing. Mar Biol 1973;22:203-9. 672 [4] Dame RF, Zingmark RG, Haskin E. Oyster reefs as processors of estuarine materials. J Exp Mar 673 Biol and Ecol 1984;83:239-47. 674 [5] Grabowski JH, Peterson CH. Restoring oyster reefs to recover ecosystem services. Theor Ecol 675 Series 2007;4:281-98. 676 [6] Parker LM, Ross PM, O'Connor WA, Portner HO, Scanes E, Wright JM. Predicting the response of 677 molluscs to the impact of ocean acidification. Biology (Basel) 2013;2:651-92. 678 [7] Kroeker KJ, Kordas RL, Crim R, Hendriks IE, Ramajo L, Singh GS, et al. Impacts of ocean 679 acidification on marine organisms: quantifying sensitivities and interaction with warming. Glob Chang 680 Biol 2013;19:1884-96. 681 [8] Gazeau F, Parker LM, Comeau S, Gattuso JP, O'Connor WA, Martin S, et al. Impacts of ocean 682 acidification on marine shelled molluscs. Mar Biol 2013;160:2207-45. 683 [9] Pujol JP. Formation of the Byssus in the Common Mussel (Mytilus edulis L.). Nature 684 1967;214:204-5. 685 [10] Priemel T, Degtyar E, Dean MN, Harrington MJ. Rapid self-assembly of complex biomolecular 686 architectures during mussel byssus biofabrication. Nat Commun 2017;8:14539. 687 [11] Harrington MJ, Masic A, Holten-Andersen N, Waite JH, Fratzl P. Iron-clad fibers: a metal-based 688 biological strategy for hard flexible coatings. Science 2010;328:216-20. 689 [12] Li Y, Sun X, Hu X, Xun X, Zhang J, Guo X, et al. Scallop genome reveals molecular adaptations 690 to semi-sessile life and neurotoxins. Nat Commun 2017;8:1721. 691 [13] Li S, Liu C, Zhan A, Xie L, Zhang R. Influencing mechanism of ocean acidification on byssus 692 performance in the pearl oyster Pinctada fucata. Environ Sci Technol 2017;51:7696-706. 693 [14] Burkett JR, Hight LM, Kenny P, Wilker JJ. Oysters produce an organic-inorganic adhesive for 694 intertidal reef construction. J Am Chem Soc 2010;132:12531-3. 695 [15] Stanley SM. Relation of Shell Form to Life Habits of the Bivalvia (Mollusca). Geological Society 696 of America Memoir; 1970, 125:296 p. 697 [16] Guo XM, Ford SE, Zhang FS. Molluscan aquaculture in China. J Shellfish Res 1999;18:19-31. 698 [17] Zhang G, Fang X, Guo X, Li L, Luo R, Xu F, et al. The oyster genome reveals stress adaptation 699 and complexity of shell formation. Nature 2012;490:49-54. 700 [18] Wang S, Zhang JB, Jiao WQ, Li J, Xun XG, Sun Y, et al. Scallop genome provides insights into 701 evolution of bilaterian karyotype and development. Nat Ecol & Evol 2017;1(5):120.
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
702 [19] Du XD, Fan GY, Jiao Y, Zhang H, Guo XM, Huang RL, et al. The pearl oyster Pinctada fucata 703 martensii genome and multi-omic analyses provide insights into biomineralization. Gigascience 2017; 704 6(8):1-12. 705 [20] Yan X, Nie H, Huo Z, Ding J, Li Z, Yan L, et al. Clam genome sequence clarifies the molecular 706 basis of its benthic adaptation and extraordinary shell color diversity. iScience 2019;19:1225-37. 707 [21] Simakov O, Marletaz F, Cho SJ, Edsinger-Gonzales E, Havlak P, Hellsten U, et al. Insights into 708 bilaterian evolution from three spiralian genomes. Nature 2013;493:526-31. 709 [22] Sea Urchin Genome Sequencing C, Sodergren E, Weinstock GM, Davidson EH, Cameron RA, 710 Gibbs RA, et al. The genome of the sea urchin Strongylocentrotus purpuratus. Science 711 2006;314:941-52. 712 [23] Ren J, Liu X, Jiang F, Guo X, Liu B. Unusual conservation of mitochondrial gene order in 713 Crassostrea oysters: evidence for recent speciation in Asia. BMC Evol Biol 2010;10:394. 714 [24] Barnosky AD, Matzke N, Tomiya S, Wogan GO, Swartz B, Quental TB, et al. Has the earth's sixth 715 mass extinction already arrived? Nature 2011;471:51-7. 716 [25] Pimm SL, Jenkins CN, Abell R, Brooks TM, Gittleman JL, Joppa LN, et al. The biodiversity of 717 species and their rates of extinction, distribution, and protection. Science 2014;344:1246752. 718 [26] Schulte P, Alegret L, Arenillas I, Arz JA, Barton PJ, Bown PR, et al. The chicxulub asteroid impact 719 and mass extinction at the Cretaceous-Paleogene boundary. Science 2010;327:1214-8. 720 [27] Baumiller TK, Salamon MA, Gorzelak P, Mooi R, Messing CG, Gahn FJ. Post-Paleozoic crinoid 721 radiation in response to benthic predation preceded the Mesozoic marine revolution. P Natl Acad Sci 722 USA 2010;107:5893-6. 723 [28] Sigurdsson JB, Titman CW, Davies PA. The dispersal of young post-larval bivalve molluscs by 724 byssus threads. Nature 1976;262:386-7. 725 [29] Hopkins AE. Attachment of larvae of the Olympia oyster, Ostrea lurida, to plane surfaces. 726 Ecology 1935;16:82-7. 727 [30] Nelson TC. The attachment of oyster Larvae. Biol Bull 1924;46:143-51. 728 [31] Garcia-Fernandez J. The genesis and evolution of homeobox gene clusters. Nat Rev Genet 729 2005;6:881-92. 730 [32] Lemons D, McGinnis W. Genomic evolution of Hox gene clusters. Science 2006;313:1918-22. 731 [33] Biscotti MA, Canapa A, Forconi M, Barucca M. Hox and ParaHox genes: a review on molluscs. 732 Genesis 2014;52:935-45. 733 [34] Frobius AC, Funch P. Rotiferan Hox genes give new insights into the evolution of metazoan 734 bodyplans. Nat Commun 2017;8:9. 735 [35] Shiga Y, Yasumoto R, Yamagata H, Hayashi S. Evolving role of Antennapedia protein in arthropod 736 limb patterning. Development 2002;129:3555-61. 737 [36] Khadjeh S, Turetzek N, Pechmann M, Schwager EE, Wimmer EA, Damen WGM, et al. Divergent 738 role of the Hox gene Antennapedia in spiders is responsible for the convergent evolution of abdominal 739 limb repression. P Natl Acad Sci USA 2012;109:4921-6. 740 [37] Kimoto M, Tsubota T, Uchino K, Sezutsu H, Takiya S. Hox transcription factor Antp regulates 741 sericin-1 gene expression in the terminal differentiated silk gland of Bombyx mori. Dev Biol 742 2014;386:64-71. 743 [38] Li JY, Ye LP, Che JQ, Song J, You ZY, Yun KC, et al. Comparative proteomic analysis of the 744 silkworm middle silk gland reveals the importance of ribosome biogenesis in silk protein production. J 745 Proteomics 2015;126:109-20.
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
746 [39] Metzler RA, Rist R, Alberts E, Kenny P, Wilker JJ. Composition and Structure of oyster adhesive 747 reveals heterogeneous materials properties in a biological composite. Adv Funct Mater 748 2016;26:6814-21. 749 [40] Guerette PA, Hoon S, Seow Y, Raida M, Masic A, Wong FT, et al. Accelerating the design of 750 biomimetic materials by integrating RNA-seq with proteomics and materials science. Nat Biotechnol 751 2013;31:908-15. 752 [41] Kohler M, Hirschberg B, Bond CT, Kinzie JM, Marrion NV, Maylie J, et al. Small-conductance, 753 calcium-activated potassium channels from mammalian brain. Science 1996;273:1709-14. 754 [42] Hirschberg B, Maylie J, Adelman JP, Marrion NV. Gating properties of single SK channels in 755 hippocampal CA1 pyramidal neurons. Biophys J 1999;77:1905-13. 756 [43] Faber ES, Sah P. Functions of SK channels in central neurons. Clin Exp Pharmacol Physiol 757 2007;34:1077-83. 758 [44] Ji H, Shepard PD. SK Ca2+-activated K+ channel ligands alter the firing pattern of 759 dopamine-containing neurons in vivo. Neuroscience 2006;140:623-33. 760 [45] Hennebert E, Maldonado B, Ladurner P, Flammang P, Santos R. Experimental strategies for the 761 identification and characterization of adhesive proteins in animals: a review. Interface Focus 762 2015;5:20140064. 763 [46] Li D, Graham LD. Epidermal secretions of terrestrial flatworms and slugs: Lehmannia valentiana 764 mucus contains matrilin-like proteins. Comp Biochem Physiol B Biochem Mol Biol 2007;148:231-44. 765 [47] Hennebert E, Wattiez R, Demeuldre M, Ladurner P, Hwang DS, Waite JH, et al. Sea star tenacity 766 mediated by a protein that fragments, then aggregates. P Natl Acad Sci USA 2014;111:6317-22. 767 [48] Papov VV, Diamond TV, Biemann K, Waite JH. Hydroxyarginine-containing polyphenolic 768 proteins in the adhesive plaques of the marine mussel Mytilus edulis. J Biol Chem 1995;270:20183-92. 769 [49] Lee BP, Messersmith PB, Israelachvili JN, Waite JH. Mussel-Inspired Adhesives and Coatings. 770 Annu Rev Mater Res 2011;41:99-132. 771 [50] Rucker RB, Kosonen T, Clegg MS, Mitchell AE, Rucker BR, Uriu-Hare JY, et al. Copper, lysyl 772 oxidase, and extracellular matrix protein cross-linking. Am J Clin Nutr 1998;67:996S-1002S. 773 [51] Walker G. A study of the cement apparatus of the cypris larva of the barnacle Balanus balanoides. 774 Mar Biol 1971;9:205-12. 775 [52] Senkbeil T, Mohamed T, Simon R, Batchelor D, Di Fino A, Aldred N, et al. In vivo and in situ 776 synchrotron radiation-based mu-XRF reveals elemental distributions during the early attachment phase 777 of barnacle larvae and juvenile barnacles. Anal Bioanal Chem 2016;408:1487-96. 778 [53] Patrick F. Biological and Biomimetic Adhesives: Challenges and Opportunities. In: Smith 779 AM editor. Multiple metal-based cross-links: protein oxidation and metal coordination in a 780 biological glue. Cambridge : Royal Society of Chemistry; 2013; p. 3-15. 781 [54] Coon SL, Fitt WK, Bonar DB. Competence and delay of metamorphosis in the Pacific oyster 782 Crassostrea gigas. Mar Biol 1990;106:379-87. 783 [55] Bonar DB, Coon SL, Walch M, Weiner RM, Fitt W. Control of oyster settlement and 784 metamorphosis by endogenous and exogenous chemical cues. B Mar Sci 1990;46:484-98. 785 [56] Morse DE. Neurotransmitter-mimetic inducers of larval settlement and metamorphosis. B Mar Sci 786 1985;37:697-706. 787 [57] Nell JA, Holliday JE. Effects of potassium and copper on the settling rate of sydney rock oyster 788 (Saccostrea commercialis) Larvae. Aquaculture 1986;58:263-7. 789 [58] Wang J, Wu CL, Xu CL, Yu WC, Li Z, Li YC, et al. Voltage-gated potassium ion channel may play
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
790 a major role in the settlement of Pacific oyster (Crassostrea gigas) larvae. Aquaculture 791 2015;442:48-50. 792 [59] Alberts EM, Taylor SD, Edwards SL, Sherman DM, Huang CP, Kenny P, et al. Structural and 793 compositional characterization of the adhesive produced by reef building oysters. ACS Appl Mater 794 Interfaces 2015;7:8533-8. 795 [60] Foulon V, Boudry P, Artigaud S, Guerard F, Hellio C. In Silico snalysis of Pacific oyster 796 (Crassostrea gigas) transcriptome over developmental stages reveals candidate genes for larval 797 settlement. Int J Mol Sci 2019;20(1):197. 798 [61] Zhang T, Liu J, Fellner M, Zhang C, Sui D, Hu J. Crystal structures of a ZIP zinc transporter 799 reveal a binuclear metal center in the transport pathway. Sci Adv 2017; 3(8):e1700344. 800 [62] Du Y, Lian F, Zhu L. Biosorption of divalent Pb, Cd and Zn on aragonite and calcite mollusk 801 shells. Environ Pollut 2011;159:1763-8. 802 [63] Bond JS, Beynon RJ. The astacin family of metalloendopeptidases. Protein Sci 1995;4:1247-61. 803 [64] Sadeghi H, Allard P, Prince F, Labelle H. Symmetry and limb dominance in able-bodied gait: a 804 review. Gait Posture 2000;12:34-45. 805 [65] Weiss IM, Schonitzer V. The distribution of chitin in larval shells of the bivalve mollusk Mytilus 806 galloprovincialis. J Struct Biol 2006;153:264-77. 807 [66] Marin F, Le Roy N, Marie B. The formation and mineralization of mollusk shell. Front Biosci 808 (Schol Ed) 2012;4:1099-125. 809 [67] Marin F, Luquet G, Marie B, Medakovic D. Molluscan shell proteins: primary structure, origin, 810 and evolution. Curr Top Dev Biol 2008;80:209-76. 811 [68] Splitt MP, Burn J, Goodship J. Defects in the determination of left-right asymmetry. J Med Genet 812 1996;33:498-503. 813 [69] Savazzi E. Adaptational strategies of bivalves living as infaunal secondary soft bottom dwellers. 814 Neus Jahrb Geol P-A 1982;164:229-44. 815 [70] Wilbur KM, Saleuddin ASM. Shell Formation In: Saleuddin ASM, Wilbur KM, editors. The 816 mollusca. London New York: Academic Press; 1983, p. 235-87. 817 [71] Joubert C, Piquemal D, Marie B, Manchon L, Pierrat F, Zanella-Cleon I, et al. Transcriptome and 818 proteome analysis of Pinctada margaritifera calcifying mantle and shell: focus on biomineralization. 819 BMC Genomics 2010;11:613. 820 [72] Yoshioka H, Meno C, Koshiba K, Sugihara M, Itoh H, Ishimaru Y, et al. Pitx2, a bicoid-type 821 homeobox gene, is involved in a lefty-signaling pathway in determination of left-right asymmetry. Cell 822 1998;94:299-305. 823 [73] Grande C, Patel NH. Nodal signalling is involved in left-right asymmetry in snails. Nature 824 2009;457:1007-11. 825 [74] Smith SB, Qu HQ, Taleb N, Kishimoto NY, Scheel DW, Lu Y, et al. Rfx6 directs islet formation 826 and insulin production in mice and humans. Nature 2010;463:775-80. 827 [75] Hamano K, Awaji M, Usuki H. cDNA structure of an insulin-related peptide in the Pacific oyster 828 and seasonal changes in the gene expression. J Endocrinol 2005;187:55-67. 829 [76] Nagai K, Yano M, Morimoto K, Miyamoto H. Tyrosinase localization in mollusc shells. Comp 830 Biochem Physiol B Biochem Mol Biol 2007;146:207-14. 831 [77] Aguilera F, McDougall C, Degnan BM. Evolution of the tyrosinase gene family in bivalve 832 molluscs: independent expansion of the mantle gene repertoire. Acta Biomater 2014;10:3855-65. 833 [78] Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform.
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
834 Bioinformatics 2009;25:1754-60. 835 [79] Xu H, Luo X, Qian J, Pang X, Song J, Qian G, et al. FastUniq: a fast de novo duplicates removal 836 tool for paired short reads. PLoS One 2012;7:e52249. 837 [80] Gnerre S, Maccallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. High-quality draft 838 assemblies of mammalian genomes from massively parallel sequence data. P Natl Acad Sci USA 839 2011;108:1513-8. 840 [81] Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using 841 SSPACE. Bioinformatics 2011;27:578-9. 842 [82] Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. Erratum: SOAPdenovo2: an empirically 843 improved memory-efficient short-read de novo assembler. Gigascience 2015;4:30. 844 [83] Koren S, Walenz BP, Berlin K, Miller JR, Bergman NH, Phillippy AM. Canu: scalable and 845 accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res 846 2017;27:722-36. 847 [84] Salmela L, Rivals E. LoRDEC: accurate and efficient long read error correction. Bioinformatics 848 2014;30:3506-14. 849 [85] Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic 850 genomes. Bioinformatics 2007;23:1061-7. 851 [86] Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing 852 genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 853 2015;31:3210-2. 854 [87] Lieberman-Aiden E, van Berkum NL, Williams L, Imakaev M, Ragoczy T, Telling A, et al. 855 Comprehensive mapping of long-range interactions reveals folding principles of the human genome. 856 Science 2009;326:289-93. 857 [88] Burton JN, Adey A, Patwardhan RP, Qiu R, Kitzman JO, Shendure J. Chromosome-scale 858 scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 859 2013;31:1119-25. 860 [89] Li C, Wang J, Song K, Meng J, Xu F, Li L, et al. Construction of a high-density genetic map and 861 fine QTL mapping for growth and nutritional traits of Crassostrea gigas. BMC Genomics 2018;19. 862 [90] Tang H, Zhang X, Miao C, Zhang J, Ming R, Schnable JC, et al. ALLMAPS: robust scaffold 863 ordering based on multiple maps. Genome Biol 2015;16. 864 [91] Xu Z, Wang H. LTR_FINDER: an efficient tool for the prediction of full-length LTR 865 retrotransposons. Nucleic Acids Res 2007;35:W265-8. 866 [92] Han Y, Wessler SR. MITE-Hunter: a program for discovering miniature inverted-repeat 867 transposable elements from genomic sequences. Nucleic Acids Res 2010;38:e199. 868 [93] Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. 869 Bioinformatics 2005;21 Suppl 1:i351-8. 870 [94] Edgar RC, Myers EW. PILER: identification and classification of genomic repeats. Bioinformatics 871 2005;21 Suppl 1:i152-8. 872 [95] Wicker T, Sabot F, Hua-Van A, Bennetzen JL, Capy P, Chalhoub B, et al. A unified classification 873 system for eukaryotic transposable elements. Nat Rev Genet 2007;8:973-82. 874 [96] Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic 875 genomes. Mob DNA 2015;6:11. 876 [97] Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic 877 sequences. Curr Protoc Bioinformatics 2009;Chapter 4:Unit 4 10.
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
878 [98] Burge C, Karlin S. Prediction of complete gene structures in human genomic DNA. J Mol Biol 879 1997;268:78-94. 880 [99] Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. 881 Bioinformatics 2003;19 Suppl 2:ii215-25. 882 [100] Majoros WH, Pertea M, Salzberg SL. TigrScan and GlimmerHMM: two open source ab initio 883 eukaryotic gene-finders. Bioinformatics 2004;20:2878-9. 884 [101] Blanco E, Parra G, Guigo R. Using geneid to identify genes. Curr Protoc Bioinformatics 885 2007;Chapter 4:Unit 4 3. 886 [102] Korf I. Gene finding in novel genomes. BMC Bioinformatics 2004;5:59. 887 [103] Keilwagen J, Wenk M, Erickson JL, Schattat MH, Grau J, Hartung F. Using intron position 888 conservation for homology-based gene prediction. Nucleic Acids Res 2016;44:e89. 889 [104] Haas BJ, Salzberg SL, Zhu W, Pertea M, Allen JE, Orvis J, et al. Automated eukaryotic gene 890 structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. 891 Genome Biol 2008;9:R7. 892 [105] Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG 893 database: an updated version includes eukaryotes. BMC Bioinformatics 2003;4:41. 894 [106] Kanehisa M, Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 895 2000;28:27-30. 896 [107] Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, et al. The 897 SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res 898 2003;31:365-70. 899 [108] Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool 900 for annotation, visualization and analysis in functional genomics research. Bioinformatics 901 2005;21:3674-6. 902 [109] Nam BH, Kwak W, Kim YO, Kim DG, Kong HJ, Kim WJ, et al. Genome sequence of pacific 903 abalone (Haliotis discus hannai): the first draft genome in family Haliotidae. Gigascience 2017;6:1-8. 904 [110] Li L, Stoeckert CJ, Jr., Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic 905 genomes. Genome Res 2003;13:2178-89. 906 [111] Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. 907 Nucleic Acids Res 2004;32:1792-7. 908 [112] Stamatakis A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large 909 phylogenies. Bioinformatics 2014;30:1312-3. 910 [113] Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 911 2007;24:1586-91. 912 [114] De Bie T, Cristianini N, Demuth JP, Hahn MW. CAFE: a computational tool for the study of gene 913 family evolution. Bioinformatics 2006;22:1269-71. 914 [115] Albertin CB, Simakov O, Mitros T, Wang ZY, Pungor JR, Edsinger-Gonzales E, et al. The 915 octopus genome and the evolution of cephalopod neural and morphological novelties. Nature 916 2015;524:220-4. 917 [116] Keilwagen J, Hartung F, Grau J. GeMoMa: Homology-Based Gene Prediction Utilizing Intron 918 Position Conservation and RNA-seq Data. Methods Mol Biol 2019;1962:161-77. 919 [117] Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. 920 Bioinformatics 2009;25:1105-11. 921 [118] Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley DR, et al. Differential gene and
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
922 transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc 923 2012;7:562-78. 924 [119] Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. 925 BMC Bioinformatics 2008;9:559. 926 [120] Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software 927 environment for integrated models of biomolecular interaction networks. Genome Res 928 2003;13:2498-504. 929 [121] Coon SL, Bonar DB, Weiner RM. Chemical Production of Cultchless Oyster Spat Using 930 Epinephrine and Norepinephrine. Aquaculture 1986;58:255-62. 931 [122] Coon SL, Bonar DB, Weiner RM. Induction of settlement and metamorphosis of the Pacific 932 oyster, Crassostrea gigas (Thunberg), by L-Dopa and catecholamines. J Exp Mar Biol and Ecol 933 1985;94:211-21. 934
935 Figure legends
936 Figure 1 The genome landscape and phylogenetic analysis of the oyster Crassostrea hongkongensis.
937 A. Circos plot highlights genome characteristics across 10 chromosomes in a megabase (Mb) scale. The GC
938 content, global heterozygosity, gene density and repeat coverage are presented from outer to inner circles in turn
939 with non-overlapping 1 Mb sliding windows. B. Analysis on gene family expansion/contraction and divergence
940 time across 12 representative mollusks species. A total of 87 gene families are expanded in the Hong Kong oyster,
941 C. hongkongensis. The human genome was set as an outgroup. Three Ostreoida oyster species (Crassostrea
942 hongkongensis, Crassostrea gigas, and Crassostrea virginica) are clustered together. Gene family
943 expansion/contraction is indicated by a plus or minus sign.
944
945 Figure 2 Loss of the homeobox gene antennapedia (Antp) is implicated in an adaptive shift from byssal
946 attachment to cemented attachment.
947 A. Comparison of Homobox (Hox) cluster organization in bivalves with two distinct attachment styles, byssal
948 attachment and cemented attachment. Unlike the disputed Hox gene cluster in C. gigas oyster genome, Hox gene
949 cluster configures linearly in both C. hongkongesis and C. virginica. Essentially, Antp is lost in all three Ostreoida
950 oysters. B. Overview of key body-plan organization in Pinctada fucata and C. hongkongensis. P. fucata possesses
951 a byssal gland and byssus, whereas adult individuals of Ostreoida oyster have lost their byssus gland and byssus. C.
952 Tissue distribution of Antp othologues in three byssally attached bivalves, P. f u c a ta , Mytilus galloprocincialis, and
953 Mizuhopecten yessoensis. BG, byssal gland; DG, stomach. Antp mRNA abundance is displayed in percentage, and
954 its expression in BG accounted for more than 50%. D. Morphology of newly regenerated byssus 48 h after
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
955 excision of original byssus. Scale bar: 2 mm. E. Anatomic analysis of byssal gland with cross section. Vertical
956 cross-section of the byssal gland displaying ciliated walls (Cl), lamina propria (LP), and byssal remnants (BY)
957 within a chamber. Scale bar: 50 μm. F. Correlation between abundance of Antp mRNA and regenerated byssus
958 numbers in P. f u ca ta . Antp mRNA level in the byssal gland was determined by real-time qPCR, while newly
959 regenerated byssus threads was counted 48 h after excision of original byssus. Pearson’s correlation coefficients
960 and p-values were calculated with two tailed tests with 95% confidence.
961
962 Figure 3 Molecular basis of attachment initiation in Ostreoida oysters.
963 A. Veen plot shows the common gene family expansion in three Ostreoida oyster species, C. hongkongensis, C.
964 virginica, and C. gigas, among which 32 core gene family expansions were identified. B. Heatmap illustrates the
965 correlation between expression levels of 32 core gene family and developmental stages of C. hongkongensis larvae.
966 The high correlation of transcriptional activated gene family with attachment is presented by red. C.
967 Pharmacological responses of oyster veliger larvae during attachment initiation and metamorphosis.
968 L-3,4-dihydroxyphenylalanine (L-DOPA) stimulated larval attachment and metamorphosis, while
969 noepinephrine (NE) only induced metamorphosis without attachment. Veen plot shows that L-DOPA/NE induced
970 specific genes, among which L-DOPA specifically induced genes may participate in attachment initiation. D.
971 Construction of coordinated gene networks based on the zinc transporter ZIP12, which is a hub forming the
972 highest degrees of gene connections in Weighted correlation network analysis (WGCNA) analysis. Red and
973 green dots indicate up-regulated genes and down regulated genes. E. Schematic diagram conceptualizing the
974 molecular basis for initiation of larval attachment in oysters. Square box indicates oyster-specific expanded gene
975 families involved in larval attachment (p <0.001). Filled color (blue) was scaled with correlation values at the spat
976 stage. Ellipse box indicates L-DOPA specifically induced genes after L-DOPA treatment, which are filled in red
977 scaled with values in log2 (FC). FC, fold change. 978
979 Figure 4 Left-right asymmetry of shell formation in Ostreoida oysters.
980 A. Comparison of the ratio of left/right shell weight and morphology between the C. hongkongensis and P. f u c a t e .
981 B. Volcano plot shows the left- and right-mantle differentially expressed genes, which are filtered by |log2(FC)| ≥1
982 with p-value <0.05. C. Expression profile of 1:1 orthologues in L/R mantle of C. hongkongensis and P. fu ca t e. A
983 total of 10,491 orthologues were paired and only a few asymmetrical orthologues were specifically expressed in
984 the Hong Kong oyster. The x- and y-axes indicate logFC of expression ratio in R/L mantle of C. hongkongensis and
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
985 P. fu ca te, respectively. D. Expression patterns of two pivotal transcription factors in left and right mantles across
986 five bivalves. Ch, C. hongkongensis; Cg, C. gigas; Cv, C. virginica; My, M. yessoensis; Pf, P. fu ca t e . E.
987 Dendrogram of known tyrosinases from five mollusks was constructed by maximum likelihood (ML) method.
988 Bivalve and molluscan TyrA orthologous groups are indicated by curvatures and annotated as A1-A3. Specific
989 tyrosinase orthologous groups are marked with color background and annotated with a species’ name. Species are
990 represented with different shapes: triangle, Mizuhopecten yessoensis; circle, Ostreidae; pentagon, Pinctada fucata.
991 F. Expression patterns of tyrosinase families in left and right mantles of two bivalve species, as determined by
992 FPKM. Total FPKM of different types of orthologous genes is displayed in cumulative histograms. Different
993 members of tyrosinase are presented with different colors.
994
a b 40 chr1 chr1020 0 20 +49 /-127 0 40 Haliotis discus hannai +8 /-27 40 60 +62 /-73 Lottia gigantea +13/ -45 chr920 0 Gastropoda +36 /-99 Aplysia californica +28 /-88 20 0 chr2 +67 /-45 60 Biomphalaria glabrata
40 +87 /-45 Crassostrea Hongkongensis 40 +21 /-41 +14 /-22 chr8 60 +98/ -41 Crassostrea gigas +30 /-110 20 GC Content 23 0 + /-14 Crassostrea virginica Heterozygosity +106 /-40 +16 /-3 +69 /-104 0 Gene Density Pinctada fucata Bivalvia 20 60 Repeat Coverage +2 /-31 +76 /-100 +69 /-14 Chlamys farreri
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint 40 +78 /-85 chr7 40 (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Bathymodiolus platifrons chr3 +88 /-43 60 +112 /-69 20 Modiolus philippinarum 0 +45 /-178 Octopus bimaculoides Cephalopoda 0 20 60 +104/ -75 Homo sapiens Outgroup 40 chr640 60 chr4 20 0 0 20 800 600 400 200 0 60
40 chr5 Million years ago (MYA) Hox1 Hox2 Hox3 Hox4 Hox5 Lox5 Antp Lox4 Lox2 Post2 Post1 A Ancestral Hox cluster
M. philippinaurm
B. platifrons
M. galloprovincialis
C. farreri
M. yessoensis Byssal attachment
P. fucata
C. gigas
C. hongkongensis
C. virginica Cemented attachment
B C Palps P. fucata Stomach
Foot M. galloprovincialis Byssus Byssal gland M. yessoensis Heart Adducator 0 20 40 60 80 100 muscle Relative Antp mRNA abundance (%) Gills Mantle BG Mantle Muscle DG Gill Foot
D E F 12 R 2 =0.3584 bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 2 mm p=0.0012 CI 8 BY
Chamber 4 LP
25 μm 0 Regenerated byssus number -4 -3 -2
Log 10 (Antp mRNA) A B
PF01039.21_Carboxyl_trans PF16589.4_BRCT_2 1 PF02838.14_Glyco_hydro_20b C. virginica PF00875.17_DNA_photolyase PF00270.28_DEAD C. hongkongensis PF04117.11_Mpv17_PMP22 PF03782.16_AMOP PF00929.23_RNase_T PF01564.16_Spermine_synth 0.5 PF01400.23_Astacin PF03530.13_SK_channel PF03067.14_LPMO_10 PF13574.5_Reprolysin_2 PF03645.12_Tctex−1 98 PF01557.17_FAA_hydrolase 37 PF09772.8_Tmem26 180 0 PF02793.21_HRM PF00335.19_Tetraspannin PF00008.26_EGF PF06119.13_NIDO PF00053.23_Laminin_EGF 32 PF12947.6_EGF_3 PF00957.20_Synaptobrevin 5 −0.5 PF07534.15_TLD PF00582.25_Usp 11 PF04505.11_CD225 10 PF06701.12_MIB_HERC2 PF16977.4_ApeC PF00643.23_zf−B_box PF07731.13_Cu−oxidase_2 C. gigas −1 PF07732.14_Cu−oxidase_3 PF00394.21_Cu−oxidase
a va ula ula eliger Spat Zygote Mor 2−4cellsBlastula Gastr D−larv ediv eliger larP Trochophore V C Attachment Metamorphosis L-Dopa D
Pediveliger Spat NDCBE MGLUR NE IFT Pr2 NBCn C2MS ALDH18A1
hypothetical protein Beta-catenin EFCB6
ZIP12 UBX
L-Dopa 548 179 41 NE 677 20 2 Prominin PKN2
bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
2+ E Sensing Ca + - K HCO3 2+ Tetraspannin Matrix/Ions secretion Zn Oyster larva
e
r SK 2 s
e
1
a
g
-
3
n channel P P
O
I a
T
C
h
Z
A
c
H
-
x
+
e 2
Cadherin a
C Collagen Matrix modification K+ EGF Matrilin Cu-oxidase -OH EGF_3 Cohesin - Correlation S S - Astacin Laminin WDR1 0.5 1.0 EGF
Log 2 (FC) Hemicentin
- HRM S S - - - - - 1.5 3.5 Apec SH HS SS 2+ - Ca +HCO3 CaCO3
Surface A B Left preference gene Right preference gene
C.hongkongensis P.fucata Crassostrea 20 20 hongkongensis 15 15 Left Right p<0.0001 shell shell 10 10 0
Pinctada fucata 1 5 5 -log (p-value)
0 1.0 1.4 1.8 2.2 0 Ratio of L/R shell weight -4 0 4 -4 0 4 log (FC) log (FC) C 2 2
5 Pitx2 Right EVX1 D 3 SLC2A1 L R Mab21 GHR Ch Ch 1
-1 Pf Cg Pf Cg
Pinctada fucata Gene density OPR Rfx6 6 -3 NacreinF1 4 NPYR 2 0 Left -5 (%) -5 -3 -1 1 3 5 My Cv My Cv Left Right Crassostrea hongkongensis Pitx2 RFX6
Ostreid ae E ata expansion F n bioRxiv preprint doi: https://doi.org/10.1101/2021.03.18.435778; this version posted March 19, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder.tada All rights nreserved.fucsio No reuse allowed without permission. Pinc pa 4000 ex TyrA1 3500
3000
2500 M
Ty P rA3 K
F 2000
1500
1000
TyrA2 500 Ostre id Mi ae zu expan hope sion 0 cte n y L R L R ex es pan s sio oen n sis Crassostrea Pinctada fucata hongkongensis