The European Zoological Journal

ISSN: (Print) 2475-0263 (Online) Journal homepage: http://www.tandfonline.com/loi/tizo21

Characteristics and evolution of satellite DNA sequences in bivalve mollusks

E. Šatović, T. Vojvoda Zeljko & M. Plohl

To cite this article: E. Šatović, T. Vojvoda Zeljko & M. Plohl (2018) Characteristics and evolution of satellite DNA sequences in bivalve mollusks, The European Zoological Journal, 85:1, 95-104 To link to this article: https://doi.org/10.1080/24750263.2018.1443164

© 2018 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.

Published online: 20 Mar 2018.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tizo21 The European Zoological Journal, 2018, 95–104 Vol. 85, No. 1, https://doi.org/10.1080/24750263.2018.1443164

Characteristics and evolution of satellite DNA sequences in bivalve mollusks

E. ŠATOVIĆ, T. VOJVODA ZELJKO, & M. PLOHL

Division of Molecular Biology, Ruđer Bošković Institute, Zagreb, Croatia

(Received 20 December 2017; accepted 13 February 2018)

Abstract Mollusks of the class have attracted attention because of the extraordinarily important roles they play in marine ecosystems and in aquaculture. Data obtained from genetic studies performed on these species are accumulating rapidly, particularly in recent years when several genomic and transcriptomic studies have been carried out, or are in progress. Despite this, knowledge concerning satellite DNAs, tandemly repeated non-coding genomic sequences important for comprehending genomic architecture and function as a whole, is fragmentary and limited to a relatively small number of mollusk species. Here, we present an overview of the studied satellite DNAs and their characteristics in bivalve mollusks, and discuss the implications of these results. In addition to the general features common for these sequences, bivalve satellite DNAs show some distinct specificities which may be intriguing for the broad scientific community involved in unravelling repetitive genome components. The most striking are low genomic contribution, diversity of sequence families, extremely old ancestry, links with mobile elements, and unusual methylation patterns. Although current results were obtained in classical studies on individual species and their satellite DNA families, it can be postulated that they defined fundamental characteristics of these sequences in bivalve species generally, and will be further explored in detail by future satellitome and other high-throughput studies.

Keywords: Bivalves, mollusks, satellite DNA, repetitive DNA sequences, evolution

Introduction processes connected to metabolism and growth The class Bivalvia consists of 9200 species found in (Saavedra & Bachère 2006). Previous research on marine and freshwater habitats throughout the these organisms was largely the result of interest in world. They play an important role in the ecosystems specific gene functions, while recent studies have they inhabit and serve as support to many mollusk been moving towards genome-wide analyses. and other communities, participating in the transfer Genome sequencing projects have so far been per- of minerals and organic matter to benthic habitats. formed only for the pearl oyster Pinctada fucata As a result of filtration during the feeding process, (Gould, 1850) (Takeuchi et al. 2012), the pacific they can accumulate various xenobiotic substances, oyster Crassostrea gigas (Thunberg, 1793) (Zhang which make them also organisms of choice for envir- et al. 2012) and the filter-feeder mussel Mytilus gal- onmental monitoring (Gosling 2003). In accordance loprovincialis Lamarck, 1819 (Murgarella et al. with their extreme ecological and commercial impor- 2016), and sequenced and assembled genomes of tance in marine ecosystems and aquaculture, interest other bivalve species can be anticipated in the near in genetic research on these organisms is growing future. steadily, and has increased particularly in recent Eukaryotic genomes contain a significant portion years. Linkage maps, transcriptomics and proteo- of non-coding DNA represented by a large amount mics studies have been carried out to reveal the of different classes of repetitive sequences, found genetic and molecular basis of traits of interest to either interspersed or organised in tandem (López- the bivalve farming industry, mainly disease suscept- Flores & Garrido-Ramos 2012; Biscotti et al. 2015). ibility, tolerance to environmental stress, and Satellite DNAs (SatDNAs) are the most abundant

Correspondence: M. Plohl, Ruđer Bošković Institute, Bijenička 54, 10000 Zagreb, Croatia. Tel: +385 1 4561 083. Fax: +385 1 4561 177. Email: [email protected]

© 2018 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Published online 20 Mar 2018 96 E. Šatović et al. class of tandem repeats, usually arranged as long as a strategy in detecting repetitive genomic frag- arrays associated with heterochromatin. Recent ments, including satDNAs (e.g. Biscotti et al. 2007). results have assigned significant functions to this Difficulties in sequencing and assembly of arrays extremely diverse group of sequences. Briefly, they composed of highly similar tandem repeats have are involved in modelling the genome landscape in resulted in the fact that satDNA sequences are ser- both structural and functional aspects, and their iously underrepresented in genome sequencing out- evolution contributes to raising reproductive barriers puts. With respect to bivalve species, genomic in speciation (reviewed by Garrido-Ramos 2017). projects are no exception, and have not yielded com- SatDNA sequences are usually AT rich, and their prehensive information regarding satDNAs. abundance in genomes shows a wide range of varia- Predictions are that 202 Mb (36% of the genome) tion, with values from less than 0.5% to more than of Crassostrea gigas genome is enriched in repetitive 50%. Due to the phenomenon of concerted evolu- sequences (Zhang et al. 2012), but over 62% of the tion, sequence variability among monomers within a detected repeats could not be assigned to known satDNA family is low, usually 2–3% (reviewed by categories. In the case of the oyster Pinctada fucata Plohl et al. 2012; Garrido-Ramos 2017). Although (Takeuchi et al. 2012), about 7.9% of the genome is many unrelated satellites can populate genomes and occupied by tandem repetitive elements, among genomic satDNA profiles evolve rapidly, their DNA which 3.7% represents microsatellite sequences. In sequences can remain conserved over long evolu- Mytilus galloprovincialis (Murgarella et al. 2016)it tionary periods (reviewed by Plohl et al. 2008, was found that repetitive elements make up 36.13% 2012). Among diverse families, satDNA monomer of the assembly (more than 80% being unclassified), length can vary significantly, but 150 to 210-bp-long with the focus being mostly on transposable ele- monomers predominate. They are considered to be ments (TEs). evolutionarily favoured (Heslop-Harrison & Recent methodological advancements in next-gen- Schwarzacher 2013), as they correspond to the eration sequencing (NGS) and data processing nucleosomal length (Henikoff et al. 2001). SatDNA opened up the possibility of obtaining detailed monomers sometimes contain about 30 bp long insight into the repetitive composition (repeatome), sequence segments that exhibit reduced variability particularly satDNAs (satellitome) of a genome with respect to the rest of the monomer sequence. (Ruiz-Ruano et al. 2016; Pita et al. 2017; Silva Such conserved segments can facilitate recombina- et al. 2017). However, until newly arising methods tion between satDNA monomers and sequence start to be more intensively employed, conventional homogenisation (Kuhn et al. 2009) or have some methods remain the main source of information other functional roles, for example, in DNA–protein about tandemly repeated sequences in bivalve interactions. One example is the CENP-B protein, genomes. which facilitates centromere formation and plays an So far, 48 species belonging to all main bivalve important role in the assembly of centromere-speci- clades have been investigated, yielding 26 different fic structures upon recognising and binding the satDNAs, with Mytilidae, Ostreidae and Veneridae CENP-B box within the higher primate alpha-satel- being the most explored families so far (Figure 1). lite sequence (Masumoto et al. 1989). Satellite The accumulated results point to some peculiarities DNAs have also shown strong relationships with in bivalve mollusk satDNA structure and organisa- mobile elements, in addition to being located in tion, specifically a relatively low genomic content, close proximity to each other in certain genomic and extreme evolutionary ancestry and dispersal of regions. For example, tandem repeats are detected some of them, as well as linkage with certain types of as structural components of some mobile elements mobile elements. In order to summarise the results or satellite repeats can be derived from parts of obtained until now, we present an overview of the mobile elements (reviewed by Meštrović et al. 2015). most prominent observations on content, features, The commonly used method employed in the genomic localisation and species distribution of detection of satellite sequences is digestion of geno- satDNAs in bivalve mollusk species. Their main mic DNA with restriction endonucleases, followed characteristics can also be found in Table I. by electrophoretic separation. A characteristic ladder pattern is the result of variability in the restriction site within satellite monomers organised in an array. SatDNAs in Pectinidae Alternatively, construction of partial genomic The Antarctic scallop Adamussium colbecki (Smith, libraries followed by colony-lift hybridisation with 1902) was found to contain a tandemly repeated fragmented total genomic DNA is sometimes used sequence, detected upon restriction with the Satellite DNA in bivalves 97

Figure 1. Bivalve species explored from the aspect of satDNAs, their classification and age estimation of the main clades (taken from Bieler et al. 2014), together with satDNA families reported in these species. endonuclease BglII (Canapa et al. 2000). The mono- DNAs have monomer lengths of about 40 bp (families mer length of this satellite DNA, named pACS, is 2, 3 and 4), while another three have monomer lengths ~170 bp. Its AT composition ranges from 59.2 to of 156 bp (family 5), 178 bp (family 1) and 326 bp 73%, with the presence of adenine and thymine (family 6). Family 1 represents about 0.13% of the stretches in the monomer sequence. The degree of genome. Family 5 monomers are characterised by har- similarity between monomers varies between 57 and bouring successive repeats of one base, in particular G, 95%, and this satellite constitutes about 0.2% of the or GG, AA, TT, CC and AGGG motifs, and having a total DNA of A. colbecki. Sequence motifs similar to very low copy number, 0.003%. The longest repeat CENP-B box and CDEIII were found within the unit, family 6, consists of two consecutive 163-bp sub- monomer sequence, suggesting that A. colbecki satel- repeats, exhibiting 75% similarity to each other. The lite DNA possesses some of the features associated authors have also found that it possesses a certain level with organisation and activity of the centromere. of similarity with the DTE satellite from the species Subsequent research has shown that it is indeed trunculus Linnaeus, 1758 (Plohl & Cornudella located in the centromeric region of several A. col- 1996; see below). Fluorescent in-situ hybridisation becki chromosomes (Odierna et al. 2006). (FISH) has shown that family 2 was concentrated on In the scallop Pecten maximus (Linnaeus, 1758), six a single chromosome pair, while family 1 showed many different repetitive DNA sequences have been discrete hybridisation sites varying in size, indicating its described (Biscotti et al. 2007). Three repetitive presence at multiple chromosomal locations. Family 5 98 E. Šatović et al.

Table I. General features of satDNA sequences presented in the text.

Monomer AT content Similarity among Abundance Conserved Connection to mobile length (bp) (%) monomers (%) (%) motifs elements

Pectinidae pACS ~170 59–73 57–95 0.2 + Family 1 178 89.9 0.13 Family 2 ~40 90 Family 3 ~40 Family 4 ~40 Family 5 156 0.003 Family 6 326 75 + PjHhaI 178 92–95 20 + ApaI 1 170–175 + + ApaI 2 159–162 + + ApaI 3 88–167 + + Mytilidae ApaI 1 173 55 91 0.10–0.79 + + ApaI 2 161 65 91 0.85–1.66 + + ApaI 3 166 55 91 0.01–0.12 + + Mg1 169 4.1 - + Mg2 260 2.7 - + Mg3 70 1 - + DTHS1 136–167 ~55 ≤ 95 0.043 DTHS2 136–167 ~55 ≤ 95 0.081 DTHS3 136–167 ~55 ≤ 95 0.035 DTHS4 136–167 ~55 ≤ 95 0.008 DTE 155 89 0.09–0.23 + + DTF1 169 62 ≥ 83 0.1 + DTF2 155,169 37.5 92–100 2 Mactridae SSU 315 56 97 1.3–4+ Ostreidae Cg170/HindIII 166 60 79–94 1–4+ BclI 150 50 90–96.6 + + Veneridae phBglII400 400 60 98 Broadly distributed satDNAs BIV160 158–166 61 46–100 (interspecies) 0.1–2+ + DTHS3 138–164 63 48–75 (interspecies) +

shows six to eight major sites with some more diffuse hybridisation signals on the chromosomes were scar- hybridisation signals, in agreement with its lower geno- cely visible. Quantitative analyses showed that the mic abundance. PjHhaI satDNA accounts for ~20% of the genome PjHhaI satDNA was described in Pecten jacobaeus of P. jacobaeus and P. maximus. In other bivalve spe- (Linnaeus, 1758) upon HhaI digestion of genomic cies, dot blot hybridisation signals were barely detect- DNA (Petraccioli et al. 2015). This satellite is wide- able, making it impossible to determine precisely the spread not only among scallops but also in other very small amount of this satellite DNA. Petraccioli bivalve species, including P. maximus, A. colbecki, et al. (2015) also found that in the genomes of C. Aequipecten opercularis (Linnaeus, 1758) and gigas and Capitella teleta Blake, Grassle and Chamelea gallina (Linnaeus, 1758). In P. jacobaeus Eckelbarger, 2009 sequences similar to the PjHhaI the monomer length is 178 bp, and similarity among satDNA exist, flanked by TEs. monomers is between 92.1 and 95.5%. Sequences obtained from other species ranged from 173 to 181 SatDNAs in Mytilidae bp, and showed high similarity to those from P. jaco- baeus (89.3–93.6%). FISH experiments localised the Following genomic DNA restriction with ApaI PjHhaI satDNA to restricted regions of all the biva- endonuclease, three different satDNAs were initially lents of P. jacobaeus, while for other bivalve species detected in Mytilus edulis Linnaeus, 1758 (Ruiz-Lara Satellite DNA in bivalves 99 et al. 1992), and were subsequently found in the Crassostrea virginica (Gmelin, 1791) (Gaffney et al. related species M. galloprovincialis, M. trossulus 2003). The central part of these elements is charac- Gould, 1850 and M. californianus Conrad, 1837 terised by a short array of tandemly repeated (Martinez-Lage et al. 2002). ApaI satellite sequences sequences, similar to satellite monomers. Pearl and named 1, 2 and 3 exhibit monomer lengths of 173, other comparably structured elements were recently 161 and 166 bp, respectively. All are AT rich (55– classified as members of Helentrons, a rolling-circle 65%), and sequence analysis in M. edulis showed a non-autonomous family of TEs (Thomas & Pritham high degree of homology between monomers, 91%. 2015). Being a putative TE, the structure described Comparison of satellite profiles between the afore- by Kourtidis et al. (2006) was named MgE. mentioned species enabled the reconstruction of Similarly, Mg2, and Mg3 repeats were found joined phylogenetic relationships between them. It was to the previously described ApaI type 2 repeats shown that M. californianus is the most diversified (Ruiz-Lara et al. 1992; Martínez-Lage et al. 2002), among them, which is also consistent with morpho- in structural arrangements similar to MgE. Kourtidis logical observations, as well as with studies based on et al. (2006) have reported that searches for con- mitochondrial DNA. The amount of satellite served motifs, such as CENP-B and the yeast sequences varies between species but satellite 2 is CDEIII sequence, yielded no significant hits. In the most represented in all of them, making up as addition, sequence alignments did not reveal any much as 1.66% of the genome, while satellite 3 is other conserved segments. least represented (0.01–0.12%). Chromosomal loca- lisation of these satDNAs was performed for M. SatDNAs in Donacidae edulis, M. galloprovincialis and M. trossulus. While satellite 2 exhibits a random distribution in small In the wedge clam, Donax trunculus, eight satellite groups across all chromosomes, satellites 1 and 3 DNAs have been described so far, making it, are located in the subtelomeric areas of individual together with Pecten maximus, one of the best- chromosomes (Martínez-Lage et al. 2002). In con- explored bivalve species from a repetitive DNA tinuation, these repetitive DNA sequences were ana- point of view (Figure 1). Four described satellite lysed in additional species belonging to Mytilida (M. DNAs of D. trunculus were assigned to the DTHS chilensis Hupé, 1854, M. coruscus Gould, 1861, Perna family, detected after the restriction of genomic canaliculus (Gmelin, 1791), Aulacomya ater (Molina, DNA with HindIII endonuclease (Plohl & 1782), Choromytilus chorus (Molina, 1782), Septifer Cornudella 1997). Monomer length of satDNAs virgatus (Wiegmann, 1837), Lithophaga lithophaga within this family ranges from 136 to 167 bp. All (Linnaeus, 1758), Geukensia demissa (Dillwyn, four satellites show low abundance: DTHS1 makes 1817)) as well as in a few other bivalves (Arca noae up 0.043% of the genome, DTHS2 0.081% and (Linnaeus, 1758), Pecten maximus, Mimachlamys DTHS3 0.035%, while DTHS4 constitutes only varia (Linnaeus, 1758)) by Martinez-Lage et al. 0.008% of the genome. Despite the difference in (2005). These results showed that monomer length nucleotide sequence, members of the DTHS family and sequence similarities correspond to those have related oligonucleotide motifs repeated up to 15 reported for satellites from the four initially analysed times within the satellite monomer. The main build- Mytilus species (Ruiz-Lara et al. 1992; Martínez- ing block is a TTAGGG motif and its variations. Lage et al. 2002). The genomic abundance of these TTAGGG is a characteristic telomeric sequence in satellites is significantly lower in the examined spe- vertebrates, and also builds chromosome ends of D. cies of Mytilidae. Nevertheless, sequence alignments trunculus and some other sea invertebrates (Plohl revealed that all of these sequences contain regions et al. 2002). of similarity to tRNA A and B boxes, and the authors DTE satellite DNA with its 155-bp-long mono- suggest that they could be spread as tRNA-derived mer was detected after genomic restriction with pseudogenes (Martinez-Lage et al. 2005). EcoRV endonuclease (Plohl & Cornudella 1996). Another three repetitive sequences, with monomer This satellite DNA contains two groups of mono- lengths of 169, 260 and 70 bp, respectively, were mers, DTE1 and DTE2, whose nucleotide reported in Mytilus galloprovincialis (Kourtidis et al. sequences show 11% divergence and represent sub- 2006). They were named Mg1, Mg2 and Mg3, con- families that can be clearly distinguished based on stituting 4.1, 2.7 and 1% of the M. galloprovincialis diagnostic mutations. DTE1 constitutes 0.23% of genome, respectively. The Mg1 repetitive region the genome, and DTE2 0.09%. This satellite DNA with its flanking sequences showed significant shows similarity to the part of the aforementioned homology to the CvE, a member of the pearl family satellite DNA found in P. maximus (Biscotti et al. of mobile elements described in the oyster 2007) and to the HindIII satellite DNA of oysters 100 E. Šatović et al.

(Wang et al. 2001; López-Flores et al. 2004) as well (3.38%), triplicating the mean of the S. subtruncata as to the central repeats of the pearl element CvA genome. Methylation of the monomer sequence is (Gaffney et al. 2003). inversely correlated with nucleotide diversity, as Following HinfI-cleavage of D. trunculus genomic regions of highest methylation coincide with two of DNA, two different satellite DNAs were detected the three most conserved sequence segments. (Petrović & Plohl 2005). They were separated from Interestingly, SSU satDNA located on one S. sub- the same electrophoretic band, with the only com- truncata chromosome pair did not reveal any sign of mon feature being a repeat length of 169 bp. One of methylation. Although exhibiting 44% of GC the two satellites is DTF1, which is AT rich (62%) nucleotides itself, SSU satDNA is predominantly and constitutes 0.1% of the D. trunculus genome. located at the GC-rich intercalary heterochromatin DTF1 monomers are divided into two subgroups of S. subtruncata. exhibiting 17% nucleotide divergence, while mono- mers within each subgroup are highly conserved. SatDNAs in Ostreidae Nucleotide diversity analysis of the repeats revealed unequal distribution of mutations, where all DTF1 Clabby et al. (1996) discovered the Cg170 satellite monomers contain a 30-bp segment showing extre- DNA of 166 bp monomer length that occupies 1–4% mely strong sequence conservation, probably as a of the Crassostrea gigas genome. The nucleotide consequence of functional constraints on segments sequence of this satellite is AT rich (60%) and the of this satDNA. A higher order repeat unit, DTRS, analysis showed a high degree of homology between also exists, composed of a complete and a shortened monomers, 79–94%. Subsequent studies have DTF1 monomer and the insertion of a 14-bp-long shown that these satellite sequences are located in unrelated nucleotide sequence. the centromeric areas of C. gigas chromosomes A second HinfI-retrieved satDNA is DTF2. In (Wang et al. 2001). The same satDNA was also contrast to other D. trunculus satellites, it has an found after HindIII restriction of genomic DNA, unexpectedly high GC nucleotide content in the thereafter named HindIII satDNA, and was monomer (62.5%), and shows high abundance, described in more detail in seven species belonging 2%. Another peculiarity of the DTF2 satellite is to the genera Ostrea and Crassostrea (Figure 1; partial methylation of monomer sequences. López-Flores et al. 2004). Authors reported the con- Nucleotide sequence analysis showed that DTF2 sensus length of monomers from these species to be monomers have low variability, with nucleotide dif- ~166 bp. Species-specific differences among repeats ferences uniformly distributed throughout the whole were taxonomically informative and enabled the monomer length. In addition to a 169-bp-long reconstruction of phylogenetic relationships among repeat unit (DTF2L variant), a shorter variant of examined taxa. In addition, two entries in Repbase 155 bp was found (DTF2S). FISH showed the pre- (database of repetitive elements, Bao et al. 2015): sence of this satellite on 14 out of 19 D. trunculus SAT-1_CGi (Jurka 2012) and SATREP_CGi chromosomes, residing in telomeric and subtelo- (Clabby et al. 1996), also belong to the same repeti- meric regions and, interestingly, mostly not coincid- tive sequence. The presence of Cg170/HindIII ing with the prominent GC-rich heterochromatic repeats was also reported by Šatović et al. (2016) blocks (Petrović et al. 2009). within structures that were assigned to DNA trans- posons of the Helitron superfamily, with varying repeat copy numbers in the internal array. SatDNAs in Mactridae After digestion with BclI enzyme, a 150-bp-long García-Souto et al. (2017) reported the presence of repetitive DNA sequence was found in the genome SSUsat in the genome of Spisula subtruncata (da of the European flat oyster, Ostrea edulis Linnaeus, Costa, 1778), with a monomer length of 315 bp. 1758 (López-Flores et al. 2010). Its presence was This satDNA is also present in two other determined in five other species belonging to the Mactridae species, Spisula solida (Linnaeus, 1758) orders of Ostrea and Crassostrea (Figure 1). and Mactra stultorum (Linnaeus, 1758). The Intraspecific variability is between 3.4% in SSUsat monomer sequence is highly conserved, Crassostrea angulata (Lamarck, 1819) and 10% in does not show species-specific grouping, and exhi- C. ariakensis (Fujita, 1913), with interspecific diver- bits only 3% of both intra- and inter-specific nucleo- gence being similar to intraspecific divergence. AT tide sequence diversity. These repeats represent at composition of this satDNA is 50%. In-situ hybridi- least 1.3% of the S. subtruncata genome, and its sation analysis showed many signals along all C. presence in the other two species is significantly angulata chromosomes, exhibiting interspersed loca- lower. Methylation levels for SSUsat are high lisation of BclI repeats. Together with this Satellite DNA in bivalves 101 organisational pattern, sequence blocks similar to pattern (R. philippinarum). BIV160 is the most abun- box A and B, characteristic for SINE elements, dant in R. decussatus, constituting 2% of the genome, have been found in BclI repeats. The authors did with similarity among monomers ranging from 69 to not determine the exact abundance of this repetitive 97%. Within BIV160 monomers two conserved DNA, but they did report faint hybridisation signals motifs have been noted, which are also shared with in both Southern and dot blot analysis, indicating a the HindIII satDNA (López- Flores et al. 2004), low representation of the BclI sequence in the gen- with DTE (Plohl & Cornudella 1996) and with the omes of Ostrea stentina Payraudeau, 1826, C. angu- CvA mobile element (Gaffney et al. 2003). lata and C. gigas, with respect to the genome of O. The similarity of BIV160 sequences and edulis. Crassostrea gigas Cg170/HindIII repeats (Clabby et al. 1996; López-Flores et al. 2004), the DTE satellite (Plohl & Cornudella 1996), and the CvA SatDNAs in Veneridae element (Gaffney et al. 2003) indicate extreme inter- phBglII400 satDNA was described by Passamonti connections between satDNAs in bivalve species, et al. (1998)inRuditapes philippinarum (Adams & both in the evolutionary sense and in potential func- Reeve, 1850). This is an AT-rich satellite DNA and tional roles, particularly with regard to conserved has 400-bp-long monomers composed of two 200- motifs retained in all of them. bp subunits. Sequence analysis shows a high degree The previously mentioned DTHS3 satDNA, ori- of homology between monomers (~98%), and FISH ginally discovered in D. trunculus (Plohl & analysis showed that these satellite DNAs are found Cornudella 1997), was further characterised within mainly in pericentromeric chromosomal regions. Its the class Bivalvia (Šatović & Plohl in press). presence was also detected in the species Venerupis Monomer variants of DTHS3 satDNA were com- aurea (Gmelin, 1791) and Paphia undulata (Born, pared in 12 clam species belonging to subclasses 1778). and Pteriomorphia (D. trunculus, Ruditapes decussatus, R. philippinarum, Sinonovacula constricta (Lamarck, 1818), Mercenaria mercenaria SatDNA families broadly distributed among (Linnaeus, 1758), Meretrix lusoria (Röding, 1798), bivalve mollusks M. meretrix (Linnaeus, 1758), Spisula solidissima Although broad interspecies distribution was (Dillwyn, 1817), Crassostrea gigas, C. virginica, observed for different mollusk satDNAs (for exam- Mytilus galloprovincialis and Dreissena bugensis ple, satellites in Pectinidae and Mytilidae, Figure 1), (Andrusov, 1897)). The results suggest that particularly interesting in this context is the BIV160 DTHS3 and BIV160 behave quite similarly, both satDNA. It was initially detected in nine species in estimated age (516 MY in case of DTHS3) and (Glycymeris glycymeris (Linnaeus, 1758), Mya are- in the presence of distribution patterns of monomer naria (Linnaeus, 1758), Dosinia exoleta (Linnaeus, variants, which can be species specific, intermingled 1758), Venus verrucosa (Linnaeus, 1758), Venerupis interspecifically, or a combination of these two pat- pullastra (Montagu, 1803), Venerupis rhomboids terns. Sequence comparisons also invoked horizontal (Pennant, 1777), Ruditapes decussatus (Linnaeus, transfer as a possible pathway by which DTHS3 1758), Ruditapes philippinarum and Nucula sp. monomers may become shared between some dis- Lamarck, 1799), belonging to all of the main sub- tant species belonging to different subclasses classes of bivalves (Protobranchia, Pteriomorphia (Šatović & Plohl In press). and Heteroconchia). According to its distribution, it was estimated to be about 540 million years old, Integrating features common for bivalve with the longest evolutionary track record known so mollusk satDNAs far (Plohl et al. 2010). In the first survey it remained undetected in D. trunculus, but its presence in this One striking characteristic of satellite DNAs present species was subsequently confirmed by using specific in bivalve genomes is their relatively low genomic primers (Šatović et al. 2016). The average AT com- abundance, compared to that found in many other position of BIV160 satDNA is 61%, and the mono- organisms (Garrido-Ramos 2017). With the strong mer length ranges from 158 to 166 bp. Isolated exception of the PjHhaI satDNA, which constitutes monomers showed three different distribution pat- 20% of the P. jacobaeus and P. maximus genomes terns: species-specific sets of variants (M. arenaria), (Petraccioli et al. 2015), the maximum genome intermixed variants without any species specificity occupancy by a certain satDNA is 1–4% for Cg170 (for example in R. decussatus), and monomer variants in C. gigas, being significantly lower for other satel- which can follow both the first and the second lites. What also characterises the satDNA profile of 102 E. Šatović et al. bivalve mollusks is the coexistence of many divergent motif could be related to the mobility of satDNA subgroups, often interrelated, with none of them monomers (Meštrović et al. 2013), based on the expanding into a dominant satDNA of the respective similarity between the CENP-B protein and transpo- species. An example is D. trunculus where eight dif- sases of the pogo family of mobile elements (Kipling ferent satDNAs have been detected, none of them & Warburton 1997). exceeding 2% (Petrović et al. 2009; Plohl et al. Sequence alignments have revealed that ApaI- 2010), or a family of six repeats detected in the repeats of Mytilus species contain segments showing scallop P. maximus (Biscotti et al. 2007). similarity to RNA Pol III boxes A and B (Martínez- Regarding AT content, bivalves show only a few Lage et al. 2005), features characteristic of SINE exceptions to the usual abundance of those nucleo- elements. Degenerate sequences similar to boxes A tides in the monomer sequence. BclI repeats of O. and B have been found within BclI repeats as well edulis (López-Flores et al. 2010) harbour 50% AT, (López-Flores et al. 2010). Because the BclI while DTF2 of D. trunculus shows stronger GC sequence of O. edulis exhibits a length and inter- enrichment, 62.5% (Petrović et al. 2009). spersed distribution pattern which is expected for SatDNA physical mapping in bivalve mollusks SINE elements, it is likely that the conserved boxes encompasses all possible distribution patterns. in those repeats could be connected to sequence ThepresenceofBglIIsatDNAofA. colbecki and origin and not a consequence of functional Cg170 of C. gigas was found at centromeres interactions. (Canapa et al. 2000;Wangetal.2001; Odierna The connection of these sequences to mobile et al. 2006). Mytilus ApaI satellite 2 shows a ran- elements goes even further, and might suggest a dom distribution across all chromosomes, while tight interconnection particularly discernible in satellites 1 and 3 are located in the subtelomeric bivalves, although different forms of interconnec- areas (Martínez-Lage et al. 2002). Family 2 of P. tion were observed in other species as well (for a maximus repeats (Biscotti et al. 2007)ispresenton review, see Meštrović et al. 2015). As described asinglechromosomepair,whilefamily1shows above, sequence similarity expressed especially many hybridisation signals at multiple chromoso- through shared conserved boxes has been observed mal locations. An interspersed pattern is also char- for the CvA element of the pearl family (Gaffney acteristic of BclI repeats of C. angulata (López- et al. 2003) and DTE, BIV160 and HindIII/Cg170 Flores et al. 2010). The distribution pattern of repeats (Clabby et al. 1996; Plohl & Cornudella satDNAs is probably correlated with the mechan- 1996; López-Flores et al. 2004; Plohl et al. 2010). ism of their dispersal; in other words, sequences Crassostrea gigas HindIII/Cg170 repeats together associated with mobile elements or having the abil- with their flanking sequences were identified as ity to transpose might be expected to build arrays parts of putative TEs (Šatović et al. 2016). MgE, dispersed on many locations along the chromo- the putative mobile element characterised in M. somes. Although this hypothesis still has to be galloprovincialis holdsMg1repeats(Kourtidisetal. confirmed, there are indications of such connec- 2006), while sequences showing similarity to tions, for example in C. gigas (Šatović et al. 2016). PjHhaI sat in C. gigas and Capitella teleta were also Conserved boxes and different sequence motifs found surrounded with structures belonging to can be found within monomer sequences of bivalve TEs (Petraccioli et al. 2015). Additionally, satDNAs, and potential functional constraints have DTHS3 satDNA in M. mercenaria, S. solidissima, already been mentioned. C. virginica and M. galloprovincialis was found It is interesting that different types of repetitive embedded in structures that might be responsible sequences, three belonging to satDNAs – DTE for their mobility (Šatović &Plohlin press). (Plohl & Cornudella 1996), BIV160 (Plohl et al. Although scarce and far from being understood, 2010), HindIII (López-Flores et al. 2004) – and results concerning methylation profiles in repetitive one belonging to the mobile element CvA (Gaffney DNAs of bivalve species have disclosed some distinct et al. 2003), share the same conserved boxes, features. In the oyster C. gigas, repeat classes such as although functional studies that would suggest invol- DNA transposons, helitrons, satellites, simple repeats vement in similar functional interactions do not exist and tandem repeats all displayed average methylation yet. In this respect, it is no surprise that pACS satel- levels 2 times higher than the genome background lite DNA, which contains motifs similar to the (Wang et al. 2014). These authors noticed that if CENP-B box and CDEIII (Canapa et al. 2000), they divide the repeats into methylated and unmethy- was found localised at the centromeres of A. colbecki lated, the divergence rate of the methylated group was chromosomes (Odierna et al. 2006). In addition to a significantly lower than that of the unmethylated putative centromeric function, the CENP-B box-like group. This observation is also in agreement with Satellite DNA in bivalves 103 data from SSU satDNA from S. subtruncata,whose References methylation is inversely correlated with nucleotide Bao W, Kojima KK, Kohany O. 2015. Repbase update, a database diversity in segments of monomer sequences of repetitive elements in eukaryotic genomes. Mobile DNA 6: (García-Souto et al. 2017). Methylated DTF2 11. DOI: 10.1186/s13100-015-0041-9. satDNA of D. trunculus shows uniform and high Bieler R, Mikkelsen PM, Collins T, Glover E, González V, Graf sequence conservation throughout the whole mono- D, Harper EM, Healy J, Kawauchi GY, Sharma PP, Staubach mer sequence (Petrović et al. 2009). S, Strong EE, Taylor JD, Tëmkin I, Zardus JD, Clark S, fi Guzmán A, McIntyre E, Sharp P, Giribet G. 2014. Speci cities in the methylation pattern of certain Investigating the bivalve tree of life – An exemplar-based types of repetitive sequences in bivalves suggest role approach combining molecular and novel morphological char- (s) alternative to mere DNA silencing. Furthermore, acters. Invertebrate Systematics 28:32–115. http://www. theexistenceofasinglechromosomepairinS. bioone.org/doi/abs/10.1071/IS13010. DOI: 10.1071/IS13010. subtruncata in which SSUsat is non-methylated Biscotti MA, Canapa A, Olmo E, Barucca M, Teo CH, Schwarzacher T, Dennerlein S, Richter R, Heslop-Harrison (García-Souto et al. 2017) leads to the assumption JSP. 2007. Repetitive DNA, molecular cytogenetics and gen- that methylation processes in the repetitive part of ome organization in the King scallop (Pecten maximus). Gene bivalve mollusk genomes are non-random and com- 406:91–98. DOI: 10.1016/j.gene.2007.06.027. plex, and the significance of these patterns remains Biscotti MA, Olmo E, Heslop-Harrison JS. 2015. Repetitive DNA – to be revealed. in eukaryotic genomes. Chromosome Research 23:415 420. DOI: 10.1007/s10577-015-9499-z. Canapa A, Barucca M, Cerioni PN, Olmo E. 2000. A satellite Concluding remarks DNA containing CENP-B box-like motifs is present in the antarctic scallop Adamussium colbecki. Gene 247:175–180. Research on bivalve mollusk satDNAs has brought DOI:10.1016/S0378-1119(00)00101-3. important information which has broadened our Clabby C, Goswami U, Flavin F, Wilkins NP, Houghton JA, Powell R. 1996. Cloning, characterization and chromosomal location of generalviewsonsatDNAsasaclassofubiquitous a satellite DNA from the pacificoyster,Crassostrea-gigas.Gene genomicsequences.Forbivalvemollusks,themost 168:205–209. DOI:10.1016/0378-1119(95)00749-0. striking features are (i) the low contribution of het- Gaffney PM, Pierce JC, Mackinley AG, Titchen DA, Glenn WK. erochromatin and satDNAs in the genomes of these 2003. Pearl, a novel family of putative transposable elements in – ancient organisms, including in the centromeric bivalve mollusks. Journal of Molecular Evolution 56:308 316. DOI: 10.1007/s00239-002-2402-5. region of some species; (ii) in some cases, an extre- García-Souto D, Mravinac B, Šatović E, Plohl M, Morán P, mely large palette of related monomer variants Pasantes JJ. 2017. Methylation profile of a satellite DNA con- comprising a library of a satDNA family; (iii) the stituting the intercalary G+C-rich heterochromatin of the cut extremely long ancestry of some of them; (iv) the trough shell Spisula subtruncata (Bivalvia, Mactridae). Scientific abilitytoeasilylinkmanyofthesesequenceswith Reports 7:6930. DOI: 10.1038/s41598-017-07231-7. Garrido-Ramos MA. 2017. Satellite DNA: An evolving topic. certain groups of mobile elements; and (v) unusual Genes 8:9. DOI: 10.3390/genes8090230. methylation patterns. Despite the generally known Gosling E. 2003. Bivalve molluscs: Biology, ecology and culture. genomic roles of satDNAs, at this moment it is not Oxford: Fishing News Books. clear whether and how specificfeaturesofthese Henikoff S, Ahmad K, Malik HS. 2001. The centromere paradox: sequences present in bivalves influence the biology Stable inheritance with rapidly evolving DNA. Science 293:1098–1102. DOI: 10.1126/science.1062939. of these organisms. Nevertheless, abundance, Heslop-Harrison JSP, Schwarzacher T 2013. Nucleosomes and dynamics and differences between satDNAs make centromeric DNA packaging. Proceedings of the National bivalve mollusks important organisms in exploring Academy of Sciences of the United States of America organisational and functional aspects of satelli- 110:19974–19975. DOI: 10.1073/pnas.1319945110 fi tomes. We are sure that future research will bring Jurka J. 2012. Satellite DNA from the Paci c oyster genome SAT- 1_CGi. Repbase Reports 12:2722. a lot of additional information that will help to Kipling D, Warburton PE. 1997. Centromeres, CENP-B and unravel roles of this specific form of noncoding tigger too. Trends in Genetics 13:1–5. DOI: 10.1016/S0168- DNA sequences. 9525(97)01098-6. Kourtidis A, Drosopoulou E, Pantzartzi CN, Chintiroglou CC, Scouras ZG. 2006. Three new satellite sequences and a mobile Acknowledgements element found inside HSP70 introns of the Mediterranean mussel (Mytilus galloprovincialis). Genome 49:1451–1458. We thank Dr Mary Sopta for critical reading and DOI: 10.1139/g06-111. language editing of the manuscript. Kuhn GCS, Teo CH, Schwarzacher T, Heslop-Harrison JS. 2009. Evolutionary dynamics and sites of illegitimate recombination revealed in the interspersion and sequence junctions of two Disclosure statement nonhomologous satellite DNAs in cactophilic Drosophila spe- cies. Heredity 102:453–464. DOI: 10.1038/hdy.2009.9. No potential conflict of interest was reported by the López-Flores I, de la Herrán R, Garrido-Ramos MA, Boudry P, authors. Ruiz-Rejón C, Ruiz-Rejón M. 2004. The molecular phylogeny 104 E. Šatović et al.

of oysters based on a satellite DNA related to transposons. Plohl M, Cornudella L. 1997. Characterization of interrelated Gene 339:181–188. DOI: 10.1016/j.gene.2004.06.049. sequence motifs in four satellite DNAs and their distribution López-Flores I, Garrido-Ramos MA. 2012. The repetitive DNA in the genome of the mollusc Donax trunculus.Journalof content of eukaryotic genomes. In: Garrido-Ramos MA, edi- Molecular Evolution 44:189–198. DOI:10.1007/PL00006135. tor. Repetitive DNA. Genome dynamics. Vol. 7: Basel: Plohl M, Luchetti A, Meštrović N, Mantovani B. 2008. Satellite Karger. pp. 1–28. DOI: 10.1159/000337118. DNAs between selfishness and functionality: Structure, geno- López-Flores I, Ruiz-Rejón C, Cross I, Rebordinos L, Robles F, mics and evolution of tandem repeats in centromeric (hetero) Navajas-Pérez R, de la Herrán R. 2010. Molecular characterization chromatin. Gene 409:72–82. DOI:10.1016/j.gene.2007.11.013. and evolution of an interspersed repetitive DNA family of oysters. Plohl M, Meštrović N, Mravinac B. 2012. Satellite DNA evolu- Genetica 138:1211–1219. DOI: 10.1007/s10709-010-9517-1. tion. In: Garrido- Ramos MA, editor. Genome dynamics. Vol. Martínez-Lage A, Rodríguez F, González-Tizón A, Prats E, 7: Basel: Karger. pp. 126–152. DOI: 10.1159/000337122. Cornudella L, Méndez J. 2002. Comparative analysis of differ- Plohl M, Petrović V, Luchetti A, Ricci A, Šatović E, Passamonti ent satellite DNAs in four Mytilus species. Genome 45:922– M, Mantovani B. 2010. Long-term conservation vs high 929. DOI:10.1139/g02-056. sequence divergence: The case of an extraordinarily old satel- Martínez-Lage A, Rodríguez-Fariña F, González-Tizón A, lite DNA in bivalve mollusks. Heredity 104:543–551. DOI: Méndez J. 2005. Origin and evolution of Mytilus mussel satel- 10.1038/hdy.2009.141. lite DNAs. Genome 48:247–256. DOI: 10.1139/G04-115. Plohl M, Prats E, Martinez-Lage A, Gonzalez-Tizon A, Mendez J, Masumoto H, Masukata H, Muro Y, Nozaki N, Okazaki T. 1989. Cornudella L. 2002. Telomeric localization of the vertebrate- A human centromere antigen (CENP-B) interacts with a short type hexamer repeat, (TTAGGG)n, in the wedgeshell clam specific sequence in alphoid DNA, a human centromeric satel- Donax trunculus and other marine invertebrate genomes. The lite. The Journal of Cell Biology 109:1963–1973. Journal of Biological Chemistry 277:19839–19846. DOI: DOI:10.1083/jcb.109.5.1963. 10.1074/jbc.M201032200. Meštrović N, Mravinac B, Pavlek M, Vojvoda-Zeljko T, Šatović E, Ruiz-Lara S, Prats E, Sainz J, Cornudella L. 1992. Cloning and Plohl M. 2015. Structural and functional liaisons between characterization of a highly conserved satellite DNA from the transposable elements and satellite DNAs. Chromosome mollusc Mytilus edulis. Gene 117:237–242. Research 23:583–596. DOI: 10.1007/s10577-015-9483-7. Ruiz-Ruano FJ, López-León MD, Cabrero J, Camacho JPM. Meštrović N, Pavlek M, Car A, Castagnone-Sereno P, Abad P, 2016. High-throughput analysis of the satellitome illuminates Plohl M. 2013. Conserved DNA motifs, including the CENP- satellite DNA evolution. Scientific Reports 6:28333. DOI: B Box-like, are possible promoters of satellite DNA array 10.1038/srep28333. rearrangements in Nematodes. PLoS ONE 8:e67328. DOI: Saavedra C, Bachère E. 2006. Bivalve genomics. Aquaculture 10.1371/journal.pone.0067328. 256:1–14. DOI:10.1016/j.aquaculture.2006.02.023. Murgarella M, Puiu D, Novoa B, Figueras A, Posada D, Šatović E, Plohl M. In press. Distribution of DTHS3 satellite Canchaya C. 2016.Afirst insight into the genome of the DNA across 12 bivalve species. Journal Od Genetics. filter-feeder mussel Mytilus galloprovincialis. PloS One 11:1– Šatović E, Vojvoda Zeljko T, Luchetti A, Mantovani B, Plohl M. 22. DOI: 10.1371/journal.pone.0151561. 2016. Adjacent sequences disclose potential for intra-genomic Odierna G, Aprea G, Barucca M, Canapa A, Capriglione T, Olmo dispersal of satellite DNA repeats and suggest a complex net- E. 2006. Karyology of the Antarctic scallop Adamussium colbecki, work with transposable elements. BMC Genomics 17:997. with some comments on the karyological evolution of pectinids. DOI: 10.1186/s12864-016-3347-1. Genetica 127:341–349. DOI: 10.1007/s10709-005-5366-8. Silva DMZDA, Utsunomia R, Ruiz-Ruano FJ, Daniel SN, Porto- Passamonti M, Mantovani B, Scali V. 1998. Characterization of a Foresti F, Hashimoto DT, Oliveira C, Porto-Foresti F, Foresti F. highly repeated DNA family in tapetinae species ( 2017. High-throughput analysis unveils a highly shared satellite Bivalvia: Veneridae). Zoological Science 15:599–605. DOI: DNA library among three species of fish genus Astyanax. 10.2108/0289-0003(1998)15[599:COAHRD]2.0.CO;2. Scientific Reports 7:12726. DOI: 10.1038/s41598-017-12939-7. Petraccioli A, Odierna G, Capriglione T, Barucca M, Forconi M, Takeuchi T, Kawashima T, Koyanagi R, Gyoja F, Tanaka M, Olmo E, Biscotti MA. 2015. A novel satellite DNA isolated in Ikuta T, et al. 2012. Draft genome of the pearl oyster Pecten jacobaeus shows high sequence similarity among mol- Pinctada fucata: A platform for understanding bivalve biology. luscs. Molecular Genetics and Genomics 290:1717–1725. DNA Research: An International Journal for Rapid DOI: 10.1007/s00438-015-1036-4. Publication of Reports on Genes and Genomes 19:117–130. Petrović V, Pérez-García C, Pasantes JJ, Šatović E, Prats E, Plohl DOI: 10.1093/dnares/dss005. M. 2009. A GC-rich satellite DNA and karyology of the Thomas J, Pritham EJ. 2015. Helitrons, the Eukaryotic rolling- bivalve mollusk Donax trunculus: A dominance of GC-rich circle transposable elements. Microbiology Spectrum 3:1–32. heterochromatin. Cytogenetic and Genome Research 124:63– DOI: 10.1128/microbiolspec.MDNA3-0049-2014. 71. DOI: 10.1159/000200089. Wang X, Li Q, Lian J, Li L, Jin L, Cai H, Xu F, Qi H, Zhang L, Petrović V, Plohl M. 2005. Sequence divergence and conservation Wu F, Meng J, Que H, Fang X, Guo X, Zhang G. 2014. in organizationally distinct subfamilies of Donax trunculus satel- Genome-wide and single-base resolution DNA methylomes lite DNA. Gene 362:37–43. DOI: 10.1016/j.gene.2005.06.044. of the Pacific oyster Crassostrea gigas provide insight into the Pita S, Panzera F, Mora P, Vela J, Cuadrado Á, Sánchez A, evolution of invertebrate CpG methylation. BMC Genomics Palomeque T, Lorite P. 2017. Comparative repeatome analysis 15:1119. DOI: 10.1186/1471-2164-15-1119. on Triatoma infestans Andean and Non-Andean lineages, main Wang Y, Xu Z, Guo X. 2001. A centromeric satellite sequence in vector of Chagas disease. PloS One 12:e0181635. DOI: the Pacific oyster (Crassostrea gigas Thunberg) identified by 10.1371/journal.pone.0181635. fluorescence in situ hybridization. Marine Biotechnology Plohl M, Cornudella L. 1996. Characterization of a complex 3:486–492. DOI: 10.1007/s10126-001-0063-3. satellite DNA in the mollusc Donax trunculus: Analysis of Zhang G, Fang X, Guo X, Li L, Luo R, Xu F, et al. 2012. The sequence variations and divergence. Gene 169:157–164. oyster genome reveals stress adaptation and complexity of shell DOI:10.1016/0378-1119(95)00734-2. formation. Nature 490:49–54. DOI: 10.1038/nature11413.