Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264

Contents lists available at ScienceDirect

Comparative Biochemistry and Physiology, Part D

journal homepage: www.elsevier.com/locate/cbpd

Complete mtDNA of lusoria (: ) reveals the presence of an atp8 gene, length variation and heteroplasmy in the control region

Hongxia Wang a, Suping Zhang a, Yang Li a,b, Baozhong Liu a,⁎ a Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China b Graduate School of the Chinese Academy of Sciences, Beijing 100039, China article info abstract

Article history: The complete nucleotide sequence of the mitochondrial genome of the clam Meretrix lusoria (Bivalvia: Received 12 April 2010 Veneridae) was determined. It comprises 20,268 base pairs (bp) and contains 13 protein-coding genes, Received in revised form 16 July 2010 including ATPase subunit 8 (atp8), two ribosomal RNAs, 22 transfer RNAs, and a non-coding control region. Accepted 19 July 2010 The atp8 encodes a protein of 39 amino acids. All genes are encoded on the same strand. A putative control Available online 3 August 2010 region (CR or D-loop) was identified in the major non-coding region (NCR) between the tRNAGly and tRNAGln. A 1087 bp tandem repeat fragment was identified that comprises nearly 11 copies of a 101 bp motif and Keywords: Meretrix lusoria accounts for approximately 41% of the NCR. The 101 bp tandem repeat motif of the NCR can be folded into a mtDNA stem–loop secondary structure. Samples of eight individuals from Hainan and Fujian provinces were atp8 collected and their NCR regions were successfully amplified and sequenced. The data revealed a highly D-loop polymorphic VNTR (variable number of tandem repeats) associated with high levels of heteroplasmy in the VNTR D-loop region. The size of the CR ranged from 1942 to 3354 bp depending upon the copy number of the Heteroplasmy repeat sequence. © 2010 Elsevier Inc. All rights reserved.

1. Introduction due to its compact nature, maternal inheritance, and fast evolutionary rate compared to nuclear DNA (Moore, 1995). Recently, the number of The genus Meretrix contains few species, most of them have available complete mitogenome sequences has increased consider- commercial importance. The genus is widely distributed along the ably. To date, the whole bivalve mtDNA sequences in NCBI (http:// coastal and estuarine areas of the western Indian Ocean and Western www.ncbi.nlm.nih.gov) represent six orders and nine families within Pacific, including China, North Korea, Japan, and Southeast Asia the class. For the Meretrix genus, complete mitochondrial (mt) (Zhuang, 2001). Though research on the aquaculture and fishery of genome M. petechialis (Ren et al., 2009a)isavailable.More Meretrix species has received much attention, their classification is mitochondrial genomic information from various Meretrix species is ambiguous due to their low morphological variability and similar soft- required to effectively advance this research, because genome level tissue anatomies. Until now, the taxonomy of Meretrix species has characteristics are much more informative than single genes, such as been controversial (Jukes-Browne, 1914; Habe, 1977; Zhuang, 2001). the COI and lrRNA. Compared to other metazoans, the bivalve Among them, the clam Meretrix lusoria is the major commercial mitogenome displays an extraordinary amount of mtDNA variation, marine bivalve species currently being cultured in the benthic for example, the additional copy of the tRNA gene and the loss of tRNA environment of the South China area. Due to its commercial value, (Hoffmann et al., 1992; Milbury and Gaffney, 2005), the duplicated accurate species identification is needed for resource management rrnS and rrnL (Milbury and Gaffney, 2005), and the lack of an atp8 and conservation. (Zbawickaa et al., 2007). Furthermore, bivalve mitogenomes reported The various Meretrix species show morphological parallelism, so far have a high amount of mitochondrial gene rearrangement. systematic descriptions of Meretrix species are often confusing, and Among the currently available taxa, there are very few shared gene the specific name M. meretrix has apparently been used for various blocks. An additional complication in the Bivalvia (seven families: species (Yoosukh and Matsukuma, 2001). Therefore, DNA sequence Mytilidae, Unionidae, Margaritiferidae, Hyriidae, Donacidae, Soleni- information will be optional and hopeful for their identification. dae, and Veneridae) is the unusual mode of inheritance for mtDNA, MtDNA has been used for studying population structure, phylogeo- termed doubly uniparental inheritance (DUI) (Zouros et al., 1994; graphy, and phylogenetic relationships at various taxonomic levels Mizi et al., 2005; Theologidis et al., 2008, and references therein). This biparental transmission system challenged the paradigm of strict maternal inheritance (SMI), and it provides an opportunity for ⁎ Corresponding author. Tel.: +86 532 82898696; fax: +86 532 82898578. studying nuclear-cytoplasmic genome interactions and the evolu- E-mail address: [email protected] (B. Liu). tionary significance of different modes of mitochondrial inheritance

1744-117X/$ – see front matter © 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.cbd.2010.07.003 H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264 257

(Breton et al., 2007). Theologidis et al. (2008) discussed that the ation at 94 °C for 2 min followed by denaturation at 94 °C for 20 s, sporadic detection DUI phenomenon in bivalvia might be due to the annealing at 56 °C for 40 s, and extension at 72 °C for 7 min, for 34 difficulty of detection, as an alternative to the genuine lack of DUI. So it cycles. The amplified lengths of the three PCR products were suggested that DUI might be far more common than is currently approximately 7.5 kb, 4.6 kb, and 7.8 kb, respectively. thought. So far, whether the DUI phenomenon is present in genus Meretrix or not is a puzzle. 2.3. Cloning and DNA sequencing In addition, the mitochondrial DNA length variation and hetero- plasmy is a common phenomenon among taxa, including fish The three long PCR products were sub-cloned and sequenced using species (Cesaroni et al., 1997), mammals (Mignotte et al., 1990; primer walking. The PCR products were sheared into 1–2kb Wilkinson and Chapman, 1991), lizards (Moritz and Brown, 1987), fragments using a JY92-II ultrasonic cell crusher (Ningbo Scientz crickets (Rand and Harrison, 1989), nematodes (Powers et al., 1986), Biotechnology co., Ltd.). Fragments of approximately 1.5 kb were etc. In bivalves, the sea scallop Placopecten magellanicus (La Roche et recovered by QIAquick gel extraction kit (QIAGEN). Subsequently, the al., 1990), M. petechialis (Ren et al., 2009a) and genus Mytilus (Cao et 5′ ends of the purified DNA fragment were phosphorylated and the 3′ al., 2009) also have been reported about the existence of variable copy ends were blunted using the TaKaRa BKL Kit (Takara), followed by number of a repeated sequence and heteroplasmy in the mtDNA. A ligation into the pUC118-Hinc II vector, according to the manufac- comparison of the characteristics of the repeated arrays in animal turer's instructions. The ligation products were transferred into mtDNA might help to explain the generation and maintenance of this competent Escherichia coli DH10α cells using the heat shock method. source of genetic diversity. Three sub-clone plasmid libraries were constructed corresponding to In this paper, we report the complete nucleotide sequence of the the three long PCR products. The recombinant plasmids were screen mitochondrial genome from the clam M. lusoria, which exhibits via the white-blue plaque plate method (contain X-gal, IPTG, Amp LB), mtDNA size variation and heteroplasmy. We also detected the atp8 and the positive white colonies which contained inserts of about gene encoding a protein of 39 amino acids with complete start and 1.5 kb were screened by PCR using the vector universal primers, M13F stop codons in the genus Meretrix species. Furthermore, the boundary and M13R. Subsequently, the plasmids were extracted and sequenced annotations of rrnS and rrnL in some species of Veneroida were on an ABI 3730x1 DNA Analyzer. The complete mtDNA sequence was revised. The present study provides data that contributes to a better mostly obtained by clone sequencing and, to a lesser degree, by the understanding of the phylogenic relationships of genus Meretrix, and primer walking strategy for gap filling and correction. supply useful M. lusoria specific markers for identification. Also, this work may facilitate studies on population structure, genetic diversity, 2.4. VNTR amplification and sequence and broodstock management of the hard clam M. lusoria. Characterization of the M. lusoria VNTRs (variable number of 2. Materials and methods tandem repeats) was done through PCR amplification of the repeat region from two populations Sanya and Xiamen. A pair of species- 2.1. Sample collection and DNA isolation specific primers NCR1218 F (5′-AGT TGG GCG TTA ATC ATA GGG-3′) and trn-Gln R (5′-CAA AAA CCA AAC AAC TAC AC-3′) was designed to M. lusoria specimens were obtained from Sanya, Hainan, China (E amplify the tandem repeat region (Fig. 1A). For each individual 109°32′ N 18°13′) and Xiamen, Fujian, China (E 118°06′ N 24°27′). sample, about 20 ng of total genomic DNA was used to amplify the The adductor muscle was slit and the whole adult clam was VNTR of the NCR region, the DNA of the eight individuals come from immediately preserved in 95% ethanol. Total genomic DNA was 2.1. PCR conditions were as follows: initial denaturation at 94 °C for extracted from the foot muscle tissues of eight individuals using a 2 min followed by denaturation at 94 °C for 20 s, annealing at 55 °C for DNeasy tissue DNA extraction kit (Promega) following the manufac- 40 s, and extension at 72 °C for 2 min, for 34 cycles. The PCR product turer's instructions and stored at −20 °C. sizes were determined by 1.5% agarose gel electrophoresis and ethidium-bromide staining. Additional tests were carried out to 2.2. Long PCR amplification of mtDNA reduce the probability of scoring PCR artifacts. For individual amplifications that produced more than one band at the standard Initial determination of the partial mitochondrial genome annealing temperature, one specimen was reamplified several times sequences (cox1, rrnL, and nad5) was performed using three universal using annealing temperatures from 50 °C to 57 °C. Then the fragments primer pairs: LCO+HCO (Folmer et al., 1994), 16Sar-5′+16Sbr-3′ were isolated, reamplified separately, and sequenced, if the single (Burger et al., 2007), and nad5F+nad5R (Lavrov et al., 2004). The PCR fragment reamplification produced the entire multiple pattern, that products of the three primer pairs were gel purified using a QIAquick fragment was discarded. GEL Extraction kit (QIAGEN) and directly sequenced. The whole mitochondrial genomes were then amplified using a long PCR 2.5. Sequence analysis protocol (Cheng et al., 1994). Based on the obtained partial sequence, three specific primer pairs: M-cox1-cytb-F (5′-TGG TGC TTC TTC TAT Base calling was performed with phred (Ewing and Green, 1998; TAT GTC TGG TAT T-3′)(Ren et al., 2009a) and cox1-rnl-R (5′-CAG TCT Ewing et al., 1998) and sequence reads were assembled in phrap with CTC CGT GTC CAA CCA TTC ATA C-3′); rnl-nad5-F (5′-GTA TGA ATG default parameters. All assembled sequences were manually checked GTT GGA CAC GGA GAG ACT G-3′) and rnl-nad5-R (5′-AGT CAC CAA using Consed to remove misassemblies (Gordon et al., 1998). The AGT AGA AGA GTG TAC CAA-3′); and nad5-cox1-F (5′-TTG GTA CAC locations of the 13 protein-coding genes and two rRNA genes were TCT TCT ACT TTG GTG ACT-3′) and nad5-cox1-R (5′-AGC AAA AAA determined by comparisons with nucleotide or amino acid sequences CCA GTC ACA GCA ATA CAC C-3′), were designed using Primer of previously determined complete mtDNA sequences from other Premier 5.0 (http://www.premierbiosoft.com/) to amplify the entire Veneridae clams. The majority of tRNA genes were identified using mitochondrial genome in three long PCR reactions. Long-distance PCR tRNAscan-SE 1.21 (Lowe and Eddy, 1997) with the invertebrate was carried out on a TaKaRa PCR thermal cycler (Dice Model TP600 mitochondrial genetic code in default search mode, and the remaining Takara Bio Inc.). Each 25-μl reaction volume PCR contained approx- tRNA genes were identified by the ARWEN program in default search imately 50 ng DNA, 1×LA PCR buffer II (Mg2+ plus, Takara), 0.2 mM of mode (http://130.235.46.10/ARWEN/)(Laslett and Canbäck, 2008). each dNTP, 0.2 μM of each primer, and one unit of LATaq polymerase Repeat sequences were identified using Tandem Repeats Finder (Takara). The thermal cycling profile was as follows: initial denatur- (Benson, 1999)(http://tandem.bu.edu/trf/trf.html). All sequences 258 H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264

Fig. 1. The structure of NCR and the VNTR elements of the control regions in the Veneroida mitochondrial genomes. The gray sections indicate the tandem repeat regions. Sequence segments are not drawn to scale ⇄ indicate the location of species-specific primers NCR1218 F and trn-GlnR in the NCR.

were aligned using ClustalW (Thompson et al., 1994). In addition, the presence and location of a transmembrane hydrophobic helix in the Table 1 ATP8 protein were investigated using the DAS software (Cserzo et al., Characteristics of the mitogenome of M. lusoria. 1997)(http://www.sbc.su.se/~miklos/DAS/). Prediction of potential Gene Position Size Codon Intergenic secondary structures was performed by the online version of the nucleotidesa mfold software, version 3.2 (Zuker, 2003). Codon usage was analyzed From To Nucleotides Amino Start Stop with MEGA 4 (Tamura et al., 2007). The complete mtDNA sequence acids was deposited in the GenBank database under accession number COI 1 1740 1740 579 ATG TAA 77 GQ903339. tRNALeu 1788 1852 65 47 (CUN) 2.6. Phylogenetic analysis ND1 1853 2755 903 300 ATG TAG 0 ND2 2891 3946 1056 351 ATG TAG 135 ND4L 4012 4311 300 99 ATA TAA 65 Sequence data from 7 species of Veneroida were included in the tRNAIle 4392 4457 66 80 analyses (M. lusoria, M. petechialis (NC_012767), V. philippinarum (NC tRNAAsp 4521 4585 65 63 003354), S. constricta (NC_011075), L. lacteus (NC_013271), L. COII 4654 5814 1161 386 ATG TAG 68 tRNAPro 5840 5905 66 25 divaricata (NC_013275) and A. tuberculata (NC_008452)). The Poly- Cyt b 5971 7197 1227 408 ATA TAG 65 placophora species Katharina tunicata served as outgroups. The amino 16S rRNA 7198 8641 1444 0 acids of 12 protein-coding genes (except the atp8 gene) were ATPase8 8642 8761 120 39 ATG TAG 0 subjected to concatenated alignments using ClustalW with the default ND4 8763 10,124 1362 453 ATG TAA 1 His settings (Thompson et al., 1994). Phylogenetic trees were built by tRNA 10,134 10,195 62 9 tRNAGlu 10,195 10,260 66 −1 maximum-likelihood (ML) analysis using PhyML 3.0 (Guindon and tRNASer 10,258 10,322 65 −3 Gascuel, 2003). The model WAG was chosen for the amino acid (UCN) dataset by the ProtTest version 1.4 (Abascal et al., 2005). For ML ATPase6 10,323 11,171 849 282 ATG TAA 0 analysis, 1000 bootstraps were used to estimate the node reliability. ND3 11,198 11,632 435 144 ATG TAG 26 ND5 11,707 13,407 1701 566 ATA TAG 74 Phylogenetic relationships among Veneroida were reconstructed ND6 13,458 13,988 531 176 GTA TAA 50 based on genome arrangement data using Bayesian inferences with tRNATrp 14,031 14,095 65 42 the Badger program (Simon and Larget, 2004). tRNAMet 14,098 14,164 67 2 tRNAVal 14,268 14,335 68 103 Lys 3. Results and discussion tRNA 14,370 14,436 67 34 tRNAPhe 14,461 14,528 68 24 tRNALeu 14,535 14,600 66 6 3.1. Genome size, base composition and codon usage (UUR) tRNAGly 14,628 14,691 64 27 Gln The complete mitochondrial genome of M. lusoria is a circular tRNA 17,339 17,406 68 2647 tRNAArg 17,461 17,526 66 54 molecule of 20,268 base pairs (bp), with 67.9% AT content. The tRNAAsn 17,535 17,596 62 8 genome contains 37 genes, including 13 protein-coding genes, two tRNAThr 17,686 17,750 65 89 ribosomal RNA genes, and 22 transfer RNA genes (Table 1). The 37 12S rRNA 17,751 18,776 1026 0 genes add up to a length of 12,291 bp, accounting for 60.6% of the tRNACys 18,935 19,001 67 158 Tyr genome. All the genes are encoded on the same strand. In addition, tRNA 19,029 19,095 67 27 tRNASer 19,143 19,207 65 47 4062 of the nucleotides in the genome are non-coding, with the (AGN) largest single non-coding region (2647 bp) located between the COIII 19,209 20,114 906 301 ATG TAG 1 tRNAGly and the tRNAGln. The other 29 non-coding regions range in size tRNAAla 20,123 20,191 69 8 from 1 bp to 158 bp. There are two overlapping regions located a Numbers correspond to the nucleotide positions separating different loci. Negative between three adjacent tRNA genes in the genome, the genes for numbers indicate overlapping nucleotides between adjacent loci. H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264 259 tRNAHis and tRNAGlu overlap by 1 bp, and tRNAGlu and tRNASer (UCN) quite heterogeneous in length among organisms (Gissi et al., 2008). overlap by 3 bp. The 39 aa ATP8 in M. lusoria is significantly shorter than the length of The M. lusoria mt genome encodes 4084 amino acids, excluding typical metazoan ATP8 proteins ranged from 50 to 65 aa (Jameson et stop codons, for all the protein-coding genes. In M. lusoria all codons al., 2003). The long ATP8 protein of 112 aa has been found in Mytilus are used, but in varying frequencies. The most frequent codon is UUU spp. (Breton et al., 2010). Although this gene is not highly conserved in (Phe; n=418), followed by GUU (Val; n=274). The lowest frequency the primary sequence, it is more conserved in the secondary structure, was observed for CGC (Arg; n=1). Third codon positions show a which is characterized by a hydrophobic N-terminus domain and a strong bias for the bases A and U, which is a common feature of most positively charged C-terminus domain (Papakonstantinou et al., 1996; bivalve genomes. In M. lusoria, A+U is present at the third position in Gray, 1999). An alignment of atp8 amino acid sequences from bivalve 3023 codons (74%); this bias appears to be similar to M. petechialis clams showed that M. lusoria starting with MAQF have weak similarity (75.6%), and is slightly higher than that of other bivalves (Ren et al., with MPQL motif at the N-terminus, but they all have a similar 2009a). transmembrane spanning domain. The predicted transmembrane helix is in a similar position, and positively charged amino acids are 3.2. Protein-coding genes, the putative atp8 observed in the C-terminal region (Fig. 2). A similar ORF is present in the same position in M. petechialis mtDNA, which was originally The M. lusoria mt genome encodes the full set of 13 proteins. reported as part of rrnL (Ren et al., 2009a). Though it showed a high Except for the nad6, which starts with a GTA codon, all the protein- degree of variation (22%–48% identities) in ATP8 amino acid coding genes start with ATG or ATA (ATN), which is typical for composition comparing the M. lusoria with other species of bivalve metazoan mitogenomes. Five genes are terminated by TAA and seven clams, the ATP8 of M. lusoria shared 97.4% amino acid similarity with by TAG (Table 1). No incomplete stop codons were identified. the ORF of M. petechialis. Our results support the presence of atp8 in M. We identified a short open reading frame located between the rrnL lusoria and M. petechialis. and nad4 genes, which encodes the complete atp8 in M. lusoria. The Commonly, the atp6 and atp8 are found adjacent to one another on putative atp8 encoded a protein of 39 amino acids that starts with the the same strand in metazoan mt genomes. In several taxa, e.g. methionine and ends with a complete stop codon. The annotation of Platyhelminthes, Nematoda and , the atp8 gene is missing atp8 in bivalves and nematodes has been hampered for its high amino from the mt genome (Gissi et al., 2008). Putative atp8 appear acid variability and the lack of conserved MPQL motif at the N- sporadically in some species. Lavrov and Brown (2001) reported a terminus, which is seemed as a common characteristic that might similarly short atp8 encoding a 41 amino acid protein in the nematode enable detection of the gene (Breton et al., 2010). In addition to its high Trichinella spiralis. Dreyer and Steiner (2006) reported a putative atp8 degree of variation in amino acid composition, the ATP8 protein is (encoding a 53 aa protein) in Hiatella arctica and a short atp8

Fig. 2. A) Alignment of ATP8 amino acid sequences from bivalvia. Highly conserved amino acids are indicated by asterisks (*) and positively charged amino acids are shaded gray. Transmembrane domains are underlined. These sequences were obtained from NCBI under the following accession numbers: Cristaria plicata (NC_012716), Pyganodon grandis (NC_013661), Hyriopsis cumingii (NC_011763), Lampsilis ornata (NC_005335), Quadrula quadrula (NC_013658), Venustaconcha ellipsiformis (NC_013659), Lucinella divaricata (NC_013275), Loripes lacteus (NC_013271), M. petechialis (NC_012767), V. philippinarum (NC_003354) and Hiatella arctica (NC_008451). B) Prediction of transmembrane helices in ATP8 of M. lusoria by TMHMM Server v. 2.0. 260 H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264

(encoding a 27 aa protein) for the scaphopod Siphonodentalium evolution (Dreyer and Steiner, 2006). It is possible that this M. lusoria lobatum. Serb and Lydeard (2003) discussed a non-functional version situation represents an evolutionary stepping stone from the fully of the atp8 in the freshwater mussel Inversidens, and Milbury and functional atp6–atp8 coupling to decoupled but complete genes. Gaffney (2005) described a potential remnant of the atp8 in the eastern oyster Crassostrea virginica. These patterns of atp6–atp8 3.3. Transfer and ribosomal RNA genes coupling, separated atp6 and atp8, truncated atp8, and complete loss of the atp8 in these metazoan mt genomes clearly indicate that the loss Twenty-two tRNA genes are predicted in the M. lusoria mitogen- of the atp8 occurred independently several times in the metazoan ome, which is typical for metazoans. They are interspersed in the mt

Fig. 3. Cloverleaf structures of the 22 tRNA genes in the mitochondrial genome of M. lusoria and the potential secondary structure of a newly identified tRNA-Ser(AGN) in M. petechialis. H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264 261 genome, and range in size from 62 to 69 nucleotides. They can all be of rrnS in Veneroida clams also suggested that the more accurate folded to the typical cloverleaf secondary structures, with several putative lengths of rrnS are 1036 bp (13,141–14,176) in V. philippi- mismatch pairs within acceptor and anticodon stems (Fig. 3). tRNASer narum, 1028 bp (16,949–17,976) in M. petechialis, and 923 bp (8466– (AGN) lacks the DHU arm; however, this feature has been frequently 9388) in S. constricta. Thus, a non-coding region was added in the V. observed in metazoan mtDNAs (Wolstenholme, 1992). A highly philippinarum and M. petechialis mitochondrial genomes, respectively, homologous sequence annotated as a non-coding region between the truncating the previously long rrnS. On the other hand, the S. tRNATyr and the cox3 in the M. petechialis genome, which also can be constricta rrnS needs to be extended by 12nt 3′ downstream of the folded into tRNASer (AGN). Thus, 23 tRNA genes, including the 12S rRNA gene inside the cox3 as defined by the author. standard complement of 22 tRNA genes and a duplication of trnQ, As for the rrnL, its boundaries have been tentatively defined as can be identified in M. petechialis, which differed from the previously immediately adjacent to the ends of the flanking genes, cytb and atp8. reported data. The new putative tRNA gene of M. petechialis is shown Boundaries were identified by sequence similarity to other Veneroida in Fig. 3. rrnL and by searching for the well-conserved heptamer TGGCAGA

It is necessary to re-annotate the boundary of rrnS in some species (N)5G box, the mitochondrial rRNA transcription termination signal, of Veneroida. The alignment of Veneroida rrnS shows great differences which is well conserved near the 3′ end of rrnL in a wide range of in length at the 3′ ends of the predicted genes. The 12S of V. organisms (Valverde et al., 1994). This motif is present in the 5′ end of philippinarum and M. petechialis mtDNA are 1249 bp and 1187 bp in the atp8 in M. lusoria, which is adjacent to the rrnL (Fig. 5). The same length, respectively, which are longer than those of Sinonovacula situation also occurs in the close relatives M. petechialis and V. constricta (911 bp), L. lacteus (840 bp), L. divaricata (836 bp), and philippinarum, where the putative atp8 was annotated as part of rrnL Acanthocardia tuberculata (824 bp). We observed a similar 23 bp-long by the authors. The heptamer termination motif is also found in the sequence in the alignment of Veneroida rrnS (Fig. 4A). The element atp8 in S. constricta, in the tRNATyr in L. lacteus and L. divaricata, and in (23 bp-long sequence) was used in the identification of their Ciona the tRNAAsn of the A. tuberculata mitochondrial genome. The first 7 nt intestinalis rrnS gene (Gissi et al., 2004); this motif is reported as part of the box deduced for vertebrates, arthropods, and echinoderma are of a small stem–loop structure, with the loop implicated in tertiary maintained, with two to four mismatches, in the seven Veneroida interactions in the human rRNA secondary structure (Cannone et al., clams (Fig. 5). The gene arrangement appeared not to be identical 2002). Furthermore, this structure is conserved in Caenorhabditis among Veneroida clams, but the occurrence of the heptamer motif elegans and ascidians rrnS, although the primary sequence is not near the 3′ end of the large rRNA is independent of the adjacent gene. identical (Gissi et al., 2004). We also found that the conserved motif in The putative boundaries of M. lusoria rrnS and rrnL cannot be precisely the Veneroida rrnS alignment could be fold into a stem–loop structure determined until transcript mapping is carried out. (10-bp stem, 4-nucleotide loop) (Fig. 4B), which is similar to the 3′ subterminal stem–loops (10-bp stems, 4-nucleotide loops) found in 3.4. The large non-coding region, VNTR and heteroplasmy the secondary structure models for all 16S and 16S-like RNAs, including the mitochondrial rrnS from C. elegans and Drosophila virilis The mt genome of M. lusoria contains a large non-coding region of (Cannone et al., 2002). This stem–loop structure was also used as the up to 2647 bp, which is longer than those of other Veneroida clams, basis to infer the 3′ terminus of rrnS in Crassostrea virginica (Milbury such as M. petechialis, L. lacteus, L. divaricata, V. philippinarum, S. and Gaffney, 2005). In M. lusoria, the stem–loop is formed by constricta, and A. tuberculata. The NCR has a slight higher AT content nucleotides 18,749 bp to 18,776 bp, leaving a 4-nucleotide tail (68.2%) than the average value of the whole genome (67.9%) of M. (Fig. 4B). Thus, we inferred that the rrnS gene is 1026 nucleotide lusoria. Two distinct tandem repeat units were found in the largest long (17,751–18,776). The strong sequence similarity of the 3′ termini non-coding region of the M. lusoria mt genome. One comprises 2.2

Fig. 4. A) Alignment of the 3′ region sequences of small subunit ribosomal DNAs. The numbers indicate the 5′ and 3′ boundary positions of rrnS published in NCBI. The number in brackets indicates the modified 3′ end boundary positions in this article. Positions involved in the stem–loop structure are shaded gray. These sequences were obtained from NCBI under the following accession numbers: M. petechialis (NC_012767), V. philippinarum (NC_003354), A. tuberculata (NC_008452), L. lacteus (NC_013271), L. divaricata (NC_013275) and S. constricta (NC_011075). B) Stem–loop structure of the 3′ end of rrnS folded by the gray shaded sequences. 262 H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264

Fig. 5. Conservation of the rrnL gene termination signal in mtDNA from Veneroida clams. The gene or region of the mitochondrial genome immediately 3′ to the rrnL is shown schematically on the right; the conserved heptanucleotide sequence is boxed and hatched. The gray shading indicates the mismatched base pair with the first 7 nt of the conserved heptanucleotide sequence. These sequences were obtained from NCBI under the following accession numbers: M. petechialis (NC_012767), V. philippinarum (NC_003354), S. constricta (NC_011075), L. lacteus (NC_013271), L. divaricata (NC_013275) and A. tuberculata (NC_008452). nearly identical copies of a 15 bp unit, which occurs 506 nt from the 5′ covering three adjacent non-coding regions (Ren et al., 2009a). Each end of the non-coding region. The other comprises 10 nearly identical of the motifs contains one tRNAGln; thus the second copy of tRNAGln copies of a 101 bp sequence and one incomplete motif with direct occurs in the duplicated motif (Fig. 1B). The two closely related tandem orientation, which is located at the 3′ end of this region species M. lusoria and M. petechialis show great divergence in the (Fig. 1A). Furthermore, the 101 bp tandem repeat motif of the NCR tandem repeat region. Moreover, the designed M. lusoria specific forms a secondary structure with a stem–loop when the sequence is primers in the NCR region (Fig. 1A) failed to amplify a product in M. folded to minimize the free energy of the structure (Fig. 6). The model petechialis as expectation, for their great sequence differences in the of the repeat array organization is different in different animal region. It should be a useful M. lusoria marker for specific mtDNAs. For example, in European sea bass (Dicentrarchus labrax), identification. V. philippinarum has 2.5 copies of imperfect 80 bp the array consists of two complete repeat elements, 17 bp and 48 bp unit and four tandem repeats of imperfect 203 bp between the nad2 motifs, which are separated by an ‘imperfect’ 17-bp repeat (Cesaroni and the nad4L gene (Fig. 1C). A. tuberculata contains a 599 bp et al., 1997). In crickets and nematode the array consists of two or fragment composed of 3.5 nearly identical copies of a 167 bp motif more ‘complete’ repeats flanked with ‘incomplete’ runs of the repeat between tRNAMet and tRNAHis (Fig. 1D). (La Roche et al., 1990). Tandem repeats are also present in other The pattern of VNTRs (variable number of tandem repeats) in the Veneroida mitochondrial genomes e.g., M. petechialis has a 585-bp NCR region of M. lusoria mtDNA showed that the copy number of the tandem repeat comprising two nearly identical motifs in the area 101 bp tandem repeat element varied from 4–18 copies when the sequence was compared among eight individuals from Sanya and Xiamen populations. The size of the CR (control region) ranged from 1942 to 3354 bp depending on the copy number of the repeat sequence. Three of the eight individuals were heteroplasmic, in which the amplifications detected multiple bands. These individuals carried 2–5 length variants. More populations and individual samples are required to assess the heteroplasmy, genetic diversity between populations of M. lusoria.

3.5. Gene arrangement and phylogenetic analyses

Generally, gene order rearrangements in mt genomes are rare in metazoa, and, if shared by two taxa, can be considered molecular synapomorphies and might provide useful data for phylogenetic reconstruction (Grande et al., 2008). On the other hand, mollusca, the second largest animal phylum, showed frequent and extensive variation in gene arrangement (Ren et al., 2009b). Bivalves show more gene order variation in the mt genome than other sampled mollusca (Serb and Lydeard, 2003). The gene orders of the two scallop species Argopecten irradians and Chlamys farreri appear almost completely rearranged (Ren et al., 2009b). Here, we focus on the gene arrangement of M. lusoria in the order Veneroida. The gene arrangement of M. lusoria mtDNA was completely identical to that of M. petechialis, except for the duplicated tRNAGln in M. petechialis.Ifwe ignore the frequent translocation of tRNA genes and the case of the lack of the atp8, the gene arrangements among M. petechialis, M. petechialis, and V. philippinarum, which belong to the same family, Fig. 6. Stem–loop structure of the tandem repeat motif in the large non-coding region of Veneridae, are the same except for translocation of the rrnS and cox3. M. lusoria. In addition, L. divaricata and L. lacteus mtDNAs share the same gene H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264 263

Fig. 7. Phylogenetic relationships among Veneroida from the ML analysis based on data set of 12 protein-coding genes (except atp8) of mitochondrial genomes and gene arrangement map of Veneroida mitochondrial genomes. All genes are transcribed from left-to-right. The bars show identical gene blocks. The tRNA and atp8 gene are not represented. Gene segments are not drawn to scale. arrangement in the same family, Lucinidae. However, even without References considering the tRNA and atp8 genes, the gene arrangements of the Abascal, F., Zardoya, R., Posada, D., 2005. ProtTest: selection of best-fit models of protein Veneroida order have many potential autapomorphic characteristics. evolution. Bioinformatics 21, 2104–2105. No general patterns could be inferred for the observed intra-order Benson, G., 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic diversity of gene arrangements in the Veneroida. The Veneridae Acids Res. 27, 573–580. Breton, S., Beaupré, H.D., Stewart, D.T., Hoeh, W.R., Blier, P.U., 2007. The unusual system family shares only one gene block, cytb-rrnL-nad4-atp6 with A. of doubly uniparental inheritance of mtDNA: isn't one enough? Trends Genet. 23, tuberculata. The gene block 12s-cox3-cox1-nad1 is complete in M. 465–474. lusoria and M. petechialis, whereas two small separated gene blocks, Breton, S., Stewart, D.T., Hoeh, W.R., 2010. Characterization of a mitochondrial ORF from fi 12s-cox3 and cox1-nad1, appear in S. constricta. The gene order the gender-associated mtDNAs of Mytilus spp. (Bivalvia: Mytilidae): identi cation of the “missing” ATPase 8 gene. Mar. Geonomics 3, 11–18. patterns are completely different between the three families, Burger, G., Lavrov, D.V., Forget, L., Lang, B.F., 2007. Sequencing complete mitochondrial Veneridae, Lucinidae, and Solecurtidae (Fig. 7). Based on all the and plastid genomes. Nat. Protoc. 2, 603–614. available Veneroida data so far is shown (http://www.ncbi.nlm.nih. Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R., D'Souza, L.M., Du, Y., Feng, B., Lin, N., Madabusi, L.V., Muller, K.M., Pande, N., Shang, Z., Yu, N., Gutell, R.R., 2002. gov), maximum-likelihood (ML) analysis was conducted on amino The Comparative RNA Web (CRW) site: an online database of comparative acid data from 12 protein-coding genes (except atp8) to generate a sequence and structure information for ribosomal, intron, and other RNAs: putative topology (Fig. 7). Phylogenetic tree showed that Veneridae correction. BMC Bioinform. 3, 15. Cao, L., Ort, B.S., Mizi, A., Pogson, G., Kenchington, E., Zouros, E., Rodakis, G.C., 2009. The family species including M. petechialis, M. lusoria and V. philippinarum control region of maternally and paternally inherited mitochondrial genomes of were clustered together, and then had a closer relationship with three species of the sea mussel genus Mytilus. Genetics 181, 1045–1056. Cardioidea family, Tellinoidea family and Lucinoidea family. The gene Cesaroni, D., Venanzetti, F., Allegrucci, G., Sbordoni, V., 1997. Mitochondrial DNA length variation and heteroplasmy in natural populations of the European Sea Bass, order matrix data, excluding the tRNA and atp8 genes, also supported Dicentrarchus labrax. Mol. Biol. Evol. 14, 560–568. the relationships within the Veneroida (data not shown). Clearly, an Cheng, S., Chang, S.Y., Gravitt, P., Respress, R., 1994. Long PCR. Nature 369, 684–685. analysis that uses both arrangement data as well as sequence data has Cserzo, M., Wallin, E., Simon, I., von Heijne, G., Elofsson, A., 1997. Prediction of transmembrane alpha-helices in procariotic membrane proteins: the Dense the potential to be more informative. Other mitochondrial informa- Alignment Surface method. Prot. Eng. 10, 673–676. tion, such as loss of the atp8, duplication events, and truncations, are Dreyer, H., Steiner, G., 2006. The complete sequences and gene organisation of the potentially useful and promising for inferring the phylogeny of the mitochondrial genomes of the heterodont bivalves Acanthocardia tuberculata and — fi major bivalve taxa. Hiatella arctica and the rst record for a putative Atpase subunit 8 gene in marine bivalves. Front. Zool. 3, 13. Ewing, B., Green, P., 1998. Basecalling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194. Acknowledgments Ewing, B., Hillier, L., Wendl, M., Green, P., 1998. Basecalling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 8, 175–185. Folmer, O., Black, M., Hoeh, R., Lutz, R.A., Vrijenhoek, R., 1994. DNA primers for This work was supported by the Chinese National High-Tech R & D amplification of mitochondrial cytochrome c oxidase subunit I from diverse Program (2006AA10A410), the National Basic Research Program of metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 3, 294–299. Gissi, C., Iannelli, F., Pesole, G., 2004. Complete mtDNA of Ciona intestinalis reveals China (2010CB126403) and the Knowledge Innovation Program of extensive gene rearrangement and the presence of an atp8 and an extra trnM gene Institute of Oceanology, CAS under contract No. 2007-12. in Ascidians. J. Mol. Evol. 58, 376–389. 264 H. Wang et al. / Comparative Biochemistry and Physiology, Part D 5 (2010) 256–264

Gissi, C., Iannelli, F., Pesole, G., 2008. Evolution of the mitochondrial genome of Metazoa Powers, T.O., Platzer, E.G., Hyman, B.C., 1986. Large mitochondrial genome and as exemplified by comparison of congeneric species. Heredity 101, 301–320. mitochondrial DNA size polymorphism in the mosquito parasite, Romanomemis Gordon, D., Abajian, C., Green, P., 1998. Consed: a graphical tool for sequence finishing. culivorax. Cur. Genet. 11, 7l–77l. Genome Res. 8, 195–202. Rand, D.M., Harrison, R.G., 1989. Molecular population genetics of mtDNA size variation Grande, C., Templado, J., Zardoya, R., 2008. Evolution of gastropod mitochondrial in crickets. Genetics 121, 551–569. genome arrangements. BMC Evol. Biol. 2008 (8), 61. Ren, J., Shen, X., Sun, M., Jiang, F., Yu, Y., Chi, Z., Liu, B., 2009a. The complete Gray, M.W., 1999. Evolution of organellar genomes. Curr. Opin. Genet. Dev. 9, 678–687. mitochondrial genome of the clam Meretrix petechialis (Mollusca: Bivalvia: Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate large Veneridae). Mitochondrial DNA 20, 78–87. phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. Ren, J., Shen, X., Jiang, F., Liu, B., 2009b. The mitochondrial genomes of two scallops, Habe, T., 1977. Systematics of Mollusca in Japan, Bivalvia and Scaphopoda (in Japanese) Argopecten irradians and Chlamys farreri (Mollusca: Bivalvia): the most highly Hokyoryukan, Tokyo. , pp. 147–270. rearranged gene order in the family Pectinidae. J. Mol. Evol. 70, 57–68. Hoffmann, R.J., Boore, J.L., Brown, W.M., 1992. A novel mitochondrial genome Serb, J.M., Lydeard, C., 2003. Complete mtDNA sequence of the north American organization for the blue mussel, Mytilus edulis. Genetics 131, 397–412. freshwater mussel, Lampsilis ornata (Unionidae): an examination of the evolution Jameson, D., Gibson, A.P., Hudelot, C., Higgs, P.G., 2003. OGRe: a relational database for and phylogenetic utility of mitochondrial genome organization in Bivalvia comparative analysis of mitochondrial genomes. Nucleic Acids Res. 31, 202–206. (Mollusca). Mol. Biol. Evol. 20, 1854–1866. Jukes-Browne, A.J., 1914. A synopsis of the family Veneridae Part I andII. Proc. Malac. Simon, D.L., Larget, B., 2004. BADGER, version 1.01 beta. Department of Mathematics Soc. London 11, 58–94. and Computer Science, Duquesne University. [http://badger.duq.edu/]. La Roche, J., Snyder, M., Cook, D.I., Fuller, K., Zouros, E., 1990. Molecular characterization Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: molecular evolutionary genetics of a repeat element causing large-scale size variation in the mitochondrial DNA of analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599. the sea scallop Placopecten magellanicus. Mol. Biol. Evol. 7, 45–64. Theologidis, I., Fodelianakis, S., Gaspar, M.B., Zouros, E., 2008. Doubly uniparental Laslett, D., Canbäck, B., 2008. ARWEN: a program to detect tRNA genes in metazoan inheritance (DUI) of mitochondrial DNA in Donax trunculus (Bivalvia: Donacidae) mitochondrial nucleotide sequences. Bioinformatics 24, 172–175. and the problem of its sporadic detection in Bivalvia. Evolution 62, 959–970. Lavrov, D.V., Brown, W.M., 2001. Trichinella spiralis mtDNA: a nematode mitochondrial Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity genome that encodes a putative ATP8 and normally structured tRNAS and has a of progressive multiple sequence alignment through sequence weighting, position- gene arrangement relatable to those of coelomate metazoans. Genetics 157, specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. 621–637. Valverde, J.R., Marco, R., Garesse, R., 1994. A conserved heptamer motif for ribosomal Lavrov, D.V., Brown, W.M., Boore, J.L., 2004. Phylogenetic position of the Pentastomida RNA transcription termination in animal mitochondria. Proc. Natl Acad. Sci. USA 91, and (pan) crustacean relationships. Proc. Biol. Sci. 271, 537–544. 5368–5371. Lowe, T.M., Eddy, S.R., 1997. tRNAscan-SE: a program for improved detection of transfer Wilkinson, G.S., Chapman, A.M., 1991. Length and sequence variation in evening bat D- RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964. loop mtDNA. Genetics 128, 607–617. Mignotte, F., Gueride, M., Champagne, A.M., Mounolou, J.C., 1990. Direct repeats in the Wolstenholme, D.R., 1992. Genetic novelties in mitochondrial genomes of multicellular noncoding region of rabbit mitochondrial DNA: involvement in the generation of . Curr. Opin. Genet. Dev. 2, 918–925. intra and inter-individual heterogeneity. Eur. J. Biochem. 194, 561–571. Yoosukh, W., Matsukuma, A., 2001. Taxonomic study on Meretrix (Mollusca: Bivalvia) Milbury, C.A., Gaffney, P.M., 2005. Complete mitochondrial DNA sequence of the from Thailand. Spec. Publ. Phuket. Mar. Biol. Cent. 25, 451–460. eastern oyster Crassostrea virginica. Mar. Biotechnol. 7, 697–712. Zbawickaa, M., Burzyńskia, A., Wenne, R., 2007. Complete sequences of mitochondrial Mizi, A., Zouros, E., Moschonas, N., Rodakis, G.C., 2005. The complete maternal and genomes from the Baltic mussel Mytilus trossulus. Gene 406, 191–198. paternal mitochondrial genomes of the Mediterranean mussel Mytilus gallopro- Zhuang, Q., 2001. Fauna Sinica: Invertebrata: Mollusca: Bivalvia: Veneridae. Science vincialis: implications for the doubly uniparental inheritance mode of mtDNA. Mol. Press, Beijing, China, pp. 229–236. Biol. Evol. 22, 952–967. Zouros, E., Oberhauser, B.A., Saavedra, C., Freeman, K.R., 1994. An unusual type of Moore, W.S., 1995. Inferring phylogenies from mtDNA variation: mitochondrial gene mitochondrial DNA inheritance in the blue mussel Mytilus. Proc. Natl Acad. Sci. USA trees versus nuclear-gene trees. Evolution 49, 718–726. 91, 7463–7467. Moritz, C., Brown, W.M., 1987. Tandem duplications in animal mitochondrial DNAs: Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridization prediction. variation in incidence and gene content among lizards. Proc. Natl Acad. Sci. USA 84, Nucleic Acids Res. 31, 3406–3415. 7183–7187. Papakonstantinou, T., Law, R.H., Nesbitt, W.S., Nagley, P., Devenish, R.J., 1996. Molecular genetic analysis of the central hydrophobic domain of subunit 8 of yeast mitochondrial ATP synthase. Curr. Genet. 30, 12–18.