<<

See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/230837764

A unique tRNA family and a novel, highly expressed ORF in the mitochondrial of the silver-lip pearl oyster, Pinctada maxima (Bivalvia: Pteriidae)

ARTICLE in GENE · AUGUST 2012 Impact Factor: 2.14 · DOI: 10.1016/j.gene.2012.08.037 · Source: PubMed

CITATIONS READS 4 53

4 AUTHORS, INCLUDING:

Xiangyun Wu Lu Li Chinese Academy of Sciences Chinese Academy of Sciences

19 PUBLICATIONS 122 CITATIONS 7 PUBLICATIONS 26 CITATIONS

SEE PROFILE SEE PROFILE

Ziniu Yu South China Sea Institute of Oceanology, G…

82 PUBLICATIONS 846 CITATIONS

SEE PROFILE

All in-text references underlined in blue are linked to publications on ResearchGate, Available from: Ziniu Yu letting you access and read them immediately. Retrieved on: 26 November 2015 This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier’s archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright Author's personal copy

Gene 510 (2012) 22–31

Contents lists available at SciVerse ScienceDirect

Gene

journal homepage: www.elsevier.com/locate/gene

A unique tRNA gene family and a novel, highly expressed ORF in the mitochondrial genome of the silver-lip pearl oyster, Pinctada maxima (Bivalvia: Pteriidae)

Xiangyun Wu a, Xiaoling Li a,b,LuLia, Ziniu Yu a,⁎ a Key Laboratory of Marine Bio-resource Sustainable Utilization, Chinese Academy of Sciences, Guangdong Key Laboratory of Applied Marine , South China Sea Institute of Oceanology, Chinese Academy of Sciences, Guangzhou 510301, China b Graduate School of the Chinese Academy of Sciences, 19A Yuquan Road, Beijing 100049, China article info abstract

Article history: Characteristics of mitochondrial (mt) DNA such as gene content and arrangement, as well as mt tRNA sec- Accepted 23 August 2012 ondary structure, are frequently used in comparative genomic analyses because they provide valuable phylo- Available online 31 August 2012 genetic information. However, most analyses do not characterize the relationship of tRNA from the same mt genome and, in some cases, analyses overlook possible novel open reading frames (ORFs) when Keywords: the 13 expected -coding genes are already annotated. In this study, we describe the sequence and Pinctada maxima Mitochondrial genome characterization of the complete mt genome of the silver-lip pearl oyster, Pinctada maxima. The 16,994-bp Pearl oyster mt genome contains the same 13 protein-coding genes (PCGs) and two ribosomal RNA genes typical of meta- tRNA gene recruitment zoans. The gene arrangement, however, is completely distinct from that of all other available bivalve mt ge- atp8 nomes, and a unique tRNA gene family is observed in this genome. The unique tRNA gene family includes two Control region trnS−AGY and trnQ genes, a trnM isomerism, but it lacks trnS−CUN. We also report the first clear evidence of alloacceptor tRNA gene recruitment (trnP→trnS−AGY) in mollusks. In addition, a novel ORF (orfUR1) expressed at high levels is present in the mt genome of this pearl oyster. This gene contains a conserved do- main, “Oxidored_q1_N”, which is a member of Complex I and thus may play an important role in key biological functions. Because orfUR1 has a very similar composition and codon bias to that of other genes in this genome, we hypothesize that this gene may have been moved to the mt genome via gene transfer from the nuclear genome at an early stage of speciation of P. maxima, or it may have evolved as a result of gene duplication, followed by rapid sequence divergence. Lastly, a 319-bp region was identified as the possible control region (CR) even though it does not correspond to the longest non- in the genome. Unlike other studies of mt , this study compares the evolutionary patterns of all avail- able bivalve mt tRNA and atp8 genes. © 2012 Elsevier B.V. All rights reserved.

1. Introduction mechanism(s) by which the high level of genome rearrangement seen in the Bivalvia is currently unknown. It is very difficult to trace Though only 55 bivalve mitochondrial (mt) genomes are available the evolutionary pattern of these rearrangements by comparative in GenBank, they are considered to provide an extreme example of analyses using the limited number of representatives currently avail- gene rearrangements that occur even among species from the same able (Boore et al., 2004). Therefore, it is not easy to use bivalve mt ge- genus (Milbury and Gaffney, 2005; Xu et al., 2012). Unlike mt gene nomic architecture as characters for phylogenetic inference at high rearrangements in insects (e.g. Dowton and Austin, 1999), amphib- taxonomic levels (e.g., between different orders). Our recent compar- ians (e.g. Kurabayashi et al., 2005; Zhang et al., 2008), and even gas- ative mt genomic analyses of four intrageneric clams (Paphia spp.) tropods (e.g. Grande et al., 2008; Rawlings et al., 2010), the revealed that mt genome reorganization among congeneric species is not random but follows phylogenetic trends (Xu et al., 2012). Furthermore, there are many special features of bivalve mt genomes that do provide useful characters for studies of evolutionary Abbreviations: atp6 and atp8, ATPase subunit 6 and 8 genes; cob, cytochrome b . Wu et al. (2009) reported a large number of tRNA gene gene; cox1–3, subunit I–III genes; nad1–6 and 4L, NADH dehy- lost in three scallops (Mimachlamys nobilis, Mizuhopecten yessoensis, drogenase subunits 1–6 and 4L genes; rRNA, ribosomal RNA; rrnL and rrnS, large and Chlamys farreri), while Smith and Snyder (2007) found that the scal- small subunits of ribosomal RNA genes; tRNA, transfer RNA; trnM, Methionine transfer lop, Placopecten magellanicus, contains 23 additional tRNA genes. Sev- RNA gene. ⁎ Corresponding author. Tel./fax: +86 20 8910 2507. eral bivalve lineages (e.g. some clams) lack the atp8 gene, while in E-mail address: [email protected] (Z. Yu). Mytilus species and in freshwater mussels a fourteenth mt gene,

0378-1119/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2012.08.037 Author's personal copy

X. Wu et al. / Gene 510 (2012) 22–31 23

Fig. 1. Gene content and organization of Pinctada maxima mitochondrial genome, and assembly indication of the six overlapping large fragments (①–⑥) through PCR amplification. All genes are encoded by the H-strand. Protein coding and rRNA genes are abbreviated as in the text, and transfer RNA genes are depicted by their corresponding one-letter code.

thought to function in doubly uniparental inheritance (DUI), was 2. Materials and methods identified (Breton et al., 2011a, 2011b). Previous studies provide the basis for questions to which further 2.1. Specimens, DNA extraction, PCR amplification and sequencing genome annotation can be directed. For example, the loss of the atp8 gene in many bivalves raises the following questions: have Whole genomic DNA was extracted from adductor muscles of these species completely lost the atp8 gene? Has the gene been trans- P. maxima using the TIANamp Marine Animals DNA kit (Tiangen, ferred to the nucleus? Or do these bivalves possess a highly modified Beijing). Short fragments from the genes cox1, cob, atp6, nad2, rrnS version of the gene that has been overlooked during genome annota- and nad5 were amplified by PCR with universal primer pairs designed tion? Similarly, previous comparative mt genomic analyses were fo- based on the alignment of the published bivalve mt genome se- cused primarily on the differences in number and/or location of quences. Based on the sequences of these fragments, long-PCR tRNA genes among different genomes but usually did not survey rela- primers were designed and employed to amplify overlapping seg- tionships among the copies of each tRNA gene from the same ge- ments of the entire mt genome (Supplementary file 1). PCR reactions nome. Such investigation will not only provide direct evidence for were performed in a 25 μL volume with 0.5 μL template DNA (ap- tRNA gene recruitment (Lavrov and Lang, 2005; Saks et al., 1998; proximately 30 ng), 2.5 μL of 10× LA-buffer (Mg2+ plus), 0.5 μLof Wang and Lavrov, 2011) but also help us to understand the organiza- 10 mM dNTP mix, 1 μL of each primer (10 μM), and 0.25 μL(1U)of tion of tRNA gene duplication. Lastly, if all 13 expected protein-coding LATaq polymerase (Takara, Dalian, China). The PCR reactions were genes (PCG) were already annotated, the remaining mt regions were performed on an ABI Veriti thermal cycler (Applied Biosystems, Cali- classified as intergenic and “non-coding” (Breton et al., 2011b). A fornia, USA) with the following parameters: pre-denaturation at novel mt open reading frame (ORF) would thus be overlooked if it 94 °C for 1 min followed by 35 cycles of 94 °C for 20 s, 45–55 °C lacked significant similarity to known . annealing temperature for 20 s, extension at 68 °C for 2–8 min, and A total of sixteen mt genomes from two families of the order a final extension step at 68 °C for 10 min. PCR products were separat- Pterioida were sequenced thus far; in this study, we report the com- ed by electrophoresis on a 1.0% agarose gel, purified with QIAquick plete mt genome of the silver-lip pearl oyster, Pinctada maxima. This PCR Purification kit (QIAGEN, California, USA) and bi-directionally se- mt genome sequence is the first representative from the third family, quenced using a primer-walking strategy on an ABI 3730xl DNA Se- Pteriidae, in the order Pterioida. A unique tRNA gene set lacking trnS− quencer (Applied Biosystems, California, USA). CUN but containing two trnS−AGY genes and an additional trnQ gene was observed in this genome. We compare all available bivalve mt 2.2. Sequence assembly, annotation and analysis tRNA genes and discuss their evolutionary patterns. We also report the first clear evidence for alloacceptor tRNA gene recruitment in Sequences were assembled using SeqMan program (DNAstar, mollusks. In addition, we describe a novel, highly expressed gene Madison, Wisconsin). Manual examinations were applied to ensure and discuss its potential function and possible origin. correct assembly. Protein coding genes (PCGs) were identified by Author's personal copy

24 X. Wu et al. / Gene 510 (2012) 22–31

Fig. 2. (A) Neighbor-joining tree based on uncorrected p distances among mitochondrial tRNA genes from Pinctada maxima. Portions of the tree discussed in the main text are in color; (B) secondary structure of the hypothesized trnM isomerism. Canonical and G–T base pairs are differently indicated; (C) alignment of trnQ1 and trnQ2 gene sequences

(above), and that of trnP and trnS1 gene sequences (below). tRNA secondary structure is displayed above the alignment, and the position of the loop regions is marked with red circles. comparison with orthologous mt DNA sequences of previously pub- (i.e., tandem repeats and stem-loop structures, respectively). A lished bivalves. The ends of rrnL and rrnS genes were assumed to ex- neighbor-joining tree of tRNA genes based on p-distances was tend to the boundaries of their flanking genes. tRNA genes were constructed using the MEGA 4.0 program. identified by the programs tRNAscan-SE1.21 (Lowe and Eddy, 1997) and ARWEN (Laslett and Canbäck, 2008). Mt genome maps were gen- 3. Results and discussion erated by CGview (Stothard and Wishart, 2005). A complete mt ge- nome has been deposited in GenBank under accession number The mt genome of the silver-lip pearl oyster, P. maxima,isa GQ452847. 16,994-bp circular that contains the expected 13 PCGs MEGA 4.0 (Tamura et al., 2007) was used for sequence alignments. and two ribosomal typical of metazoan mt genomes (Fig. 1). Examination of ORFs was performed using the NCBI ORF Finder pro- However, several unusual features were observed in this genome: gram (http://www.ncbi.nlm.nih.gov/projects/gorf/). Sequence similari- 1) a total of 24 tRNA genes, including two copies of each of the fol- ty searches were performed in GenBank using BlastX and PSI-Blast lowing genes: trnS−AGY, trnM and trnQ; 2) a novel ORF (orfUR1) (Altschul et al., 1997) against the following databases: 1) nonredundant encoding a putative protein with 320 amino acids located between −AGY protein sequences, 2) SWISSPROT, 3) protein data bank, and 4) environ- the trnS1 and trnQ1 genes; 3) a putative control region (CR) of mental samples. Protein domains were predicted with the Simple only 319 bp between the trnP and trnD genes; and 4) a gene arrange- Modular Architecture Research Tool (SMART) program (http://smart. ment distinct from those of all other available bivalve mt genomes. emblheidelberg.de/). The relative synonymous codon usage (RSCU) values of 13 PCGs and that of a newly identified ORF were calculated 3.1. A unique tRNA gene family with MEGA 4.0. The presence and location of a transmembrane hydro- phobic helix in the atp8 protein were investigated using the TMHMM Based on their aminoacylation identity, tRNAs are subdivided into Server v. 2.0 (http://www.cbs.dtu.dk/services/TMHMM/) and DAS soft- 20 amino acid accepting groups (alloacceptors). Metazoan mt genomes ware (http://www.sbc.su.se/~miklos/DAS/). We utilized the software are able to translate all codons with as few as 22 tRNAs, with two genes package GraphDNA (Thomas et al., 2007) to identify features associated (trnL and trnS) independently comprising two tRNAs (isoacceptors) with the control region (CR). Included in the package is DNA-Walk that translate synonymous codons (Saks et al., 1998). However, the method (Lobry, 1996), an inferential technique that generates a graph- presence of two trnM genes has been treated as a common feature of bi- ical representation of cumulative skew across a complete genome by valve mt genomes (Xu et al., 2012; Yu and Li, 2012). It is reasonable to assigning a direction to each consecutive nucleotide in the sequence infer that the existence of two trnM genes per mt genome originates (C = North, T = East, G = South, and A = West). We also utilized Tan- from a gene duplication event at an early stage of bivalve radiation. dem Repeats Finder (Benson, 1999) and Mfold (http://mfold.rna. Xu et al. (2012) found that the paralogous similarity of the two trnM albany.edu/?q=mfold) to locate structures characteristic of the CR genes from different bivalve mt genomes is highly variable, ranging Author's personal copy

X. Wu et al. / Gene 510 (2012) 22–31 25

Fig. 3. A comparison of all available 56 bivalve tRNA genes to show abnormal gene families being produced via gene loss (black block), complete duplication (blue block), gene multiplication (brown block), isomerism (red point), and recruitment (blue triangle). from as low as 42.4% similarity in Acanthocardia tuberculata to as high as different lineages to maintain the double trnM genes during concerted 76.8% in Paphia undulate (Xu et al., 2012). Differences in paralogous di- evolution after its early duplication. In this study, the two trnM genes vergence of the two trnM genes may reflect diversity in the ability of share the lowest sequence similarity (34.3%) reported in bivalves so Author's personal copy

26 X. Wu et al. / Gene 510 (2012) 22–31

Fig. 4. (A) The full-length nucleotide and deduced amino acid sequence of orfUR1; (B) result of database searches for conserved amino acid patterns by BlastX and PSI-Blast in NCBI; (C) comparison of relative synonymous codon usage (RSCU) in 13 protein coding genes (left column) and that in orfUR1 (right column). Codon families are provided on the x axis.

far, in spite of the fact that these two genes are still identified as a sister Xue et al., 2003). However, it is also possible for tRNA genes to evolve group on the phylogenetic tree by the algorithms used in this study independently of the by changing both the anticodon

(Fig. 2A). Furthermore, the putative secondary structure of the trnM1 and acceptor identities of duplicated alloacceptor tRNA genes lacks a TψCarm(Fig. 2B). This first report of trnM isomerism in bivalves (Cedergren et al., 1980; Lavrov and Lang, 2005). The possibility of also suggests a relatively poor degree of maintenance of paralogous such a process, called “tRNA gene recruitment”, has been demonstrat- trnM genes in one genome. ed experimentally in (Saks et al., 1998) and unambig- It is widely believed that different alloacceptors evolved before the uously observed in the mt genomes of several organisms (Lavrov and last common ancestor of all living organisms (Lavrov and Lang, 2005; Lang, 2005; Wang and Lavrov, 2011). In this study, phylogenetic Author's personal copy

X. Wu et al. / Gene 510 (2012) 22–31 27

Fig. 5. Evidences of the existence of atp8 gene in Pinctada maxima mitochondrial genome. (A) Alignment of putative ATP8 amino acid sequences from P. maxima (Pmax) and P. margaritifera (Pmar). Highly conserved amino acids are indicated by asterisks; (B) nucleotide pair between P. maxima and P. margaritifera. ii = identical pairs; si = transitional pairs; sv = transversional pairs; total = si+sv; (C) prediction of transmembrane helices in ATP8 by TMHMM Server v. 2.0; (D) prediction of transmembrane helices in ATP8 by DAS program.

−AGY analyses indicate a very close relationship between trnP and trnS1 recruitment. Second, the novel tRNA gene family structures are (Fig. 2A), and from the alignments of these two tRNA sequences, we lineage-specific rather than species-specific with the exception of can see that the stem and anticodon regions are highly conserved, a Mytilus trossulus, the only mt genome from the family Mytilidae pattern that is expected for functional tRNA genes (Fig. 2C). Loss of that contains two trnQ genes. Third, we note that “tRNA gene multi- the DHU arm is typical of metazoan trnS−AGY (Kumazawa and plication” is the most frequent type of alteration to tRNA family struc- −AGY −AGY Nishida, 1993), as is true for trnS2 but not for trnS1 . A possible ture while “tRNA gene recruitment” is least frequently observed. −AGY parsimonious explanation for the existence of trnS1 in the mt ge- Lastly, the trnQ gene duplication or multiplication events observed nome of P. maxima is that this tRNA gene was derived from a recently in ten species from five different families suggests that different duplicated trnP gene via an alloacceptor tRNA gene recruitment pro- alloacceptor tRNAs may undergo differential concerted evolution in cess that concurrently changes tRNA amino acid charging identity mt genomes. Based on these findings, we have reason to believe and mRNA coupling capacity. The example presented here is informa- that the P. maxima mt genome is a suitable model to study the evolu- tive as it is the first evidence of alloacceptor tRNA gene recruitment in tion of tRNA multigene families when additional sequences from mollusks. closely related species are available in the future. Another unique feature of the P. maxima tRNA gene family is the existence of two trnQ genes. These genes share 73.8% sequence simi- 3.2. Characterization of a novel highly expressed ORF larity (the loop region was excluded for this calculation), strongly suggesting that the second trnQ gene may be derived by gene dupli- Intergenic regions have commonly been classified as “non-coding” cation (Fig. 2C). The finding of trnM isomerism, duplicate trnQ in published animal mt genomes. In this study, a 1085-bp region be- −AGY genes, and a tRNA recruitment event in P. maxima suggests that the tween trnS1 and trnQ1 gene was initially annotated as a “major evolution of the tRNA gene family in bivalves has been more complex non-coding region (MNR)” because the expected number of mt than is currently appreciated. It is worthwhile to note that most tRNA genes was already annotated, and the size of the region is very close gene duplication and recruitment events occurred recently and are to that of other bivalve MNRs. The MNR is the expected location of easily missed as they can be concealed by rapid evolution (Wang the control region (CR), a sequence that plays a key role in mtDNA and Lavrov, 2011). replication and . However, no typical or identifiable ele- In order to evaluate whether this unique tRNA gene family is a ments of the CR (e.g., high AT content, stem-loop structures and re- species-specific character of the P. maxima mt genome or is common peats) are found in this region, making identification of this region to the highly variable bivalve mt genomes, we compared all available as the functional CR less likely. Surprisingly, an ORF (orfUR1)of bivalve tRNA genes. A total of 56 different bivalve species were com- 963-bp was discovered in this region, beginning with an ATG start pared. Several interesting findings are worth noting (Fig. 3): First, a codon and ending with a TAA . The ORF encodes a putative total of five abnormal gene families were detected among the 56 ge- protein of 320 amino acids (Fig. 4A). To investigate the biological sig- nomes analyzed. These novel gene families appear to be produced via nificance of orfUR1, we downloaded all the ESTs (9498 items) of gene loss, complete duplication, gene multiplication, isomerism, and P. maxima from GenBank and assembled them using the complete Author's personal copy

28 X. Wu et al. / Gene 510 (2012) 22–31

Table 1 Table 1 (continued) List of 56 bivalves' atp8 genes for their gene length, start and stop codon, position in ge- Species Position Length Start/stop GenBank nome and accession number in GenBank. Items highlighted with asterisks (*) are (5′–3′) (a.a.) codon number newly annotated in the present study. Family Margaritiferidae Species Position Length Start/stop GenBank Margaritifera falcata 14,449–14,243 68 ATG/TAA HM856634 (5′–3′) (a.a.) codon number

Subclass Pteriomorphia Family Pectinidae Argopecten irradians 1801–1953 50 GTG/TAG DQ665851 Chlamys farreri 10,446–10,598 50 TTG/TAA FJ595957 Mimachlamys nobilis 7937–8089 50 ATG/TAA FJ595958 mtDNA sequence as a reference in the CLC Genomics Workbench soft- Mizuhopecten yessoensis 9233–9385 50 TTG/TAG FJ595959 ware. Though the EST dataset is too small to supply a high-resolution Placopecten magellanicus 23,508–23,660 50 TTG/TAA DQ088274 map of transcription, a set of expression patterns expected for bivalve mt genes was found, namely rrnL is the most highly transcribed gene Family Ostreidae fi Crassostrea gigas 3543–3662 39 ATG/TAA NC_001276 followed by cox1 (Supplementary le 2). Surprisingly, a comparative- Crassostrea ariakensis 6214–6333 39 ATG/TAA NC_012650 ly large number of ESTs perfectly targeted the location of orfUR1 (be- Crassostrea angulata 6269–6388 39 ATG/TAA NC_012648 tween 5500 and 6500), suggesting that orfUR1 is a highly expressed Crassostrea sikamea 6275–6394 39 GTG/TAA NC_012649 gene, similar in expression level to cox1 (Supplementary file 2). Crassostrea hongkongensis 6210–6329 39 ATG/TAG FJ841963 Although subsequent SMART analysis yielded no information on the Crassostrea virginica 9255–9374 39 ATG/TAA NC_007175 Crassostrea iredalei 6335–6454 39 ATG/TAA NC_013997 possible function of orfUR1, database searches for conserved amino acid Crassostrea nippona 6401–6520 39 ATG/TAG HM015198 patterns via BlastX and PSI-Blast detected a putative conserved domain Saccostrea mordax 6113–6235 * 40 GTG/TAA FJ841968 called “Oxidored_q1_N”, a domain found in NADH-Ubiquinone oxidore- – Ostrea denselamellosa 4898 5017 * 39 ATG/TAA NC_015231 ductase (complex I), chain 5 at the N-terminus (Fig. 4B). This sub-family Ostrea edulis 4837–4956 * 39 ATG/TAA NC_016180 represents an amino terminal extension of pfam00361. It is part of com- Family Pteriidae plex I which catalyzes the transfer of two electrons from NADH to ubiqui- Pinctada maxima 2609–2746 * 45 ATG/TAA GQ452847 none in a reaction associated with proton translocation across the Pinctada margaritifera 1–138 * 45 ATG/TAA JX069978 membrane. Only NADH-Ubiquinone chain 5 and eubacterial chain L are in this family. We searched for similar domain architectures in NCBI Family Mytilidae Mytilus edulis 10,396–10,659 87 GTG/TAA NC_006161 using CDART (Geer et al., 2002). A total of 15 similar domain architec- Mytilus galloprovincialis 8801–9064 87 GTG/TAA NC_006886 tures were detected which all share the domain “Oxidored_q1_N” (Sup- Mytilus trossulus 11,954–12,217 87 GTG/TAA NC_007687 plementary file 3). Among these 15 architectures, only two of them – Mytilus californianus 8777 9040 87 GTG/TAA NC_015993 represent mt-encoded genes, namely NADH dehydrogenase subunit 5 Musculista senhousia 7403–7594 * 63 ATG/TAG GU001954 (nad5) in all organisms and ATPase 6 (atp6)inStrongylocentrotus species. Subclass Heterodonta However, alignment of the primary sequence (both and Family Lucinidae amino acids) of orfUR1 with that of nad5, revealed little to no orthology, Loripes lacteus 14,442–14,559 39 ATT/T NC_013271 suggesting that these two genes do not have a homologous origin. None- Lucinella divaricata 15,861–15,974 37 ATT/TAA NC_013275 theless, we infer that orfUR1 is a typical mt gene based on two lines of ev- Family Cardiidae idence: 1) the nucleotide composition pattern of orfUR1 (AT%=42%; Acanthocardia tuberculata 12,546–12,659 * 37 GTG/TAG NC_008452 GC%=58%) is identical to that of the whole genome and very close to that of all PCGs (AT%=43%; GC%=57%); and 2) amino acid RSCU values Family Solecurtidae of most of the expected 13 PCGs are similar to that of orfUR1 with a few Sinonovacula constricta 14,289–14,402 * 37 ATG/TAA NC_011075 exceptions (G, S1,V)(Fig. 4C). Family Veneridae At least two possible mechanisms could have created orfUR1 in Meretrix lamarckii 8835–8954 39 ATG/TAA NC_016174 P. maxima. First, the orfUR1 gene, as a member of Complex I, might Meretrix petechialis 8553–8672 * 39 ATG/TAG EU145977 have been transferred from the nuclear to mt genome. This hypothe- Meretrix meretrix 8553–8672 * 39 ATG/TAG NC_013188 sis is supported by the following two points: 1) the “Oxidored_q1_N” Meretrix lusoria 8642–8761 39 ATG/TAG GQ903339 fi Paphia amabilis 14,035–14,148 37 ATG/TAG JF969276 domain is frequently found in nuclear genes (Supplementary le 3), Paphia textile 13,019–13,132 37 ATG/TAA JF969277 and 2) orfUR1 and other mt genes do not share any detectably con- Paphia undulata 12,642–12,755 37 ATG/TAA JF969278 served domains. Since the orfUR1 presents a very similar nucleotide – Paphia euglypta 12,997 13,110 37 GTG/TAA GU269271 composition and codon usage to that of the mt genome, the gene Venerupis philippinarum 5968–6087 39 ATT/TAG NC_003354 transfer event would have occurred soon after the speciation of Family Hiatellidae P. maxima, and thus co-evolved with the mt genome for a long time. Hiatella arctica 10,367–10,570 67 ATG/TAA NC_008451 The second plausible scenario is that orfUR1 emerged as a result of mt gene duplication, followed by a period of rapid sequence diver- Subclass Palaeoheterodonta gence, making any sequence similarity to a ‘parent’ gene undetectable Family Unionidae Cristaria plicata 6974–6783 63 ATG/TAA NC_012716 using current bioinformatics algorithms. In this study, the most likely Lampsilis ornata 1640–1852 70 ATG/TAG NC_005335 “parent” gene of orfUR1 may be nad5 because both genes have the Pyganodon grandis 14,247–14,023 74 ATG/TAA NC_013661 “Oxidored_q1_N” domain. However, the present study cannot con- – Quadrula quadrula 14,404 14,246 52 ATG/TAA NC_013658 firm whether orfUR1 and nad5 have similar functions. Several recent Hyriopsis cumingii 11,767–11,618 49 ATG/TAA NC_011763 Venustaconcha ellipsiformis 15,549–15,376 57 GTG/TAG NC_013659 studies report that mtDNA-encoded proteins display extra-mt func- Inversidens japanensis 12,523–12,711 * 62 GTT/TAA AB055625 tions. For example, a C-terminus extended, male-transmitted Cox2 Hyriopsis schlegelii 11,733–11,572 53 ATG/TAA HQ641406 protein in the freshwater mussel, Venustaconcha ellipsiformis, plays a Unio pictorum 3537–3731 64 ATG/TAG HM014130 role in reproduction through gamete maturation, fertilization, and/ – Toxolasma parvus 14,322 14,143 59 GTG/TAG HM856639 or embryogenesis (Chakrabarti et al., 2007), while Breton et al. Lasmigona compressa 14,304–14,080 74 ATG/TAA HM856638 fi Utterbackia imbecillis 14,498–14,313 61 ATG/TAA HM856637 (2011a, 2011b) recently identi ed a fourteenth mtDNA-encoded pro- Utterbackia peninsularis 15,196–15,011 61 ATG/TAG HM856635 tein that is involved in key biological functions in bivalve species with DUI. Therefore, the highly expressed orfUR1 gene may play an Author's personal copy

X. Wu et al. / Gene 510 (2012) 22–31 29

Fig. 6. (A) Sequence of the putative control region (CR) of Pinctada maxima. The region contains a section with tandem repeats (2× 32 bp, highlighted with red and blue colors separately) and an AT-rich region containing seven putative hairpin structures; (B) DNA walk result for detecting control region. The line represents successive compositional changes of nucleotides along the mitochondrial DNA sequence, and the arrows show sharp directional changes and indeed represent abrupt base composition changes around them.

important role in key biological functions, or in the speciation of P. region was sequenced in the mt genome of Pinctada margaritifera,a maxima, and it deserves further study. species closely related to P. maxima. Alignment of the two amino acid sequences (Fig. 5A) reveals high similarities between the two pu- 3.3. Characterization of the atp8 gene tative proteins, and the gene possesses a pattern of evolution expected for a PCG evolving under purifying selection (i.e., mutations The atp8 gene was previously thought to be missing in many spe- follow the expected 3rd>1st>2nd codon position pattern) (Fig. 5B), cies of bivalves, platyhelminthes, and (Gissi et al., 2008), strongly suggesting that the protein is expressed. The putative pro- but recently Breton et al. (2010) discovered that nearly all bivalves tein product has one transmembrane helix similar to that of other contain a functional atp8 gene. Atp8 is a short gene that tolerates al- metazoan atp8 proteins (Figs. 5C and D). Comparison to the atp8 most every kind of change, and remains under low selective pressure. gene from other available bivalve mt genomes reveals that the length The gene exhibits high levels of variability in both size and primary of the pearl oyster gene (45 a.a.) is similar to that of venerids (37–40 sequence, and lacks the typical “MPQL” N-terminus amino acid signa- a.a.), oysters (39–40 a.a.) and scallops (50 a.a.), and slightly shorter ture in bivalves. Thus, this gene is difficult to detect by homology than those of freshwater mussels (49–74 a.a.), but greatly shorter searches or comparisons. than those of marine mussels (63–87 a.a.) (Table 1). From our survey, We investigated all intergenic regions and found that an ORF the gene size of atp8 appears lineage-specific in most marine bivalves,

(138 bp long, between trnL1 and nad6 gene) is a putative candidate that is, species from one family have almost identical gene sizes. for atp8. To verify the biological significance of this ORF, the same However, it is interesting that the atp8 gene of freshwater mussels Author's personal copy

30 X. Wu et al. / Gene 510 (2012) 22–31 show a high diversity of gene size, with species from different genera Acknowledgments having different gene sizes. In contrast, usage of start and stop codons is more variable. It is common to see different start or terminal co- This work was financially supported by the National Science Foun- dons employed by co-familial species (Table 1). dation of China (no. 40906077), the Knowledge Innovation Program of the Chinese Academy of Sciences (no. SQ200804) and the earmarked fund for Modern Agro-industry Technology Research System. The au- 3.4. Discovery of the putative control region (CR) thors thank Prof. Elizabeth De Stasio for her English review and sugges- tive comments. The CR is thought to play a crucial role in replication and transcrip- tion of the mt genome and its location typically corresponds to the lon- gest non-coding region. In most animal lineages, the CR can be References recognized based on the following: (1) a relatively high AT content, (2) frequent stable stem-loop structures containing AT-rich loops, Altschul, S.F., et al., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. (3) blocks, (4) frequent repetitive elements, and Benson, G., 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic (5) abrupt changes in base composition bias. The putative CR has Acids Res. 27, 573–580. been identified in several mollusks, such as sea mussels Mytilus spp. Boore, J.L., Medina, M., Rosenberg, L.A., 2004. Complete sequences of the highly rearranged molluscan mitochondrial genomes of the scaphopod Graptacme eborea (Cao et al., 2004) and sessile snails (Rawlings et al., 2010). In sea mus- and the bivalve Mytilus edulis. Mol. Biol. Evol. 21, 1492–1503. sels, for example, the putative CR is capable of producing characteristic Breton, S., Stewart, D.T., Hoeh, W.R., 2010. Characterization of a mitochondrial ORF secondary structures and contains conserved motifs that are broadly from the gender-associated mtDNAs of Mytilus spp. (Bivalvia: Mytilidae): Identifi- cation of the “missing” ATPase 8 gene. Mar Genomics 3, 11–18. similar to the mammalian CR (Cao et al., 2004). The location of the CR Breton, S., Ghiselli, F., Passamonti, M., Milani, L., Stewart, D.T., Hoeh, W.R., 2011a. Evi- is comparatively less well-defined in many invertebrate lineages. dence for a fourteenth mtDNA-encoded protein in the female-transmitted Chen et al. (2008) noted that CR locations are highly variable among mtDNA of marine mussels (Bivalvia: Mytilidae). PLoS One 6, e19365. Breton, S., et al., 2011b. Novel protein genes in animal mtDNA: a new sex determination scleractinian corals, Van Oppen et al. (2002) hypothesized a large de- system in freshwater mussels (Bivalvia: Unionoida)? Mol. Biol. Evol. 28, 1645–1659. gree of heterogeneity in the mechanisms mediating mt replication in Cao, L., Kenchington, E., Zouros, E., Rodakis, G.C., 2004. Evidence that the large noncoding hexacorals, and to our knowledge, the CRs of scallop and clam mt ge- sequence is the main control region of maternally and paternally transmitted mito- chondrial genomes of the marine mussel (Mytilus spp.). 167, 835–850. nomes have not yet been described. A high degree of genomic Cedergren, R.J., LaRue, B., Sankoff, D., Lapalme, G., Grosjean, H., 1980. Convergence and rearrangement and the loss of typical recognition elements such as minimal mutation criteria for evaluating early events in tRNA evolution. Proc. Natl. the apparent loss of both conserved sequence blocks and repetitive ele- Acad. Sci. U. S. A. 77, 2791–2795. ments may account for the difficulty in identifying the CR in inverte- Chakrabarti, R., et al., 2007. Reproductive function for a C-terminus extended, male- transmitted cytochrome c oxidase subunit II protein expressed in both spermato- brate lineages. In this study, a region between trnP and trnD may be zoa and eggs. FEBS Lett. 581, 5213–5219. the most likely location of the CR because it meets most criteria of a typ- Chen, C., Chiou, C., Dai, C., Chen, C., 2008. Unique mitogenomic features in the scleractinian – ical CR. This 319-bp segment is AT rich (65%) compared to other parts of family Pocilloporidae (Scleractinia: Astrocoeniina). Mar. Biotechnol. 10, 538 553. Dowton, M., Austin, A.D., 1999. Evolutionary dynamics of a mitochondrial rearrangement the genome (57%), and it contains seven stem-loop structures and two “Hot Spot” in the hymenoptera. Mol. Biol. Evol. 16, 298–309. tandem repeats (Fig. 6A). DNA-walk results show sharp directional Geer, L.Y., Domrachev, M., Lipman, D.J., Bryant, S.H., 2002. CDART: protein homology by changes in this region, representing abrupt changes in base composi- domain architecture. Genome Res. 12, 1619–1623. Gissi, C., Iannelli, F., Pesole, G., 2008. Evolution of the mitochondrial genome of Metazoa tion (Fig. 6B). as exemplified by comparison of congeneric species. Heredity 101, 301–320. Grande, C., Templado, J., Zardoya, R., 2008. Evolution of gastropod mitochondrial ge- nome arrangements. BMC Evol. Biol. 8, 61. Kumazawa, Y., Nishida, M., 1993. Sequence evolution of mitochondrial tRNA genes and 4. Conclusions deep-branch animal phylogenetics. J. Mol. Evol. 37, 380–398. Kurabayashi, A., Sumida, M., Yonekawa, H., Glaw, F., Vences, M., Hasegawa, M., 2005. In this study, we sequenced and characterized the complete mt Phylogeny, recombination, and mechanisms of stepwise mitochondrial genome re- organization in mantellid frogs from Madagascar. Mol. Biol. Evol. 25, 874–891. genome of the pearl oyster, P. maxima. The gene arrangement of Laslett, D., Canbäck, B., 2008. ARWEN, a program to detect tRNA genes in metazoan mi- this genome is completely distinct from those of all other available bi- tochondrial nucleotide sequences. Bioinformatics 24, 172–175. valves, another example of the highly variable nature of bivalve mt Lavrov, D.V., Lang, B.F., 2005. Transfer RNA gene recruitment in mitochondrial DNA. – genomes. We compared the atp8 gene in available bivalve genome se- Trends Genet. 21, 129 133. Lobry, J.R., 1996. Asymmetric substitution patterns in the two DNA strands of bacteria. quences and found that the length of atp8 is lineage-specific in most Mol. Biol. Evol. 13, 660–665. marine bivalves. Though there is no large non-coding region in the Lowe, T.M., Eddy, S.R., 1997. A program for improved detection of transfer RNA genes in – mt genome of P. maxima, a region between trnP and trnD was identi- genomic sequence. Nucleic Acids Res. 25, 955 964. fi fi Milbury, C.A., Gaffney, P.M., 2005. Complete mitochondrial DNA sequence of the east- ed as the putative CR. In addition, this study made two novel nd- ern oyster Crassostrea virginica. Mar. Biotechnol. 7, 697–712. ings. First, a unique tRNA gene family with trnM isomerism, with Rawlings, T.A., MacInnis, M.J., Bieler, R., Boore, J.L., Collins, T.M., 2010. Sessile snails, dy- two trnS−AGY and trnQ genes, but which lacks trnS−CUN was observed namic genomes: gene rearrangements within the mitochondrial genome of a fam- fi ily of caenogastropod mollusks. BMC Genomics 11, 440. in this genome. We report the rst clear evidence for alloacceptor Saks, M.E., Sampson, J.R., Abelson, J., 1998. Evolution of a transfer RNA gene through a −AGY tRNA gene recruitment (trnP→trnS ) in mollusks, and discuss point mutation in the anticodon. Science 279, 1665–1670. of the evolutionary patterns of bivalve mt tRNA genes based on cur- Smith, D.R., Snyder, M., 2007. Complete mitochondrial DNA sequence of the scallop Placopecten magellanicus: evidence of transposition leading to an uncharacteristi- rently available mt sequences. Second, a novel highly expressed ORF cally large mitochondrial genome. J. Mol. Evol. 65, 380–391. (orfUR1) was described and its potential function and two possible Stothard, P., Wishart, D.S., 2005. Circular genome visualization and exploration using mechanisms of origin were compared. The ORF contains a conserved CGView. Bioinformatics 21, 537–539. Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: molecular evolutionary genet- domain called “Oxidored_q1_N” which is a member of Complex I. ics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599. Confirmation of orfUR1 function, whether in proton translocation or Thomas, J., Horspool, D., Brown, G., Tcherepanov, V., Upton, C., 2007. GraphDNA: a Java extra-mt function, awaits further study. It will be of interest to learn program for graphical display of DNA composition analyses. BMC Bioinforma. 8, 21. whether orfUR1 is linked to speciation of P. maxima. Similarly, there van Oppen, M.J., Catmull, J., McDonald, B.J., Hislop, N.R., Hagerman, P.J., Miller, D.J., 2002. The mitochondrial genome of Acropora tenuis (Cnidaria; Scleractinia) con- is yet no evidence linking the unique tRNA gene family found tains a large group I and a candidate control region. J. Mol. Evol. 55, 1–13. in the mt genome of P. maxima to this organism's adaptive Wang, X., Lavrov, D.V., 2011. Gene recruitment — a common mechanism in the evolu- – processes. tion of transfer RNA gene families. Gene 475, 22 29. Wu, X., Xu, X., Yu, Z., Kong, X., 2009. Comparative mitogenomic analyses of three scal- Supplementary data to this article can be found online at http:// lops (Bivalvia: Pectinidae) reveal high level variation of genomic organization and dx.doi.org/10.1016/j.gene.2012.08.037. a diversity of transfer RNA gene sets. BMC Res. Notes 2, 69. Author's personal copy

X. Wu et al. / Gene 510 (2012) 22–31 31

Xu, X.D., Wu, X.Y., Yu, Z.N., 2012. Comparative studies of the complete mitochondrial Yu, H., Li, Q., 2012. Complete mitochondrial DNA sequence of Crassostrea nippona: com- genomes of four Paphia clams and reconsideration of subgenus Neotapes (Bivalvia: parative and phylogenomic studies on seven commercial Crassostrea species. Veneridae). Gene 494, 17–23. Mol. Biol. Rep. 39, 999–1009. Xue, H., Tong, K.L., Marck, C., Grosjean, H., Wong, J.T.F., 2003. Transfer RNA paralogs: Zhang, P., Papenfuss, T.J., Wake, M.H., Qu, L., Wake, D.B., 2008. Phylogeny and biogeog- evidence for genetic code amino acid coevolution and an archaeal raphy of the family Salamandridae (Amphibia: Caudata) inferred from complete root of . Gene 310, 59–66. mitochondrial genomes. Mol. Phylogenet. Evol. 49, 586–597.