中国科技论文在线 http://www.paper.edu.cn 855 The mitochondrial genome of Ruspolia dubia (: Conocephalidae) contains a short A+T-rich region of 70 bp in length

Zhijun Zhou, Yuan Huang, and Fuming Shi

Abstract: The complete sequence (14 971 bp) of the Ruspolia dubia mitochondrial genome was determined and annotated. The genome contains the gene content, base composition, and codon usage typical of metazoan mitochondrial genomes. All 37 genes are conserved in the positions observed most frequently in mitochondrial genome structures. The sec- ondary structures of both small subunit and large subunit rRNA were predicted. The most unusual features found were the initiation codon (TTA) of COI and a short A+T-rich region of 70 bp in length. In addition, a short, highly conserved poly- thymidine stretch that was previously described in Orthoptera and Diptera was also present in the A+T-rich region. Key words: mitochondrial DNA, A+T-rich region, RNA secondary structure, Tettigonioidea, . Re´sume´ : La se´quence comple`te (14 971 pb) du ge´nome mitochondrial du Ruspolia dubia ae´te´ de´termine´e et annote´e. Le ge´nome affiche un contenu ge´nique, une composition nucle´otidique et un usage des codons typiques des ge´nomes mito- chondriaux chez les me´tazoaires. Les 37 ge`nes sont conserve´sa` des positions observe´es le plus fre´quemment chez les ge´- nomes mitochondriaux des insectes. Les structures secondaires des petits et grands ARNr ont e´te´ pre´dites. Les caracte´ristiques les plus inhabituelles e´taient le codon d’initiation (TTA) du COI et une petite re´gion de 70 pb riche en A+T. De plus, seule une courte suite polythymidinique fortement conserve´eetde´crite ante´rieurement chez les orthopte`res et les dipte`res est e´galement pre´sente au sein de la re´gion riche en A+T. Mots-cle´s:ADN mitochrondrial, re´gion riche en A+T, structure secondaire de l’ARN, Tettigonioidea, Ensifera. [Traduit par la Re´daction]

Introduction the control region) has been studied extensively (Zhang et al. 1995; Zhang and Hewitt 1997; Duenas et al. 2006; Cha The mitochondrial DNA (mtDNA) is a double- et al. 2007). stranded, circular molecule of 14–19 kb in length. It usually contains a remarkably conserved set of 37 genes: 22 encod- Mitochondrial genes are commonly used molecular ing tRNAs, 2 encoding rRNAs (lrRNA and srRNA), and 13 markers for phylogenetic studies at various taxonomic levels encoding proteins that are involved in electron transport and (Simon et al. 1994). Recently, mtDNA has been extensively oxidative phosphorylation (Wolstenholme 1992; Boore used in studies of phylogenetics, phylogeography, popula- 1999). Additionally, the insect mitochondrial genome (mito- tion structure and dynamics, and molecular evolution at the genome) contains a major non-coding region known as the genomic level. The growing interest in mitochondrial ge- A+T-rich region that plays a role in the initiation of tran- nomes has triggered a rapid increase in published complete scription and replication (Wolstenholme 1992). The length mitochondrial genome sequences. Complete mtDNA sequen- of the A+T-rich region ranges from tens to several thousands ces have been determined for nearly 1000 metazoans and of base pairs (Lewis et al. 1995; Shao et al. 2001). It is are available in GenBank. However, most of these sequen- one of the most variable regions of mtDNA, not only because ces are from vertebrates, while the numbers for other of its high rates of nucleotide substitution and insertions/ phyla are lagging behind (Boore 1999). Sixty-five complete deletions, but also because of the presence of varying insect mtDNA sequences from 17 orders have been deter- copy numbers of tandemly repeated elements (Zhang and mined, 19 of them from Diptera, 11 from Hemiptera, 7 Hewitt 1997; Duenas et al. 2006). The structural organiza- from Lepidoptera, and only 2 from Orthoptera: those for Lo- tion of the insect mtDNA A+T-rich region (also known as custa migratoria and Gryllotalpa orientalis. In this paper, we report the complete mitogenome of Rus- Received 9 April 2007. Accepted 12 June 2007. Published on polia dubia, which belongs to the family Conocephalidae, the NRC Research Press Web site at genome.nrc.ca on superfamily Tettigonioidea, suborder Ensifera, order Orthop- 18 September 2007. tera. Our goal in sequencing the mtDNA of other orthop- Corresponding Editor: L. Bonen. teran taxa was to help characterize the evolution of the mitogenome structure of Orthoptera. Z. Zhou and Y. Huang.1 College of Life Science, Shaanxi Normal University, Xi’an, Shaanxi, 710062, China. F. Shi. College of Life Science, Hebei University, Baoding, Materials and methods Hebei, 071002, China. Insect collection and DNA extraction 1Corresponding author (e-mail: [email protected]). Specimens of R. dubia were collected by Yuan Huang at

Genome 50: 855–866 (2007) doi:10.1139/G07-057 # 2007 NRC Canada 转载 中国科技论文在线 http://www.paper.edu.cn 856 Genome Vol. 50, 2007

Zulu, Hebei, China, in August 2005. All specimens were Fig. 1. Gene map of the mitochondrial genome of Ruspolia dubia. preserved in 100% ethanol and stored at 4 8C until DNA ex- Protein-coding genes are transcribed in the clockwise direction, ex- traction was performed. cept for ND1, ND4L, ND4, and ND5 (encoding NADH dehydrogen- Whole genomic DNA was extracted from the leg muscle ase subunits 1, 4L, 4, and 5) (underlined). The 2 genes encoding tissue of a single adult male of R. dubia and digested in large subunit ribosomal RNA (lrRNA) and small subunit ribosomal 600 mL of protease buffer (0.01 mol/L Tris, pH 7.8, RNA (srRNA) are on the L strand (underlined). Transfer RNA genes 5 mmol/L EDTA, 5% (w/v) SDS, 50 ng/mL proteinase K) at are designated by single-letter amino acid codes, and those on the H 65 8C for 3–5 h. The digestion was extracted twice with one and L strands are shown outside and inside the circular gene map, volume of Tris buffer–saturated phenol and then one volume respectively. CR denotes the A+T-rich control region, and L1, L2, Leu(CUN) Leu(UUN) Ser(AGN) of chloroform : isoamyl alcohol (24:1). DNA was precipi- S1, and S2 denote tRNA , tRNA , tRNA , and Ser(UCN) tated overnight in 2.5 volumes of 100% ethanol at –20 8C. tRNA , respectively. Two pairs of L-PCR primers, LPA- The precipitates were washed once with 70% ethanol, L2123 + LPA-H11240 and LPB-L11187 + LPB-H3651 (LP03 + dissolved in 50 mL of double-distilled water, and stored LP04 and LPCyt b + LPCO II in Liu et al. 2006), amplify 2 seg- at –20 8C. The DNA was diluted to 50 ng/mL with double- ments that cover the entire mitochondrial genome. distilled water before being used as template in long-range PCR (L-PCR).

PCR amplification and sequencing Two pairs of L-PCR primers were used to amplify the entire mtDNA of R. dubia in 2 overlapping portions (Liu et al. 2006). L-PCR amplifications were performed with 150 ng of total DNA, 1.5 mLof10Â LA PCR Buffer (Ta- KaRa Bio Inc.), 3.0 mmol/L dNTPs (2.5 mmol/L each dNTP), 15 mmol/L each primer, 37.5 mmol/L MgCl2, and 0.9 unit of LA Taq polymerase (TaKaRa) per 15 mL of total reaction volume on the MyCyclerTM thermal cycler. The cy- cling protocol consisted of a primary denaturation step at 94 8C for 2 min followed by 40 cycles of denaturation at 92 8C for 10 s, annealing at 44 8C for 30 s, and elongation at 68 8C for 8 min during the first 20 cycles and then an additional 20 s per cycle during the last 20 cycles. The final elongation step was continued at 68 8C for 7 min. L-PCR products were purified with a DNA Gel Purification Kit (U-Gene) after separation by electrophoresis on a 1.0% agarose gel. Sub-PCR primers were designed from comparison of 12 hemimetabola sequences available in GenBank, except that Sequence assembly, annotation, and analysis those for ND2 and srRNA were designed from ‘‘universal’’ We used the Staden package (http://staden.sourceforge. primers of insect mtDNA (Simon et al. 1994). Sub-PCR am- net/) for sequence assembly and annotation. Protein-coding plifications were performed with 50 ng of the L-PCR prod- genes (PCGs) and rRNA genes were identified by sequence ucts, 2.5 mLof10Â LA PCR Buffer, 5.0 mmol/L dNTPs comparison with L. migratoria mtDNA sequences (NC- (2.5 mmol/L each dNTP), 25–50 mmol/L each primer, 001712) available in GenBank. In PCGs, the start codon 62.5 mmol/L MgCl2, and 1.5 units of TaKaRa Taq (or LA was assumed to be that nearest to the beginning of the se- Taq) polymerase (TaKaRa) per 25 mL of total reaction vol- quence alignment. The secondary structures of the rRNA ume on the MyCyclerTM thermal cycler. The cycling proto- were derived by analogy with models available for other ar- col consisted of a primary denaturation step at 94 8C for thropods (Cannone et al. 2002; Gillespie et al. 2006; Podsia- 2 min followed by 30–35 cycles of denaturation at 94 8C dlowski et al. 2006). Most tRNAs were identified by for 10 s, annealing at 38–58 8C for 30 s, and elongation at tRNAscan-SE Search Server v.1.21 (Lowe and Eddy 1997), 72 8C for 1–2 min. The final elongation step was continued and the remaining ones were found by sequence comparison at 72 8C for 7 min. with L. migratoria. PCG sequences were aligned using Clus- The 28 sub-PCR fragments for R. dubia were directly se- tal X (Thompson et al. 1997). The aligned data were further quenced after separation and purification. Primers used for analyzed by MEGA 3.0 (Kumar et al. 2004). cycle sequencing were the same as those used for sub-PCR. The complete mtDNA sequence of R. dubia has been de- 1 DNA sequencing was performed using the ABI PRISM Big- posited in GenBank under accession No. EF583824. Dye1 Terminator v3.1 Cycle Sequencing Kit and the ABI PRISMTM 3100-Avant Genetic Analyzer. All fragments were Results and discussion sequenced from both strands. Sequencing PCR conditions were a primary denaturation step at 96 8C for 1 min followed Genome structure, organization, and composition by 35 cycles of denaturation at 96 8C for 10 s, annealing at The entire mitogenome of R. dubia was amplified in 2 38–50 8C for 1 min, and elongation at 60 8C for 4 min. overlapping fragments by L-PCR (Fig. 1). The mtDNA is a

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn Zhou et al. 857

Table 1. Organization of the Ruspolia dubia mitochondrial genome.

Gene or region Position (bp) Size (bp) Direction Non OL Start/stop codons tRNAIle 1–66 66 Forward (5’?3’)3 tRNAGln 64–132 69 Reverse (3’?5’)1 tRNAMet 132–201 70 Forward 5 ND2 197–1216 1020 Forward 3 ATT/TAA tRNATrp 1220–1287 68 Forward 8 tRNACys 1280–1345 66 Reverse 3 tRNATyr 1349–1414 66 Reverse 5 COI 1410–2951 1542 Forward 5 TTA/TAA tRNALeu(UUR) 2947–3012 66 Forward 6 COII 3019–3709 711 Forward ATG/T-tRNALys tRNALys 3710–3779 70 Forward 1 tRNAAsp 3779–3845 67 Forward ATP8 3846–4007 162 Forward 7 ATT/TAA ATP6 4001–4675 675 Forward 2 ATG/TAA COIII 4678–5466 789 Forward 1 ATG/TAA tRNAGly 5466–5529 64 Forward 3 ND3 5527–5883 357 Forward 5 ATA/TAA tRNAAla 5889–5951 63 Forward tRNAArg 5952–6015 64 Forward 21 tRNAAsn 6037–6103 67 Forward 2 tRNASer(AGN) 6106–6172 67 Forward 3 tRNAGlu 6176–6242 67 Forward tRNAPhe 6243–6309 67 Reverse ND5 6310–8041 1732 Reverse ATT/T-tRNAPhe tRNAHis 8042–8107 66 Reverse 5 ND4 8113–9453 1341 Reverse 7 ATG/TAA ND4L 9447–9743 297 Reverse 2 ATG/TAA tRNAThr 9746–9812 70 Forward 1 tRNAPro 9812–9875 64 Reverse 1 ND6 9877–10407 531 Forward 8 ATA/TAA Cytb 10400–11536 1137 Forward 11 ATG/TAG tRNASer(UCN) 11548–11616 69 Forward 22 ND1 11639–12586 948 Reverse 6 ATA/TAA tRNALeu(CUN) 12581–12645 65 Reverse lrRNA 12646–13947 1302 Reverse tRNAVal 13948–14019 72 Reverse srRNA 14020–14901 882 Reverse A+T-rich region 14902–14971 70 Forward 70 Note: tRNA, transfer RNA; ND1–ND6 and ND4L, NADH dehydrogenase subunits 1–6 and 4L; COI–COIII, cyto- chrome c oxidase subunits I–III; ATP6 and ATP8, ATPase subunits 6 and 8; Cytb, cytochrome b; srRNA and lrRNA, small subunit and large subunit ribosomal RNA. Non, non-coding region; OL, overlapping region. circular molecule of 14 971 bp, and thus it is the shortest in size from 1 to 22 bp. The largest region (22 bp) is lo- among known mitogenomes of orthopteran : 15 722 bp cated between tRNASer(UCN) and ND1 and corresponds to in L. migratoria (Flook et al. 1995), 15 521 bp in G. orien- 19 bp in L. migratoria (Flook et al. 1995) and 34 bp in talis (Kim et al. 2005), and 15 929 bp in G. gratiosa G. orientalis (Kim et al. 2005). In contrast to L. migratoria (Z. Zhou et al., unpublished data). The sequence analysis and G. orientalis, which contain only one larger spacer, revealed a gene content typically found in arthropod mito- R. dubia contains another 21 bp intergenic spacer sequence genomes: 13 protein-coding genes, 22 tRNAs, 2 rRNAs between tRNAArg and tRNAAsn (Table 1). (lrRNA and srRNA), and the A+T-rich region (Wolsten- The nucleotide composition of the R. dubia mitogenome holme 1992; Boore 1999). The orientation and gene order is biased toward adenine and thymine (70.8%), as are other of the R. dubia mitogenome are identical to those of Dro- insect mtDNA sequences (Table 2). The A+T content of sophila yakuba (Clary and Wolstenholme 1985) (Fig. 1). R. dubia is intermediate when compared with that of L. mi- This arrangement is conserved in divergent insect orders gratoria (75.3%) and G. orientalis (70.5%). and is inferred to be ancestral to . The genes overlap over a total of 56 bp in 13 locations (1–8 bp at Protein-coding genes each location) in the R. dubia mitogenome. The intergenic The invertebrate mitochondrial genetic code was used to spacer sequence (86 bp) is spread over 14 regions ranging identify open reading frames that were then matched to

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn 858 Genome Vol. 50, 2007

Table 2. The nucleotide composition of different features of the R. dubia mitochondrial genome.

Proportion of nucleotides Feature % T % C % A % G % A+T No. of nucleotides Whole genome 34.6 18.1 36.2 11.1 70.8 14971 Protein-coding genes* 40.7 15.5 29.1 14.7 69.8 11166 1st codon position 34.6 14.3 29.3 21.8 63.9 3722 2nd codon position 44.9 20.9 19.1 15.1 64.0 3722 3rd codon position 42.6 11.3 39.0 7.1 81.6 3722 tRNA genes 35.6 12.4 37.8 14.1 73.4 1470 rRNA genes 38.4 8.8 35.1 17.6 73.5 2184 A+T-rich region 41.4 20.0 30.0 8.6 71.4 70 Note: Overlapping nucleotides are counted in each region in which they appear. *The initiation and termination codons are excluded.

Fig. 2. The alignment of the start region of the cytochrome c oxidase subunit I gene (COI) of 3 orthopteran insects sequenced in their entire mitogenome. Underlined nucleotides in G. orientalis and L. migratoria sequences have been postulated to act as initiation codons of COI (Flook et al. 1995; Kim et al. 2005).

Table 3. The codon usage of R. dubia mitochondrial genome protein-coding genes.

Codon (aa) n (RSCU) Codon (aa) n (RSCU) Codon (aa) n (RSCU) Codon (aa) n (RSCU) UUU (F) 262.0 (1.53) UCU (S) 87.0 (2.12) UAU (Y) 123.0 (1.54) UGU (C) 37.0 (1.80) UUC (F) 80.0 (0.47) UCC (S) 25.0 (0.61) UAC (Y) 37.0 (0.46) UGC (C) 4.0 (0.20) UUA (L) 374.0 (3.83) UCA (S) 95.0 (2.31) UAAa 0.0 (0.00) UGA (W) 95.0 (1.74) UUG (L) 47.0 (0.48) UCG (S) 8.0 (0.19) UAGa 0.0 (0.00) UGG (W) 14.0 (0.26) CUU (L) 87.0 (0.89) CCU (P) 79.0 (2.14) CAU (H) 53.0 (1.32) CGU (R) 15.0 (0.97) CUC (L) 18.0 (0.18) CCC (P) 19.0 (0.51) CAC (H) 27.0 (0.68) CGC (R) 3.0 (0.19) CUA (L) 55.0 (0.56) CCA (P) 42.0 (1.14) CAA (Q) 62.0 (1.63) CGA (R) 32.0 (2.06) CUG (L) 5.0 (0.05) CCG (P) 8.0 (0.22) CAG (Q) 14.0 (0.37) CGG (R) 12.0 (0.77) AUU (I) 273.0 (1.76) ACU (T) 80.0 (1.52) AAU (N) 131.0 (1.58) AGU (S) 46.0 (1.12) AUC (I) 38.0 (0.24) ACC (T) 31.0 (0.59) AAC (N) 35.0 (0.42) AGC (S) 9.0 (0.22) AUA (M) 185.0 (1.71) ACA (T) 92.0 (1.75) AAA (K) 47.0 (1.31) AGA (S) 55.0 (1.34) AUG (M) 32.0 (0.29) ACG (T) 7.0 (0.13) AAG (K) 25.0 (0.69) AGG (S) 4.0 (0.10) GUU (V) 100.0 (1.85) GCU (A) 82.0 (1.61) GAU (D) 59.0 (1.51) GGU (G) 71.0 (1.21) GUC (V) 14.0 (0.26) GCC (A) 49.0 (0.96) GAC (D) 19.0 (0.49) GGC (G) 12.0 (0.20) GUA (V) 87.0 (1.61) GCA (A) 57.0 (1.12) GAA (E) 68.0 (1.70) GGA (G) 105.0 (1.79) GUG (V) 15.0 (0.28) GCG (A) 16.0 (0.31) GAG (E) 12.0 (0.30) GGG (G) 47.0 (0.80) Note: A total of 3722 codons were analyzed, excluding the initiation and termination codons. n, frequency of each codon. RSCU, relative synonymous codon usage. Codons that are underlined correspond to the 22 mitochondrial tRNAs. aStop codons.

those of L. migratoria to identify the 13 PCGs. All R. dubia gene. We found that CTTA is located in the same position protein-coding sequences except for COI start with a typical as the initiation codon (ATTA) of L. migratoria COI, based ATN codon (Table 1). The problematic translational start of on sequence alignment with L. migratoria and G. orientalis the COI locus has been extensively discussed in several ar- mtDNA (Fig. 2); however, CTTA has not been reported as thropod species; suggestions include tetranucleotides an initiation codon of mtDNA PCGs. Instead, TTA, which (ATAA, TTAA, and ATTA) (Flook et al. 1995; Lewis et al. is a rare but possible initiation codon (Nardi et al. 2001), is 1995) and a hexanucleotide (ATTTAA) (Junqueira et al. the only possible initiation codon in the start region of 2004). In COI of R. dubia mtDNA, however, neither a typi- R. dubia COI and the upstream neighboring tRNATyr gene. cal ATN initiator codon nor a tetranucleotide or hexanucleo- Conventional termination codons (TAA and TAG) could be tide is found in the start region or the neighboring tRNATyr assigned to most of the putative protein sequences. Only COII

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn Zhou et al. 859

Fig. 3. Inferred secondary structure of 22 tRNAs of R. dubia. The tRNAs are labeled with the abbreviations of their corresponding amino acids. Dashes (–) indicate Watson–Crick base pairing and centered asterisks (*) indicate G–U base pairing.

and ND5 have incomplete termination codons (T-tRNA). The don (TAA) is created by polyadenylation of the RNA message presence of incomplete stop codons is a common phenomenon after cleavage, as observed in other animal phyla (Anderson et in a number of invertebrate mtDNA PCGs (Carapelli et al. al. 1981; Bibb et al. 1981; Ojala et al. 1981; Okimoto et al. 2006; Kim et al. 2006; Cha et al. 2007). The termination co- 1990; Lavrov et al. 2002).

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn 860 Genome Vol. 50, 2007

Fig. 3 (continued).

The A+T content of the combined PCGs in R. dubia (81.6%) at the third codon position was higher than that at mtDNA was 69.8%. This value is very similar to that of the first (63.9%) and second (64%) positions. G. orientalis (69.4%) but lower than that of L. migratoria The total number of codons in the PCGs of R. dubia (74.1%). Analysis of base composition at each codon posi- mtDNA was 3722, excluding the initiation and termination tion of PCGs (Table 2) showed that the A+T content codons. The codon usage of PCGs in R. dubia mtDNA and

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn Zhou et al. 861

Fig. 3 (concluded).

the relative synonymous codon usage values are given in lengths for the aminoacyl (AA) stem (7 bp), the AC loop (7 Table 3. All codons are present in the R. dubia mtDNA nucleotides), and the AC stem (5 bp). Most of the size varia- PCGs. The 4 most frequent amino acids in the R. dubia tion stems from length variation in the DHU and T arms, mtDNA PCGs are leucine (15.74%), phenylalanine (9.19%), within which the loop size (3–9 bp) is more variable than serine (8.84%), and isoleucine (8.36%), and the total content the stem size (3–5 bp, except for tRNASer(AGN)). The antico- (42.13%) of these amino acids is similar to that in L. migra- dons are identical to those observed in D. yakuba (Clary and toria (42.87%) and G. orientalis (42.69%) (Flook et al. Wolstenholme 1985) and L. migratoria (Flook et al. 1995). 1995; Kim et al. 2005). A total of 37 unmatched base pairs occur in the R. dubia mitochondrial tRNAs. Twenty-four of them are G–U pairs, tRNA-coding genes which form a weak bond, located in the AA stem (6 bp), The whole set of 22 tRNAs typical of arthropod mitoge- the T stem (2 bp), the AC stem (10 bp), and the DHU stem nomes was found in R. dubia, and the secondary structures (6 bp). The remaining 13 consist of A–C mismatches (3 bp) were drawn for each one (Fig. 3). All R. dubia mitochondrial in the AA stem of tRNAIle, the T stem of tRNAMet, and the tRNAs could be folded into typical cloverleaf secondary AC stem of tRNATrp; A–G mismatches (3 bp) in the T stems structures, except for tRNASer(AGN). The secondary structure of tRNAHis and tRNALeu(CUN) and the DHU stem of tRNAIle; of tRNASer(AGN) is identical to the Type-9 structure, but the U–U mismatches (5 bp) in the AA stems of tRNAAla and optimal base pairing (9 bp in contrast to the normal 5) ex- tRNASer(UCN) and the AA stem (2 bp) and T stem of poses the presence of a bulged nucleotide in the middle of tRNAVal; and U–C mismatches (2 bp) in the AC stems of the anticodon (AC) stem, there are 6 bp in the TÉC (T) tRNALys and tRNAArg. stem in contrast to the normal 5 and only 1 bp in the dihy- drouridine (DHU) stem, and there are no connector nucleoti- rRNA-coding genes des (Steinberg and Cedergren 1994, 1997). The length of As in all other insect mitogenomes, the 2 rRNA genes R. dubia mitochondrial tRNA genes ranges from 63 to (lrRNA and srRNA)inR. dubia are located between 72 bp. All R. dubia mitochondrial tRNAs possess invariable tRNALeu(CUN) and tRNAVal and between tRNAVal and the

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn 862 Genome Vol. 50, 2007

Fig. 4. The secondary structure of mitochondrial rRNA (srRNA and lrRNA) in R. dubia: (A) srRNA; (B) lrRNA. The differences between our structure and the previously published Apis mellifera structure (Gillespie et al. 2006) are within boxes. Parts (a), (b), and (c) show some dissimilarity between R. dubia and A. mellifera (see text).

# 2007 NRC Canada 中国科技论文在线 hue al. et Zhou

Fig. 4 (concluded). http://www.paper.edu.cn # 07NCCanada NRC 2007 863 中国科技论文在线 http://www.paper.edu.cn 864 Genome Vol. 50, 2007

Fig. 5. Structural element and stem-loop structure found in the A+T-rich region of R. dubia mtDNA. (A) Alignment of the polythymidine stretch (Zhang et al. 1995) in the A+T-rich region of Schistocerca gregaria, Chorthippus parallelus, and R. dubia. The poly(T) stretch runs from 14 918 to 14 954 (37 bp) in the R. dubia mitogenome (5’?3’). Asterisks indicate consensus in the alignment. (B) Possible secondary structures of the R. dubia A+T-rich region.

A+T-rich region, respectively (Fig. 1 and Table 1). The A+T-rich region lengths of lrRNA and srRNA were determined to be The A+T-rich region is well known for the initiation of 1302 bp and 882 bp, respectively (Table 1). The A+T replication in both vertebrates and invertebrates, and its par- content (73.6%) of the rRNA genes (Table 2) was the ticularly high content of adenine and thymine is one of its highest among different regions of the R. dubia mitoge- most outstanding features (Simon et al. 1994; Boore 1999). nome, and the G content (17.6%) was almost twice the C The 70 bp A+T-rich region of R. dubia is located in the con- content (8.9%). served position between srRNA and the tRNAIle–tRNAGln– The secondary structures of the R. dubia mitochondrial tRNAMet gene cluster (Fig. 1). In contrast to the A+T-rich re- srRNA and lrRNA were drawn following the previously gion in most insects, the A+T content (71.4%) of this region published models for Apis mellifera (Gillespie et al. 2006), in R. dubia mtDNA is lower than that of tRNAs (73.4%) and the data fit the models reasonably well. The secondary and rRNAs (73.6%) (Table 2). The length of the R. dubia structure of the R. dubia mitochondrial srRNA consists of 3 A+T-rich region is the shortest among those reported, which structural domains and 33 helices (Fig. 4A). Three parts (a–c) range from 73 bp + 47 bp in Heterodoxus macropus (Shao of the R. dubia srRNA show some dissimiliarity to the et al. 2001) to 4061 bp in D. melanogaster (Lewis et al. A. mellifera model (Gillespie et al. 2006). Three newly 1995). In H. macropus, there are 2 A+T-rich regions (73 bp proposed helices are located in domains I, II, and III of and 47 bp), which is a very rare genomic organization. the srRNA: H5’, H12’, and H20’, respectively. Domain I is The R. dubia A+T-rich region contains a polythymidine probably the most variable fragment of the entire srRNA, stretch that is highly conserved in Orthoptera and Diptera differing in both length and secondary structure. Part a, (Fig. 5A), which may be involved in the control of tran- which was proposed to form a helix (H47) in A. mellifera, scription and (or) replication (Zhang et al. 1995; Cha et al. with a 30 bp larger loop, can be folded into 2 helices, H4 2007). An alignment of this region with 2 orthopteran spe- and H5’,inR. dubia. In part b, one newly proposed helix cies, Schistocerca gregaria and Chorthippus parallelus, (H12’) can be folded between H12 and H13 (H577 and showed only a short block of high sequence similarity. H673 in A. mellifera). Domain III is probably the most These results support the idea that the poly(T) region may conserved part of the srRNA, and only one newly proposed be important in the initiation of mtDNA replication or have helix (H20’) in part c can be folded between H20 and H21 some other unknown function (Cha et al. 2007). (H984 and H1047 in A. mellifera). Potential secondary structures were obtained for the The secondary structure of R. dubia lrRNA consists of 6 R. dubia A+T-rich region (Fig. 5B). The R. dubia A+T-rich structural domains (domain III is absent in arthropods) and region may fold into 2 stem-loop structures. However, no 47 helices (Fig. 4B). Two parts (a and b) of the R. dubia highly conserved flanking TATA sequences at the 5’ end or lrRNA contain 4 newly proposed helices that show some G(A)nT sequences at the 3’ end (Zhang et al. 1995) of these dissimiliarity to the A. mellifera model (Gillespie et al. stem-loop structures were found. Saito et al. (2005) found a 2006). Part a is in domain I of the R. dubia lrRNA and con- potential stem-loop structure located at a similar position in tains 3 newly proposed helices (H-1, H-2, and H-3). Part b the A+T-rich region among L. migratoria, S. gregaria, and contains another newly proposed helix (H16’) between H16 C. parallelus and proposed that it may be involved in repli- and H17 (H991 and H1057 in A. mellifera) in domain II. cation initiation because it is highly similar to the stem-loop

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn Zhou et al. 865

structure located around the L-strand origin of vertebrate H.D., and Jin, B.R. 2005. The complete nucleotide sequence mtDNA, but the 2 stem-loop structures of R. dubia have and gene organization of the mitochondrial genome of the oriental scarcely any similarity to it. mole cricket, Gryllotalpa orientalis (Orthoptera: Gryllotalpidae). Gene, 353: 155–168. doi:10.1016/j.gene.2005.04.019. PMID: Acknowledgements 15950403. Kim, I., Lee, E.M., Seol, K.Y., Yun, E.Y., Lee, Y.B., Hwang, J.S., We thank Dr. Lu Hui-meng for primer design and discus- and Jin, B.R. 2006. The mitochondrial genome of the Korean sion during the experiments. This study was supported by hairstreak, Coreana raphaelis (Lepidoptera: Lycaenidae). Insect the National Natural Science Foundation of China (grant Mol. Biol. 15: 217–225. doi:10.1111/j.1365-2583.2006.00630.x. Nos. 30470238, 30670279, and 30670252) and the Innova- PMID:16640732. tion Foundation of the Graduate Student Cultivation of the Kumar, S., Tamura, K., and Nei, M. 2004. MEGA3: integrated Shaanxi Normal University (grant No. 2006CXB003). software for molecular evolutionary genetics analysis and se- quence alignment. Brief Bioinform. 5: 150–163. doi:10.1093/ References bib/5.2.150. PMID:15260895. Anderson, S., Bankier, A.T., Barrell, B.G., de Bruijin, M.H.L., Lavrov, D.V., Boore, J.L., and Brown, W.M. 2002. Complete Droujn, A.R.J., Eperon, I.C., et al. 1981. Sequence and organiza- mtDNA sequences of two millipedes suggest a new model for tion of the human mitochondrial genome. Nature (London), 290: mitochondrial gene rearrangements: duplication and nonrandom 457–465. doi:10.1038/290457a0. PMID:7219534. loss. Mol. Biol. Evol. 19: 163–169. PMID:11801744. Bibb, M.J., VanEtten, R.A., Wright, C.T., Walberg, M.W., and Lewis, D.L., Farr, C.L., and Kaguni, L.S. 1995. Drosophila Clayton, D.A. 1981. Sequence and gene organization of mouse melanogaster mitochondrial DNA: completion of the nucleotide mitochondrial DNA. Cell, 26: 167–180. doi:10.1016/0092- sequence and evolutionary comparisons. Insect Mol. Biol. 4: 8674(81)90300-7. PMID:7332926. 263–278. PMID:8825764. Boore, J.L. 1999. Animal mitochondrial genomes. Nucleic Acids Liu, N., Hu, J., and Huang, Y. 2006. Amplification of grasshoppers Res. 27: 1767–1780. doi:10.1093/nar/27.8.1767. PMID:10101183. complete mitochondrial genomes using long PCR. Chin. J. Zool. Cannone, J.J., Subramanian, S., Schnare, M.N., Collett, J.R., 41: 61–65. [In Chinese.] D’Souza, L.M., and Du, Y. 2002. The Comparative RNA Web Lowe, T.M., and Eddy, S.R. 1997. tRNAscan-SE: a program for (CRW) Site: an online database of comparative sequence and improved detection of transfer RNA genes in genomic sequence. structure information for ribosomal, intron, and other RNAs: Nucleic Acids Res. 25: 955–964. doi:10.1093/nar/25.5.955. Correction. BMC Bioinformatics, 3: 15. doi:10.1186/1471-2105- PMID:9023104. 3-15. Nardi, F., Carapelli, A., Fanciulli, P.P., Dallai, R., and Frati, F. Carapelli, A., Vannini, L., Nardi, F., Boore, J.L., Beani, L., Dallai, 2001. The complete mitochondrial DNA sequence of the basal R., and Frati, F. 2006. The mitochondrial genome of the ento- hexapod Tetrodontophora bielanensis: evidence for heteroplasmy mophagous endoparasite Xenos vesparum (Insecta: Strepsiptera). and tRNA translocations. Mol. Biol. Evol. 18: 1293–1304. Gene, 376: 248–259. doi:10.1016/j.gene.2006.04.005. PMID: PMID:11420368. 16766140. Ojala, D., Montoya, J., and Attardi, G. 1981. tRNA punctuation Cha, S.Y., Yoon, H.J., Lee, E.M., Yoon, M.H., Hwang, J.S., Jin, model of RNA processing in human mitochondria. Nature B.R., Han, Y.S., and Kim, I. 2007. The complete nucleotide se- (London), 290: 470–474. doi:10.1038/290470a0. PMID:7219536. quence and gene organization of the mitochondrial genome of Okimoto, R., Macfarlane, J.L., and Wolstenholme, D.R. 1990. Evi- the bumblebee, Bombus ignitus (Hymenoptera: Apidae). Gene, dence for the frequent use of TTG as the translation initiation 392: 206–220. doi:10.1016/j.gene.2006.12.031. PMID:17321076. codon of mitochondrial protein genes in the nematodes, Ascaris Clary, D.O., and Wolstenholme, D.R. 1985. The mitochondrial suum and Caenorhabditis elegans. Nucleic Acids Res. 18: DNA molecule of Drosophila yakuba: nucleotide sequence, 6113–6118. doi:10.1093/nar/18.20.6113. PMID:2235493. gene organization, and genetic code. J. Mol. Evol. 22: 252–271. Podsiadlowski, L., Carapelli, A., Nardi, F., Dallai, R., Koch, M., doi:10.1007/BF02099755. PMID:3001325. Boore, J.L., and Frati, F. 2006. The mitochondrial genomes of Duenas, J.C., Gardenal, C.N., Llinas, G.A., and Panzetta-Dutari, Campodea fragilis and Campodea lubbocki (Hexapoda: G.M. 2006. Structural organization of the mitochondrial DNA Diplura): high genetic divergence in a morphologically uniform control region in Aedes aegypti. Genome, 49: 931–937. doi:10. taxon. Gene, 381: 49–61. doi:10.1016/j.gene.2006.06.009. 1139/G06-053. PMID:17036068. PMID:16919404. Flook, P.K., Rowell, C.H., and Gellissen, G. 1995. The sequence, Saito, S., Tamura, K., and Aotsuka, T. 2005. Replication origin of organization, and evolution of the Locusta migratoria mitochon- mitochondrial DNA in insects. Genetics, 171: 1695–1705. drial genome. J. Mol. Evol. 41: 928–941. doi:10.1007/ doi:10.1534/genetics.105.046243. PMID:16118189. BF00173173. PMID:8587138. Shao, R., Campbell, N.J., and Barker, S.C. 2001. Numerous gene Gillespie, J.J., Johnston, J.S., Cannone, J.J., and Gutell, R.R. 2006. rearrangements in the mitochondrial genome of the wallaby Characteristics of the nuclear (18S, 5.8S, 28S and 5S) and mito- louse, Heterodoxus macropus (Phthiraptera). Mol. Biol. Evol. chondrial (12S and 16S) rRNA genes of Apis mellifera (Insecta: 18: 858–865. PMID:11319269. Hymenoptera): structure, organization, and retrotransposable ele- Simon, C., Frati, F., Bekenbach, A., Crespi, B., Liu, H., and Flook, ments. Insect Mol. Biol. 15: 657–686. doi:10.1111/j.1365-2583. P. 1994. Evolution, weighting, and phylogenetic utility of mito- 2006.00689.x. PMID:17069639. chondrial gene sequences and a compilation of conserved poly- Junqueira, A.C.M., Lessinger, A.C., Torresa, T.T., da Silva, F.R., merase chain-reaction primers. Ann. Entomol. Soc. Am. 87: Vettore, A.L., Arruda, P., and Azeredo Espin, A.M.L. 2004. 651–701. The mitochondrial genome of the blowfly Chrysomya chloropyga Steinberg, S., and Cedergren, R. 1994. Structural compensation in (Diptera: Calliphoridae). Gene, 339: 7–15. doi:10.1016/j.gene. atypical mitochondrial tRNAs. Nat. Struct. Biol. 1: 507–510. 2004.06.031. PMID:15363841. doi:10.1038/nsb0894-507. PMID:7664076. Kim, I., Cha, S.Y., Yoon, M.H., Hwang, J.S., Lee, S.M., Sohn, Steinberg, S., Leclerc, F., and Cedergren, R. 1997. Structural rules and

# 2007 NRC Canada 中国科技论文在线 http://www.paper.edu.cn 866 Genome Vol. 50, 2007

conformational compensations in the tRNA L-Form. J. Mol. Biol. Zhang, D.X., and Hewitt, G.M. 1997. Insect mitochondrial control 266: 269–282. doi:10.1006/jmbi.1996.0803. PMID:9047362. region: a review of its structure, evolution and usefulness in Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., and evolutionary studies. Biochem. Syst. Ecol. 25: 99–120. doi:10. Higgins, D.G. 1997. The CLUSTAL_X window interface: flex- 1016/S0305-1978(96)00042-7. ible strategies for multiple sequence alignment aided by quality Zhang, D.X., Szymura, J.M., and Hewitt, G.M. 1995. Evolution analysis tools. Nucleic Acids Res. 25: 4876–4882. doi:10.1093/ and structural conservation of the control region of insect mito- nar/25.24.4876. PMID:9396791. chondrial DNA. J. Mol. Evol. 40: 382–391. doi:10.1007/ Wolstenholme, D.R. 1992. Animal mitochondrial DNA: structure BF00164024. PMID:7769615. and evolution. Int. Rev. Cytol. 141: 173–216. PMID:1452431.

# 2007 NRC Canada