View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Genomics 101 (2013) 64–73

Contents lists available at SciVerse ScienceDirect

Genomics

journal homepage: www.elsevier.com/locate/ygeno

The complete mitogenome of mori strain Dazao (: ) and comparison with other lepidopteran

Qiu-Ning Liu 1, Bao-Jian Zhu 1, Li-Shang Dai, Chao-Liang Liu ⁎

College of Life Science, Anhui Agricultural University, 130 Changjiang West Road 230036, PR China

article info abstract

Article history: The complete mitochondrial genome (mitogenome) of strain Dazao (Lepidoptera: Bombycidae) Received 31 August 2012 was determined to be 15,653 bp, including 13 protein-coding genes (PCGs), two rRNA genes, 22 tRNA genes Accepted 6 October 2012 and a A+T-rich region. It has the typical gene organization and order of mitogenomes from lepidopteran in- Available online 13 October 2012 sects. The AT skew of this mitogenome was slightly positive and the nucleotide composition was also biased toward A+T nucleotides (81.31%). All PCGs were initiated by ATN codons, except for cytochrome c oxidase Keywords: subunit 1 (cox1) gene which was initiated by CGA. The cox1 and cox2 genes had incomplete stop codons Mitochondrial genome Lepidoptera consisting of just a T. All the tRNA genes displayed a typical clover-leaf structure of mitochondrial tRNA. Bombyx mori The A+T-rich region of the mitogenome was 495 bp in length and consisted of several features common Phylogeny to the lepidopteras. Phylogenetic analysis showed that the B. mori Dazao was close to Bombycidae. © 2012 Elsevier Inc. All rights reserved.

1. Introduction phylogenetic analyses to the selected from Lepidoptera, Orthoptera and Diptera based on the mitogenome sequences were mitogenome DNA (mtDNA) is a circular molecule with performed using the neighbor-joining (NJ) method. 14–19 kb in length and contains 13 protein coding genes (PCGs), sub- units 6 and 8 of the ATPase (atp6 and atp8), cytochrome c oxidase subunits 1–3(cox1-cox3), cytochrome B (cob), NADH dehydrogenase Table 1 The complete mitochondrial genome of Lepidoptera. subunits 1–6 and 4 L (nad1-6 and nad4L), two ribosomal RNA genes encoding the small and large subunit rRNAs (rrnL and rrnS), 22 trans- Species Length (bp) Accession number References fer RNA (tRNA) genes and a A+T-rich region which contains initia- Bombyx mori Dazao 15, 653 This study tion sites for transcription and replication [1–3]. Mitochondrial DNA Bombyx mori Xiafang 15, 664 AY048187 [4] has been widely used as an informative molecular marker for diverse Bombyx mori C108 15, 656 AB070264 [5] Japanese 15, 928 NC_003395 [5] evolutionary studies among species [3]. Chinese Bombyx mandarina 15, 682 AY301620 [6] Although Lepidoptera is the 2nd most numerous order of insects, only pernyi 15, 575 AY242996 [7] a few complete or near-complete mitogenomes are currently available in Antheraea yamamai 15, 338 EU726630 [8] GenBank (Table 1). The silk-producing insects with economic value in Eriogyna pyretorum 15, 327 FJ685653 [9] Lepidoptera belong to two families of , Bombycidae and . Samia cynthia ricini 15, 366 JN215366 [10] Actias selene 15, 236 JX186589 [11] The complete mitogenomes of three species of Bombycidae and six spe- Caligula boisduvalii 15, 360 NC_010613 [12] cies of Saturniidae were sequenced: Bombyx mori Xiafang [4], Bombyx Manduca sexta 15, 516 EU286785 [13] mori C108 [5], Japanese B. mandarina [5] and Chinese B. mandarina [6] be- Hyphantria cunea 15, 481 GU592049 [14] longing to Bombycidae; Antheraea pernyi [7], Antheraea yamamai Helicoverpa armigera 15, 347 GU188273 [15] Ochrogaster lunifer 15, 593 AM946601 [16] [8], Eriogyna pyretorum [9], Samia cynthia ricini [10], Actias selene [11] Ostrinia nubilalis 14, 535 NC_003367 [17] and Caligula boisduvalii [12] belonging to the family Saturniidae. Chilo suppressalis 15, 456 HQ860290 [18] Bombyx mori strain Dazao, a member of the Bombycidae family, Diatraea saccharalis 15,490 FJ240227 [19] is an important model organism and used as a bioreactor for the Phthonandria atrilineata 15, 499 EU569764 [20] production of recombinant proteins [28]. In the present study, the Adoxophyes honmai 15, 680 DQ073916 [21] Grapholita molesta 15, 776 HQ116416 [22] complete mitogenome of B. mori strain Dazao was sequenced. The Spilonota lechriaspis 15, 368 HM204705 [23] Artogeia melete 15, 140 EU597124 [24] Coreana raphaelis 15, 314 NC_007976 [25] ⁎ Corresponding author. Fax: +86 551 5786201. Acraea issoria 15, 245 GQ376195 [26] E-mail address: [email protected] (C.-L. Liu). nerippe 15, 140 JF504707 [27] 1 These authors contributed equally to this work.

0888-7543/$ – see front matter © 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ygeno.2012.10.002 Q.-N. Liu et al. / Genomics 101 (2013) 64–73 65

Table 2 2.2. PCR amplification and sequencing Primers used for amplification of the mitogenome of B. mori Dazao.

Primer pair Primer sequence (5′→3′) For amplification of the entire mitogenome of B. mori strain Dazao, nine primer sets were designed according to the conserved nucleo- F1 GCTTTTGGGCTCATACCTCA R1 GATGAAATACCTGCAAGATGAAG tide sequences of known mitochondrial sequences from Lepidoptera F2 TGGAGCAGGAACAGGATGAAC insects [4–12] and synthesized in Sunbiotech Co., Ltd. (Beijing, R2 GAGACCADTACTTGCTTTCAG China) (Table 2). The fragments were amplified using Aidlab Taq Kit F3 ATTTGTGGAGCTAATCATAG (Aidlab Co., Beijing, China) according to the manufacturer's instruc- R3 GGTCAGGGACTATAATCTAC F4 TCGACCTGGAACTTTAGC tions. The PCR was performed under the following conditions: R4 GCAGCTATAGCCGCTCCTACT 3 min at 94 °C, followed by 35 cycles of 30 s at 94 °C, 1–3 min at F5 TAAAGCAGAAACAGGAGTAG 55–60 °C, 10 min at 72 °C. The PCR products were separated by aga- R5 ATTGCGATATTATTTCTTTTG gel electrophoresis (1% w/v) and purified using a DNA gel extrac- F6 ACATTCTTAGGTGGATTA tion kit (Aidlab Co., Beijing, China). The purified PCR products were R6 GTTAAAGTGGCATTATCT F7 GGAGCTTCTACATGAGCTTTTGG ligated into the T-vector (TaKaRa Co., Dalian, China) and sequenced R7 GTTTGCGACCTCGATGTTG at least three times at Sunbiotech Co., Ltd., (Beijing, China). F8 GGTCCCTTACGAATTTGAATATATCCT R8 AAACTAGGATTAGATACCCTATTAT 2.3. Sequence assembly and gene annotation F9 CTCTACTTTGTTACGACTTATT R9 TCTAGGCCAATTCAACAACC Sequence annotation was performed using the blast tools in NCBI web site (http://blast.ncbi.nlm.nih.gov/Blast)andDNAStarpackage 2. Materials and methods (DNAStar Inc. Madison, USA). Identification of tRNA genes was verified using the tRNAscan-SE program. The potential stem–loop secondary 2.1. DNA extraction structures within these tRNA gene sequences were calculated using the tRNAscan-SE Search Server (http://lowelab.ucsc.edu/tRNAscan-SE/) The silkworm strain Dazao was maintained in our laboratory. All [29]. The secondary structures of tRNA genes that could not be predicted larvae were fed with fresh mulberry leaves under normal conditions. using the tRNAscan-SE were analyzed by comparison with the Total DNA was isolated from the third day of fifth instar larvae using nucleotide sequences of other insect tRNA sequences [4–12].The the Aidlab Genomic DNA Extraction Kit (Aidlab Co., Beijing, China) PCGs were identified by sequence similarity with B. mori C108 [5]. according to the manufacturer's instructions. Extracted DNA was The nucleotide sequences of PCGs were translated with the inverte- used for PCR amplification of the complete mitogenome. brate mitogenome genetic code. Alignments of PCGs from various

Fig. 1. Map of the mitogenome of B. mori strain Dazao. tRNA genes are labeled according to the IUPAC-IUB. Single letter amino acids above the bar indicate coding sequence on major strand whereas below the bar indicates coding on minor strand. Anti-clockwise PCGs or rRNA genes are located on L strand and others are located on H strand. 66 Q.-N. Liu et al. / Genomics 101 (2013) 64–73 lepidopteran mitogenomes were performed using Clustal X [30]. A+T-rich region (Fig. 1). The order of genes and the orientation of Composition skewness was calculated according to the formulas: the mitogenome of B. mori Dazao are identical to the sequenced lep- AT skew=[A− T]/[A+T]; GC skew=[G− C]/[G+ C] [31].Tandem idopteran mitogenomes. The placement of the trnM tRNA gene in repeats in the A+ T-rich region were predicted using the Tandem the B. mori Dazao mitogenome differs from the ancestral gene order Repeats Finder program (http://tandem.bu.edu/trf/trf.html) [32]. observed in insects. The ancestral order is: A+T-rich region, trnI, trnQ, trnM, nad2 [39]. The placement of trnM may therefore represent 2.4. Phylogenetic analysis a molecular feature exclusive to lepidopteran mtDNAs [3,37]. The nu- cleotide composition of the mitogenome of B. mori Dazao is: A=6732 To reconstruct the phylogenetic relationship among lepidopteran (43.01%), T=5995 (38.30%), G=1154 (7.37%) and C=1772 insects, the complete mitogenomes of 26 lepidopteran species were (11.32%). As observed in other lepidopterans, the nucleotide compo- obtained from the GenBank database (Table 1). These mitogenomes sition of the B. mori Dazao mitogenome is biased toward A+T were divided into 11 lepidopteran superfamilies within the lepidopter- (81.31%) (Table 4). an suborder. The mitogenomes of Locusta migratoria (NC_001712), Drosophila yakuba (NC_001322) and Anopheles gambiae (NC_002084) were used as outgroups [33–35]. The amino acid sequences of each of 3.2. Overlapping and intergenic spacer regions the 13 mitochondrial PCGs were aligned using default settings and concatenated. The concatenated set of amino acid sequences from the The B. mori Dazao mitogenome has a total of 31 bp overlap be- 13 PCGs was used in phylogenetic analysis, which was performed using tween genes in eight locations. Two longest eight nucleotides over- neighbor-joining (NJ) method with the MEGA version 5.0 program [36]. laps located between trnW and trnC as well as trnF and nad5 were found (Table 3), however, the eight nucleotides overlap between 3. Results trnF and nad5 are not detected in other lepidopteran species. Seven nucleotides overlap between atp8 and atp6 is common to many 3.1. Genome organization and base composition other insect mitogenomes. The overlapping between trnW and trnC are also detected in other lepidopteran species such as C. boisduvalii The mitogenome of B. mori Dazao is a closed circular molecule of [12], A. pernyi [7], Ostrinia furnacalis and Ostrinia nubilalis [17]. The 15,653 bp. The gene content is typical of other insect mitochondrial B. mori Dazao mitogenome contains 338 bp of intergenic spacer se- genomes, including 22 tRNA genes (one for each amino acid, two quences spread over 19 regions ranging in size from 1 to 55 bp. The for leucine and serine), 13 PCGs (cox1-3, nad1-6, nad4L, cob, atp6 longest spacer sequence is 55 bp, located between the trnA and and atp8), two mitochondrial ribosomal RNAs (rrnS and rrnL), and a nad3 genes, is extremely A+T rich (100.00%) (Table 4).

Table 3 Summary of the mitogenome of B. mori strain Dazao.

Gene Direction Location Size Anticodon Start codon Stop codon Intergenic nucleotides*

trnM F1–68 68 CAT ––−2 trnI F67–132 66 GAT ––−3 trnQ R 130–198 69 TTG ––47 nad2 F 246–1268 1023 – ATA TAA 5 trnW F 1274–1343 70 TCA ––−8 trnC R 1336–1402 67 GCA ––0 trnY R 1409–1474 66 GTA ––6 cox1 F 1490–3020 1531 – TTAG T 0 trnL2(UUR) F 3021–3087 67 TAA ––0 cox2 F 3088–3769 682 – ATG T 0 trnK F 3770–3840 71 CTT ––−1 trnD F 3840–3906 67 GTC ––0 atp8 F 3907–4068 162 – ATA TAA −7 atp6 F 4062–4739 678 – ATG TAA 14 cox3 F 4754–5542 789 – ATG TAA 2 trnG F 5545–5610 66 TCC ––3 nad3 F 5614–5964 351 – ATA TAA 55 trnA F 6020–6087 68 TGC ––28 trnR F 6116–6179 64 TCG ––1 trnN F 6181–6247 67 GTT ––−1 trnS1(AGN) F 6247–6315 69 GCT ––16 trnE F 6332–6396 65 TTC ––11 trnF R 6408–6474 67 GAA ––−8 nad5 R 6467–8185 1719 – ATT TAA 21 trnH R 8207–8273 67 GTG ––44 nad4 R 8318–9670 1353 – ATG TAA −1 nad4L R 9670–9960 291 – ATG TAA 4 trnT F 9965–10029 65 TGT ––0 trnP R 10030–10095 66 TGG ––2 nad6 F 10098–10628 531 – ATT TAA 45 cob F 10674–11825 1152 – ATA TAA 3 trnS2(UCN) F 11829–11894 66 TGA ––25 nad1 R 11920–12852 933 – ATT TAA 6 trnL1(CUN) R 12859–12929 71 TAG ––0 rrnL R 12930–14307 1378 –– – 0 trnV R 14308–14375 68 TAC ––0 rrnS R 14376–15158 783 –– – 0 A+T-rich region 15159–15653 495 –– – 0 Q.-N. Liu et al. / Genomics 101 (2013) 64–73 67

Table 4 Composition and skewness in the Lepidopteran mitogenomes.

Species Size (bp) A% G% T% C% A+T % AT skewness GC skewness

B. mori Dazao 15,653 43.01 7.37 38.30 11.32 81.31 0.058 −0.211 B. mori Xiafang 15,664 43.05 7.32 38.30 11.33 81.35 0.058 −0.215 A. selene 15,236 38.54 8.05 40.37 13.03 78.91 −0.023 −0.236 O. nubilalis 14,535 41.36 8.02 38.81 11.82 80.17 0.031 −0.192 O. furnicalis 14,536 41.46 7.91 38.92 11.71 80.37 0.032 −0.194 C. raphaelis 15,314 39.37 7.30 43.29 10.04 82.66 −0.047 −0.158 E. pyretorum 15,327 39.17 7.63 41.65 11.55 80.82 −0.031 −0.204 A. yamamai 15,338 39.26 7.69 41.04 12.02 80.29 −0.022 −0.220 C. boisduvalii 15,360 39.34 7.58 41.28 11.79 80.62 −0.024 −0.217 S. cynthia ricini 15,384 39.65 7.81 40.13 12.41 79.78 −0.006 −0.227 P. atrilineata 15,499 40.78 7.67 40.24 11.31 81.02 0.007 −0.192 M. sexta 15,516 40.67 7.46 41.11 10.76 81.79 −0.005 −0.181 A. pernyi 15,566 39.22 7.77 40.94 12.07 80.16 −0.021 −0.216 O. lunifer 15,593 40.09 7.56 37.75 14.60 77.84 0.030 −0.318 B. mori C108 15,656 43.06 7.31 38.30 11.33 81.36 0.059 −0.216 A. honmai 15,680 40.15 7.88 40.24 11.73 80.39 −0.001 −0.178 Chinese B. mandarina 15,682 43.11 7.40 38.48 11.01 81.59 0.057 −0.196 Japanese B. mandarina 15,928 43.08 7.21 38.60 11.11 81.68 0.055 −0.213 PCGs Size (bp) A% G% T% C% A+T % AT skewness GC skewness B. mori Dazao 11,195 42.83 8.21 36.74 12.22 79.57 0.077 −0.196 B. mori Xiafang 11,177 42.93 8.17 36.65 12.26 79.57 0.079 −0.200 B. mori C108 11,187 42.91 8.15 36.68 12.26 79.58 0.078 −0.201 Japanese B. mandarina 11,193 42.78 8.14 36.86 12.22 79.64 0.074 −0.200 Chinese B. mandarina 11,196 42.83 8.26 37.04 11.87 79.87 0.072 −0.179 A. selene 11,231 37.93 8.74 39.44 13.89 77.37 −0.020 −0.228 E. pyretorum 11,228 33.18 10.50 46.23 10.09 79.41 −0.164 0.020 A. pernyi 11,204 38.50 8.56 40.02 12.92 78.52 −0.019 −0.203 A. yamamai 11,269 33.04 10.71 45.90 10.35 78.94 −0.163 0.017 C. boisduvalii 11,227 38.83 8.32 40.31 12.53 79.15 −0.019 −0.202 O. nubilalis 11,184 41.06 8.57 38.10 12.27 79.16 0.037 −0.178 O. furnicalis 11,186 41.15 8.42 38.27 12.15 79.42 0.036 −0.181 A. honmai 11,245 39.65 8.77 38.83 12.74 78.48 0.010 −0.181 C. raphaelis 11,145 39.10 7.94 42.40 10.55 81.51 −0.04 −0.141 M. sexta 11,207 40.39 8.20 39.95 11.46 80.34 0.005 −0.163 A. melete 11,180 40.13 8.47 38.38 13.01 78.52 0.022 −0.211 P. atrilineata 11,202 40.23 8.59 38.87 12.31 79.10 0.017 −0.178 O. lunifer 11,266 32.47 12.08 43.26 12.19 75.73 −0.142 −0.004 tRNAs Size (bp) A% G% T% C% A+T % AT skewness GC skewness B. mori Dazao 1480 42.23 7.77 39.53 10.47 81.76 0.033 −0.148 B. mori Xiafang 1468 42.10 7.90 39.37 10.63 81.47 0.034 −0.147 B. mori C108 1461 42.20 7.80 39.70 10.47 81.72 0.031 −0.146 Japanese B. mandarina 1463 41.90 7.79 39.71 10.59 81.61 0.026 −0.152 Chinese B. mandarina 1472 41.78 7.81 39.95 10.46 81.73 0.022 −0.145 A. selene 1459 40.37 8.16 40.23 11.24 80.60 0.002 −0.159 E. pyretorum 1424 42.59 10.61 39.35 7.45 81.94 0.039 0.174 A. pernyi 1459 40.71 8.02 40.71 10.56 81.43 0 −0.137 A. yamamai 1473 41.07 8.08 40.26 10.59 81.33 0.010 −0.134 C. boisduvalii 1466 40.38 7.78 41.61 10.23 81.99 −0.015 −0.136 O. nubilalis 1425 42.11 7.86 39.58 10.46 81.68 0.031 −0.142 O. furnicalis 1424 42.21 8.01 39.12 10.67 81.32 0.038 −0.142 A. honmai 1474 41.11 8.41 39.89 10.58 81.00 0.015 −0.114 C. raphaelis 1516 40.90 7.72 42.15 9.23 83.05 −0.015 −0.089 M. sexta 1499 41.09 8.07 40.76 10.07 81.85 0.004 −0.110 A. melete 1509 41.42 11.33 38.070 8.55 80.12 0.034 0.142 P. atrilineata 1602 41.20 8.30 40.26 10.24 81.46 0.012 −0.105 O. lunifer 1666 41.78 7.32 39.86 11.04 81.63 0.023 −0.202 rRNAs Size (bp) A% G% T% C% A+T % AT skewness GC skewness B. mori Dazao 2162 43.64 4.67 41.09 10.60 84.73 0.030 −0.388 B. mori Xiafang 2159 43.68 4.63 41.08 10.61 84.76 0.031 −0.392 B. mori C108 2162 43.73 4.58 41.09 10.60 84.82 0.031 −0.397 Japanese B. mandarina 2160 43.89 4.63 41.30 10.19 85.19 0.030 −0.375 ChineseB. mandarina 2134 43.86 4.78 41.05 10.31 84.91 0.028 −0.366 A. selene 2126 39.93 4.99 43.79 11.29 83.73 −0.046 −0.387 E. pyretorum 2116 41.16 4.82 43.38 10.63 84.55 −0.026 −0.376 A. pernyi 2144 40.86 4.90 43.10 11.15 83.96 −0.027 −0.390 A. yamamai 2156 40.68 5.06 43.46 10.81 84.14 −0.033 −0.363 C. boisduvalii 2165 40.60 5.13 43.93 10.44 84.53 −0.039 −0.341 O. nubilalis 1773 42.41 5.13 41.79 10.66 84.21 0.007 −0.350 O. furnicalis 1774 42.39 5.07 42.05 10.48 84.44 0.004 −0.348 A. honmai 2166 40.49 4.94 43.72 10.85 84.21 −0.038 −0.374 C. raphaelis 2107 38.93 5.03 46.56 9.49 85.48 −0.089 −0.307 M. sexta 2168 41.37 4.84 44.05 9.73 85.42 −0.031 −0.335 A. melete 2096 40.65 5.25 43.56 10.54 84.21 −0.035 −0.335 P. atrilineata 2203 42.85 4.58 43.08 9.49 85.93 −0.003 −0.349

(continued on next page) 68 Q.-N. Liu et al. / Genomics 101 (2013) 64–73

Table 4 (continued) Species Size (bp) A% G% T% C% A+T % AT skewness GC skewness

O. lunifer 2157 41.96 4.82 40.19 13.03 82.15 0.022 −0.460 A+T-rich region Size (bp) A% G% T% C% A+T % AT skewness GC skewness B. mori Dazao 495 45.05 2.02 50.10 2.83 95.15 −0.053 −0.167 B. mori Xiafang 496 44.76 1.81 50.60 2.82 95.36 −0.061 −0.218 B. mori C108 494 44.94 1.62 50.61 2.83 95.55 −0.059 −0.272 Japanese B. mandarina 747 45.52 2.41 49.67 2.41 95.18 −0.043 0 Chinese B. mandarina 484 46.49 2.69 47.93 2.89 94.42 −0.015 −0.036 A. selene 339 43.07 5.90 44.84 6.19 87.91 −0.020 −0.024 E. pyretorum 358 42.18 2.51 50.00 5.31 92.18 −0.085 −0.358 A. pernyi 552 41.12 4.17 49.28 5.43 90.40 −0.090 −0.127 A. yamamai 334 41.62 3.59 47.90 6.89 90.40 −0.070 −0.314 C. boisduvalii 330 42.12 2.12 49.39 6.36 91.52 −0.079 −0.500 A. honmai 489 48.47 2.86 45.81 2.86 94.27 0.028 0 C. raphaelis 375 44.27 1.33 49.87 4.53 94.13 −0.059 −0.545 M. sexta 324 45.00 1.54 50.31 3.29 95.37 −0.055 −0.334 A. melete 351 43.87 3.13 45.30 7.69 89.17 −0.016 −0.421 P. atrilineata 457 40.70 0.66 57.55 1.09 98.25 −0.172 −0.246 O. lunifer 319 44.5 1.6 48.9 5.0 93.4 −0.047 −0.524

3.3. Protein-coding genes ‘ATAGA’ and poly-T stretch at the 18 bp downstream of the rrnS gene. A macrosatellite (AT)n element and a12 bp poly-A commonly All protein-coding sequences in the B. mori Dazao mitogenome observed in other lepidopteran mitogenomes was also found imme- were initiated by typical ATN codons (ATA for nad2, atp8, nad2 and diately upstream of the trnM gene. Within the A+T-rich region of cob genes, ATG for cox2, atp6, cox3, nad4 and nad4L genes, ATT for B. mori Dazao mitogenomes, we also identified two tandem repeat el- nad1, nad5 and nad6 genes) with the exception of the cytochrome ements, which contains two 31 bp repeat elements and three 36 bp oxidase subunit 1(cox1)(Table 3). The sequence of the 13 PCGs is repeat elements (Fig. 3). 11195 bp in length and the arrangement of the PCGs is same as that in the other sequenced lepidopterans. The A+T nucleotides composi- 3.7. Phylogenetic analysis tion of 13 PCGs in the mitogenome of B. mori Dazao is 79.57% (Table 4). Sequence alignment revealed that the open reading frame According to the most recent consensus view of lepidopteran rela- of cox1 in B. mori Dazao also starts with a CGA codon for encoding ar- tionships, , ,Papillonoidea and ginine [10]. The canonical termination codon (TAA) occurs in 11 PCGs are designated as the Macrolepidoptera. together with in the B. mori Dazao mitogenome and the other two PCGs (cox1 and Macrolepidoptera are designated as Obtectornera; Tortricoidea is cox2) were terminated with a single T. The cox1 and cox2 have incom- the sisters to the remaining lepidopteran superfamilies [38]. In the plete stop codons found in all sequenced lepidopterans. present study, the concatenated amino acid sequences of the 13 PCGs of the mitochondrial genome were used to reconstruct the phy- 3.4. tRNA genes logenetic relationships among the superfamilies of lepidopteran by the neighbor-joining (NJ) method (Fig. 4). These 26 mitochondrial The structure of tRNA genes was predicted using the tRNAscan-SE genomes represent six superfamilies within the lepidopteran suborder: Search Server [12]. The B. mori Dazao mitogenome contains 22 tRNA Bombycoidea, Noctuoidea, Pyraloidea, Tortricidea, Papillonoidea and genes ranging in size from 64 (trnR) to 71 nucleotides (trnL1 and Geometroidea. The phylogenetic analyses show that Bombycidae trnK). As shown in Fig. 2, all tRNAs displayed the typical clover-leaf (B. mori and B. mandarina), Sphingoidae (M. sexta), and Saturniidae secondary structure observed in mitochondrial tRNA genes from sev- (A. selene, A. pernyi, A. yamamai, S. cynthia ricini, E. pyretorum and eral lepidopteran and metazoan insects [1,4]. Among the 22 tRNA C. boisduvalli) were clustered in one branch in the phylogenetic tree. genes, 14 were encoded by the H-strand and eight were encoded by H. cunea and H. armigera within Noctuoidea are close to Bombycoidea. the L-strand. The total length of 22 tRNAs in the mitogenome of Our phylogenetic analyses of these superfamilies support the traditional B. mori Dazao is 1480 bp and the A+T contents is 81.76% (Table 4). morphology-based classification [38].

3.5. rRNA genes 4. Discussion

As typically observed in other insect mitogenomes, the rrnL and Using the polymerase chain reaction method, the mitogenome of rrnS genes of the B. mori Dazao mitogenome are 1378 bp and B. mori Dazao is determined as a circular molecule of 15,653 bp and 783 bp in length, respectively. These two rRNA genes are located be- longer than other known complete lepidopteran mitogenomes. The tween trnL1(CUN) and trnV, and between trnV and the A+T-rich re- order of genes and the orientation of the mitogenome of B. mori gion, respectively. The A+T content of the two rRNA genes are Dazao are identical to the sequenced lepidopteran mitogenomes. Dif- 84.73%, which is within the range reported for other lepidopterans fering from the ancestral gene order observed in some insects, the (Table 4). order of the trnM tRNA gene in the B. mori Dazao mitogenome is as follows: A+T-rich region, trnM, trnI, trnQ, nad2 (Fig. 1); whereas 3.6. The A+T-rich region the ancestral order is: A+T-rich region, trnI, trnQ, trnM, nad2 [39]. The similar results were also found in other lepidopteran insects The A+T-rich region of B. mori Dazao was located between the [4–12], suggesting that the lepidopteran insects may have acquired rrnS and trnM and 495 bp nucleotides long. The A+T-rich region such an orientation and gene order independently after splitting contained 95.15% A+T contents, which was the highest in the from the ancestral insects [13]. mitogenome (Table 4). Several conserved structures of other lepidop- The nucleotide composition of the B. mori Dazao mitogenome is teran mitogenomes were also observed in the A+T-rich region of biased toward A+T (81.31%) (Table 4) and the A+ T content is B. mori Dazao mitogenomes. There is a structure including the motif higher than that in other lepidopteran species such as O. Lunifer Q.-N. Liu et al. / Genomics 101 (2013) 64–73 69

Fig. 2. Putative secondary structures for the tRNA genes of the B. mori strain Dazao mitogenome. 70 Q.-N. Liu et al. / Genomics 101 (2013) 64–73

Fig. 2 (continued). Q.-N. Liu et al. / Genomics 101 (2013) 64–73 71

Fig. 3. Features present in the A+T-rich region of B. mori strain Dazao.

(77.84%), A. selene (78.91%), A. pernyi (80.15%), A. yamamai (80.29%), B. mori Dazao mitogenome was 0.058 (Table 4), indicating a higher Adoxophyes honmai (80.39%), C. boisduvalii (80.62%), E. pyretorum occurrence of A compared to T nucleotides. Similar results were (80.82%), P. atrilineata (81.02%); However, it is slightly lower in the also observed in B. mori C108 (0.059), Chinese B. mandarina case of B. mori C108 (81.36%), Japanese B. mandarina (81.68%), (0.057), Japanese B. Mandarina (0.055), O. nubilalis (0.031), O. M. sexta (81.79%) and C. raphaelis (82.66%). The AT skew for the furnicalis (0.032), O. lunifer (0.030), Artogeia melete (0.012) and P.

Fig. 4. Phylogeny of lepidopteran insects. Phylogenetic trees inferred from the amino acid sequences of 13 PCGs of the mitogenome using neighbor joining analysis. L. migratoria, D. yakuba and A. gambiae were used as outgroups. The numbers above the branches specify bootstrap percentages (1000 replicates). 72 Q.-N. Liu et al. / Genomics 101 (2013) 64–73 atrilineata (0.007). In contrast, the AT skew of the complete [5] K. Yukuhiro, H. Sezutsu, M. Itoh, K. Shimizu, Y. Banno, Significant levels of se- − quence divergence and gene rearrangements have occurred between the mito- mitogenome was slightly negative in C. raphaelis ( 0.047), C. chondrial genomes of the wild mulberry silkmoth, Bombyx mandarina, and its boisduvalii (−0.024), A. selene (−0.023), A. yamamai (−0.022), A. close relative, the domesticated silkmoth, Bombyx mori, Mol. Biol. Evol. 19 pernyi (−0.021), M. sexta (−0.005) and A. honmai (−0.001). Similar (2002) 1385–1389. [6] M.H. Pan, Q.Y. Yu, Y.L. Xia, F.Y. Dai, Y.Q. Liu, C. Lu, Z. Xiang, Characterization of to mitogenomes from B. mori C108, Chinese B. mandarina, Japanese B. mitochondrial genome of Chinese wild mulberry silkworm, Bomyx mandarina Mandarina, O. nubilalis and O. furnicalis,theATskewsoftRNAand (Lepidoptera:Bombycidae), Sci. China, Ser. C Life Sci. 51 (2008) 693–701. rRNA genes in the B. mori Dazao mitogenome were 0.033 and [7] Y.Q. Liu, Y.P. Li, M.H. Pan, F.Y. Dai, X. Zhu, C. Lu, Z.H. Xiang, The complete mito- 0.030, respectively. The AT skew in the B. mori Dazao mitogenome chondrial genome of the Chinese silkmoth, Antheraea pernyi (Lepidoptera: Saturniidae), Acta Biochim. Biophys. Sin. 40 (2008) 694–703. A+T-rich region was −0.053, indicating a bias for T over A nucleo- [8] S.R. Kim, M.I. Kim, M.Y. Hong, K.Y. Kim, P.D. Kang, J.S. Hwang, Y.S. Han, B.R. Jin, I. Kim, tides. The GC skew values were negative in all sequenced lepidopter- The complete mitogenome sequence of the Japanese oak silkmoth, Antheraea – an mitogenomes, indicating a higher content of C compared to G yamamai (Lepidoptera:Saturniidae), Mol. Biol. Rep. 36 (2008) 1871 1880. [9] S.T. Jiang, G.Y. Hong, M. Yu, N. Li, Y. Yang, Y.Q. Liu, Z.J. Wei, Characterization of the nucleotides. According to the result, the GC skew for rRNA genes is complete mitochondrial genome of the giant silkworm moth, Eriogyna pyretorum lower than that of tRNA, PCGs and the A+T-rich region, indicating (Lepidoptera:Saturniidae), Int. J. Biol. Sci. 5 (2009) 351–365. a heavy bias toward C versus G nucleotides in genes encoding rRNAs. [10] J.S. Kim, J.S. Park, M.J. Kim, P.D. Kang, S.G. Kim, B.R. Jin, Y.S. Han, I. Kim, Complete nucle- otide sequence and organization of the mitochondrial genome of eri-silkworm, Samia The overlapping between trnW and trnC in B. mori Dazao mitogenome cynthia ricini (Lepidoptera: Saturniidae), J. Asia-Pac. Entomol. 15 (2012) 162–173. are also detected in other lepidopteran species such as C. boisduvalii, [11] Q.N. Liu, B.J. Zhu, L.S. Dai, G.Q. Wei, C.L. Liu, The complete mitochondrial genome A. pernyi, Ostrinia furnicalis and Ostrinia nubilalis. The 338 bp intergenic of the wild silkworm moth, Actias selene, Gene 505 (2012) 291–299. [12] M.Y. Hong, E.M. Lee, Y.H. Jo, H.C. Park, S.R. Kim, J.S. Hwang, B.R. Jin, P.D. Kang, K.G. spacer sequence spreading over 19 regions is considerably longer than Kim, Y.S. Han, I. Kim, Complete nucleotide sequence and organization of the that of other Lepidopteran species including M. sexta (115 bp over 13 re- mitogenome of the silk moth Caligula boisduvalii (Lepidoptera: Saturniidae) and gions), A. selene (137 bp over 13 regions), C. boisduvalii (194 bp over 16 comparison with other Lepidopteran insects, Gene 413 (2008) 49–57. [13] S.L. Cameron, M.F. Whiting, The complete mitochondrial genome of the tobacco regions), Coreana raphaelis (178 bp over 17 regions), while is shorter hornworm, Manduca sexta (Insecta: Lepidoptera: Sphingidae), and an examina- than Ochrogaster lunifer (371 bp over 20 regions). tion of mitochondrial gene variability within butterflies and , Gene 408 Different from the other PCGs, cox1 is initiated by CGA codon and (2008) 112–113. this non-canonical putative start codon is also found in some insects [14] F. Liao, L. Wang, S. Wu, Y.P. Li, L. Zhao, G.M. Huang, C.J. Niu, Y.Q. Liu, M.G. Li, The complete mitochondrial genome of the fall webworm, Hyphantria cunea [5,20,21,24]. The TTAG tetra-nucleotide is a putative start codon in (Lepidoptera: Arctiidae), Int. J. Biol. Sci. 6 (2010) 172–186. C. raphaelis [25] and the TATTAG also represents a putative start [15] J. Yin, G.Y. Hong, A.M. Wang, Y.Z. Cao, Z.J. Wei, Mitochondrial genome of the cot- codon in the moth species, O. nubilalis and O. furnicalis [17]. An un- ton bollworm Helicoverpa armigera (Lepidoptera: ) and comparison with other Lepidopterans, Mitochondrial DNA 21 (2010) 160–169. usual start codon for the cox1 gene has also been described in various [16] P. Salvato, M. Simonato, A. Battisti, E. Negrisolo, The complete mitochondrial ge- mtDNAs [40]. nome of the bag-shelter moth Ochrogaster lunifer (Lepidoptera: Notodontidae), Like the mitogenomes of most , all tRNA genes have a typical BMC Genomics 9 (2008) 331. [17] B.S. Coates, D.V. Sumerford, R.L. Hellmich, L.C. Lewis, Partial mitochondrial genome clover-leaf structure of mitochondrial tRNA [3]. However, the DHU sequences of Ostrinia nubilalis and Ostrinia furnicalis, Int. J. Biol. Sci. 1 (2005) 13–18. arm of trnS1(AGN) could not form a stable stem–loop structure [9–11]. [18] J. Yin, A.M. Wang, G.Y. Hong, Y.Z. Cao, Z.J. Wei, Complete mitochondrial genome of Compared with other lepidopteran species, the A+T-rich region is lon- Chilo suppressalis (Walker)(Lepidoptera: Crambidae), Mitochondrial DNA 22 (2011) 41–43. ger than that of A. yamamai (334 bp), C. boisduvalii (330 bp), M. sexta [19] W.W. Li, X.Y. Zhang, Z.X. Fan, B.S. Yue, F.N. Huang, E. King, J.H. Ran, Structural char- (324 bp), O. lunifer (319 bp), C. raphaelis (375 bp), P. atrilineata acteristics and phylogenetic analysis of the mitochondrial genome of the sugarcane (457 bp), A. melete (351 bp), Chinese B. mandarina (484 bp), A. honmai borer, Diatraea saccharalis (Lepidoptera: Crambidae), DNA Cell Biol. 30 (2011) 3–8. [20] L. Yang, Z.J. Wei, G.Y. Hong, S.T. Jiang, L.P. Wen, The complete nucleotide sequence of (489 bp) and B. mori C108 (494 bp), and shorter than that of A. pernyi the mitochondrial genome of Phthonandria atrilineata (Lepidoptera: Geometridae), (552 bp) and Japanese B. mandarina (747 bp). The A+T-rich region Mol. Biol. Rep. 36 (2008) 1441–1449. contained 95.15% A+T content, which was the highest in the [21] E.S. Lee, K.S. Shin, M.S. Kim, H. Park, S. Cho, C.B. Kim, The mitochondrial genome of mitogenome (Table 4). There are some common structures such as the the smaller tea tortix Adoxophyes honmai (Lepidoptera: ), Gene 373 (2008) 52–57. motif ‘ATAGA’ and poly-T stretch at the 18 bp downstream of the rrnS [22] Y.J. Gong, B.C. Shi, Z.J. Kang, F. Zhang, S.J. Wei, The complete mitochondrial ge- gene in lepidopteran mitogenomes [4–12], which may represent the or- nome of the oriental fruit moth Grapholita molesta (Busck) (Lepidoptera: – igin of minority or light strand replication [41]. Within the A+T-rich re- Tortricidae), Mol. Biol. Rep. 39 (2011) 2893 2900. fi [23] J.L. Zhao, Y.Y. Zhang, A.R. Luo, G.F. Jiang, S.L. Cameron, C.D. Zhu, The complete mi- gion of B. mori Dazao mitogenomes, we also identi ed two 31 bp repeat tochondrial genome of Spilonota lechriaspis Meyrick (Lepidoptera: Tortricidae), elements and three 36 bp repeat elements. The presence of multiple tan- Mol. Biol. Rep. 38 (2011) 3757–3764. dem repeat elements is a characteristic of the insect A+T-rich region [24] G.Y. Hong, S.T. Jiang, M. Yu, Y. Yang, F. Li, F.S. Xue, Z.J. Wei, The complete nucleo- fl – tide sequence of the mitochondrial genome of the cabbage butter y, Artogeia [42 44].InA. pernyi, the A+T-rich region harbors a repeat element of melete (Lepidoptera:Pieridae), Acta Biochim. Biophys. Sin. 41 (2009) 446–455. 38 bp, occurring six times in tandem [7], while in A. roylei and the wild [25] I. Kim, E.M. Lee, K.Y. Seol, E.Y. Yun, Y.B. Lee, J.S. Hwang, B.R. Jin, The mitochondrial type of A. pernyi this element is present in five repeats [45,46]. genome of the Korean hairstreak, Coreana raphaelis (Lepidoptera:Lycaenidae), Insect Mol. Biol. 15 (2006) 17–225. [26] J. Hu, D.X. Zhang, J.S. Hao, D.Y. Huang, S. Cameron, C.D. Zhu, The complete mitochon- Acknowledgments drial genome of the yellow coaster, Acraea issoria (Lepidoptera:: :Acraeini): sequence, gene organization and a unique tRNA transloca- tion event, Mol. Biol. Rep. 37 (2010) 3431–3438. This work was supported by the earmarked fund for Modern [27] M.J. Kim, H.C. Jeong, S.R. Kim, I. Kim, Complete mitochondrial genome of the Agro-industry Technology Research System (CARS-22-SYZ10), Youth nerippe fritillary butterfly, Argynnis nerippe (Lepidoptera:Nymphalidae), Mito- Foundation of Anhui Agricultural University (2010zd10) and Natural chondrial DNA 22 (2011) 86–88. Science Foundation of Anhui Province of China (11040606M98). [28] International Silkworm Genome Consortium, The genome of a lepidopteran model insect, the silkworm Bombyx mori, Insect Biochem. Mol. Biol. 38 (2008) 1036–1045. [29] T.M. Lowe, S.R. Eddy, tRNAscan-SE: a program for improved detection of transfer References RNA genes in genomic sequence, Nucleic Acids Res. 25 (1997) 955–964. [30] J.D. Thompson, T.J. Gibson, F. Plewniak, F. Jeanmougin, D.G. Higgins, The [1] D.R. Wolstenholme, mitochondrial DNA: structure and , Int. Rev. CLUSTAL_X windows interface: flexible strategies for multiple sequence align- Cytol. 141 (1992) 173–216. ment aided by quality analysis tools, Nucleic Acids Res. 25 (1997) 4876–4882. [2] C. Moritz, T.E. Dowling, W.M. Brown, Evolution of animal mitochondrial DNA: [31] A.C. Junqueira, A.C. Lessinger, T.T. Torres, F.R. da Silva, A.L. Vettore, P. Arruda, A.M. relevance for population genetics and systematics, Annu. Rev. Ecol. Syst. 18 (1987) Azeredo Espin, The mitochondrial genome of the blowfly Chrysomya chloropyga 269–292. (Diptera: Calliphoridae), Gene 339 (2004) 7–15. [3] J.L. Boore, Animal mitochondrial genomes, Nucleic Acids Res. 27 (1999) 1767–1780. [32] G. Benson, Tandem repeats finder: a program to analyze DNA sequences, Nucleic [4] C. Lu, Y.Q. Liu, S.Y. Liao, B. Li, Z.H. Xiang, H. Han, X.G. Xue, Complete sequence de- Acids Res. 27 (1999) 573–580. termination and analysis of Bombyx mori mitochondrial genome, J. Agric. [33] P.K. Flook, C.H. Rowell, G. Gellissen, The sequence, organization, and evolution of Biotechnol. 10 (2002) 163–170. the Locusta migratoria mitochondrial genome, J. Mol. Evol. 41 (1995) 928–941. Q.-N. Liu et al. / Genomics 101 (2013) 64–73 73

[34] D.O. Clary, D.R. Wolstenholme, The mitochondrial DNA molecular of Drosophila [40] E. Negrisolo, A. Minelli, G. Valle, Extensive gene order rearrangement in the mitochon- yakuba: nucleotide sequence, gene organization, and genetic code, J. Mol. Evol. drial genome of the centipede Scutigera coleoptrata, J. Mol. Evol. 58 (2004) 413–423. 22 (1985) 252–271. [41] J.W. Taanman, The mitochondrial genome: structure, transcription, translation [35] C.B. Beard, D. Mills, F.H. Collins, The mitochondrial genome of the mosquito and replication, Biochim. Biophys. Acta 1410 (1999) 103–123. Anopheles gambiae: DNA sequence, genome organization, and comparisons [42] M. Vila, M. Björklund, The utility of the neglected mitochondrial control region for with mitochondrial sequences of other insects, Insect Mol. Biol. 2 (1993) evolutionary studies in Lepidoptera (insecta), J. Mol. Evol. 58 (2004) 280–290. 103–124. [43] D. Li, Y.R. Guo, H.J. Shao, L.C. Tellier, J. Wang, Z.H. Xiang, Q.Y. Xia, Genetic diversity, [36] K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei, S. Kumar, MEGA5: molec- molecular phylogeny and selection evidence of the silkworm mitochondria impli- ular evolutionary genetics analysis using maximum likelihood, evolutionary dis- cated by complete resequencing of 41 genomes, BMC Evol. Biol. 10 (2010) 81. tance, and maximum parsimony methods, Mol. Biol. Evol. 28 (2011) 2731–2739. [44] X.L. Hu, G.L. Cao, R.Y. Xue, X.J. Zheng, X. Zhang, H.R. Duan, C.L. Gong, The complete [37] D.V. Lavrov, W.M. Brown, J.L. Boore, A novel type of RNA editing occurs in the mito- mitogenome and phylogenetic analysis of Bombyx mandarina strain Qingzhou, chondrial tRNAs of the centipede Lithobius forficatus, Proc. Natl. Acad. Sci. U. S. A. 97 Mol. Biol. Rep. 37 (2010) 2599–2608. (1999) 13738–13742. [45] K.P. Arunkumar, M. Metta, J. Nagaraju, Molecular phylogeny of silk moths reveals [38] N.P. Kristensen, A.W. Skalski, Phylogeny and paleontology, in: N.P. Kristensen the origin of domesticated silk moth, Bombyx mori from Chinese Bombyx (Ed.), Lepidoptera: Moths and Butterflies, 1 Evolution, Systematics, and Biogeog- mandarina and paternal inheritance of Antheraea proylei mitochondrial DNA, raphy, Handbook of Zoology Vol IV, Part 35, De Gruyter, Berlin and New York, Mol. Phylogenet. Evol. 40 (2006) 419–427. 1999, pp. 7–25. [46] Y.Q. Liu, Y.P. Li, H. Wang, R.X. Xia, C.L. Chai, M.H. Pan, C. Lu, Z.H. Xiang, The com- [39] J.L. Boore, D. Lavrov, W.M. Brown, Gene translocation links insects and crusta- plete mitochondrial genome of the wild type of Antheraea pernyi (Lepidoptera: ceans, Nature 392 (1998) 667–668. Saturniidae), Ann. Entomol. Soc. Am. 105 (2012) 498–505.