Characterization of Mitochondrial Genomes of Three Andrena Bees (Apoidea: Andrenidae) and Insights Into the Phylogenetics
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Biological Macromolecules 127 (2019) 118–125 Contents lists available at ScienceDirect International Journal of Biological Macromolecules journal homepage: http://www.elsevier.com/locate/ijbiomac Characterization of mitochondrial genomes of three Andrena bees (Apoidea: Andrenidae) and insights into the phylogenetics Bo He a,b,1, Tianjuan Su c,1,ZeqingNiuc,ZeyangZhoua, Zhanying Gu b,⁎, Dunyuan Huang a,⁎ a Chongqing Key Laboratory of Vector Insects, Chongqing Key Laboratory of Animal Biology, Chongqing Normal University, Chongqing 401331, China b Key Laboratory of Cultivation and Protection for Non-Wood Forest Trees of Ministry of Education, Key Laboratory of Non-Wood Forest Products of State Forestry Administration, Central South University of Forestry and Technology, Changsha 410004, China c Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China article info abstract Article history: Andrena is a large bee genus of N1500 species, which includes many important pollinators of agricultural systems. Received 2 October 2018 In this study, we present three mitochondrial genomes (mitogenomes) of Andrena species, which are the polli- Received in revised form 8 January 2019 nators of Camellia oleifera. Compared with putative ancestral gene arrangement of insects, the three Accepted 8 January 2019 mitogenomes present identical gene rearrangement events, including local inversion (trnR) and gene shuffling Available online 09 January 2019 (trnQ/trnM, trnK/trnD, and trnW/trnC-trnY). Most PCGs initiate with standard ATN codon and share the stop codon of TAA or TAG, whereas truncated stop codon T was detected in the atp6 gene of A. chekiangensis.Further- Keywords: Andrena more, the nad4 gene end with a single T in all three Andrena species. All tRNAs could be folded into clover-leaf Phylogeny secondary structure except for trnS1, with the dihydrouracil (DHU) arm forming a simple loop. Phylogenetic Mitochondrial genome analysis is performed on 17 Andrena mitogenomes. Maximum likelihood and Bayesian methods generate identi- cal topology, in which A. hunanensis and A. striata form a group and are close to A. camellia. Although A. chekiangensis is also difficult to be distinguished from A. camellia by morphological methods, A. chekiangensis and A. haemorrhoa form a clade and are grouped with the other taxa of the genus Andrena. © 2019 Published by Elsevier B.V. 1. Introduction Andrena Fabricius is a large bee genus of N1500 species, which is dis- tributed predominantly in Holarctic [10,11]. These bees are important Camellia oleifera Abel. is the most important woody oil tree in China, pollinators of agricultural systems and nest solitarily or communally in with the cultivated area of approximately 3.7 million hectares [1]. It is the ground [12]. While most Andrena species fly in the spring, there self-sterile because of prezygotic late-acting self-incompatibility [2,3], are also bivoltine and univoltine taxa that emerge in summer and au- with sexual reproduction dependent on the pollinators [4]. C. oleifera tumn. They exhibit a range of diet breadth, from polylectic to oligolectic blooms from autumn to winter, during which the pollinators are limited [13,14]. Therefore, the genus Andrena is an excellent group to study the due to the low temperature. It has been reported that Andrena camellia evolution of pollen diet [15,16]. However, although Andrena species (Andrenidae) and Colletes gigas (Colletidae) are the main pollinators of have a wide range of biological characteristics, they are so morphologi- C. oleifera [4–8]. Although our field observations find that Andrena cally uniform to be distinguished from each other. It would also be a chekiangensis, Andrena hunanensis,andAndrena striata are also widely challenge to clarify their phylogenetic relationships. distributed in oil-tea forest, they are rarely described as the pollinators. The typical insect mitochondrial genome (mitogenome) is a The reason may be that these three flower visitors exhibit similar mor- 15–18 kb circular molecule, including 13 protein-coding genes (PCGs), phologic features with A. camellia [9]. They may be incorrectly identified two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), as A. camellia. Therefore, it is important to clarify the phylogenetic rela- and a control region that contains the initial sites of replication and tran- tionships of the pollinators of C. oleifera. scription [17–19]. Owing to some unique characters like small size, ma- ternal inheritance, strict orthologous genes, fast evolution rate, and low rate of recombination, mitogenome has been widely used as a molecu- lar marker for comparative and evolutionary genomics, and phyloge- ⁎ Corresponding authors. netic analysis at different taxonomic levels [19–21]. E-mail addresses: [email protected] (Z. Gu), [email protected] fi (D. Huang). To date, fteen mitogenomes of Andrena are available in GenBank 1 These authors contributed equally to this work. [22,23], which would increase our understanding about taxonomic https://doi.org/10.1016/j.ijbiomac.2019.01.036 0141-8130/© 2019 Published by Elsevier B.V. B. He et al. / International Journal of Biological Macromolecules 127 (2019) 118–125 119 Table 1 2. Materials and methods Summary of mitogenomes used in this study. Family Species Total size (bp) Genbank accession no. 2.1. Samples collection and DNA extraction Andrenidae Andrena angustior 15,252 KT164658 Andrenidae Andrena bicolor 15,422 KT164666 Three Andrena species (A. chekiangensis, A. hunanensis,andA. striata) Andrenidae Andrena chrysosceles 15,692 KT164602 were collected from oil-tea forest of Jiangxi, China, in November 2017. Andrenidae Andrena cineraria 17,069 KT164628 All specimens were stored in absolute ethyl alcohol at −20 °C freezer Andrenidae Andrena dorsata 16,333 KT164633 in Key Laboratory of Animal Biology, Chongqing Normal University. Andrenidae Andrena haemorrhoa 15,936 KT164645, KT164635 Andrenidae Andrena semilaevis 16,459 KT164629 Total genomic DNA was extracted separately from each specimen Andrenidae Andrena minutula 15,302 KT164675 with the Tissure DNA Kit (Omega Bio-Tek, Norcross, GA, USA) following Andrenidae Andrena subopaca 14,747 KT164612 the manufacturer's instructions. Andrenidae Andrena flavipes 15,074 KT164679 Andrenidae Andrena fulva 15,318 KT164623 2.2. Sequencing and assembly Andrenidae Andrena labiata 15,074 KT164613 Andrenidae Andrena nigroaenea 15,376 KT164665 Andrenidae Andrena nitida 14,996 KT164636 The mitogenomes were generated by next-generation sequencing. Andrenidae Andrena camellia 15,065 KX241615 After the exacted total genomic DNA was quantified, sequences were Andrenidae Andrena chekiangensis 15,804 MH982580 fragmented to an average size of 450 bp using Covaris M220 system. Andrenidae Andrena hunanensis 14,780 MH982581 Andrenidae Andrena striata 14,736 MH982582 The library with two indexes was constructed using the Illumina Halictidae Seladonia tumulorum 15,268 KT164609 TruSeq™ DNA Sample Prep Kit (Illumina, San Diego, CA, USA) and se- Colletidae Colletes gigas 15,885 KM978210 quenced by the platform Illumina Hiseq 4000 with the strategy of 360 Colletidae Hylaeus dilatatus 15,475 NC_026468 paired-ends. Approximately 10 Gb paired-end reads of 150 bp length were generated. The quality of raw sequences was assessed using and phylogenetic relationships of this genus. In this study, we se- FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc). quenced three other mitogenomes of Andrena species, compared the The software of FASTX toolkit 0.013 (http://hannonlab.cshl.edu/fastx_ characters in detail and analyzed their phylogenetic relationships. toolkit/)wasalsousedtofilter the low quality sequences. The Fig. 1. Circular map of the three Andrena mitogenomes. Different color exhibits the nucleotide identity of BLAST searches. The species from outside to inside as follows, respectively: A. chekiangensis, A. striata,andA. hunanensis. 120 B. He et al. / International Journal of Biological Macromolecules 127 (2019) 118–125 Andrena angustior Andrena flavipes 0 Andrena dorsata Andrena semilaevis Andrena minutula Andrena subopaca 1.12 Andrena nigroaenea Andrena nitida Andrena camellia 2.25 Andrena chekiangensis Andrena fulva Andrena hunanensis 3.38 Andrena striata Andrena haemorrhoa Andrena labiata 4.5 Andrena bicolor Andrena cineraria TGT(C) GTC(V) CTC(L) ACG(T) CCG(P) CAG(Q) CAC(H) AGT(S) CAT(H) ACT(T) GTT(V) CGA(R) GCA(A) TCT(S) CAA(Q) CCC(P) GCC(A) TCC(S) GAC(D) ATC(I) GAG(E) CTT(L) ACC(T) TAC(Y) AAC(N) TTC(F) AAG(K) ATG(M) TGG(W) AGG(S) CGC(R) AGC(S) CTG(L) TCG(S) GGC(G) GCG(A) CGG(R) GTG(V) TGC(C) GGG(G) TTG(L) CTA(L) CGT(R) GGT(G) CCT(P) TTA(L) TCA(S) ACA(T) CCA(P) TGA(W) AGA(S) GTA(V) TAT(Y) TTT(F) AAA(K) ATT(I) GAA(E) GGA(G) ATA(M) AAT(N) GAT(D) GCT(A) Fig. 2. The relative synonymous codon usage (RSCU) of PCGs in Andrena mitogenomes. The x-axis and y-axis indicate the hierarchical clustering of codon frequencies and Andrena species, respectively. downstream analyses were performed on clean data of high quality substitution models were confirmed by PartitionFinder 2.1.1 [37]with (Q20 N 90% and Q30 N 85%). The mitogenomes were reconstructed by the Bayesian Information Criterion (BIC). The sequences were pre- MITObim v1.7 [24] with the default parameters, and the mitogenome defined by both gene types (13 PCGs, 22 tRNAs, and two rRNAs) and of A. camellia (GenBank accession no: KX241615) was employed as a codon positions (the first, second, and third codon positions of each reference. PCG). The maximum likelihood (ML) analysis was inferred using IQ-TREE 2.3. Bioinformatic analysis [38]. Branch support was conducted with 1000 replicates of ultrafast likelihood bootstrap. Bayesian inference (BI) analysis was estimated The secondary structures of tRNAs were predicted by Mitos using MrBayes 3.2.6 [39]. Two independent runs were performed, WebServer [25] under the invertebrate mitochondrial genetic code. each with three hot chains and one cold chain, with posterior distribu- The PCGs and rRNAs boundaries were determined by the positions of tions estimated using Markov Chain Monte Carlo (MCMC) sampling.