Large-Scale Trna Intron Transposition In
Total Page:16
File Type:pdf, Size:1020Kb
Large-Scale tRNA Intron Transposition in the Archaeal Order Thermoproteales Represents a Novel Mechanism of Intron Gain Kosuke Fujishima,1,2 Junichi Sugahara,1,3 Masaru Tomita,1,2,3 and Akio Kanai*,1,2,3 1Institute for Advanced Biosciences, Keio University, Tsuruoka, Japan 2Department of Environmental Information, Keio University, Fujisawa, Japan 3Systems Biology Program, Graduate School of Media and Governance, Keio University, Fujisawa, Japan *Corresponding author: E-mail: [email protected]. Associate editor: Martin Embley Research article Abstract Downloaded from Recently, diverse arrangements of transfer RNA (tRNA) genes have been found in the domain Archaea, in which the tRNA is interrupted by a maximum of three introns or is even fragmented into two or three genes. Whereas most of the eukaryotic tRNA introns are inserted strictly at the canonical nucleotide position (37/38), archaeal intron-containing tRNAs have a wide diversity of small tRNA introns, which differ in their numbers and locations. This feature is especially pronounced in the archaeal order Thermoproteales. In this study, we performed a comprehensive sequence comparison of mbe.oxfordjournals.org 286 tRNA introns and their genes in seven Thermoproteales species to clarify how these introns have emerged and diversified during tRNA gene evolution. We identified 46 intron groups containing sets of highly similar sequences (.70%) and showed that 16 of them contain sequences from evolutionarily distinct tRNA genes. The phylogeny of these 16 intron groups indicates that transposition events have occurred at least seven times throughout the evolution of Thermoproteales. These findings suggest that frequent intron transposition occurs among the tRNA genes of Thermoproteales. Further computational analysis revealed limited insertion positions and corresponding amino acid types of tRNA genes. This has arisen because the bulge–helix–bulge splicing motif is required at the newly transposed position if at Keio University (PULC Consortium) on September 23, 2010 the pre-tRNA is to be correctly processed. These results clearly demonstrate a newly identified mechanism that facilitates the late gain of short introns at various noncanonical positions in archaeal tRNAs. Key words: Archaea, tRNA intron, transposition, Thermoproteales. Introduction tions within a single tRNA gene (Kawach et al. 2005; Soma et al. 2007; Maruyama et al. 2009). Transfer RNA (tRNA) is a universally conserved essential Similarly, archaeal species have one of three types of ho- molecule that decodes genetic information during protein mologous tRNA splicing endonucleases (homodimeric, ho- biosynthesis. tRNA introns are found in all three domains motetrameric, or heterotetrameric), which recognize an of life (bacteria, Archaea, and Eukarya), representing the exon–intron junction characterized by the bulge–helix– most common disruption of the host tRNA genes. How- ever, their splicing mechanisms and locations in the tRNA bulge (BHB) motif (Diener and Moore 1998; Xue, Calvin, genes of the three biological domains differ. Bacterial tRNA and Li 2006). Archaeal species expressing the heterotetra- introns are known to be self-splicing group I introns and are meric splicing endonuclease tend to have increased pro- located in the anticodon loop of the tRNA genes of the a-, portions of tRNA introns inserted at noncanonical b-, and c-Proteobacteria (Reinhold-Hurek and Shub 1992). positions (other than 37/38) compared with the archaeal In contrast, the tRNA introns of the Archaea and Eukarya species that express the other types of endonucleases. Spe- are processed by the specific enzymatic reactions of tRNA cies belonging to the thermoacidophilic order Thermopro- splicing endonucleases. Eukaryotic tRNA introns are pre- teales display a particularly elevated proportion of tRNA dominantly inserted at canonical position 37/38, a univer- introns at noncanonical sites, which are found at over sally conserved tRNA intron insertion position in Archaea 30 nucleotide positions within various tRNA genes and Eukarya, and are removed by the heterotetrameric eu- (Sugahara et al. 2008). Unusual tRNA genes that contain karyotic splicing endonucleases of the Sen family, based on three introns are currently known in only three species, a measuring system that recognizes the distance between Thermofilum pendens, Pyrobaculum calidifontis, and P. is- the mature domain and the splice sites (Li et al. 1998). landicum, all members of the Thermoproteales. However, A few exceptions occur in the nuclear genomes of the numbers of intron-containing tRNA genes do not fol- photosynthetic eukaryotes and the nucleomorph, the low the phylogenetic trends of the Thermoproteales spe- endosymbiont-derived eukaryotic nucleus found in certain cies, which suggests an active gain and loss of tRNA introns plastids, in which 1–3 introns are located at various loca- within this order (Sugahara et al. 2009). © The Author 2010. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: [email protected] Mol. Biol. Evol. 27(10):2233–2243. 2010 doi:10.1093/molbev/msq111 Advance Access publication April 29, 2010 2233 Fujishima et al. · doi:10.1093/molbev/msq111 MBE Several models have been proposed for the gain and loss 70%. Transposable introns were further extracted by man- of spliceosomal introns in eukaryotes. For example, reverse ually investigating each tRNA intron groups to see whether transcription and recombination of the spliced RNA could it satisfied either of the following criteria: 1) the introns lead to intron loss, whereas reverse splicing, the insertion of belong to more than two evolutionarily distinct tRNA transposable elements, and gene duplication could result in genes (e.g., tRNAThr and tRNAVal), or 2) the introns are lo- intron gain (Roy and Gilbert 2006). In contrast, the origin cated at more than two different nucleotide positions. The and evolution of tRNA introns are poorly understood. same criteria were also applied to 217 tRNA introns from Therefore, whether the models proposed for the spliceoso- 12 crenarchaeal species: Aeropyrum pernix, Cenarchaeum mal introns can also be applied to the enzymatically symbiosum, Desulfurococcus kamchatkensis, Hyperthermus cleaved tRNA introns remains a major question. A unique butyricus, Ignicoccus hospitalis, Metallosphaera sedula, type of disrupted tRNA, the so-called ‘‘split tRNA’’ (tRNA Staphylothermus marinus, Sulfolobus acidocaldarius, Su. is- encoded on two or three separate minigenes), has recently landicus L.S.2.15, Su. islandicus M.14.25, Su. solfataricus, and been discovered in two hyperthermophilic Archaea, Nano- Su. tokodaii. archaeum equitans and Caldivirga maquilingensis (Randau Downloaded from et al. 2005; Fujishima et al. 2009). Our recent studies have Construction of a Phylogenetic Tree of tRNAs shown that the precursor sequences of these split tRNAs A multiple sequence alignment of the tRNAs was con- have high sequence similarity to the tRNA intron sequen- structed by masking all the tRNA intron sequences to con- ces in related archaeal species (Fujishima et al. 2008, 2009). sider the phylogeny of the mature tRNA sequences. All the This finding clearly suggests an evolutionary relationship mature tRNA sequences (input as tDNA sequences) were mbe.oxfordjournals.org between the two types of tRNA genes, although the tran- aligned using ClustalW 2.0 (Larkin et al. 2007). The aligned sition state between the two is unknown. One explanation tRNA sequences (.aln file) were manually improved by is that the tRNA intron is a remnant of the residual leader matching the consensus nucleotides conserved among sequence produced during the trans splicing of the split the archaeal tRNAs: base 8 (T or C), bases 14–15 (AG), ba- tRNA gene (‘‘split-early’’ hypothesis; Di Giulio 2006, ses 18–19 (GG), base 21 (A or G), base 33 (T), base 48 (T or 2008). An alternative explanation is that the insertion of C), bases 53–58 (GTTC[A/G]A), bases 60–61 (TC), and base the intron occurred first and the gene subsequently under- 72 (T or C) (Marck and Grosjean 2002) to generate a struc- at Keio University (PULC Consortium) on September 23, 2010 went fragmentation at the site of insertion (‘‘split-late’’ hy- tural alignment. An unrooted neighbor-joining tree was pothesis; Randau and Soll 2008; Heinemann et al. 2009). constructed using ClustalW 2.0 and was visualized with In this study, we performed a comprehensive sequence iToL (Letunic and Bork 2007). analysis of all the tRNA intron sequences in seven members of the Thermoproteales and characterized the sets of ‘‘in- In Silico Splicing Analysis tron groups’’ with similar sequences. Approximately one- The intron sequences belonging to the tRNAs listed in third of the groups contained introns from evolutionarily table 2 were computationally inserted into every nucleo- distinct tRNA genes, which we defined as the ‘‘transposable tide position (from 1/2 to 72/73) of the original tRNA se- intron’’ group. Phylogenetic analysis revealed that these quence, and into the same nucleotide position in the groups originated at various times during the evolution mature tRNAs with different isoacceptors derived from of the Thermoproteales and have contributed to the late the same species, to produce two types of artificial intron- gain of introns in the tRNA genes. This new phenomenon is containing tRNAs (supplementary fig. S1, Supplementary inconsistent with any of the known