Is There Any Intron Sliding in Mammals? Irina V
Total Page:16
File Type:pdf, Size:1020Kb
Poverennaya et al. BMC Evol Biol (2020) 20:164 https://doi.org/10.1186/s12862-020-01726-0 RESEARCH ARTICLE Open Access Is there any intron sliding in mammals? Irina V. Poverennaya1,2*† , Nadezhda A. Potapova3,4† and Sergey A. Spirin5,6,7 Abstract Background: Eukaryotic protein-coding genes consist of exons and introns. Exon–intron borders are conserved between species and thus their changes might be observed only on quite long evolutionary distances. One of the rarest types of change, in which intron relocates over a short distance, is called "intron sliding", but the reality of this event has been debated for a long time. The main idea of a search for intron sliding is to use the most accurate genome annotation and genome sequence, as well as high-quality transcriptome data. We applied them in a search for sliding introns in mammals in order to widen knowledge about the presence or absence of such phenomena in this group. Results: We didn’t fnd any signifcant evidence of intron sliding in the primate group (human, chimpanzee, rhesus macaque, crab-eating macaque, green monkey, marmoset). Only one possible intron sliding event supported by a set of high quality transcriptomes was observed between EIF1AX human and sheep gene orthologs. Also, we checked a list of previously observed intron sliding events in mammals and showed that most likely they are artifacts of genome annotations and are not shown in subsequent annotation versions as well as are not supported by transcriptomic data. Conclusions: We assume that intron sliding is indeed a very rare evolutionary event if it exists at all. Every case of intron sliding needs a lot of supportive data for detection and confrmation. Keywords: Intron sliding, Introns, Gene annotation, Transcriptome, Intron evolution Background mutations in splice sites or nearby, such as a birth or Eukaryotic genes consist of exons and introns whose death of new splice sites resulted in a loss or acquisi- borders, i.e. genomic coordinates, are evolutionarily tion of exons or their part [6–9]. But the rarest type of conservative which means they are under the pressure such alterations is called “intron sliding” (also known of negative selection [1–4]. Te changes of exon–intron as “intron shifting”, “intron drift”, “intron slippage” or boundaries might afect coding protein, therefore they “intron migration”). Intron sliding is a movement of are rare events and can be seen only on the long evolu- intron from one position to another within a gene [10]. tionary distances [5]. Such changes difer from the alter- In other words, during sliding the start and end of the native splicing, which is widespread in eukaryotes and certain intron move in the same direction for the same does not afect exon–intron coordinates. number of nucleotides. For example, if initially the intron Various reasons for exon–intron boundaries altera- start coordinate was 100 and the end coordinate was 500, tions were observed. All of them happen as a result of after hypothetical intron sliding event coordinates would be 101 and 501, respectively. Tis means that coordinate changes for both sides are the same (in this case 1 bp *Correspondence: [email protected] shift). Potential mechanisms of intron sliding appearance †Irina V. Poverennaya and Nadezhda A. Potapova contributed equally to this work as frst authors were proposed [10–12] and it was shown that almost 1 Vavilov Institute of General Genetics, Russian Academy of Sciences, always they are no longer than 1 bp [12], but potentially Moscow, Russia might be of a length of 15 bp [13] or even 60 bp [14]. Full list of author information is available at the end of the article © The Author(s) 2020. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creat iveco mmons .org/publi cdoma in/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Poverennaya et al. BMC Evol Biol (2020) 20:164 Page 2 of 8 According to the diferent data [13, 15, 16] introns relo- for intron sliding in a one certain gene. Second, essential cate their positions over the distances of 1–15 bp or even are high-quality genome annotations and transcriptomes, around 5% of the intron length. However, the longer the to avoid annotation errors. In the very frst intron slid- distance is, the more sceptical we might be about the ing studies limited sets of data were used, e.g. only gene presence of an intron sliding event rather than intron loss annotations and genomes, or even just genes [18, 19], and gain in diferent positions. and authors were cautious with claims about the exist- As proper intron recognition and removal from tran- ence of any intron sliding event. Later, a step with check- scripts require correct splice signals at the intron start ing intron sliding using RNA-seq data was added, which and end, i.e. donor and acceptor splice sites, intron slid- helps to consider each potential case more accurately ing might happen as a result of a series of mutations lead- (e.g. see [20]). But yet authors of only a few studies are ing to creation of the alternative splice sites and loss of confdent with the results of sliding intron presence [15, the old ones. It remains unclear whether canonical donor 16], while the majority of found cases failed after check- and acceptor splice sites GT-AG, which are the most ing [21]. For instance, in [10] a list of 32 sliding events common in eukaryotes, stay intact after intron sliding, or was examined and 20 cases were fltered out. However, intron sliding rather happens in introns with non-canon- the authors stayed skeptical about the remaining cases ical splice sites. Te recently suggested molecular mech- and suspected that most, if not all, of them were artifacts. anism for 1 bp sliding [12] considers the importance of In another study [22], 9 oomycete genomes were used canonical splice sites and focuses on so-called stwintron and it was concluded that intron sliding is only acciden- (spliceosomal twin intron) that is an intron with nested tal and plays a minor role in eukaryotic genome evolu- internal introns that could be alternatively spliced. tion [22]. Other works also showed that there is no intron Intron sliding does not afect the sequence of the tran- sliding in Cryptococcus species [23] and in Daphnia [24]. script but can lead to intron phase changes [15], which Summarizing, there is a very small possibility for intron might be important for alternative splicing. Let us con- sliding occurrences. All of such events were observed sider a hypothetical example with the sequence at the using outdated data or genome annotations that had exon–intron boundaries GAC|gtcct…tgag|CTAA where changed drastically in recent years. Some studies did not the splice site is shown by the vertical streak, donor and take into account transcriptomic data. Tus, each previ- acceptor splice sites are underlined, intron’s sequence is ously found possible case needs an accurate interpreta- highlighted in lowercase letters, while the sequences of tion and verifcation. neighbouring exons are in capital letters. If intron slid- We approached this task using the latest genome anno- ing, as it is generally predicted, happens as a result of tations, reference genome versions and high quality tran- the cut-and-paste mechanism then the sequence in the scriptomic data for some mammalian species (including case of 1 bp sliding will look like this: GACC|gtcct… human, chimpanzee, rhesus macaque and mouse), for tgag|TAA. Te intron sequence stays intact, while the which the mentioned list of data was available. In order exon sequences gain/lose 1 bp each at their border lead- to understand the prevalence of intron sliding phenom- ing to intron phase change. However, if intron sliding ena and, if it takes place, its characteristics, we identi- happens in a series of point mutations at the exon–intron fed and then verifed each possible case of intron sliding border or the presence of canonical splice signals stops between human and other mammals. being crucial for correct intron splicing for this particu- lar case (according to [17], the number of non-canonical sites is probably greatly underestimated due to imperfect Results genome annotations), we might fnd the intermediate Search for intron sliding events cases, e.g. GACG|tc ct…tgaac|TAA (the point mutation Search for parallel shifts of exon–intron borders in the G → A occurs at the end of intron leading to the creation pairwise genome alignments between human and other of a new acceptor site AC| and shifting the exon–intron species resulted in 35 potential intron sliding events. border). Here we might observe the changes not only in Te maximum number of sliding hits (13) were found intron sequences but in the afecting codons as well.