The Role of Long Non-Coding Rnas in the Pathogenesis of Hereditary Diseases Peter Sparber1*, Alexandra Filatova1, Mira Khantemirova2,3 and Mikhail Skoblov1,4
Total Page:16
File Type:pdf, Size:1020Kb
Sparber et al. BMC Medical Genomics 2019, 12(Suppl 2):42 https://doi.org/10.1186/s12920-019-0487-6 REVIEW Open Access The role of long non-coding RNAs in the pathogenesis of hereditary diseases Peter Sparber1*, Alexandra Filatova1, Mira Khantemirova2,3 and Mikhail Skoblov1,4 From 11th International Multiconference “Bioinformatics of Genome Regulation and Structure\Systems Biology” - BGRS\SB- 2018 Novosibirsk, Russia. 20-25 August 2018 Abstract Background: Thousands of long non-coding RNA (lncRNA) genes are annotated in the human genome. Recent studies showed the key role of lncRNAs in a variety of fundamental cellular processes. Dysregulation of lncRNAs can drive tumorigenesis and they are now considered to be a promising therapeutic target in cancer. However, how lncRNAs contribute to the development of hereditary diseases in human is still mostly unknown. Results: This review is focused on hereditary diseases in the pathogenesis of which long non-coding RNAs play an important role. Conclusions: Fundamental research in the field of molecular genetics of lncRNA is necessary for a more complete understanding of their significance. Future research will help translate this knowledge into clinical practice which will not only lead to an increase in the diagnostic rate but also in the future can help with the development of etiotropic treatments for hereditary diseases. Keywords: Long non-coding RNA, lncRNA, Hereditary disease, Gene regulation, Medical genetics Introduction The second group includes rRNAs and long non-coding In 1958, Francis Crick proposed the central dogma of mo- RNAs (lncRNAs), which to date are very poorly function- lecular biology in which he explained the flow of genetic ally annotated. information within a biological system. In the central LncRNAs are defined as transcripts of more than 200 dogma RNA acts as a simple intermediary between the nucleotides in length not containing an extended open DNA that carries the genetic information and the proteins reading frame. Based on the annotations of the FAN- that define the whole variety of biological processes in the TOM5 project the human genome contains almost 28,000 cell. Since then our understanding of the central dogma lncRNA genes, a number that is comparable to the has dramatically changed. Large international consortiums amount of protein-coding genes [2] . In-depth analysis re- such as ENCODE (The Encyclopedia of DNA Elements) vealed that lncRNAs is an extensive and very heterogenic has shown that up to 80% of the genome is transcribed group from mRNA-like transcripts to circular RNAs. while only 1,5% of it is protein –coding sequences [1]. Their length can vary from several hundred nucleotides This showed that non-coding RNAs are a lot more abun- for BC-200 to 90 thousands like in the case of Kcnqot1. dant than it was considered before. LncRNA are found in all branches of life and the com- Based on the length of the transcript non-coding RNAs plexity of different organisms is well correlated with the are divided into two groups: short and long non-coding amount and the diversity of these transcripts [3]. Accord- RNAs. The first group includes well-known classes such as ing to the GENCODE project one third of all human tRNAs, snRNAs, snoRNAs, miRNAs, piRNAs and others. lncRNAs genes are primate-specific [4]. LncRNAs share a lot of common features with * Correspondence: [email protected] mRNAs, they are often capped, polyadenylated and 1Research Center for Medical Genetics, Moscow, Russia Full list of author information is available at the end of the article undergo splicing. Nevertheless, lncRNA has its own © The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Sparber et al. BMC Medical Genomics 2019, 12(Suppl 2):42 Page 64 of 140 peculiarities [4]. On average lncRNAs have less exons in them is genome wide association studies – GWAS. In this their structure. 42% of lncRNA contains two exons in com- case, single nucleotide polymorphisms (SNP), associated parison with only 6% in mRNAs. In addition, on average with a particular condition are investigated. To date there lncRNA are shorter than mRNA. Five hundred ninety-two are only a few examples of using GWAS in the field of nucleotides compared to 2453 for mRNA. Whereas 95% of lncRNA analysis. The reason is that GWAS were focused all multi-exon mRNAs are alternatively spliced only 25% of primarily on protein-coding genes [15], despite the fact that lncRNAs undergo alternative splicing. Generally, lncRNA most SNPs are located outside of protein-coding regions of has a lower expression level, their expression is more the genome [16]. One of the last researches that used tissue-specific, and the majority of lncRNA has nuclear GWAS revealed almost 150 lncRNAs associated with the localization. LncRNA is less conserved than mRNA, but the cardiometabolic phenotype [15]. More detailed analysis was sequence conservation in lncRNA is not always correlated to performed for lncRNA linc-NFE2L3–1 and identified a their function. For example, XIST (X-inactive specific tran- possible molecular mechanism by which linc-NFE2L3–1 script), one of the first described lncRNAs has a low level of can contribute to the development of the investigated sequence conservation, but a highly conserved function phenotype. Another example of using GWAS analysis for across placental mammals – inactivation of the X chromo- identification of possible lncRNA association with human some [5]. On the other hand, MALAT1 whose sequence is diseases is research conducted into the DISC genomic locus highly conserved between human and mice do not tend to in the field of psychiatric disorders such as schizophrenia, be conserved on the functional level. Knockdown experi- bipolar disorder, depression and autism spectrum disorders. ments on human and mice lung cancer cell lines showed a This study revealed that a protein-coding gene DICS1 is as- decrease migration and metastatic rate in human. On the sociated with these conditions. This genomiс locus also other hand in mice knockdown of MALAT1 apparently do contains a lncRNA gene DICS2, that is transcribed from not lead to a distinct phenotype [6]. These examples demon- the opposite strand and may possibly regulate DICS1 ex- strate that for lncRNA the sequence conservation is not al- pression [17]. However, there is no experimental data con- ways a predictor for functionality, and that functional firming this hypothesis and the role of DICS2 in the conservation may have a more complex nature in lncRNAs. development of schizophrenia. So far, only a small amount of lncRNAs is functionally Another approach for investigating the role of lncRNA annotated [7, 8]. However various studies showed that in the development of human disease is differential ex- lncRNAs can participate in many cellular processes, in- pression analysis both in normal conditions and in disease cluding regulation of gene expression at the transcrip- state. Many papers focusing on protein-coding genes tional [9–11] and post transcriptional levels [12–14]. showed that the change in gene expression might be the Moreover, it is known that many lncRNAs can contribute main event in disease pathogenesis or it can be a second- to the development of many human diseases. Currently ary incident. Although the first situation is more interest- several databases containing entries about the association ing for studying the molecular mechanisms of disease of non-coding RNAs and human diseases are available pathogenesis, in both cases the altered expression level such as Lnc2Cancer, MNDR, LncRNADisease. For ex- can be used as a very important biomarker of the patho- ample, the last update of LncRNADisease (from july 26, logical processes. 2017) contains almost 2000 entries about 914 lncRNAs An example of such a work is a study of gene expression that are associated with 329 diseases. Different approaches in hereditary hemorrhagic telangiectasia, also known as can be used for investigating the role of lncRNAs in the Osler–Weber–Rendu disease (OMIM 187300) [18]. Using pathogenesis of human diseases. For some lncRNAs the microarrays the expression level of RNA extracted from connection with human diseases was made through normal and affected tissues of patients with Osler– GWAS (genome wide association studies) or investigating Weber–Rendu disease was obtained. As a result, 42 differ- differential gene expression, but the molecular mecha- entially expressed lncRNAs were identified. In their work, nisms are still unknown. For others several aspects of the the authors note that further study is necessary to estab- pathogenesis were experimentally investigated in-vitro. Fi- lish the role of these lncRNAs in disease progression. nally for some lncRNAs their role in disease progression However, the expression level of these genes can already was obtained during in-vivo experiments on mice models be used as a diagnostic biomarker. Another work, empha- and in some cases lncRNAs were suggested as therapeutic sizing the importance of lncRNAs in human disease is a targets. This review presents an analysis of the papers per- recently published paper investigating autism spectrum taining to each of the groups described above. disorder (ASD) [19]. Authors compared RNA-seq data of samples from affected and control brain samples. 60 Analysis of lncRNA association with hereditary diseases lncRNAs with a dramatically altered expression level were Several approaches are used for investigating the role of identified in autism. Moreover, authors showed that most lncRNA in the development of human diseases.