Current Challenges of Computational Intelligent Techniques for Functional Annotation of Ncrna Genes
Total Page:16
File Type:pdf, Size:1020Kb
Available online at www.ijmrhs.com al R edic ese M a of rc l h a & n r H u e o a J l l t h International Journal of Medical Research & a n S ISSN No: 2319-5886 o c i t i Health Sciences, 2019, 8(6): 54-63 e a n n c r e e t s n I • • I J M R H S Current Challenges of Computational Intelligent Techniques for Functional Annotation of ncRNA Genes Qaisar Abbas* College of Computer and Information Sciences, Al Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia *Corresponding e-mail: [email protected] ABSTRACT The limited understanding of functional annotation of non-coding RNAs (ncRNAs) has been largely due to the complex functionalities of ncRNAs. They perform a vital part in the operation of the cell. There are many ncRNAs available such as micro RNAs or long non-coding RNAs that play important functions in the cell. In practice, there is a strong binding of the function of RNAs that must be considered to develop computational intelligent techniques. Comprehensive modeling of the structure of an ncRNA is essential that may provide the first clue towards an understanding of its functions. Many computational techniques have been developed to predict ncRNAs structures but few of them focused on the functions of ncRNA genes. Nevertheless, the accuracy of the functional annotation of ncRNAs is still facing computational challenges and results are not satisfactory. Here, many computational intelligent methods were described in this paper to predict the functional annotation of ncRNAs. The current literature review is suggested that there is still a dire need to develop advanced computational techniques for functional annotating of ncRNA genes in terms of accuracy and computational time. Keywords: Genes, Non-coding RNAs, Artificial intelligence, Machine learning, Functional annotation, Computational methods, Deep learning INTRODUCTION The discovery of non-coding RNA has made us realize that coding DNA is highly relevant in the regulation of gene expression. Due to their vital functional role in the cell operation, this type of genes has been used in many studies. Compare to coding RNAs, the recent studies showed that the non-coding RNAs (ncRNAs) play a vital role in the functional side of the cell. Nowadays, the researchers are building computational intelligent techniques for prediction of ncRNAs structure and functional annotation of ncRNAs genes [1,2]. In the case of the genome sequence, the authors utilized many ncRNA genes for classification of RNA structures and functional discovery [3-5]. As a result, the authors described in different studies that it became obvious important to predict RNA structure along with the function. However, a few studies computational intelligence techniques focused on developing intelligent system for ncRNAs genes. MicroRNAs (miRNAs/miRs) are newly discovered that have changed the understanding of how genes are regulated. Several studies were developed based on conventional and advanced machine learning algorithms to detect the functional annotation of ncRNAs genes. But still, it is unknown for many scientists, some of the important functions of ncRNAs that must be discovered. Long non-coding RNA (lncRNAs) transcripts micro RNAs (mRNAs) genes but lacks in coding for a novel function of ncRNAs. So discovering a function of RNAs it is essential to have a good model of its structure. It is not only challenging to know the complete structure of RNA [6]. This process is also tiresome and took time as well. Few computational models for predicting RNA secondary structure have been around somewhere in the early to mid-seventies. In Figure 1, we have given a broad overview of the type of structures and resources to analyze them in many different characteristics. 54 Abbas, Q (mRNAs) es la in co or no unction of ncR S discoveri unct As it essential ha a model structure. only hallengin t complete ure RNA [ This also resom too tim well computational odel fo predicting NA seconda struc arou somewhe i the early mid-seventies. Figure 1, e give Kadhim, et al. Abbas,broa ervieQ type uctu a resou analyInt J Med Resm Health iffere Sci aracteristi 2019, 8(6): 54-63 Figure 1 The branch structure of genomic variations in long ncRNAs (lncRNAs) i h ah ctur enomi aiai i long (lncRNAs) Good modeling of RNA structure is critical that provides some of the evidence or information towards a clear explanation of its function. Despite the advancement in high throughput computing, it is still a tedious and time- consumingGoo odelin task to determineof N struc the tertiary critical structure of provi RNA. In som past studies,of the authors formationhave developed r a number cl of computationalexplanatio algorithms it functi to predict Despite structure dvancem models of ncRNA throu genes [7]. Primarycomputing structure, it still of RNA dious is a sequence- e- specific process that determines some functional properties, like mature miRNA and siRNA molecules base pair consuming sk determ rtia tur RNA I ie, ha developed number to their targets [8]. Modification in the primary, secondary structure and expression levels of lncRNAs, as well as their cognateputational RNA-binding gorithms proteins, predict can causeuctu diseases model ranging from genesneurodegeneration [7] Prim tostructure cancers [9-13]. RNA That a maysequence question-s the methods roce based etermon high-throughput onal sequencing, properti, especially afor larger miRN structured a N RNAs with long- range tertiary interactions. Therefore, deriving the tertiary structure in a hierarchic way from the predicted secondary ai t the et [ Modificatio , seconda an expr le , structure is not straightforward [14]. RNA tertiary structure can be model in 2 main ways [15,16]. gnat RNA-bindin eins, cause rangin from generation cancer Since the ncRNAs are completely described the biological process in the operation of the cell for the formation of tertiary[9-13 structures]. T ay [17]. esti Those tertiary metho structures bas of ncRNA gh-thr genes are enabled sequencin, to interact specially with eachr r other structuredto produce proteins [18]. The long authors-ra utilizedrtia ainteractions. simple folding Therefore, technique de to detect t a transitionterti ucture from secondary i hierarc to tertiary structure rom [19]. Instead of structure, the authors have also described another category of ncRNAs that is long non-coding RNA Abbas predicte, Q seconda uc is straightforwa [4 RNA rtia uctu ca model 2 main classes [20]. All those classes of ncRNAs are described in Figure 2. In fact, many kinds of ncRNA genes are used to develop non-protein5,6 coding that will ultimately generate disease in the cell [21-23]. NA a completely ibed biological operati th cel formation rtia uctures [17] os tertia structu ncRNA nes enable t inte with othe to produ [18]. Th author utiliz simple lding ch detect transition onda tertia re [19]. Instea uctur, hav al rib another catego n-codin asses [20] thos cl of nc describe Figure 2 In , s cR ene are use t el n-protei codi at il imately gene t cell [21-23]. Figure 2 The branch structure of genomic variations in microRNAs i h ah ctur enomi aiai i miRNAs It is cleared from Figure 1 that there are many classes based on functional transcripts of ncRNAs in case of humans. All the classes are described that are difficult to cover in this review article [24]. However, the lncRNAs, miRNAs from Figu t ar m cl base unctional script are mostly included in this study. The literature review suggested that there is also long non-coding RNA (lncRNA) classhumans. instead of just thecl ncRNAs annotationdescribe [25].tha In practice, diffic it was cove noticed byth medical ie expert article that [4 the. humanHowe, ncRNAs compriselncRN, 9882 iRNAs small RNAs mostl according include to the i tcurrent stu comprehensive T literatur annotation vie suggeste of genes that [26-29]. Accordingly, als non the- lncRNA genes were also included in this review article to find their functional annotation in the development of codithe cell. RN cl instead ncR nnotation [5. ic, it notice by edical expe the m ncRN omprise smal ordin curre comprehens annotation nes [26-29]. Accordin, ncR genes 55 al incl in revi articl to t functional otation t evelopm cell. re articl focuses omputational gorithm predicting functional io of A includi NA ass I cont functional notation NA gene, ha also develo som chnique pre ncRNA ucture. em wer discusse paper uch Fold ign, Pfol, old, RNAfol, RNAshapes, RNAstructu, N, iFoldR, and RNA [30-38] The ethods will rib in future pape I t coming , te-of-the-ar ficial intell chnique briefly describe to nctiona NA genes [39-4. Compuaia Intelli hi Functional notation le identifyi gene larg seq of NAs. ye, le of ng ncRN (lncRNAs gaine lots no appeared produce otein- codi 49]. S -protein-codin contain transcri noi expe to likely unctiona i res, unctio nnotation ncRNAs ncRN e researc topic the ai xt-generation quencer ( 50 regulati gene xpres an epigenetic, th ha rucial unct suc as devel Ther, m ie have been conducte determin As 5 pical mputational et, classification depe heavily prote making it ic distinguis codin fr non-co ranscri To om challenge, the is develo a adva computational em approaches e SV, , NN or tistical model for entification protein-codin subseque paragra briefly describ putational techniques functional otation genes 52-76] omputational system presented . Kadhim, et al. Abbas, Q Int J Med Res Health Sci 2019, 8(6): 54-63 This review article focuses on the computational algorithms for predicting functional annotation of ncRNA including lncRNA class. In contrast to functional annotation of ncRNA genes, the authors have also developed some techniques to predict ncRNA structure.