View a Copy of This Licence, Visit
Total Page:16
File Type:pdf, Size:1020Kb
Prashanth et al. BMC Endocrine Disorders (2021) 21:61 https://doi.org/10.1186/s12902-021-00709-6 RESEARCH ARTICLE Open Access Identification of hub genes related to the progression of type 1 diabetes by computational analysis G. Prashanth1 , Basavaraj Vastrad2 , Anandkumar Tengli3 , Chanabasayya Vastrad4* and Iranna Kotturshetti5 Abstract Background: Type 1 diabetes (T1D) is a serious threat to childhood life and has fairly complicated pathogenesis. Profound attempts have been made to enlighten the pathogenesis, but the molecular mechanisms of T1D are still not well known. Methods: To identify the candidate genes in the progression of T1D, expression profiling by high throughput sequencing dataset GSE123658 was downloaded from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identified, and gene ontology (GO) and pathway enrichment analyses were performed. The protein-protein interaction network (PPI), modules, target gene - miRNA regulatory network and target gene - TF regulatory network analysis were constructed and analyzed using HIPPIE, miRNet, NetworkAnalyst and Cytoscape. Finally, validation of hub genes was conducted by using ROC (Receiver operating characteristic) curve and RT-PCR analysis. A molecular docking study was performed. Results: A total of 284 DEGs were identified, consisting of 142 up regulated genes and 142 down regulated genes. The gene ontology (GO) and pathways of the DEGs include cell-cell signaling, vesicle fusion, plasma membrane, signaling receptor activity, lipid binding, signaling by GPCR and innate immune system. Four hub genes were identified and biological process analysis revealed that these genes were mainly enriched in cell-cell signaling, cytokine signaling in immune system, signaling by GPCR and innate immune system. ROC curve and RT-PCR analysis showed that EGFR, GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, GABARAP, TLN1 and PXN might be involved in the advancement of T1D. Molecular docking studies showed high docking score. Conclusions: DEGs and hub genes identified in the present investigation help us understand the molecular mechanisms underlying the advancement of T1D, and provide candidate targets for diagnosis and treatment of T1D. Keywords: bioinformatics, type 1 diabetes, differentially expressed genes, enrichment analysis, pathways * Correspondence: [email protected] 4Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad, Karanataka 580001, India Full list of author information is available at the end of the article © The Author(s). 2021 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Prashanth et al. BMC Endocrine Disorders (2021) 21:61 Page 2 of 65 Introduction Material and methods Type 1 diabetes (T1D) (insulin-dependent) is a core Data resources challenge for endocrine research around the world [1]. Expression profiling by high throughput sequencing Approximately 5 to 10% of the childhood population is dataset GSE123658 was downloaded from the GEO affected with T1D worldwide [2]. T1D affects the eyes, database (http://www.ncbi.nlm.nih.gov/geo/)[21]. CPM kidneys, heart, peripheral and autonomic nervous sys- count normalization performed on the original dataset tems [3]. Pancreatic cells, particularly β-cells, play a key GSE123658 from GEO databse using package edgeR role in the occurrence and progression of T1D [4]. package [22], voom function [23], and Limma [24]ofR Treatment for T1D includes targeting β-cells and β-cells software. The data was produced using a GPL18573 Illu- regeneration [5]. However, T1D is a complex disease mina NextSeq 500 (Homo sapiens). The GSE123658 and its biology remains poorly understood [6]. dataset contained data from 82 samples, including 39 There are several important risk factors for T1D, such T1D patients’ samples and 43 healthy donors’ samples. as genetic and environmental factors [7, 8]. Previous studies identified aspects of the molecular mechanism of T1D advancement. T1D has been genetically associated Identification of DEGs with genes and signaling pathways, CTLA-4 [9], SUMO4 The identification of DEGs between 39 T1D patients’ [10], CYP27B1 [11], PD-1 [12], KIAA0350 [13], tumor samples and 43 healthy donors’ samples was performed necrosis factor alpha signaling pathways [14], NLRP3 using Limma package in R bioconductor. lmFit function and NLRP1 inflammasomes signaling pathways [15], in the limma package to construct linear model for indi- HIF-1/VEGF signaling pathway [16], l-arginine/NO vidual gene [25]. makeContrasts function in the limma pathway [17], and CaMKII/NF-κB/TGF-β1 and PPAR-γ package to compose similarity between T1D and healthy signaling pathway [18]. Next-generation sequencing donors groups (log fold-changes) are obtained as con- (NGS) has drastically increased the understanding mech- trasts of these fitted linear model. eBayes is a function in anism of T1D, and analyses of these data can provide limma package which figure out empirical Bayes predicts insight into effective diagnostic and therapeutic T1D of DEGs [26]. topTable function in limma package to treatments [19]. Thus, identifying key molecular bio- obtain a table of the most significant Up and down regu- markers is essential for early diagnosis, prevention, and lated genes from a eBayes model fit. To correct the dis- treatment of T1D. covery of statistically important molecular biomarkers It worth a lot of money and time to identify disease re- and limitations of false-positives, we using the adjusted lated molecular biomarkers by experiment alone. With P-value and Benjamini and Hochberg false discovery rate the wide application of expression profiling by high method [27]. Fold-change (FC) and adjust p-values were throughput sequencing data, there were huge genomics used to found DEGs. A |log2FC| > 0.94 for up regulated data deposited in public databases [20]. The progression genes, |log2FC| -0.39 for down regulated genes and P- of computational tools gives us an alternative method to value < 0.05 were used as considered statistically signifi- diagnose novel molecular biomarkers. cant. The volcano plot was implemented using ggplot2 In this investigation, we employed the bioinformatics package [28], and the heat map was established using approach to discover the differentially expressed genes gplots package in R language. between T1D patients and healthy donors. Original ex- pression profiling by high throughput sequencing dataset GSE123658 was downloaded. 39 T1D patients’ samples Gene Ontology (GO) and pathway enrichment analyses of and 43 healthy donors’ samples were analyzed in our in- DEGs vestigation. Commonly altered DEGs were isolated from Gene Ontology (GO) (http://www.geneontology.org) integrated data. Additionally, GO/ REACTOME pathway analysis is a routine analysis for annotating genes and analysis, construction of protein–protein interaction net- determining biological component, including biological work, modules, target gene - miRNA regulatory network process (BP), cellular component (CC) and molecular and target gene - TF regulatory network analysis were function (MF) [29]. REACTOME (https://reactome.org/) performed to analyze these data. Four hub genes (EGFR, [30] pathway database is applied for classification by cor- GRIN2B, GJA1, CAP2, MIF, POLR2A, PRKACA, relating gene sets into their respective pathways. The GABARAP, TLN1 and PXN) were identified. ROC (re- ToppGene (ToppFun) (https://toppgene.cchmc.org/ ceiver operating characteristic) curve and RT-PCR ana- enrichment.jsp)[31] is a gene functional classification lysis were used to verify clinically relevant hub genes. tool that objective to provide a extensive set of func- The aim of this investigation was to gain a better under- tional annotation tools for authors to recognize the bio- standing of the underlying molecular mechanisms and logical explanation behind large lists of genes. P < 0.05 to discover molecular biomarkers for T1D. was find statistically significant. Prashanth et al. BMC Endocrine Disorders (2021) 21:61 Page 3 of 65 PPI network construction and module analysis was constructed through network topology prosperities. The online Human Integrated Protein-Protein Inter- The node degree was determined using the Network action rEference (HIPPIE) (http://cbdm.uni-mainz.de/ analysis plugin, and miRNAs with a node degree >12. hippie/)[32] online database was using to predicted the PPI network information. Analyzing the interactions and Construction of TF - target regulatory network functions between DEGs may provide information