Downloaded from the Gene Expression Omnibus (GEO) Database, Which Contained 366 Samples, Including 200 Heart Failure Samples and 166 Non Heart Failure Samples
Total Page:16
File Type:pdf, Size:1020Kb
Kolur et al. BMC Cardiovasc Disord (2021) 21:329 https://doi.org/10.1186/s12872-021-02146-8 RESEARCH Open Access Identifcation of candidate biomarkers and therapeutic agents for heart failure by bioinformatics analysis Vijayakrishna Kolur1 , Basavaraj Vastrad2 , Chanabasayya Vastrad3* , Shivakumar Kotturshetti3 and Anandkumar Tengli4 Abstract Introduction: Heart failure (HF) is a heterogeneous clinical syndrome and afects millions of people all over the world. HF occurs when the cardiac overload and injury, which is a worldwide complaint. The aim of this study was to screen and verify hub genes involved in developmental HF as well as to explore active drug molecules. Methods: The expression profling by high throughput sequencing of GSE141910 dataset was downloaded from the Gene Expression Omnibus (GEO) database, which contained 366 samples, including 200 heart failure samples and 166 non heart failure samples. The raw data was integrated to fnd diferentially expressed genes (DEGs) and were further analyzed with bioinformatics analysis. Gene ontology (GO) and REACTOME enrichment analyses were performed via ToppGene; protein–protein interaction (PPI) networks of the DEGs was constructed based on data from the HiPPIE interactome database; modules analysis was performed; target gene—miRNA regulatory network and target gene— TF regulatory network were constructed and analyzed; hub genes were validated; molecular docking studies was performed. Results: A total of 881 DEGs, including 442 up regulated genes and 439 down regulated genes were observed. Most of the DEGs were signifcantly enriched in biological adhesion, extracellular matrix, signaling receptor binding, secre- tion, intrinsic component of plasma membrane, signaling receptor activity, extracellular matrix organization and neu- trophil degranulation. The top hub genes ESR1, PYHIN1, PPP2R2B, LCK, TP63, PCLAF, CFTR, TK1, ECT2 and FKBP5 were identifed from the PPI network. Module analysis revealed that HF was associated with adaptive immune system and neutrophil degranulation. The target genes, miRNAs and TFs were identifed from the target gene—miRNA regulatory network and target gene—TF regulatory network. Furthermore, receiver operating characteristic (ROC) curve analysis and RT-PCR analysis revealed that ESR1, PYHIN1, PPP2R2B, LCK, TP63, PCLAF, CFTR, TK1, ECT2 and FKBP5 might serve as prognostic, diagnostic biomarkers and therapeutic target for HF. The predicted targets of these active molecules were then confrmed. Conclusion: The current investigation identifed a series of key genes and pathways that might be involved in the progression of HF, providing a new understanding of the underlying molecular mechanisms of HF. Keywords: Heart failure, Diferentially expressed genes, Molecular docking, Enrichment analysis, Prognosis Introduction *Correspondence: [email protected] Heart failure (HF) is a cardiovascular disease character- 3 Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, ized by tachycardia, tachypnoea, pulmonary rales, pleu- Dharwad 580001, Karnataka, India Full list of author information is available at the end of the article ral efusion, raised jugular venous pressure, peripheral © The Author(s) 2021. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/. The Creative Commons Public Domain Dedication waiver (http:// creat iveco mmons. org/ publi cdoma in/ zero/1. 0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Kolur et al. BMC Cardiovasc Disord (2021) 21:329 Page 2 of 33 oedema and hepatomegaly [1]. Morbidity and mortality Finally, we performed molecular docking studies for over linked with HF is a prevalent worldwide health problem expressed hub genes. Results from the present investiga- holding a universal position as the leading cause of death tion might provide new vision into potential prognostic [2]. Te numbers of cases of HF are rising globally and and therapeutic targets for HF. it has become a key health issue. According to a survey, the prevalence HF is expected to exceed 50% of the global Materials and methods population [3]. Research suggests that modifcation in Data resource multiple genes and signaling pathways are associated in Expression profling by high throughput sequencing with controlling the advancement of HF. However, a lack of series number GSE141910 based on platform GPL16791 investigation on the precise molecular mechanisms of HF was downloaded from the GEO database. Te dataset development limits the treatment efcacy of the disease of GSE141910 contained 200 heart failure samples and at present. 166 non heart failure samples. It was downloaded from Previous study showed that HF was related to the the GEO database in NCBI based on the platform of expression of MECP2 [4] RBM20 [5], CaMKII [6], tro- GPL16791 Illumina HiSeq 2500 (Homo sapiens). ponin I [7] and SERCA2a [8]. Toll-Like receptor signaling pathway [9], activin type II receptor signaling pathway Identifcation of DEGs in HF [10], CaMKII signaling pathways [11], Drp1 signaling DEGs of dataset GSE141910 between HF groups and non pathways [12] and JAK-STAT signaling pathway [13] were heart failure groups were respectively analyzed using liable for progression of HF. More investigations are the limma package in R [21]. Fold changes (FCs) in the required to focus on treatments that enhance the out- expression of individual genes were calculated and DEGs come of patients with HF, to strictly make the diagnosis with P < 0.05, |log FC|> 1.158 for up regulated genes and of the disease based on screening of biomarkers. Tese |log FC|< − 0.83 for down regulated genes were consid- investigations can upgrade prognosis of patients by low- ered to be signifcant. Hierarchical clustering and visuali- ering the risk of advancement of HF and related compli- zation were used by Heat-map package of R. cations. So it is essential to recognize the mechanism and fnd biomarkers with a good specifcity and sensitivity. Functional enrichment analysis Te recent high-throughput RNA sequencing data Gene Ontology (GO) analysis and REACTOME pathway has been widely employed to screen the diferentially analysis were performed to determine the functions of expressed genes (DEGs) between normal samples and DEGs using the ToppGene (ToppFun) (https:// toppg ene. HF samples in human beings, which makes it accessible cchmc. org/ enric hment. jsp) [22] GO terms (http:// geneo for us to further explore the entire molecular alterations ntolo gy. org/) [23] included biological processes (BP), cel- in HF at multiple levels involving DNA, RNA, proteins, lular components (CC) and molecular functions (MF) epigenetic alterations, and metabolism [14]. However, of genomic products. REACTOME (https:// react ome. there still exist obstacles to put these RNA seq data in org/) [24] analyzes pathways of important gene products. application in clinic for the reason that the number of ToppGene is a bioinformatics database for analyzing the DEGs found by expression profling by high throughput functional interpretation of lists of proteins and genes. sequencing were massive and the statistical analyses were Te cutof value was set to P < 0.05. also too sophisticated [15–19] In this study, frst, we had chosen dataset GSE141910 Protein–protein interaction network construction from Gene Expression Omnibus (GEO) (http:// www. and module screening ncbi. nlm. nih. gov/ geo/) [20]. Second, we applied for PPI networks are used to establish all protein coding limma tool in R software to obtain the diferentially genes into a massive biological network that serves an expressed genes (DEGs) in this dataset. Tird, the Top- advance compassionate of the functional system of the pGene was used to analyze these DEGs including biologi- proteome [25]. Te HiPPIE interactome (https:// cbdm. cal process (BP), cellular component (CC) and molecular uni- mainz. de/ hippie/) [26] database furnish informa- function (MF) REACTOME pathways. Fourth, we estab- tion regarding predicted and experimental interactions lished protein–protein interaction (PPI) network and of proteins. In the current investigation, the DEGs were then applied Cytotype PEWCC1 for module analysis of mapped into the HiPPIE interactome database to fnd the DEGs which would identify some hub genes. Fifth, signifcant protein pairs with a combined score of > 0.4. we established target gene—miRNA regulatory network Te PPI network was subsequently constructed using and target gene—TF regulatory network. In addition, Cytoscape software, version 3.8.2 (www. cytos cape. org) we further validated the hub genes by receiver operating [27]. Te nodes with a higher node degree [28], higher characteristic (ROC) curve analysis and RT-PCR