COVID-19: Viral–Host Interactome Analyzed by Network Based

Total Page:16

File Type:pdf, Size:1020Kb

COVID-19: Viral–Host Interactome Analyzed by Network Based Messina et al. J Transl Med (2020) 18:233 https://doi.org/10.1186/s12967-020-02405-w Journal of Translational Medicine RESEARCH Open Access COVID-19: viral–host interactome analyzed by network based-approach model to study pathogenesis of SARS-CoV-2 infection Francesco Messina1†, Emanuela Giombini1†, Chiara Agrati1, Francesco Vairo1, Tommaso Ascoli Bartoli1, Samir Al Moghazi1, Mauro Piacentini1,2, Franco Locatelli3, Gary Kobinger4, Markus Maeurer5,6, Alimuddin Zumla7,8, Maria R. Capobianchi1* , Francesco Nicola Lauria1†, Giuseppe Ippolito1† and COVID 19 INMI Network Medicine for IDs Study Group Abstract Background: Epidemiological, virological and pathogenetic characteristics of SARS-CoV-2 infection are under evaluation. A better understanding of the pathophysiology associated with COVID-19 is crucial to improve treatment modalities and to develop efective prevention strategies. Transcriptomic and proteomic data on the host response against SARS-CoV-2 still have anecdotic character; currently available data from other coronavirus infections are there- fore a key source of information. Methods: We investigated selected molecular aspects of three human coronavirus (HCoV) infections, namely SARS- CoV, MERS-CoV and HCoV-229E, through a network based-approach. A functional analysis of HCoV–host interactome was carried out in order to provide a theoretic host–pathogen interaction model for HCoV infections and in order to translate the results in prediction for SARS-CoV-2 pathogenesis. The 3D model of S-glycoprotein of SARS-CoV-2 was compared to the structure of the corresponding SARS-CoV, HCoV-229E and MERS-CoV S-glycoprotein. SARS-CoV, MERS-CoV, HCoV-229E and the host interactome were inferred through published protein–protein interactions (PPI) as well as gene co-expression, triggered by HCoV S-glycoprotein in host cells. Results: Although the amino acid sequences of the S-glycoprotein were found to be diferent between the various HCoV, the structures showed high similarity, but the best 3D structural overlap shared by SARS-CoV and SARS-CoV-2, consistent with the shared ACE2 predicted receptor. The host interactome, linked to the S-glycoprotein of SARS-CoV and MERS-CoV, mainly highlighted innate immunity pathway components, such as Toll Like receptors, cytokines and chemokines. Conclusions: In this paper, we developed a network-based model with the aim to defne molecular aspects of pathogenic phenotypes in HCoV infections. The resulting pattern may facilitate the process of structure-guided phar- maceutical and diagnostic research with the prospect to identify potential new biological targets. Keywords: Coronavirus infection, Virus–host interactome, Spike glycoprotein Background *Correspondence: [email protected] In December 2019, a novel coronavirus (SARS-CoV-2) †Francesco Messina, Emanuela Giombini, Francesco Nicola Lauria and Giuseppe Ippolito contributed equally to this work was frst identifed as a zoonotic pathogen of humans 1 National Institute for Infectious Diseases “Lazzaro Spallanzani” IRCCS, in Wuhan, China, causing a respiratory infection Rome, Italy with associated bilateral interstitial pneumonia. Te Full list of author information is available at the end of the article © The Author(s) 2020. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creat iveco mmons .org/publi cdoma in/ zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. Messina et al. J Transl Med (2020) 18:233 Page 2 of 10 disease caused by SARS-CoV-2 was named by the We investigated here biologically and clinically relevant World Health Organization as COVID-19 and it has molecular targets of three human coronaviruses (HCoV) been classifed as a global pandemic since it has spread infections using a network based approach. A functional rapidly to all continents. As of May 20, 2020, there have analysis of HCoV–host interactome was carried out in been 4.889.287 confrmed COVID-19 cases worldwide order to provide a theoretic host–pathogen interaction with 322.457 deaths reported to the WHO [1]. Whilst model for HCoV infections, and to predict viable mod- clinical and epidemiological data on COVID-19 have els for SARS-CoV-2 pathogenesis. Tree HCoV caus- become readily available, information on the pathogen- ing respiratory diseases were used as the model targets, esis of the SARS-CoV-2 infection has not been forth- namely: SARS-CoV, that shares with SARS-CoV-2 a coming [2]. Te transcriptomic and proteomic data on strong genetic similarity, including MERS-CoV, and host response against SARS-CoV-2 is scanty and not HCoV-229E. efective therapeutics and vaccines for COVID-19 are available yet. Methods Coronaviruses (CoVs) typically afect the respiratory Comparative reconstruction of S‑glycoprotein in HCoVs tract of mammals, including humans, and lead to mild Te reconstruction of virus–host interactome was car- to severe respiratory tract infections [3]. Many emerg- ried out using the RWR algorithm to explore the human ing HCoV infections have spilled-over from animal res- PPI network and the multilayer PPI platform enriched ervoirs, such as HCoV-OC43 and HCoV-229E which with gene expression data sets. 259 sequences of CoVs, cause mild diseases such as common colds [4, 5]. During infecting diferent animal hosts (Additional fle 1: the past 2 decades, two highly pathogenic HCoVs, severe Table S1), were downloaded by GSAID and NCBI data- acute respiratory syndrome coronavirus (SARS-CoV) and base in order to evaluate the variability in the S gene. Middle East respiratory syndrome coronavirus (MERS- SARS-CoV, HCoV-229E and MERS-CoV and other CoV CoV), have led to global epidemics with high morbid- full genome sequence groups were aligned with MAFFT ity and mortality [6]. In this period, a large amount of [14], synonymous and non-synonymous mutations, and experimental data associated with the two infections has amino acid similarity were calculated using the SSE pro- allowed to better understand molecular mechanism(s) of gram with a sliding windows of 250 nucleotides and a coronavirus infection, and enhance pathways for devel- pass of 25 nu [15]. A homology model was built for the oping new drugs, diagnostics and vaccines and identi- amino acid sequences of the S-glycoprotein, derived from fcation of host factors stimulating (proviral factors) or the full genome sequence obtained at “SARS-CoV-2/ restricting (antiviral factors) infection remains poorly INMI1/human/2020/ITA” (MT066156.1). Te Swiss pdb understood [7]. Structures of many proteins of SARS- server was used to construct three-dimensional mod- CoV and MERS-CoV, and biological interactions with els for the S-glycoprotein of SARS-CoV-2 [16]. Among other viral and host proteins have been widely explored; proteins with a 3D structure, the best match with the through experimental testing of small molecule inhibi- “SARS-CoV-2/INMI1/human/2020/ITA” sequence was tors with anti-viral efects [8, 9]. ACE2, expressed in type the 6VSB.1, that was evaluated considering the iden- 2 alveolar cells in the lung, has been identifed as recep- tity of two amino acid sequences and the QMEN value tor of SARS-CoV and SARS-CoV-2, while dipeptidyl included in Swiss pdb server. Te model of a single chain peptidase DPP4 was identifed as the specifc receptor for was overlapped with the three-dimensional structure MERS-CoV [10, 11]. of S-glycoprotein single chain belonging to SARS-CoV Te investigation of structural genomics and inter- (5WRG), HCoV-229E (6U7H.1) and MERS-CoV (5X59), actomics of SARS-CoV-2 can be implemented through using Chimera 1.14 [17]. In order to better evaluate the systematical mapping of protein–protein interactions conservation of the sequence in each site, all sequences (PPI) between SARS-CoV-2 and human host, and an were aligned with MAFFT and the topology of all struc- integrated bioinformatics approach [12, 13]. Structural tures were compared. Te detailed description of the analysis of specifc SARS-CoV-2 proteins, in particular reconstruction of S-glycoprotein structure is reported in Spike glycoproteins (S-glycoproteins), and their interac- Additional fle 2. tions with human proteins, can guide the identifcation of the putative functional sites and help to better defne the PPI and gene co‑expression network pathologic phenotype of the infection. Tis functional Network analysis, based on protein–protein interactions interaction analysis between the host and other HCoVs, and gene expression data, was performed in order to view combined with an evolutionary sequence analysis of all possible virus–host protein interactions during the SARS-CoV-2, can be used to guide new treatment and HCoV infections. Since the SARS-CoV-2 genome exhibits prevention interventions. substantial similarity to the SARS-CoV genome [18] and Messina et al. J Transl Med (2020) 18:233 Page 3 of 10 subsequently also the proteome [19], we hypothesized sequence obtained at Laboratory of Virology, National that several molecular interactions that were observed Institute for Infectious Diseases “L. Spallanzani” IRCCS, in the SARS-CoV interactome will be preserved in the using Swiss pdb viewer server (Additional fle 2: Figure SARS-CoV-2 interactome.
Recommended publications
  • Ingenuity Pathway Analysis of Human Facet Joint Tissues: Insight Into Facet Joint Osteoarthritis
    EXPERIMENTAL AND THERAPEUTIC MEDICINE 19: 2997-3008, 2020 Ingenuity pathway analysis of human facet joint tissues: Insight into facet joint osteoarthritis CHU CHEN*, SHENGYU CUI*, WEIDONG LI, HURICHA JIN, JIANBO FAN, YUYU SUN and ZHIMING CUI Department of Spine Surgery, The Second Affiliated Hospital of Nantong University, Nantong, Jiangsu 226001, P.R. China Received August 17, 2019; Accepted January 30, 2020 DOI: 10.3892/etm.2020.8555 Abstract. Facet joint osteoarthritis (FJOA) is a common the top 5 IPA networks (with a score >30). The present study degenerative joint disorder with high prevalence in the elderly. provides insight into the pathological processes of FJOA from FJOA causes lower back pain and lower extremity pain, and a genetic perspective and may thus benefit the clinical treat- thus severely impacts the quality of life of affected patients. ment of FJOA. Emerging studies have focused on the histomorphological and histomorphometric changes in FJOA. However, the dynamic Introduction genetic changes in FJOA have remained to be clearly deter- mined. In the present study, previously obtained RNA deep Facet joint osteoarthritis (FJOA) is a common degenerative sequencing data were subjected to an ingenuity pathway joint disorder that causes the degeneration and breakdown analysis (IPA) and canonical signaling pathways of differ- of cartilage and restricts the movement of joints (1). It has entially expressed genes (DEGs) in FJOA were studied. The been reported that lumbar FJOA occurs at high prevalence top 25 enriched canonical signaling pathways were identified and the presence of FJOA is highly associated with age (2,3). and canonical signaling pathways with high absolute values A community-based cross-sectional study indicated that of z-scores, specifically leukocyte extravasation signaling, in populations aged <50 years, the prevalence of FJOA was Tec kinase signaling and osteoarthritis pathway, were inves- <45%, while it was ~75% in populations aged >50 years (4).
    [Show full text]
  • Genomic-Wide Copy Number Variation Profile and a New IKZF1 Gene Variation in Chinese Adult Acute Lymphoblastic Leukemia
    Research Article Clinics in Oncology Published: 02 Mar, 2021 Genomic-wide Copy Number Variation Profile and a New IKZF1 Gene Variation in Chinese Adult Acute Lymphoblastic Leukemia Tian Yuan1,2, Yingchan Wang1, Xiaoyan Li1, Yan Li1 and Yingchang Mi1* 1State Key Laboratory of Experimental Hematology, Chinese Academy of Medical Sciences and Peking Union Medical College, China 2Department of Hematology, Tianjin Medical University Cancer Institute and Hospital, China Abstract Purpose: Copy Number Variation (CNV) and Loss of Heterozygosity (LOH) were investigated in adult Chinese patients with Acute Lymphoblastic Leukemia (ALL), and these patients were screened for adult ALL prognostic genes. Methods: The CNV and LOH were detected with Affymetrix SNP 6.0 array. Using java and software R to construct the signal-net based on KEGG, enrichment the significant CNV genes associated with disease. Check the copy number variation of IKZF1 gene by FISH. Results: In Ph+ B-ALL patients, the most frequent CNVs were gain at chr13q, loss at chr6 and recurrent LOH at chr17p and 1p33. In Ph- B-ALL patients, the most frequent CNVs were gain at chr7, loss at chr9p and recurrent LOH at chr15q. In T-ALL, the most frequent CNVs were gain at chr20 and 7q, loss at chr2p and recurrent LOH also at chr15q. These CNV regions included several candidate genes such as IKZF1, JAK2, CDKN2A/2B, ETV6 etc. EGFR and IKBKG are in the center position of the Signal-Net, which may play essential role in ALL. The copy number amplification of IKZF1 gene is confirmed by FISH. OPEN ACCESS Discussion: CNV and LOH identification in adult ALL patients can identify susceptible genes, *Correspondence: guide diagnostic classification, and help with treatment selection.
    [Show full text]
  • Recurrent Copy Number Variants Associated with Bronchopulmonary Dysplasia
    nature publishing group Articles Translational Investigation Recurrent copy number variants associated with bronchopulmonary dysplasia Ausaf Ahmad1,2, Soumyaroop Bhattacharya1,2, Arthi Sridhar3, Anwar M. Iqbal3 and Thomas J. Mariani1,2 BACKGROUND: Variability in the incidence and severity of and almost all infants (>97%) with birth weight <1,250 g are bronchopulmonary dysplasia (BPD) among premature infants diagnosed with BPD (5). suggests that genetic susceptibility plays a role in pathogen- A diagnosis of BPD is dependent upon the use of supple- esis. An assessment of copy number variants (CNV) in BPD sub- mental oxygen following preterm birth. In addition to pre- jects may help to identify loci that harbor genetic susceptibility mature birth, environmental factors such as oxidative stress, factors. mechanical ventilation, and infection play a significant role METHODS: We conducted a retrospective analysis of clinical in pathogenesis of BPD. Higher than physiological levels of DNA microarray data from our institution. We identified 19 BPD therapeutic oxygen (hyperoxia) induce stress with production subjects, and 2 controls groups (full-term and preterm) with of reactive oxygen species. The lungs of premature infants are no lung-related disease. We reanalyzed raw data from each of susceptible to injury with pre- and postnatal exposures. These these subjects to identify recurrent CNV loci in BPD subjects. exposures may cause lung damage and induce a deviation from RESULTS: We identified three loci (at 11q13.2, 16p13.3, and the normal developmental path (4). 22q11.23–q12.1) with recurrent CNV in BPD subjects. The fre- Variability in the incidence and severity of BPD among pre- quency of these CNV was significantly higher in BPD subjects mature infants with similar risk factors suggests that genetic when compared with at least one control group.
    [Show full text]
  • Gene Expression Profiling and Bioinformatic Analysis of Rabbit
    ogy & N ol eu ur e ro N p h f y o s l i Matsuo, et al., J Neurol Neurophysiol 2014, 5:3 o a l n o r g u y o J Journal of Neurology & Neurophysiology DOI: 10.4172/2155-9562.1000201 ISSN: 2155-9562 Research Article Open Access Gene Expression Profiling and Bioinformatic Analysis of Rabbit Basilar Artery after Experimental Subarachnoid Hemorrhage Satoshi Matsuo, Yuichiro Kikkawa*, Masaaki Hokama, Ryota Kurogi, Akira Nakamizo, Masahiro Mizoguchi and Tomio Sasaki Department of Neurosurgery, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan *Corresponding author: Yuichiro Kikkawa, Department of Neurosurgery, Graduate School of Medical Sciences, Kyushu University, 3-1-1 Maidashi, Higashi-ku, Fukuoka 812-8582, Japan, Tel: +81-92-642-5524; Fax: +81-92-642-5526; E-mail: [email protected] Received date: Feb 14, 2014, Accepted date: Mar 20, 2014, Published date: Mar 26, 2014 Copyright: © 2014 Matsuo S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Abstract Objective: The molecular mechanisms which contribute to the development of vascular events including cerebral vasospasm after subarachnoid hemorrhage (SAH) in cerebral artery remain to be elucidated. In this study, we investigated the time course of changes in the gene expression of cerebral artery using rabbit SAH model and performed bioinformatic analysis of differentially expressed genes. Methods: Rabbit basilar arteries were harvested on days 3, 5, and 7 after initial hemorrhage.
    [Show full text]
  • A Pathway Analysis Method for Genome-Wide Association Studies Babak Shahbaba,A Catherine M
    Research Article Received 22 March 2011, Accepted 2 November 2011 Published online in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4477 A pathway analysis method for genome-wide association studies Babak Shahbaba,a Catherine M. Shachafb and Zhaoxia Yua*† For genome-wide association studies, we propose a new method for identifying significant biological pathways. In this approach, we aggregate data across single-nucleotide polymorphisms to obtain summary measures at the gene level. We then use a hierarchical Bayesian model, which takes the gene-level summary measures as data, in order to evaluate the relevance of each pathway to an outcome of interest (e.g., disease status). Although shifting the focus of analysis from individual genes to pathways has proven to improve the statistical power and provide more robust results, such methods tend to eliminate a large number of genes whose pathways are unknown. For these genes, we propose to use a Bayesian multinomial logit model to predict the associated pathways by using the genes with known pathways as the training data. Our hierarchical Bayesian model takes the uncertainty regard- ing the pathway predictions into account while assessing the significance of pathways. We apply our method to two independent studies on type 2 diabetes and show that the overlap between the results from the two studies is statistically significant. We also evaluate our approach on the basis of simulated data. Copyright © 2012 John Wiley & Sons, Ltd. Keywords: hierarchical; pathway; genome-wide; uncertainty 1. Introduction With the development of high-throughput genotyping technologies, genome-wide association studies (GWAS) have been widely used for identifying common genetic variants that contribute to human com- plex traits.
    [Show full text]
  • Pathway-Based Analysis of Genome-Wide Association Data
    fphar-09-00158 March 9, 2018 Time: 17:6 # 1 ORIGINAL RESEARCH published: 13 March 2018 doi: 10.3389/fphar.2018.00158 Pathway-Based Analysis of Genome-Wide Association Data Identified SNPs in HMMR as Biomarker for Chemotherapy- Induced Neutropenia in Breast Cancer Patients Edited by: Rick Kittles, Behzad Bidadi1†, Duan Liu1, Krishna R. Kalari2, Matthias Rubner3, Alexander Hein3, Irell and Manella Graduate School Matthias W. Beckmann3, Brigitte Rack4, Wolfgang Janni5, Peter A. Fasching3, of Biological Sciences, United States Richard M. Weinshilboum1 and Liewei Wang1* Reviewed by: Ming Ta Michael Lee, 1 Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics, Mayo Clinic, Geisinger Health System, Rochester, MN, United States, 2 Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, United States Mayo Clinic, Rochester, MN, United States, 3 Department of Gynecology and Obstetrics, University Hospital Erlangen, Ken Batai, Friedrich-Alexander-University Erlangen-Nuremberg, Erlangen, Germany, 4 Department of Gynecology and Obstetrics, University of Arizona, United States Ludwig-Maximilians-Universität München, Munich, Germany, 5 Department of Gynecology and Obstetrics, University Hospital Ulm, Ulm, Germany *Correspondence: Liewei Wang [email protected] Neutropenia secondary to chemotherapy in breast cancer patients can be life- † Present address: threatening and there are no biomarkers available to predict the risk of drug-induced Behzad Bidadi, Merck & Co., Inc., North Wales, PA, neutropenia in those patients. We previously performed a genome-wide association United States study (GWAS) for neutropenia events in women with breast cancer who were treated with 5-fluorouracil, epirubicin and cyclophosphamide and recruited to the SUCCESS- Specialty section: This article was submitted to A trial.
    [Show full text]
  • Systematic Analysis of Copy Number Variation Associated with Congenital Diaphragmatic Hernia
    Systematic analysis of copy number variation associated with congenital diaphragmatic hernia Qihui Zhua,1, Frances A. Highb,c,d,1, Chengsheng Zhanga,1, Eliza Cerveiraa, Meaghan K. Russellb, Mauro Longonib,d, Maliackal P. Joyb, Mallory Ryana, Adam Mil-homensa, Lauren Bellfya, Caroline M. Colettib, Pooja Bhayanib, Regis Hilab, Jay M. Wilsonc,d, Patricia K. Donahoeb,d,2,3, and Charles Leea,e,2,3 aThe Jackson Laboratory for Genomic Medicine, Farmington, CT 06032; bPediatric Surgical Research Laboratories, Massachusetts General Hospital, Boston, MA 02114; cDepartment of Surgery, Boston Children’s Hospital, Boston, MA 02115; dHarvard Medical School, Boston, MA 02115; and eDepartment of Life Sciences, Ewha Womans University, Seoul 03760, South Korea Contributed by Patricia K. Donahoe, April 4, 2018 (sent for review August 22, 2017; reviewed by Sally A. Camper and Deborah Krakow) Congenital diaphragmatic hernia (CDH), characterized by malfor- mouse models were also later found to harbor rare and predicted mation of the diaphragm and hypoplasia of the lungs, is one of the pathogenic mutations in patients with CDH (8, 10, 13). most common and severe birth defects, and is associated with high Numerous studies have implicated copy number variants morbidity and mortality rates. There is growing evidence demon- (CNVs) in human disease with associated phenotypes ranging strating that genetic factors contribute to CDH, although the patho- from cognitive disabilities to predispositions to obesity, cancer, and genesis remains largely elusive. Single-nucleotide polymorphisms otherdiseases(14–17). Both inherited and de novo CNVs play a have been studied in recent whole-exome sequencing efforts, causative role in CDH (18–22), and at least 10% of CDH cases are but larger copy number variants (CNVs) have not yet been studied estimated to be caused by genomic imbalances (21).
    [Show full text]
  • Pathway/ Gene Set Analysis in Genome-Wide Association Studies
    Pathway/ Gene Set Analysis in Genome-Wide Association Studies Alison Motsinger-Reif, PhD Branch Chief, Senior Investigator Biostatistics and Computational Biology Branch National Institute of Environmental Health Sciences [email protected] Goals • Methods for GWAS with SNP chips – Integrating expression and SNP information Many Shared Issues • Many of the issues/choices/methodological approaches discussed for microarray data are true across all “-omics” • Many methods have been readily extended for other omic data • There are several biological and technological issues that may make just “off the shelf” use of pathway analysis tools inappropriate Genome-Wide Association Studies Population resources • trios • case-control samples Whole-genome genotyping • hundreds of thousands or million(s) of markers, typically SNPs Genome-wide Association • single SNP alleles • genotypes • multimarker haplotypes Advantages of GWAS • Compared to candidate gene studies – unbiased scan of the genome – potential to identify totally novel susceptibility factors • Compared to linkage-based approaches – capitalize on all meiotic recombination events in a population • Localize small regions of the chromosome • enables rapid detection causal gene – Identifies genes with smaller relative risks Concerns with GWAS • Assumes CDCV hypothesis • Study Design – Replication • Expense – Choice of SNPs • Power dependent on: • Analysis methods – Allele frequency – IT support, data – Relative risk management – Sample size – Variable selection – LD between genotyped – Multiple testing marker and the risk allele – disease prevalence – .ultiple testing – ……. Successes in GWAS Studies • 33989 GWAS papers published to date (GWAS catalogue) • Big Finds: – In 2005, it was learned through GWAS that age-related macular degeneration is associated with variation in the gene for complement factor H, which produces a protein that regulates inflammation (Klein et al.
    [Show full text]
  • Copy Number Variation Is Highly Correlated with Differential Gene Expression
    Shao et al. BMC Medical Genetics (2019) 20:175 https://doi.org/10.1186/s12881-019-0909-5 RESEARCH ARTICLE Open Access Copy number variation is highly correlated with differential gene expression: a pan- cancer study Xin Shao1, Ning Lv1, Jie Liao1, Jinbo Long1, Rui Xue1,NiAi1, Donghang Xu2* and Xiaohui Fan1* Abstract Background: Cancer is a heterogeneous disease with many genetic variations. Lines of evidence have shown copy number variations (CNVs) of certain genes are involved in development and progression of many cancers through the alterations of their gene expression levels on individual or several cancer types. However, it is not quite clear whether the correlation will be a general phenomenon across multiple cancer types. Methods: In this study we applied a bioinformatics approach integrating CNV and differential gene expression mathematically across 1025 cell lines and 9159 patient samples to detect their potential relationship. Results: Our results showed there is a close correlation between CNV and differential gene expression and the copy number displayed a positive linear influence on gene expression for the majority of genes, indicating that genetic variation generated a direct effect on gene transcriptional level. Another independent dataset is utilized to revalidate the relationship between copy number and expression level. Further analysis show genes with general positive linear influence on gene expression are clustered in certain disease-related pathways, which suggests the involvement of CNV in pathophysiology of diseases. Conclusions: This study shows the close correlation between CNV and differential gene expression revealing the qualitative relationship between genetic variation and its downstream effect, especially for oncogenes and tumor suppressor genes.
    [Show full text]
  • Download of Human Gene Expression Microarray Figuredatasets 6
    International Journal of Molecular Sciences Article Analyzing Gene Expression Profiles from Ataxia and Spasticity Phenotypes to Reveal Spastic Ataxia Related Pathways Andrea C. Kakouri 1,2,3 , Christina Votsi 2,3, Marios Tomazou 1,3 , George Minadakis 1,3, Evangelos Karatzas 4, Kyproula Christodoulou 2,3,* and George M. Spyrou 1,3,* 1 Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, Nicosia 2370, Cyprus; [email protected] (A.C.K.); [email protected] (M.T.); [email protected] (G.M.) 2 Department of Neurogenetics, The Cyprus Institute of Neurology and Genetics, Nicosia 2370, Cyprus; [email protected] 3 The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia 2370, Cyprus 4 Institute for Fundamental Biomedical Research, BSRC “Alexander Fleming”, 16672 Vari, Greece; [email protected] * Correspondence: [email protected] (K.C.); [email protected] (G.M.S.) Received: 18 August 2020; Accepted: 9 September 2020; Published: 14 September 2020 Abstract: Spastic ataxia (SA) is a group of rare neurodegenerative diseases, characterized by mixed features of generalized ataxia and spasticity. The pathogenetic mechanisms that drive the development of the majority of these diseases remain unclear, although a number of studies have highlighted the involvement of mitochondrial and lipid metabolism, as well as calcium signaling. Our group has previously published the GBA2 c.1780G > C (p.Asp594His) missense variant as the cause of spastic ataxia in a Cypriot consanguineous family, and more recently the biochemical characterization of this variant in patients’ lymphoblastoid cell lines. GBA2 is a crucial enzyme of sphingolipid metabolism.
    [Show full text]
  • A Decision Analysis Model for KEGG Pathway Analysis Junli Du1,2†, Manlin Li1†, Zhifa Yuan1, Mancai Guo1, Jiuzhou Song3, Xiaozhen Xie1 and Yulin Chen2*
    Du et al. BMC Bioinformatics (2016) 17:407 DOI 10.1186/s12859-016-1285-1 RESEARCH ARTICLE Open Access A decision analysis model for KEGG pathway analysis Junli Du1,2†, Manlin Li1†, Zhifa Yuan1, Mancai Guo1, Jiuzhou Song3, Xiaozhen Xie1 and Yulin Chen2* Abstract Background: The knowledge base-driven pathway analysis is becoming the first choice for many investigators, in that it not only can reduce the complexity of functional analysis by grouping thousands of genes into just several hundred pathways, but also can increase the explanatory power for the experiment by identifying active pathways in different conditions. However, current approaches are designed to analyze a biological system assuming that each pathway is independent of the other pathways. Results: A decision analysis model is developed in this article that accounts for dependence among pathways in time-course experiments and multiple treatments experiments. This model introduces a decision coefficient—a designed index, to identify the most relevant pathways in a given experiment by taking into account not only the direct determination factor of each Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway itself, but also the indirect determination factors from its related pathways. Meanwhile, the direct and indirect determination factors of each pathway are employed to demonstrate the regulation mechanisms among KEGG pathways, and the sign of decision coefficient can be used to preliminarily estimate the impact direction of each KEGG pathway. The simulation study of decision analysis demonstrated the application of decision analysis model for KEGG pathway analysis. Conclusions: A microarray dataset from bovine mammary tissue over entire lactation cycle was used to further illustrate our strategy.
    [Show full text]
  • Pathway and Network Analysis for Mrna and Protein Profiling Data
    Pathway and network analysis for mRNA and protein profiling data Bing Zhang, Ph.D. Professor of Molecular and Human Genetics Lester & Sue Smith Breast Center Baylor College of Medicine [email protected] VU workshop, 2016 Gene expression DNA Transcription Transcriptome Transcriptome RNA mRNA decay profiling Translation Proteome Protein Proteome Protein degradation profiling Phenotype Networks VU workshop, 2016 Overall workflow of gene expression studies Biological question Experimental design Microarray RNA-Seq Shotgun proteomics Image analysis Reads mapping Peptide/protein ID Signal intensities Read counts Spectral counts; Intensities Data Analysis Experimental Hypothesis validation VU workshop, 2016 Data matrix Samples probe_set_id HNE0_1 HNE0_2 HNE0_3 HNE60_1 HNE60_2 HNE60_3 1007_s_at 8.6888 8.5025 8.5471 8.5412 8.5624 8.3073 1053_at 9.1558 9.1835 9.4294 9.2111 9.1204 9.2494 117_at 7.0700 7.0034 6.9047 9.0414 8.6382 9.2663 121_at 9.7174 9.7440 9.6120 9.7581 9.7422 9.7345 1255_g_at 4.2801 4.4669 4.2360 4.3700 4.4573 4.2979 1294_at 6.3556 6.2381 6.2053 6.4290 6.5074 6.2771 Genes 1316_at 6.5759 6.5330 6.4709 6.6636 6.6438 6.4688 1320_at 6.5497 6.5388 6.5410 6.6605 6.5987 6.7236 1405_i_at 4.3260 4.4640 4.1438 4.3462 4.3876 4.6849 1431_at 5.2191 5.2070 5.2657 5.2823 5.2522 5.1808 1438_at 7.0155 6.9359 6.9241 7.0248 7.0142 7.0971 1487_at 8.6361 8.4879 8.4498 8.4470 8.5311 8.4225 1494_f_at 7.3296 7.3901 7.0886 7.2648 7.6058 7.2949 1552256_a_at 10.6245 10.5235 10.6522 10.4205 10.2344 10.3144 1552257_a_at 10.3224 10.1749 10.1992 10.2464 10.2191
    [Show full text]