Functional Interpretation of Gene Lists

Total Page:16

File Type:pdf, Size:1020Kb

Functional Interpretation of Gene Lists Functional interpretation of gene lists Bing Zhang Department of Biomedical Informatics Vanderbilt University [email protected] Microarray experiment comparing metastatic and non- metastatic colon cancer cell lines Parental cell line Affymetrix RNA .cel files Mouse430_2 Metastatic cell line Smith et al., Gastroenterology, 138:958-968, 2010 2 Where to get the data n Gene Expression Omnibus (GEO) q http://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi? acc=GSE19073 3 Download and prepare the data in Linux n In your home directory in Linux q mkdir gse19073 q cd gse19073 q wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE19nnn/GSE19073/suppl/GSE19073_RAW.tar q tar xvf GSE19073_RAW.tar q rm GSE19073_RAW.tar q gunzip *.gz q cp /home/igptest/diff/annotation.txt . n (can use nano to create the annotation file for future projects) first row contains one fewer field Columns separated by Sample names tab consistent with cel file names 4 Make sure you have the correct setting to use R In the first lecture, you were asked to get an ACCRE account, copy the file sample_file.txt under directory /home/igptest to your home directory, and add a line “setpkgs –a R” (without the quotation marks) to the end of your .bashrc file. 5 Analyze the data using R n R CMD BATCH /home/igptest/diff/limma.r limma.rout & limma.r 6 Explore the results using excel n Copy the output file data_rma_limma.csv to your local computer for exploring in Excel (SSH or Fugu) 863 significant probe set IDs (adjp<0.01 and fold-change>2), out of 45,101 probe sets *because the data are log2 based, fold-change>2 means abs(logFC)>1 7 Omics studies generate lists of interesting genes log2(ratio) 92546_r_at 92545_f_at 96055_at 102105_f_at Microarray 102700_at -log10(p value) 161361_s_at RNA-Seq Differential 92202_g_at expression 103548_at 100947_at Proteomics 101869_s_at 102727_at 160708_at …... Clustering Lists of genes with potential biological interest 8 Understanding a gene list n Level I 1451263_a_at 1436486_x_at q What are the genes behind the IDs and 1451780_at what do we know about the function of the 1438237_at 1417023_a_at genes? 1441054_at 1416203_at 1416295_a_at 1435012_x_at 1416069_at 1436485_s_at 1438148_at 1452740_at 1422184_a_at …… 9 One-gene-at-a-time information systems 10 Biomart: a batch information retrieval system http://central.biomart.org/ 11 Biomart analysis n Choose dataset q Choose database: Ensembl 74 Genes q Choose dataset: Mus musculus genes n Set filters q Gene: a list of genes identified by various database IDs (e.g. Affy probe set IDs) q Gene Ontology: filter for genes with specific GO terms (e.g. cell cycle) q Protein domains: filter for genes with specific protein domains (e.g. SH2 domain, signal domains ) q Region: filter for genes in a specific chromosome region (e.g. chr1 1:1000000 or 11q13) q Others n Select output attributes q Gene annotation information in the Ensembl database, e.g. gene description, chromosome name, gene start, gene end, strand, band, gene name, etc. q External data: Gene Ontology, IDs in other databases q Expression: anatomical system, development stage, cell type, pathology q Protein domains: SMART, PFAM, Interpro, etc. 12 Biomart: sample output 13 Understanding a gene list n Level I q What are the genes behind the IDs and what do we know about the function of the genes? n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? 14 Functional group enrichment analysis 98 Hoxa5 Hoxa11 Sash1 Ltbp3 Cd24a Agt Sox4 581 1842 Foxc1 Psrc1 Ctla2b Edn1 Ror2 Angptl4 Gnag Depdc7 Observe Sorbs1 Smad3 compare Wdr5 Macrod1 Enpp2 Trp63 Sox9 Tmem176a 65 Pax1 …… Acd Rai1 Pitx1 581 1842 …… Differentially expressed genes (581 genes) Expect System development n Is the observed overlap significantly (1842 genes) larger than the expected value? 15 Enrichment analysis: hypergeometric test Significant genes Non-significant genes Total genes in the group k j-k j Other genes n-k m-n-j+k m-j Total n m-n m Hypergeometric test: given a total of m genes where j genes are in the functional group, if we pick n genes randomly, what is the probability of having k or more genes from the group? Observed # m − j&# j& k min(n, j ) % (% ( $ n − i '$ i' p = n j ∑ # m& i= k % ( m $ n ' Zhang et.al. Nucleic Acids Res. 33:W741, 2005 16 € Commonly used functional groups n Gene Ontology (http://www.geneontology.org) q Structured, precisely defined, controlled vocabulary for describing the roles of genes and gene products q Three organizing principles: molecular function, biological process, and cellular component n Pathways q KEGG (http://www.genome.jp/kegg/pathway.html) q Pathway commons (http://www.pathwaycommons.org) q WikiPathways (http://www.wikipathways.org) n Cytogenetic bands n Targets of transcription factors/miRNAs 17 WebGestalt: Web-based Gene Set Analysis Toolkit 8 organisms Human, Mouse, Rat, Dog, Fruitfly, Worm, Zebrafish, Yeast Microarray Probe IDs Gene IDs Protein IDs • Affymetrix • Gene Symbol • UniProt • Agilent • GenBank • IPI • Codelink • Ensembl Gene • RefSeq Peptide • Illumina • RefSeq Gene • Ensembl Peptide • UniGene • Entrez Gene Genetic Variation IDs • SGD • dbSNP • MGI • Flybase ID • Wormbase ID • ZFIN 196 ID types with mapping to Entrez Gene ID WebGestalt 59,278 functional categories with genes identified by Entrez Gene IDs Gene Ontology Pathway Network module • Biological Process • KEGG • Transcription factor targets • Molecular Function • Pathway Commons • microRNA targets • Cellular Component • WikiPathways • Protein interaction modules http://www.webgestalt.org Disease and Drug Chromosomal location • Disease association genes • Cytogenetic bands Zhang et.al. Nucleic Acids Res. 33:W741, 2005 • Drug association genes Wang et al. Nucleic Acids Res. 41:W77, 2013 18 WebGestalt: ID mapping n Input list q 863 significant probe sets identified in the microarray study n Mapping result q Total number of User IDs: 863. Unambiguously mapped User IDs to Entrez IDs: 734. Unique User Entrez IDs: 581. The Enrichment Analysis will be based upon the unique IDs. 19 WebGestalt: top 10 enriched GO biological processes 20 WebGestalt: top 10 enriched KEGG pathways 21 Understanding a gene list n Level I q What are the genes behind the IDs and what do we know about the function of the genes? n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? n Level III q How do the gene products work together to form a functional network? 22 Biological networks Networks Nodes Edges Protein-protein Proteins Physical interaction, interaction network undirected Signaling network Proteins Modification, Physical directed interaction networks Gene regulatory TFs/miRNAs Physical interaction, network Target genes directed Metabolic network Metabolites Metabolic reaction, directed Co-expression Genes/ Co-expression, Functional network proteins undirected association networks Genetic network Genes Genetic interaction, undirected 23 Properties of complex networks Scale-free Modular Hierarchical 24 WebGestalt protein interaction module analysis 25 STRING (http://string-db.org) n A database of known and predicted protein interactions, including both direct (physical) and indirect associations (functional). n Quantitatively integrates interaction data from different sources for a large number of organisms, and transfers information between these organisms where applicable. n Covers 5,214,234 proteins from 1,133 organisms. 26 Understanding a gene list: summary n Level I q What are the genes behind the IDs and what do we know about the function of the genes? q Biomart (http://www.biomart.org/) n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? q WebGestalt (http://bioinfo.vanderbilt.edu/webgestalt) q Related tools: DAVID (http://david.abcc.ncifcrf.gov/), GenMAPP (http://www.genmapp.org/), GSEA (http://www.broadinstitute.org/gsea ) n Level III q How do the gene products work together to form a functional network? q STRING (http://string-db.org) q Related tools: Cytoscape (http://www.cytoscape.org/), Genemania (http://www.genemania.org), Ingenuity (http://www.ingenuity.com/), Pathway Studio ( http://www.ariadnegenomics.com/products/pathway-studio/) 27 .
Recommended publications
  • Whole Exome Sequencing in ADHD Trios from Single and Multi-Incident Families Implicates New Candidate Genes and Highlights Polygenic Transmission
    European Journal of Human Genetics (2020) 28:1098–1110 https://doi.org/10.1038/s41431-020-0619-7 ARTICLE Whole exome sequencing in ADHD trios from single and multi-incident families implicates new candidate genes and highlights polygenic transmission 1,2 1 1,2 1,3 1 Bashayer R. Al-Mubarak ● Aisha Omar ● Batoul Baz ● Basma Al-Abdulaziz ● Amna I. Magrashi ● 1,2 2 2,4 2,4,5 6 Eman Al-Yemni ● Amjad Jabaan ● Dorota Monies ● Mohamed Abouelhoda ● Dejene Abebe ● 7 1,2 Mohammad Ghaziuddin ● Nada A. Al-Tassan Received: 26 September 2019 / Revised: 26 February 2020 / Accepted: 10 March 2020 / Published online: 1 April 2020 © The Author(s) 2020. This article is published with open access Abstract Several types of genetic alterations occurring at numerous loci have been described in attention deficit hyperactivity disorder (ADHD). However, the role of rare single nucleotide variants (SNVs) remains under investigated. Here, we sought to identify rare SNVs with predicted deleterious effect that may contribute to ADHD risk. We chose to study ADHD families (including multi-incident) from a population with a high rate of consanguinity in which genetic risk factors tend to 1234567890();,: 1234567890();,: accumulate and therefore increasing the chance of detecting risk alleles. We employed whole exome sequencing (WES) to interrogate the entire coding region of 16 trios with ADHD. We also performed enrichment analysis on our final list of genes to identify the overrepresented biological processes. A total of 32 rare variants with predicted damaging effect were identified in 31 genes. At least two variants were detected per proband, most of which were not exclusive to the affected individuals.
    [Show full text]
  • Genome-Wide Screen of Otosclerosis in Population Biobanks
    medRxiv preprint doi: https://doi.org/10.1101/2020.11.15.20227868; this version posted November 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 1 Genome-wide Screen of Otosclerosis in 2 Population Biobanks: 18 Loci and Shared 3 Heritability with Skeletal Structure 4 Joel T. Rämö1, Tuomo Kiiskinen1, Juha Karjalainen1,2,3,4, Kristi Krebs5, Mitja Kurki1,2,3,4, Aki S. 5 Havulinna6, Eija Hämäläinen1, Paavo Häppölä1, Heidi Hautakangas1, FinnGen, Konrad J. 6 Karczewski1,2,3,4, Masahiro Kanai1,2,3,4, Reedik Mägi5, Priit Palta1,5, Tõnu Esko5, Andres Metspalu5, 7 Matti Pirinen1,7,8, Samuli Ripatti1,2,7, Lili Milani5, Antti Mäkitie9, Mark J. Daly1,2,3,4,10, and Aarno 8 Palotie1,2,3,4 9 1. Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of 10 Helsinki, Helsinki, Finland 11 2. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, 12 Massachusetts, USA 13 3. Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, 14 USA 15 4. Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA 16 5. Estonian Genome Center, University of Tartu, Tartu, Estonia, Institute of Molecular and Cell Biology, 17 University of Tartu, Tartu, Estonia 18 6. Finnish Institute for Health and Welfare, Helsinki, Finland 19 7. Department of Public Health, Clinicum, Faculty of Medicine, University of Helsinki, Helsinki, Finland 20 8.
    [Show full text]
  • Supplementary Table 1: Adhesion Genes Data Set
    Supplementary Table 1: Adhesion genes data set PROBE Entrez Gene ID Celera Gene ID Gene_Symbol Gene_Name 160832 1 hCG201364.3 A1BG alpha-1-B glycoprotein 223658 1 hCG201364.3 A1BG alpha-1-B glycoprotein 212988 102 hCG40040.3 ADAM10 ADAM metallopeptidase domain 10 133411 4185 hCG28232.2 ADAM11 ADAM metallopeptidase domain 11 110695 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 195222 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 165344 8751 hCG20021.3 ADAM15 ADAM metallopeptidase domain 15 (metargidin) 189065 6868 null ADAM17 ADAM metallopeptidase domain 17 (tumor necrosis factor, alpha, converting enzyme) 108119 8728 hCG15398.4 ADAM19 ADAM metallopeptidase domain 19 (meltrin beta) 117763 8748 hCG20675.3 ADAM20 ADAM metallopeptidase domain 20 126448 8747 hCG1785634.2 ADAM21 ADAM metallopeptidase domain 21 208981 8747 hCG1785634.2|hCG2042897 ADAM21 ADAM metallopeptidase domain 21 180903 53616 hCG17212.4 ADAM22 ADAM metallopeptidase domain 22 177272 8745 hCG1811623.1 ADAM23 ADAM metallopeptidase domain 23 102384 10863 hCG1818505.1 ADAM28 ADAM metallopeptidase domain 28 119968 11086 hCG1786734.2 ADAM29 ADAM metallopeptidase domain 29 205542 11085 hCG1997196.1 ADAM30 ADAM metallopeptidase domain 30 148417 80332 hCG39255.4 ADAM33 ADAM metallopeptidase domain 33 140492 8756 hCG1789002.2 ADAM7 ADAM metallopeptidase domain 7 122603 101 hCG1816947.1 ADAM8 ADAM metallopeptidase domain 8 183965 8754 hCG1996391 ADAM9 ADAM metallopeptidase domain 9 (meltrin gamma) 129974 27299 hCG15447.3 ADAMDEC1 ADAM-like,
    [Show full text]
  • Association of the PSRC1 Rs599839 Variant with Coronary Artery Disease in a Mexican Population
    medicina Communication Association of the PSRC1 rs599839 Variant with Coronary Artery Disease in a Mexican Population Martha Eunice Rodríguez-Arellano 1, Jacqueline Solares-Tlapechco 1, Paula Costa-Urrutia 1 , Helios Cárdenas-Hernández 1, Marajael Vallejo-Gómez 1, Julio Granados 2 and Sergio Salas-Padilla 3,* 1 Laboratorio de Medicina Genómica, Hospital Regional Lic. Adolfo López Mateos ISSSTE, Ciudad de Mexico 01030, Mexico; [email protected] (M.E.R.-A.); [email protected] (J.S.-T.); [email protected] (P.C.-U.); [email protected] (H.C.-H.); [email protected] (M.V.-G.) 2 División de Inmunogenética, Departamento de Trasplantes, Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Ciudad de Mexico 14080, Mexico; [email protected] 3 Servicio de Cardiología, Hospital Regional Lic. Adolfo López Mateos ISSSTE, Ciudad de Mexico 01030, Mexico * Correspondence: [email protected]; Tel.: +52-555-322-2200 Received: 3 July 2020; Accepted: 12 August 2020; Published: 26 August 2020 Abstract: Background and Objectives: Coronary artery disease (CAD) is a major health problem in México. The identification of modifiable risk factors and genetic biomarkers is crucial for an integrative and personalized CAD risk evaluation. In this work, we aimed to validate in a Mexican population a set of eight selected polymorphisms previously associated with CAD, myocardial infarction (MI), or dyslipidemia. Materials and Methods: A sample of 907 subjects (394 CAD cases and 513 controls) 40–80 years old was genotyped for eight loci: PSRC1 (rs599839), MRAS (rs9818870), BTN2A1 (rs6929846), MTHFD1L (rs6922269), CDKN2B (rs1333049), KIAA1462 (rs3739998), CXCL12 (rs501120), and HNF1A (rs2259816). The association between single nucleotide polymorphisms (SNPs) and CAD was evaluated by logistic regression models.
    [Show full text]
  • Role and Regulation of the P53-Homolog P73 in the Transformation of Normal Human Fibroblasts
    Role and regulation of the p53-homolog p73 in the transformation of normal human fibroblasts Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Bayerischen Julius-Maximilians-Universität Würzburg vorgelegt von Lars Hofmann aus Aschaffenburg Würzburg 2007 Eingereicht am Mitglieder der Promotionskommission: Vorsitzender: Prof. Dr. Dr. Martin J. Müller Gutachter: Prof. Dr. Michael P. Schön Gutachter : Prof. Dr. Georg Krohne Tag des Promotionskolloquiums: Doktorurkunde ausgehändigt am Erklärung Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig angefertigt und keine anderen als die angegebenen Hilfsmittel und Quellen verwendet habe. Diese Arbeit wurde weder in gleicher noch in ähnlicher Form in einem anderen Prüfungsverfahren vorgelegt. Ich habe früher, außer den mit dem Zulassungsgesuch urkundlichen Graden, keine weiteren akademischen Grade erworben und zu erwerben gesucht. Würzburg, Lars Hofmann Content SUMMARY ................................................................................................................ IV ZUSAMMENFASSUNG ............................................................................................. V 1. INTRODUCTION ................................................................................................. 1 1.1. Molecular basics of cancer .......................................................................................... 1 1.2. Early research on tumorigenesis ................................................................................. 3 1.3. Developing
    [Show full text]
  • Gene Regulation Underlies Environmental Adaptation in House Mice
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Research Gene regulation underlies environmental adaptation in house mice Katya L. Mack,1 Mallory A. Ballinger,1 Megan Phifer-Rixey,2 and Michael W. Nachman1 1Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, California 94720, USA; 2Department of Biology, Monmouth University, West Long Branch, New Jersey 07764, USA Changes in cis-regulatory regions are thought to play a major role in the genetic basis of adaptation. However, few studies have linked cis-regulatory variation with adaptation in natural populations. Here, using a combination of exome and RNA- seq data, we performed expression quantitative trait locus (eQTL) mapping and allele-specific expression analyses to study the genetic architecture of regulatory variation in wild house mice (Mus musculus domesticus) using individuals from five pop- ulations collected along a latitudinal cline in eastern North America. Mice in this transect showed clinal patterns of variation in several traits, including body mass. Mice were larger in more northern latitudes, in accordance with Bergmann’s rule. We identified 17 genes where cis-eQTLs were clinal outliers and for which expression level was correlated with latitude. Among these clinal outliers, we identified two genes (Adam17 and Bcat2) with cis-eQTLs that were associated with adaptive body mass variation and for which expression is correlated with body mass both within and between populations. Finally, we per- formed a weighted gene co-expression network analysis (WGCNA) to identify expression modules associated with measures of body size variation in these mice.
    [Show full text]
  • Autoantibodies As Potential Biomarkers in Breast Cancer
    biosensors Review Autoantibodies as Potential Biomarkers in Breast Cancer Jingyi Qiu, Bailey Keyser, Zuan-Tao Lin and Tianfu Wu * ID Department of Biomedical Engineering, University of Houston, 3517 Cullen BLVD, SERC 2008, Houston, TX 77204, USA; [email protected] (J.Q.); [email protected] (B.K.); [email protected] (Z.-T.L.) * Correspondence: [email protected]; Tel.: +1-713-743-0142 Received: 14 June 2018; Accepted: 11 July 2018; Published: 13 July 2018 Abstract: Breast cancer is a major cause of mortality in women; however, technologies for early stage screening and diagnosis (e.g., mammography and other imaging technologies) are not optimal for the accurate detection of cancer. This creates demand for a more effective diagnostic means to replace or be complementary to existing technologies for early discovery of breast cancer. Cancer neoantigens could reflect tumorigenesis, but they are hardly detectable at the early stage. Autoantibodies, however, are biologically amplified and hence may be measurable early on, making them promising biomarkers to discriminate breast cancer from healthy tissue accurately. In this review, we summarized the recent findings of breast cancer specific antigens and autoantibodies, which may be useful in early detection, disease stratification, and monitoring of treatment responses of breast cancer. Keywords: autoantibody; breast cancer; early diagnosis; immunotherapy 1. Introduction Breast cancer is the prevailing cancer among women in developing and developed countries [1]. As such, screening and early diagnosis with respect to risk stratification are critical for prevention and early intervention of the disease, leading to better therapeutic outcomes [2,3]. Breast cancer itself is genetically heterogeneous and expresses a variety of aberrant proteins that, until recently, were un-utilizable.
    [Show full text]
  • Functional Interpretation of Gene Lists
    Functional interpretation of gene lists Bing Zhang Department of Biomedical Informatics Vanderbilt University [email protected] Microarray data analysis workflow log2(ratio) 92546_r_at 92545_f_at 96055_at 102105_f_at 102700_at -log10(p value) 161361_s_at Microarray data Differential 92202_g_at expression 103548_at 100947_at 101869_s_at 102727_at 160708_at …... Normalization Clustering Lists of genes with potential biological interest 2 Applied Bioinformatics, Spring 2013 Omics studies generate gene/protein lists n Genomics q Genome Wide Association Study (GWAS) q Next generation sequencing (NGS) n Transcriptomics q mRNA profiling ……….. n Microarrays ……….. ……….. n Serial analysis of gene expression (SAGE) ……….. n RNA-Seq ………. q Protein-DNA interaction ………. n Chromatin immunoprecipitation ………. n Proteomics ………. ……… q Protein profiling n LC-MS/MS ……… ……… q Protein-protein interaction n Yeast two hybrid n Affinity pull-down/LC-MS/MS 3 Applied Bioinformatics, Spring 2013 Microarray experiment comparing metastatic and non- metastatic colon cancer cell lines Parental cell line Affymetrix RNA Mouse430_2 Metastatic cell line Smith et al., Gastroenterology, 138:958-968, 2010 4 Applied Bioinformatics, Spring 2013 Data matrix and differential expression analysis 863 significant probe set IDs (adjp<0.01 and fold-change>2), out of 45,101 probe sets *because the data are log2 based, fold-change>2 means abs(logFC)>1 5 Applied Bioinformatics, Spring 2013 Understanding a gene list n Level I 1451263_a_at 1436486_x_at q What are the genes behind the IDs and 1451780_at what do we know about the function of the 1438237_at 1417023_a_at genes? 1441054_at 1416203_at 1416295_a_at 1435012_x_at 1416069_at 1436485_s_at 1438148_at 1452740_at 1422184_a_at …… 6 Applied Bioinformatics, Spring 2013 One-gene-at-a-time information systems 7 Applied Bioinformatics, Spring 2013 Biomart: a batch information retrieval system n In contrast to the “one-gene-at-a-time” systems, e.g.
    [Show full text]
  • Identification of Subgroups Along the Glycolysis-Cholesterol Synthesis
    Zhang et al. Human Genomics (2021) 15:53 https://doi.org/10.1186/s40246-021-00350-3 PRIMARY RESEARCH Open Access Identification of subgroups along the glycolysis-cholesterol synthesis axis and the development of an associated prognostic risk model Enchong Zhang1†, Yijing Chen2,3†, Shurui Bao4, Xueying Hou3, Jing Hu3, Oscar Yong Nan Mu5, Yongsheng Song1 and Liping Shan1* Abstract Background: Skin cutaneous melanoma (SKCM) is one of the most highly prevalent and complicated malignancies. Glycolysis and cholesterogenesis pathways both play important roles in cancer metabolic adaptations. The main aims of this study are to subtype SKCM based on glycolytic and cholesterogenic genes and to build a clinical outcome predictive algorithm based on the subtypes. Methods: A dataset with 471 SKCM specimens was downloaded from The Cancer Genome Atlas (TCGA) database. We extracted and clustered genes from the Molecular Signatures Database v7.2 and acquired co-expressed glycolytic and cholesterogenic genes. We then subtyped the SKCM samples and validated the efficacy of subtypes with respect to simple nucleotide variations (SNVs), copy number variation (CNV), patients’ survival statuses, tumor microenvironment, and proliferation scores. We also constructed a risk score model based on metabolic subclassification and verified the model using validating datasets. Finally, we explored potential drugs for high-risk SKCM patients. Results: SKCM patients were divided into four subtype groups: glycolytic, cholesterogenic, mixed, and quiescent subgroups. The glycolytic subtype had the worst prognosis and MGAM SNV extent. Compared with the cholesterogenic subgroup, the glycolytic subgroup had higher rates of DDR2 and TPR CNV and higher proliferation scores and MK167 expression levels, but a lower tumor purity proportion.
    [Show full text]
  • The Rs599839 A>G Variant Disentangles
    Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 15 March 2021 doi:10.20944/preprints202103.0400.v1 Article The rs599839 A>G variant disentangles cardiovascular risk and hepatocellular carcinoma in NAFLD patients Meroni Marica PhD1,2, Longo Miriam MSc1,3, Paolini Erika MSc1,4, Alisi Anna PhD5, Miele Luca MD6, De Caro Emilia Rita MSc1, Pisano Giuseppina MD1, Maggioni Marco MD7, Soardo Giorgio MD8, Valenti Luca MD2,9, Fracanzani Anna Ludovica MD1,2, Dongiovanni Paola MSc1. 1 General Medicine and Metabolic Diseases, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Milano, Italy; 2 Department of Pathophysiology and Transplantation, Università degli Studi di Milano, Milano, Italy; 3 Department of Clinical Sciences and Community Health, Università degli Studi di Milano, Italy; 4 Department of Pharmacological and Biomolecular Sciences, Università degli Studi di Milano, Italy; 5 Research Unit of Molecular Genetics of Complex Phenotypes, Bambino Gesù Children Hospital, IRCCS, Rome, Italy; 6 Area Medicina Interna, Gastroenterologia e Oncologia Medica, Fondazione Policlinico Universitario A. Gemelli IRCCS, Rome, Italy; 7 Department of Pathology, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, Milano, Italy; 8 Clinic of Internal Medicine-Liver Unit Department of Medical Area (DAME) University School of Medicine, Udine, Italy and Italian Liver Foundation AREA Science Park - Basovizza Campus, Trieste, Italy. 9 Translational Medicine, Department of Transfusion Medicine and Hematology, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico Milano; * Correspondence: [email protected]. Phone: +39-02-5503-3467. Fax +39-02-5503-4229. Abstract: Background and Aims: Dyslipidemia and cardiovascular diseases (CAD) are comorbidi- Citation: PhD, M.M.; MSc, L.M.; ties of nonalcoholic fatty liver disease (NAFLD), which ranges from steatosis to hepatocellular car- MSc, P.E.; PhD, A.A.; MD, M.L.; cinoma (HCC).
    [Show full text]
  • Genome-Wide Enhancer Annotations Differ Significantly in Genomic Distribution, Evolution, and Function
    Benton et al. BMC Genomics (2019) 20:511 https://doi.org/10.1186/s12864-019-5779-x RESEARCH ARTICLE Open Access Genome-wide enhancer annotations differ significantly in genomic distribution, evolution, and function Mary Lauren Benton1, Sai Charan Talipineni2, Dennis Kostka2,3* and John A. Capra1,4* Abstract Background: Non-coding gene regulatory enhancers are essential to transcription in mammalian cells. As a result, a large variety of experimental and computational strategies have been developed to identify cis-regulatory enhancer sequences. Given the differences in the biological signals assayed, some variation in the enhancers identified by different methods is expected; however, the concordance of enhancers identified by different methods has not been comprehensively evaluated. This is critically needed, since in practice, most studies consider enhancers identified by only a single method. Here, we compare enhancer sets from eleven representative strategies in four biological contexts. Results: All sets we evaluated overlap significantly more than expected by chance; however, there is significant dissimilarity in their genomic, evolutionary, and functional characteristics, both at the element and base-pair level, within each context. The disagreement is sufficient to influence interpretation of candidate SNPs from GWAS studies, and to lead to disparate conclusions about enhancer and disease mechanisms. Most regions identified as enhancers are supported by only one method, and we find limited evidence that regions identified by multiple methods are better candidates than those identified by a single method. As a result, we cannot recommend the use of any single enhancer identification strategy in all settings. Conclusions: Our results highlight the inherent complexity of enhancer biology and identify an important challenge to mapping the genetic architecture of complex disease.
    [Show full text]
  • The Rs599839 A>G Variant Disentangles Cardiovascular Risk
    cancers Article The rs599839 A>G Variant Disentangles Cardiovascular Risk and Hepatocellular Carcinoma in NAFLD Patients Marica Meroni 1,2 , Miriam Longo 1,3 , Erika Paolini 1,4, Anna Alisi 5 , Luca Miele 6 , Emilia Rita De Caro 1, Giuseppina Pisano 1 , Marco Maggioni 7, Giorgio Soardo 8 , Luca Vittorio Valenti 2,9 , Anna Ludovica Fracanzani 1,2 and Paola Dongiovanni 1,* 1 General Medicine and Metabolic Diseases, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, 20122 Milano, Italy; [email protected] (M.M.); [email protected] (M.L.); [email protected] (E.P.); [email protected] (E.R.D.C.); [email protected] (G.P.); [email protected] (A.L.F.) 2 Department of Pathophysiology and Transplantation, Università degli Studi di Milano, 20122 Milano, Italy; [email protected] 3 Department of Clinical Sciences and Community Health, Università degli Studi di Milano, 20122 Milano, Italy 4 Department of Pharmacological and Biomolecular Sciences, Università degli Studi di Milano, 20133 Milano, Italy 5 Research Unit of Molecular Genetics of Complex Phenotypes, Bambino Gesù Children Hospital, IRCCS, 00165 Rome, Italy; [email protected] 6 Area Medicina Interna, Gastroenterologia e Oncologia Medica, Fondazione Policlinico Universitario A. Gemelli IRCCS, 00168 Rome, Italy; [email protected] 7 Department of Pathology, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, 20122 Milano, Italy; Citation: Meroni, M.; Longo, M.; [email protected] 8 Paolini, E.; Alisi, A.; Miele, L.; De Clinic of Internal Medicine-Liver Unit Department of Medical Area (DAME), University School of Medicine, Caro, E.R.; Pisano, G.; Maggioni, M.; Udine, Italy and Italian Liver Foundation AREA Science Park—Basovizza Campus, 34149 Trieste, Italy; [email protected] Soardo, G.; Valenti, L.V.; et al.
    [Show full text]