Functional Interpretation of Gene Lists

Total Page:16

File Type:pdf, Size:1020Kb

Functional Interpretation of Gene Lists Functional interpretation of gene lists Bing Zhang Department of Biomedical Informatics Vanderbilt University [email protected] Microarray data analysis workflow log2(ratio) 92546_r_at 92545_f_at 96055_at 102105_f_at 102700_at -log10(p value) 161361_s_at Microarray data Differential 92202_g_at expression 103548_at 100947_at 101869_s_at 102727_at 160708_at …... Normalization Clustering Lists of genes with potential biological interest 2 Applied Bioinformatics, Spring 2013 Omics studies generate gene/protein lists n Genomics q Genome Wide Association Study (GWAS) q Next generation sequencing (NGS) n Transcriptomics q mRNA profiling ……….. n Microarrays ……….. ……….. n Serial analysis of gene expression (SAGE) ……….. n RNA-Seq ………. q Protein-DNA interaction ………. n Chromatin immunoprecipitation ………. n Proteomics ………. ……… q Protein profiling n LC-MS/MS ……… ……… q Protein-protein interaction n Yeast two hybrid n Affinity pull-down/LC-MS/MS 3 Applied Bioinformatics, Spring 2013 Microarray experiment comparing metastatic and non- metastatic colon cancer cell lines Parental cell line Affymetrix RNA Mouse430_2 Metastatic cell line Smith et al., Gastroenterology, 138:958-968, 2010 4 Applied Bioinformatics, Spring 2013 Data matrix and differential expression analysis 863 significant probe set IDs (adjp<0.01 and fold-change>2), out of 45,101 probe sets *because the data are log2 based, fold-change>2 means abs(logFC)>1 5 Applied Bioinformatics, Spring 2013 Understanding a gene list n Level I 1451263_a_at 1436486_x_at q What are the genes behind the IDs and 1451780_at what do we know about the function of the 1438237_at 1417023_a_at genes? 1441054_at 1416203_at 1416295_a_at 1435012_x_at 1416069_at 1436485_s_at 1438148_at 1452740_at 1422184_a_at …… 6 Applied Bioinformatics, Spring 2013 One-gene-at-a-time information systems 7 Applied Bioinformatics, Spring 2013 Biomart: a batch information retrieval system n In contrast to the “one-gene-at-a-time” systems, e.g. Entrez Gene n Originally developed for the Ensembl genome databases (http://www.ensembl.org ) n Adopted by other projects including UniProt, InterPro, Reactome, Pancreatic Expression Database, and many others (see a complete list and get access to the tools from http://www.biomart.org/ ) 8 Applied Bioinformatics, Spring 2013 Biomart analysis n Choose dataset q Choose database: Ensembl Genes 69 q Choose dataset: Mus musculus genes (NCBIM37) n Set filters q Gene: a list of genes identified by various database IDs (e.g. Affy probe set IDs) q Gene Ontology: filter for genes with specific GO terms (e.g. cell cycle) q Protein domains: filter for genes with specific protein domains (e.g. SH2 domain, signal domains ) q Region: filter for genes in a specific chromosome region (e.g. chr1 1:1000000 or 11q13) q Others n Select output attributes q Gene annotation information in the Ensembl database, e.g. gene description, chromosome name, gene start, gene end, strand, band, gene name, etc. q External data: Gene Ontology, IDs in other databases q Expression: anatomical system, development stage, cell type, pathology q Protein domains: SMART, PFAM, Interpro, etc. 9 Applied Bioinformatics, Spring 2013 Biomart: sample output 10 Applied Bioinformatics, Spring 2013 Understanding a gene list n Level I q What are the genes behind the IDs and what do we know about the function of the genes? n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? 11 Applied Bioinformatics, Spring 2013 Functional group enrichment analysis 98 Hoxa5 Hoxa11 Sash1 Ltbp3 Cd24a Agt Sox4 581 1842 Foxc1 Psrc1 Ctla2b Edn1 Ror2 Angptl4 Gnag Depdc7 Observe Sorbs1 Smad3 compare Wdr5 Macrod1 Enpp2 Trp63 Sox9 Tmem176a 65 Pax1 …… Acd Rai1 Pitx1 581 1842 …… Differentially expressed genes (581 genes) Expect System development n Is the observed overlap significantly (1842 genes) larger than the expected value? 12 Applied Bioinformatics, Spring 2013 Enrichment analysis: hypergeometric test Significant genes Non-significant genes Total genes in the group k j-k j Other genes n-k m-n-j+k m-j Total n m-n m Hypergeometric test: given a total of m genes where j genes are in the functional group, if we pick n genes randomly, what is the probability of having k or more genes from the group? Observed # m − j&# j& k min(n, j ) % (% ( $ n − i '$ i' p = n j ∑ # m& i= k % ( m $ n ' Zhang et.al. Nucleic Acids Res. 33:W741, 2005 13 Applied Bioinformatics, Spring 2013 € Commonly used functional groups n Gene Ontology (http://www.geneontology.org) q Structured, precisely defined, controlled vocabulary for describing the roles of genes and gene products q Three organizing principles: molecular function, biological process, and cellular component n Pathways q KEGG (http://www.genome.jp/kegg/pathway.html) q Pathway commons (http://www.pathwaycommons.org) q WikiPathways (http://www.wikipathways.org) n Cytogenetic bands n Targets of transcription factors/miRNAs 14 Applied Bioinformatics, Spring 2013 WebGestalt: Web-based Gene Set Analysis Toolkit 8 organisms Human, Mouse, Rat, Dog, Fruitfly, Worm, Zebrafish, Yeast Microarray Probe IDs Gene IDs Protein IDs • Affymetrix • Gene Symbol • UniProt • Agilent • GenBank • IPI • Codelink • Ensembl Gene • RefSeq Peptide • Illumina • RefSeq Gene • Ensembl Peptide • UniGene • Entrez Gene Genetic Variation IDs • SGD • dbSNP • MGI • Flybase ID • Wormbase ID • ZFIN 196 ID types with mapping to Entrez Gene ID http://bioinfo.vanderbilt.edu/webgestalt WebGestalt Zhang et.al. Nucleic Acids Res. 33:W741, 2005 59,278 functional categories with genes identified by Entrez Gene IDs Gene Ontology Pathway Network module • Biological Process • KEGG • Transcription factor targets • Molecular Function • Pathway Commons • microRNA targets • Cellular Component • WikiPathways • Protein interaction modules Disease and Drug Chromosomal location • Disease association genes • Cytogenetic bands • Drug association genes 15 Applied Bioinformatics, Spring 2013 WebGestalt: ID mapping n Input list q 863 significant probe sets identified in the microarray study n Mapping result q Total number of User IDs: 863. Unambiguously mapped User IDs to Entrez IDs: 734. Unique User Entrez IDs: 581. The Enrichment Analysis will be based upon the unique IDs. 16 Applied Bioinformatics, Spring 2013 WebGestalt: top 10 enriched GO biological processes 17 Applied Bioinformatics, Spring 2013 WebGestalt: top 10 enriched KEGG pathways 18 Applied Bioinformatics, Spring 2013 Understanding a gene list n Level I q What are the genes behind the IDs and what do we know about the function of the genes? n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? n Level III q How do the gene products work together to form a functional network? 19 Applied Bioinformatics, Spring 2013 Biological networks Networks Nodes Edges Protein-protein Proteins Physical interaction, interaction network undirected Signaling network Proteins Modification, Physical directed interaction networks Gene regulatory TFs/miRNAs Physical interaction, network Target genes directed Metabolic network Metabolites Metabolic reaction, directed Co-expression Genes/ Co-expression, Functional network proteins undirected association networks Genetic network Genes Genetic interaction, undirected 20 Applied Bioinformatics, Spring 2013 Properties of complex networks Scale-free Modular Hierarchical 21 Applied Bioinformatics, Spring 2013 WebGestalt protein interaction module analysis 22 Applied Bioinformatics, Spring 2013 STRING (http://string-db.org) n A database of known and predicted protein interactions, including both direct (physical) and indirect associations (functional). n Quantitatively integrates interaction data from different sources for a large number of organisms, and transfers information between these organisms where applicable. n Covers 5,214,234 proteins from 1,133 organisms. 23 Applied Bioinformatics, Spring 2013 Understanding a gene list: summary n Level I q What are the genes behind the IDs and what do we know about the function of the genes? q Biomart (http://www.biomart.org/) n Level II q Which biological processes and pathways are the most interesting in terms of the experimental question? q WebGestalt (http://bioinfo.vanderbilt.edu/webgestalt) q Related tools: DAVID (http://david.abcc.ncifcrf.gov/), GenMAPP (http://www.genmapp.org/), GSEA (http://www.broadinstitute.org/gsea ) n Level III q How do the gene products work together to form a functional network? q STRING (http://string-db.org) q Related tools: Cytoscape (http://www.cytoscape.org/), Genemania (http://www.genemania.org), Ingenuity (http://www.ingenuity.com/), Pathway Studio ( http://www.ariadnegenomics.com/products/pathway-studio/) 24 Applied Bioinformatics, Spring 2013 .
Recommended publications
  • Genome-Wide Screen of Otosclerosis in Population Biobanks
    medRxiv preprint doi: https://doi.org/10.1101/2020.11.15.20227868; this version posted November 16, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC-ND 4.0 International license . 1 Genome-wide Screen of Otosclerosis in 2 Population Biobanks: 18 Loci and Shared 3 Heritability with Skeletal Structure 4 Joel T. Rämö1, Tuomo Kiiskinen1, Juha Karjalainen1,2,3,4, Kristi Krebs5, Mitja Kurki1,2,3,4, Aki S. 5 Havulinna6, Eija Hämäläinen1, Paavo Häppölä1, Heidi Hautakangas1, FinnGen, Konrad J. 6 Karczewski1,2,3,4, Masahiro Kanai1,2,3,4, Reedik Mägi5, Priit Palta1,5, Tõnu Esko5, Andres Metspalu5, 7 Matti Pirinen1,7,8, Samuli Ripatti1,2,7, Lili Milani5, Antti Mäkitie9, Mark J. Daly1,2,3,4,10, and Aarno 8 Palotie1,2,3,4 9 1. Institute for Molecular Medicine Finland (FIMM), Helsinki Institute of Life Science (HiLIFE), University of 10 Helsinki, Helsinki, Finland 11 2. Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, 12 Massachusetts, USA 13 3. Stanley Center for Psychiatric Research, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, 14 USA 15 4. Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA 16 5. Estonian Genome Center, University of Tartu, Tartu, Estonia, Institute of Molecular and Cell Biology, 17 University of Tartu, Tartu, Estonia 18 6. Finnish Institute for Health and Welfare, Helsinki, Finland 19 7. Department of Public Health, Clinicum, Faculty of Medicine, University of Helsinki, Helsinki, Finland 20 8.
    [Show full text]
  • Supplementary Table 1: Adhesion Genes Data Set
    Supplementary Table 1: Adhesion genes data set PROBE Entrez Gene ID Celera Gene ID Gene_Symbol Gene_Name 160832 1 hCG201364.3 A1BG alpha-1-B glycoprotein 223658 1 hCG201364.3 A1BG alpha-1-B glycoprotein 212988 102 hCG40040.3 ADAM10 ADAM metallopeptidase domain 10 133411 4185 hCG28232.2 ADAM11 ADAM metallopeptidase domain 11 110695 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 195222 8038 hCG40937.4 ADAM12 ADAM metallopeptidase domain 12 (meltrin alpha) 165344 8751 hCG20021.3 ADAM15 ADAM metallopeptidase domain 15 (metargidin) 189065 6868 null ADAM17 ADAM metallopeptidase domain 17 (tumor necrosis factor, alpha, converting enzyme) 108119 8728 hCG15398.4 ADAM19 ADAM metallopeptidase domain 19 (meltrin beta) 117763 8748 hCG20675.3 ADAM20 ADAM metallopeptidase domain 20 126448 8747 hCG1785634.2 ADAM21 ADAM metallopeptidase domain 21 208981 8747 hCG1785634.2|hCG2042897 ADAM21 ADAM metallopeptidase domain 21 180903 53616 hCG17212.4 ADAM22 ADAM metallopeptidase domain 22 177272 8745 hCG1811623.1 ADAM23 ADAM metallopeptidase domain 23 102384 10863 hCG1818505.1 ADAM28 ADAM metallopeptidase domain 28 119968 11086 hCG1786734.2 ADAM29 ADAM metallopeptidase domain 29 205542 11085 hCG1997196.1 ADAM30 ADAM metallopeptidase domain 30 148417 80332 hCG39255.4 ADAM33 ADAM metallopeptidase domain 33 140492 8756 hCG1789002.2 ADAM7 ADAM metallopeptidase domain 7 122603 101 hCG1816947.1 ADAM8 ADAM metallopeptidase domain 8 183965 8754 hCG1996391 ADAM9 ADAM metallopeptidase domain 9 (meltrin gamma) 129974 27299 hCG15447.3 ADAMDEC1 ADAM-like,
    [Show full text]
  • Role and Regulation of the P53-Homolog P73 in the Transformation of Normal Human Fibroblasts
    Role and regulation of the p53-homolog p73 in the transformation of normal human fibroblasts Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Bayerischen Julius-Maximilians-Universität Würzburg vorgelegt von Lars Hofmann aus Aschaffenburg Würzburg 2007 Eingereicht am Mitglieder der Promotionskommission: Vorsitzender: Prof. Dr. Dr. Martin J. Müller Gutachter: Prof. Dr. Michael P. Schön Gutachter : Prof. Dr. Georg Krohne Tag des Promotionskolloquiums: Doktorurkunde ausgehändigt am Erklärung Hiermit erkläre ich, dass ich die vorliegende Arbeit selbständig angefertigt und keine anderen als die angegebenen Hilfsmittel und Quellen verwendet habe. Diese Arbeit wurde weder in gleicher noch in ähnlicher Form in einem anderen Prüfungsverfahren vorgelegt. Ich habe früher, außer den mit dem Zulassungsgesuch urkundlichen Graden, keine weiteren akademischen Grade erworben und zu erwerben gesucht. Würzburg, Lars Hofmann Content SUMMARY ................................................................................................................ IV ZUSAMMENFASSUNG ............................................................................................. V 1. INTRODUCTION ................................................................................................. 1 1.1. Molecular basics of cancer .......................................................................................... 1 1.2. Early research on tumorigenesis ................................................................................. 3 1.3. Developing
    [Show full text]
  • Gene Regulation Underlies Environmental Adaptation in House Mice
    Downloaded from genome.cshlp.org on September 28, 2021 - Published by Cold Spring Harbor Laboratory Press Research Gene regulation underlies environmental adaptation in house mice Katya L. Mack,1 Mallory A. Ballinger,1 Megan Phifer-Rixey,2 and Michael W. Nachman1 1Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, California 94720, USA; 2Department of Biology, Monmouth University, West Long Branch, New Jersey 07764, USA Changes in cis-regulatory regions are thought to play a major role in the genetic basis of adaptation. However, few studies have linked cis-regulatory variation with adaptation in natural populations. Here, using a combination of exome and RNA- seq data, we performed expression quantitative trait locus (eQTL) mapping and allele-specific expression analyses to study the genetic architecture of regulatory variation in wild house mice (Mus musculus domesticus) using individuals from five pop- ulations collected along a latitudinal cline in eastern North America. Mice in this transect showed clinal patterns of variation in several traits, including body mass. Mice were larger in more northern latitudes, in accordance with Bergmann’s rule. We identified 17 genes where cis-eQTLs were clinal outliers and for which expression level was correlated with latitude. Among these clinal outliers, we identified two genes (Adam17 and Bcat2) with cis-eQTLs that were associated with adaptive body mass variation and for which expression is correlated with body mass both within and between populations. Finally, we per- formed a weighted gene co-expression network analysis (WGCNA) to identify expression modules associated with measures of body size variation in these mice.
    [Show full text]
  • UC San Diego Electronic Theses and Dissertations
    UC San Diego UC San Diego Electronic Theses and Dissertations Title Cardiac Stretch-Induced Transcriptomic Changes are Axis-Dependent Permalink https://escholarship.org/uc/item/7m04f0b0 Author Buchholz, Kyle Stephen Publication Date 2016 Peer reviewed|Thesis/dissertation eScholarship.org Powered by the California Digital Library University of California UNIVERSITY OF CALIFORNIA, SAN DIEGO Cardiac Stretch-Induced Transcriptomic Changes are Axis-Dependent A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Bioengineering by Kyle Stephen Buchholz Committee in Charge: Professor Jeffrey Omens, Chair Professor Andrew McCulloch, Co-Chair Professor Ju Chen Professor Karen Christman Professor Robert Ross Professor Alexander Zambon 2016 Copyright Kyle Stephen Buchholz, 2016 All rights reserved Signature Page The Dissertation of Kyle Stephen Buchholz is approved and it is acceptable in quality and form for publication on microfilm and electronically: Co-Chair Chair University of California, San Diego 2016 iii Dedication To my beautiful wife, Rhia. iv Table of Contents Signature Page ................................................................................................................... iii Dedication .......................................................................................................................... iv Table of Contents ................................................................................................................ v List of Figures ...................................................................................................................
    [Show full text]
  • Gene Regulation Underlies Environmental Adaptation in House Mice 2 3 Short Title: Gene Regulation and Adaptation 4 5 6 Katya L
    Downloaded from genome.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press 1 Title: Gene regulation underlies environmental adaptation in house mice 2 3 Short title: Gene regulation and adaptation 4 5 6 Katya L. Mack1, Mallory A. Ballinger1, Megan Phifer-Rixey2, Michael W. Nachman1 7 8 1Department oF Integrative Biology and Museum oF Vertebrate Zoology, University oF 9 CaliFornia, Berkeley, CA 94720, USA 10 2Department oF Biology, Monmouth University, West Long Branch, NJ 07764, USA 11 12 13 Corresponding Author: 14 Michael Nachman 15 Department oF Integrative Biology and Museum oF Vertebrate Zoology 16 3101 Valley LiFe Sciences Building 17 University of California, Berkeley 94707 18 Phone: 510 642-1792 19 Email: [email protected] 20 21 22 Keywords: evolution, adaptation, gene regulation 1 Downloaded from genome.cshlp.org on September 29, 2021 - Published by Cold Spring Harbor Laboratory Press 23 Abstract 24 Changes in cis- regulatory regions are thought to play a major role in the genetic basis oF 25 adaptation. However, few studies have linked cis- regulatory variation with adaptation in 26 natural populations. Here, using a combination oF exome and RNA-seq data, we performed 27 expression quantitative trait locus (eQTL) mapping and allele-specific expression analyses 28 to study the genetic architecture oF regulatory variation in wild house mice (Mus musculus 29 domesticus) using individuals From 5 populations collected along a latitudinal cline in 30 eastern North America. Mice in this transect showed clinal patterns of variation in several 31 traits, including body mass. Mice were larger in more northern latitudes, in accordance 32 with Bergmann’s rule.
    [Show full text]
  • HER-2 Overexpression Differentially Alters
    Available online http://breast-cancer-research.com/content/7/6/R1058 ResearchVol 7 No 6 article Open Access HER-2 overexpression differentially alters transforming growth factor-β responses in luminal versus mesenchymal human breast cancer cells Cindy A Wilson1, Elaina E Cajulis2, Jennifer L Green3, Taylor M Olsen1, Young Ah Chung2, Michael A Damore2, Judy Dering1, Frank J Calzone2 and Dennis J Slamon1 1Department of Medicine, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA 2Amgen Inc., Thousand Oaks, CA 91320, USA 3Department of Biology, California Institute of Technology, Pasadena, CA 91125, USA Corresponding author: Cindy A Wilson, [email protected] Received: 20 Jul 2005 Revisions requested: 23 Aug 2005 Revisions received: 27 Sep 2005 Accepted: 6 Oct 2005 Published: 8 Nov 2005 Breast Cancer Research 2005, 7:R1058-R1079 (DOI 10.1186/bcr1343) This article is online at: http://breast-cancer-research.com/content/7/6/R1058 © 2005 Wilson et al.; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/ 2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Abstract Introduction Amplification of the HER-2 receptor tyrosine binding assays, phospho-specific Smad antibodies, kinase has been implicated in the pathogenesis and aggressive immunofluorescent staining of Smad and Smad DNA binding behavior of approximately 25% of invasive human breast assays. cancers. Clinical and experimental evidence suggest that aberrant HER-2 signaling contributes to tumor initiation and Results We demonstrate that cells engineered to over-express disease progression.
    [Show full text]
  • Pathway and Network Analysis for Mrna and Protein Profiling Data
    Pathway and network analysis for mRNA and protein profiling data Bing Zhang, Ph.D. Professor of Molecular and Human Genetics Lester & Sue Smith Breast Center Baylor College of Medicine [email protected] VU workshop, 2016 Gene expression DNA Transcription Transcriptome Transcriptome RNA mRNA decay profiling Translation Proteome Protein Proteome Protein degradation profiling Phenotype Networks VU workshop, 2016 Overall workflow of gene expression studies Biological question Experimental design Microarray RNA-Seq Shotgun proteomics Image analysis Reads mapping Peptide/protein ID Signal intensities Read counts Spectral counts; Intensities Data Analysis Experimental Hypothesis validation VU workshop, 2016 Data matrix Samples probe_set_id HNE0_1 HNE0_2 HNE0_3 HNE60_1 HNE60_2 HNE60_3 1007_s_at 8.6888 8.5025 8.5471 8.5412 8.5624 8.3073 1053_at 9.1558 9.1835 9.4294 9.2111 9.1204 9.2494 117_at 7.0700 7.0034 6.9047 9.0414 8.6382 9.2663 121_at 9.7174 9.7440 9.6120 9.7581 9.7422 9.7345 1255_g_at 4.2801 4.4669 4.2360 4.3700 4.4573 4.2979 1294_at 6.3556 6.2381 6.2053 6.4290 6.5074 6.2771 Genes 1316_at 6.5759 6.5330 6.4709 6.6636 6.6438 6.4688 1320_at 6.5497 6.5388 6.5410 6.6605 6.5987 6.7236 1405_i_at 4.3260 4.4640 4.1438 4.3462 4.3876 4.6849 1431_at 5.2191 5.2070 5.2657 5.2823 5.2522 5.1808 1438_at 7.0155 6.9359 6.9241 7.0248 7.0142 7.0971 1487_at 8.6361 8.4879 8.4498 8.4470 8.5311 8.4225 1494_f_at 7.3296 7.3901 7.0886 7.2648 7.6058 7.2949 1552256_a_at 10.6245 10.5235 10.6522 10.4205 10.2344 10.3144 1552257_a_at 10.3224 10.1749 10.1992 10.2464 10.2191
    [Show full text]
  • Functional Interpretation of Gene Lists
    Functional interpretation of gene lists Bing Zhang Department of Biomedical Informatics Vanderbilt University [email protected] Microarray experiment comparing metastatic and non- metastatic colon cancer cell lines Parental cell line Affymetrix RNA .cel files Mouse430_2 Metastatic cell line Smith et al., Gastroenterology, 138:958-968, 2010 2 Where to get the data n Gene Expression Omnibus (GEO) q http://www.ncbi.nlm.nih.gov/ geo/query/acc.cgi? acc=GSE19073 3 Download and prepare the data in Linux n In your home directory in Linux q mkdir gse19073 q cd gse19073 q wget ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE19nnn/GSE19073/suppl/GSE19073_RAW.tar q tar xvf GSE19073_RAW.tar q rm GSE19073_RAW.tar q gunzip *.gz q cp /home/igptest/diff/annotation.txt . n (can use nano to create the annotation file for future projects) first row contains one fewer field Columns separated by Sample names tab consistent with cel file names 4 Make sure you have the correct setting to use R In the first lecture, you were asked to get an ACCRE account, copy the file sample_file.txt under directory /home/igptest to your home directory, and add a line “setpkgs –a R” (without the quotation marks) to the end of your .bashrc file. 5 Analyze the data using R n R CMD BATCH /home/igptest/diff/limma.r limma.rout & limma.r 6 Explore the results using excel n Copy the output file data_rma_limma.csv to your local computer for exploring in Excel (SSH or Fugu) 863 significant probe set IDs (adjp<0.01 and fold-change>2), out of 45,101 probe sets *because the data are log2 based, fold-change>2 means abs(logFC)>1 7 Omics studies generate lists of interesting genes log2(ratio) 92546_r_at 92545_f_at 96055_at 102105_f_at Microarray 102700_at -log10(p value) 161361_s_at RNA-Seq Differential 92202_g_at expression 103548_at 100947_at Proteomics 101869_s_at 102727_at 160708_at …..
    [Show full text]
  • Genes Involved in Amelogenesis Imperfecta. Part II*
    Genes involved in amelogenesis imperfecta. Part II* Genes involucrados en la amelogénesis imperfecta. Parte II* Víctor Simancas-Escorcia1, Alfredo Natera2, María Gabriela Acosta de Camargo3 * See Part I in Revista Facultad de Odontología Universidad de Antioquia, 2018; 30(1): 105-120. DOI: http://dx.doi.org/10.17533/udea. REVIEW ARTICLE REVIEW rfo.v30n1a10 1 DDS. MSc in Cell Biology, Physiology and Pathology. PhD candidate in Physiology and Pathology, Université Paris-Diderot, France. Grupo Interdisciplinario de Investigaciones y Tratamientos Odontológicos Universidad de Cartagena, Colombia (GITOUC). 2 DDS. Professor in the Department of Operative Dentistry, Universidad Central de Venezuela. Head of Centro Venezolano de Investigación Clínica para el Tratamiento de la Fluorosis Dental y Defectos del Esmalte (CVIC FLUOROSIS). 3 DDS. Specialist in Pediatric Dentistry, Universidad Santa María. PhD in Dentistry, Universidad Central de Venezuela. Professor in the Department of Dentistry of the Child and Adolescent, Universidad de Carabobo. ABSTRACT Amelogenesis imperfecta (AI) is a condition of genetic origin that alters the structure of tooth enamel. AI may exist in isolation or associated with other systemic conditions as part of a syndromic AI. Our goal is to describe in detail the genes involved in syndromic AI, the proteins encoded by these genes, and their functions according to current scientific evidence. An electronic literature search was carried out from the Keywords: year 2000 to December 2017, pre-selecting 1,573 articles, 40 of which were analyzed and discussed. The amelogenesis results indicate that mutations in 12 genes are responsible for syndromic AI: DLX3, COL17A1, LAMA3, imperfecta, tooth LAMB3, FAM20A, TP63, CNNM4, ROGDI, LTBP3, FAM20C, CLDN16, CLDN19.
    [Show full text]
  • Sheet1 Page 1 Gene Symbol Gene Description Entrez Gene ID
    Sheet1 RefSeq ID ProbeSets Gene Symbol Gene Description Entrez Gene ID Sequence annotation Seed matches location(s) Ago-2 binding specific enrichment (replicate 1) Ago-2 binding specific enrichment (replicate 2) OE lysate log2 fold change (replicate 1) OE lysate log2 fold change (replicate 2) Probability Pulled down in Karginov? NM_005646 202813_at TARBP1 Homo sapiens TAR (HIV-1) RNA binding protein 1 (TARBP1), mRNA. 6894 TR(1..5130)CDS(1..4866) 4868..4874,5006..5013 3.73 2.53 -1.54 -0.44 1 Yes NM_001665 203175_at RHOG Homo sapiens ras homolog gene family, member G (rho G) (RHOG), mRNA. 391 TR(1..1332)CDS(159..734) 810..817,782..788,790..796,873..879 3.56 2.78 -1.62 -1 1 Yes NM_002742 205880_at PRKD1 Homo sapiens protein kinase D1 (PRKD1), mRNA. 5587 TR(1..3679)CDS(182..2920) 3538..3544,3202..3208 4.15 1.83 -2.55 -0.42 1 Yes NM_003068 213139_at SNAI2 Homo sapiens snail homolog 2 (Drosophila) (SNAI2), mRNA. 6591 TR(1..2101)CDS(165..971) 1410..1417,1814..1820,1610..1616 3.5 2.79 -1.38 -0.31 1 Yes NM_006270 212647_at RRAS Homo sapiens related RAS viral (r-ras) oncogene homolog (RRAS), mRNA. 6237 TR(1..1013)CDS(46..702) 871..877 3.82 2.27 -1.54 -0.55 1 Yes NM_025188 219923_at,242056_at TRIM45 Homo sapiens tripartite motif-containing 45 (TRIM45), mRNA. 80263 TR(1..3584)CDS(589..2331) 3408..3414,2437..2444,3425..3431,2781..2787 3.87 1.89 -0.62 -0.09 1 Yes NM_024684 221600_s_at,221599_at C11orf67 Homo sapiens chromosome 11 open reading frame 67 (C11orf67), mRNA.
    [Show full text]
  • Gene Ontology Analysis of Arthrogryposis (Multiple Congenital Contractures)
    Received: 5 March 2019 Revised: 13 June 2019 Accepted: 17 July 2019 DOI: 10.1002/ajmg.c.31733 RESEARCH ARTICLE Gene ontology analysis of arthrogryposis (multiple congenital contractures) Jeff Kiefer1 | Judith G. Hall2,3 1Systems Oncology, Scottsdale, Arizona Abstract 2Department of Medical Genetics, University of British Columbia and BC Children's In 2016, we published an article applying Gene Ontology Analysis to the genes that had Hospital, Vancouver, British Columbia, Canada been reported to be associated with arthrogryposis (multiple congenital contractures) (Hall 3Department of Pediatrics, University of & Kiefer, 2016). At that time, 320 genes had been reported to have mutations associated British Columbia and BC Children's Hospital, Vancouver, British Columbia, Canada with arthrogryposis. All were associated with decreased fetal movement. These 320 genes were analyzed by biological process and cellular component categories, and yielded 22 Correspondence Judith G. Hall, Department of Medical distinct groupings. Since that time, another 82 additional genes have been reported, now Genetics, BC Children's Hospital, 4500 Oak totaling 402 genes, which when mutated, are associated with arthrogryposis (arthrogryposis Street, Room C234, Vancouver, British Columbia V6H 3N1, Canada. multiplex congenita). So, we decided to update the analysis in order to stimulate further Email: [email protected] research and possible treatment. Now, 29 groupings can be identified, but only 19 groups have more than one gene. KEYWORDS arthrogryposis, developmental pathways, enrichment analysis, gene ontology, multiple congenital contractures 1 | INTRODUCTION polyhydramnios, decreased gut mobility and shortened gut, short umbili- cal cord, skin changes, and multiple joints with limitation of movement, Arthrogryposis is the term that has been used for the last century including limbs, jaw, and spine).
    [Show full text]