Download.Html)[80]

Total Page:16

File Type:pdf, Size:1020Kb

Download.Html)[80] 2 Epigenetic determinants of cellular differentiation, transcriptional reprogramming, and human disease by Khoi Thien Nguyen Submitted to the Department of Biological Engineering on April 3, 2020, in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Biological Engineering Abstract Much of the diversity we observe in cellular and organismal phenotypes can be at- tributed to epigenetic and genetic variation. DNA provides the instructions for life, while epigenetic modifications regulate which parts of the genetic information con- tained in DNA can be read out in a given cell and how this information is interpreted. In recent years, epigenetic and genetic variation has been profiled on a large scale with sequencing-based assays, generating many datasets to be explored. In this thesis, I present three projects which apply computational techniques to identify and char- acterize epigenetic mechanisms that may contribute to the regulation of phenotypic variance. First, we mine a dataset charactering the epigenomes of diverse cell types in order to discover signatures of adult stem cell differentiation. We identify a novel marker of the multipotent state, a chromatin state characterized by the histone marks H3K36me3 and H3K9me3, and describe biological processes that may be linked to the loss of this chromatin state in fully differentiated cell types. Next, I present what we learned from profiling the epigenetic state of cells before and after transplantation into Xenopus oocytes, a process that transcriptionally reprograms the cells. This analysis elucidates how the initial epigenetic state of a cell influences the success of cellular reprogramming and identifies transcription factors that help regulate this pro- cess. Finally, we integrate studies measuring the effects of genetic variants on disease with studies measuring the effects of genetic variants on transcriptional and epigenetic activity. This identifies specific mechanisms underlying disease processes, and demon- strates that transcriptional and epigenetic mechanisms may independently contribute to disease pathogenesis. Together, these projects demonstrate the biological insights that can be gained from epigenetic profiling, and expand our understanding of the potential effects of epigenetic modifications. Thesis Supervisor: Manolis Kellis Title: Professor 3 4 Acknowledgments I am grateful to many people for their contributions to my thesis work and my expe- rience as a PhD student. First, I would like to thank my research advisor, Manolis Kellis, for his support and patience, and for giving me the opportunity to be a member of the MIT Com- putational Biology group. The breadth of people and ideas I’ve encountered here has helped me grow immensely as a scientist. I would also like to thank Doug Lauf- fenburger, Ernest Fraenkel, and Mary Gehring, for serving on my thesis committee. Your support, mentorship and advice over the years is appreciated. Next, I would like to thank my collaborators, who have been integral to my achievements. Even if projects didn’t always go the way we envisioned, I always learned a lot. I thank Pouya Kheradpour, Andreas Pfenning, Melina Claussnitzer, Je-Yoel Cho, Kei Miyamoto, Dinesh Kumar, Yongjin Park, Hailiang Huang, Lei Hou, and Carles Boix for working with me over the years and teaching me so much. By extension, I would also like to thank everyone else who has been a member of the Kellis lab, including but not definitely not limited to Alvin, Bob, Kunal, Angela, Xinchen, Abhishek, Max, Yaping, Laura, Xushen, Patty, Jackie, Na, Suvi, Leandro, Remy, Eloi, Maria, Irwin, Peter, Nate, and Nicola. You have all helped me along in my scientific journey in one way or another and have spent many hours laughing and commiserating with me. Many other people have also been a source of knowledge and comraderie, and have helped me get me through the inevitable rough patches in my PhD. Thank you to the students, faculty, and staff of the Biological Engineering department, the Muddy Charles, and all of my friends. Special thanks go to the BE class of 2013, the humans (and animals!) I’ve had the pleasure to live with, the inspirational women in my life, and those I can count on when I need to get my mind off of science. Adrienne, Deena, Natasha, Lynna, Claire, Andee, Kelly, Ashvin, Ben, Shawn, Santi, Griffin, Isaak, Naveen: thank you. Finally, I would like to thank my parents, Thoa and Dung, and my siblings, Danh 5 and Qui. They have been beside me my entire life and I am forever indebted to them for the sacrifices they’ve made to give me the opportunity to be whereIam today. Thank you for teaching me that material (and even academic) success is not the measure of a life well lived. 6 Contents 1 Introduction 17 1.1 Motivation . 17 2 Chromatin state signatures of cellular differentiation 21 2.1 Introduction . 21 2.2 Materials and methods . 23 2.2.1 Epigenome and gene expression data . 23 2.2.2 Model of differentiation effect . 23 2.2.3 Transitions . 24 2.2.4 Gene state classification . 24 2.2.5 Biological pathways . 24 2.3 Results and discussion . 25 2.3.1 The loss of the "ZNF/Rpts" chromatin state is a signature of late-stage cellular differentiation . 25 2.3.2 Chromatin state dynamics of "ZNF/Rpts"-associated regions and genes . 29 2.3.3 Changes in gene expression linked to loss of "ZNF/Rpts" . 35 2.3.4 Biological processes linked to genes that lose "ZNF/Rpts" . 37 2.4 Conclusion . 39 3 Chromatin accessibility impacts transcriptional reprogramming in oocytes 41 3.1 Introduction . 42 7 3.2 Methods . 44 3.2.1 Experimental methods . 44 3.2.2 Statistical analysis . 44 3.2.3 ATAC-seq data processing and peak calling . 44 3.2.4 Genomic distribution of peaks . 45 3.2.5 RNA-seq data . 45 3.2.6 ATAC-seq read coverage . 45 3.2.7 Correlation between samples . 46 3.2.8 Pathway analysis . 46 3.2.9 Motif analysis . 46 3.3 Results . 47 3.3.1 Optimization and Evaluation of ATAC-seq for Analyzing Cells Transplanted into Oocytes . 47 3.3.2 ATAC-seq reveals changes in chromatin accessibility after NT of somatic cells to Xenopus oocytes . 50 3.3.3 Open promoters of silent genes are permissive for transcrip- tional activation in oocytes . 52 3.3.4 Open chromatin is often maintained near genes down-regulated after NT . 53 3.3.5 Oocyte-Mediated Reprogramming Involves Opening of Closed Chromatin . 58 3.3.6 Transcription Factors in Oocytes Are Involved in Transcrip- tional Reprogramming . 58 3.3.7 Closed Chromatin Is Associated with Reprogramming-Resistant Genes . 61 3.3.8 Chromatin Opening Facilitates Transcriptional Reprogramming 61 3.4 Conclusions . 63 4 Multi-omic mediation analysis across multiple disease traits and cell types 67 8 4.1 Introduction . 67 4.2 Materials and Methods . 69 4.2.1 Datasets and processing . 69 4.2.2 Mediation analysis . 70 4.2.3 Functional analysis . 73 4.3 Results and Discussion . 73 4.3.1 Discovering molecular mediators of disease . 73 4.3.2 Every QTL type contributes to heritability explained . 80 4.3.3 Genes are linked to disease through varied mechanisms . 86 4.3.4 Functional dissection of GWAS loci . 91 4.4 Conclusion . 95 5 Conclusion 97 5.1 Summary of contributions . 97 5.2 Limitations . 98 5.2.1 Chromatin state signatures of differentiation . 99 5.2.2 Impact of chromatin accessibility on transcriptional reprogram- ming . 100 5.2.3 Multi-omic mediation analysis . 100 5.3 Looking to the future . 101 A Supplementary information related to Chapter 2 103 B Supplementary information related to Chapter 4 105 9 10 List of Figures 2-1 Strategy to identify epigenomic signatures of cellular differentiation . 25 2-2 Effect of differentiation stage on chromatin state prevalence. .26 2-3 Abundance of the "ZNF/Rpts" in cell types, grouped by differentiation stage and cell lineage . 26 2-4 Effect of differentiation stage on other chromatin features. .27 2-5 Annotation enrichment of chromatin states from the 15-state ChromHMM model . 28 2-6 Differentiation-associated change in chromatin state proportion at "ZNF/Rpts" regions . 30 2-7 Genes clustered by chromatin state composition . 31 2-8 Relative frequency distribution of genes across chromatin state clusters 32 2-9 Chromatin state composition of repressed and active genes with "ZNF/Rpts" 33 2-10 Cluster membership of genes that lose "ZNF/Rpts" during differentiation 33 2-11 Track image of a region that switches from the "ZNF/Rpts" to the "EnhG" chromatin state . 34 2-12 Gene expression changes associated with "ZNF/Rpts" loss . 36 2-13 Expression of genes in active or repressed epigenetic configuration, with or without "ZNF/Rpts" . 37 2-14 Gene sets enriched in genes that lose "ZNF/Rpts" during differentiation 38 2-15 Overlap between genes that lose "ZNF/Rpts" during differentiation in each cell lineage . 39 3-1 ATAC-seq enables the identification of open chromatin regions . 48 11 3-2 Correlation between mapped ATAC-seq reads for samples with varying numbers of cells . 49 3-3 Chromatin accessibility changes in oocyte reprogramming . 51 3-4 Track image of MEF ATAC-seq compared to reference datasets . 52 3-5 ATAC-seq reads in MEFs before and after NT were compared around TSSs in groups of genes . 54 3-6 ATAC-seq, DNase-seq, H3K4me3 ChIP-seq, and Pol II ChIP-seq at TSSs of reprogramming genes . 55 3-7 ATAC-seq reads for MEFs before and after NT around Eif4e2 and Ift46 56 3-8 Transcription factor analysis of genes downregulated after NT . 57 3-9 Track image of newly produced open chromatin during reprogramming 59 3-10 Gene categories associated with newly opened chromatin peaks after NT 59 3-11 Sequence motifs enriched in open chromatin regions after NT .
Recommended publications
  • A Method to Infer Changed Activity of Metabolic Function from Transcript Profiles
    ModeScore: A Method to Infer Changed Activity of Metabolic Function from Transcript Profiles Andreas Hoppe and Hermann-Georg Holzhütter Charité University Medicine Berlin, Institute for Biochemistry, Computational Systems Biochemistry Group [email protected] Abstract Genome-wide transcript profiles are often the only available quantitative data for a particular perturbation of a cellular system and their interpretation with respect to the metabolism is a major challenge in systems biology, especially beyond on/off distinction of genes. We present a method that predicts activity changes of metabolic functions by scoring reference flux distributions based on relative transcript profiles, providing a ranked list of most regulated functions. Then, for each metabolic function, the involved genes are ranked upon how much they represent a specific regulation pattern. Compared with the naïve pathway-based approach, the reference modes can be chosen freely, and they represent full metabolic functions, thus, directly provide testable hypotheses for the metabolic study. In conclusion, the novel method provides promising functions for subsequent experimental elucidation together with outstanding associated genes, solely based on transcript profiles. 1998 ACM Subject Classification J.3 Life and Medical Sciences Keywords and phrases Metabolic network, expression profile, metabolic function Digital Object Identifier 10.4230/OASIcs.GCB.2012.1 1 Background The comprehensive study of the cell’s metabolism would include measuring metabolite concentrations, reaction fluxes, and enzyme activities on a large scale. Measuring fluxes is the most difficult part in this, for a recent assessment of techniques, see [31]. Although mass spectrometry allows to assess metabolite concentrations in a more comprehensive way, the larger the set of potential metabolites, the more difficult [8].
    [Show full text]
  • The Kyoto Encyclopedia of Genes and Genomes (KEGG)
    Kyoto Encyclopedia of Genes and Genome Minoru Kanehisa Institute for Chemical Research, Kyoto University HFSPO Workshop, Strasbourg, November 18, 2016 The KEGG Databases Category Database Content PATHWAY KEGG pathway maps Systems information BRITE BRITE functional hierarchies MODULE KEGG modules KO (KEGG ORTHOLOGY) KO groups for functional orthologs Genomic information GENOME KEGG organisms, viruses and addendum GENES / SSDB Genes and proteins / sequence similarity COMPOUND Chemical compounds GLYCAN Glycans Chemical information REACTION / RCLASS Reactions / reaction classes ENZYME Enzyme nomenclature DISEASE Human diseases DRUG / DGROUP Drugs / drug groups Health information ENVIRON Health-related substances (KEGG MEDICUS) JAPIC Japanese drug labels DailyMed FDA drug labels 12 manually curated original DBs 3 DBs taken from outside sources and given original annotations (GENOME, GENES, ENZYME) 1 computationally generated DB (SSDB) 2 outside DBs (JAPIC, DailyMed) KEGG is widely used for functional interpretation and practical application of genome sequences and other high-throughput data KO PATHWAY GENOME BRITE DISEASE GENES MODULE DRUG Genome Molecular High-level Practical Metagenome functions functions applications Transcriptome etc. Metabolome Glycome etc. COMPOUND GLYCAN REACTION Funding Annual budget Period Funding source (USD) 1995-2010 Supported by 10+ grants from Ministry of Education, >2 M Japan Society for Promotion of Science (JSPS) and Japan Science and Technology Agency (JST) 2011-2013 Supported by National Bioscience Database Center 0.8 M (NBDC) of JST 2014-2016 Supported by NBDC 0.5 M 2017- ? 1995 KEGG website made freely available 1997 KEGG FTP site made freely available 2011 Plea to support KEGG KEGG FTP academic subscription introduced 1998 First commercial licensing Contingency Plan 1999 Pathway Solutions Inc.
    [Show full text]
  • Distinctive Regulatory Architectures of Germline-Active and Somatic Genes in C
    Downloaded from genome.cshlp.org on October 7, 2021 - Published by Cold Spring Harbor Laboratory Press Research Distinctive regulatory architectures of germline-active and somatic genes in C. elegans Jacques Serizay, Yan Dong, Jürgen Jänes, Michael Chesney, Chiara Cerrato, and Julie Ahringer The Gurdon Institute and Department of Genetics, University of Cambridge, CB2 1QN Cambridge, United Kingdom RNA profiling has provided increasingly detailed knowledge of gene expression patterns, yet the different regulatory ar- chitectures that drive them are not well understood. To address this, we profiled and compared transcriptional and regu- latory element activities across five tissues of Caenorhabditis elegans, covering ∼90% of cells. We find that the majority of promoters and enhancers have tissue-specific accessibility, and we discover regulatory grammars associated with ubiquitous, germline, and somatic tissue–specific gene expression patterns. In addition, we find that germline-active and soma-specific promoters have distinct features. Germline-active promoters have well-positioned +1 and −1 nucleosomes associated with a periodic 10-bp WW signal (W = A/T). Somatic tissue–specific promoters lack positioned nucleosomes and this signal, have wide nucleosome-depleted regions, and are more enriched for core promoter elements, which largely differ between tissues. We observe the 10-bp periodic WW signal at ubiquitous promoters in other animals, suggesting it is an ancient conserved signal. Our results show fundamental differences in regulatory architectures of germline and somatic tissue–specific genes, uncover regulatory rules for generating diverse gene expression patterns, and provide a tissue-specific resource for future studies. [Supplemental material is available for this article.] Cell type–specific transcription regulation underlies production tissues are achieved and whether expression is governed by dis- of the myriad of different cells generated during development.
    [Show full text]
  • Jawaharlal Nehru University, New Delhi
    Jawaharlal Nehru University, New Delhi Evaluative Report of the Department (science) Name of the School/Spl. Center Pages 1 School of Life Sciences 1-45 2. School of Biotechnology 46-100 3 School of Computer and systems Sciences 101-120 4 School of Computational and Integrative Sciences 121-141 5. School of Physical Sciences 142-160 6 School of Environmental Sciences 161-197 7 Spl. Center for Molecular Medicine 198-256 8 Spl Center for Nano Sciences 257-268 Evaluation Report of School of Life Sciences In the past century, biology, with inputs from other disciplines, has made tremendous progress in terms of advancement of knowledge, development of technology and its applications. As a consequence, in the past fifty years, there has been a paradigm shift in our interpreting the life process. In the process, modern biology had acquired a truly interdisciplinary character in which all streams of sciences have made monumental contributions. Because of such rapid emergence as a premier subject of teaching and research; a necessity to restructure classical teachings in biology was recognised by the academics worldwide. In tune with such trends, the academic leadership of Jawaharlal Nehru University conceptualised the School of Life Sciences as an interdisciplinary research/teaching programme unifying various facets of biology while reflecting essential commonality regarding structure, function and evolution of biomolecules. The School was established in 1973 and since offering integrated teaching and research at M. Sc/ Ph.D level in various sub-disciplines in life sciences. Since inception, it remained dedicated to its core objectives and evolved to be one of the top such institutions in India and perhaps in South East Asia.
    [Show full text]
  • High-Throughput, Pooled Sequencing Identifies Mutations in NUBPL And
    ARTICLES High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED1 in human complex I deficiency Sarah E Calvo1–3,10, Elena J Tucker4,5,10, Alison G Compton4,10, Denise M Kirby4, Gabriel Crawford3, Noel P Burtt3, Manuel Rivas1,3, Candace Guiducci3, Damien L Bruno4, Olga A Goldberger1,2, Michelle C Redman3, Esko Wiltshire6,7, Callum J Wilson8, David Altshuler1,3,9, Stacey B Gabriel3, Mark J Daly1,3, David R Thorburn4,5 & Vamsi K Mootha1–3 Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes that are involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing and experimental validation to uncover the molecular basis of mitochondrial complex I disorders. We created seven pools of DNA from a cohort of 103 cases and 42 healthy controls and then performed deep sequencing of 103 candidate genes to identify 151 rare variants that were predicted to affect protein function. We established genetic diagnoses in 13 of 60 previously unsolved cases using confirmatory experiments, including cDNA complementation to show that mutations in NUBPL and FOXRED1 can cause complex I deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can be used to identify causal mutations in individual cases. Complex I of the mitochondrial respiratory chain is a large ~1-MDa ­assembly factors are probably required, as suggested by the 20 factors macromolecular machine composed of 45 protein subunits encoded necessary for assembly of the smaller complex IV9 and by cohort by both the nuclear and mitochondrial (mtDNA) genomes.
    [Show full text]
  • Semantic Web
    SEMANTIC WEB: REVOLUTIONIZING KNOWLEDGE DISCOVERY IN THE LIFE SCIENCES SEMANTIC WEB: REVOLUTIONIZING KNOWLEDGE DISCOVERY IN THE LIFE SCIENCES Edited by Christopher J. O. Baker1 and Kei-Hoi Cheung2 1Knowledge Discovery Department, Institute for Infocomm Research, Singapore, Singapore; 2Center for Medical Informatics, Yale University School of Medicine, New Haven, CT, USA Kluwer Academic Publishers Boston/Dordrecht/London Contents PART I: Database and Literature Integration Semantic web approach to database integration in the life sciences KEI-HOI CHEUNG, ANDREW K. SMITH, KEVIN Y. L. YIP, CHRISTOPHER J. O. BAKER AND MARK B. GERSTEIN Querying Semantic Web Contents: A case study LOIC ROYER, BENEDIKT LINSE, THOMAS WÄCHTER, TIM FURCH, FRANCOIS BRY, AND MICHAEL SCHROEDER Knowledge Acquisition from the Biomedical Literature LYNETTE HIRSCHMAN, WILLIAM HAYES AND ALFONSO VALENCIA PART II: Ontologies in the Life Sciences Biological Ontologies PATRICK LAMBRIX, HE TAN, VAIDA JAKONIENE, AND LENA STRÖMBÄCK Clinical Ontologies YVES LUSSIER AND OLIVIER BODENREIDER Ontology Engineering For Biological Applications vi Revolutionizing knowledge discovery in the life sciences LARISA N. SOLDATOVA AND ROSS D. KING The Evaluation of Ontologies: Toward Improved Semantic Interoperability LEO OBRST, WERNER CEUSTERS, INDERJEET MANI, STEVE RAY AND BARRY SMITH OWL for the Novice JEFF PAN PART III: Ontology Visualization Techniques for Ontology Visualization XIAOSHU WANG AND JONAS ALMEIDA On Vizualization of OWL Ontologies SERGUEI KRIVOV, FERDINANDO VILLA, RICHARD WILLIAMS, AND XINDONG WU PART IV: Ontologies in Action Applying OWL Reasoning to Genomics: A Case Study KATY WOLSTENCROFT, ROBERT STEVENS AND VOLKER HAARSLEV Can Semantic Web Technologies enable Translational Medicine? VIPUL KASHYAP, TONYA HONGSERMEIER AND SAMUEL J. ARONSON Ontology Design for Biomedical Text Mining RENÉ WITTE, THOMAS KAPPLER, AND CHRISTOPHER J.
    [Show full text]
  • Differential Expression Profile Prioritization of Positional Candidate Glaucoma Genes the GLC1C Locus
    LABORATORY SCIENCES Differential Expression Profile Prioritization of Positional Candidate Glaucoma Genes The GLC1C Locus Frank W. Rozsa, PhD; Kathleen M. Scott, BS; Hemant Pawar, PhD; John R. Samples, MD; Mary K. Wirtz, PhD; Julia E. Richards, PhD Objectives: To develop and apply a model for priori- est because of moderate expression and changes in tization of candidate glaucoma genes. expression. Transcription factor ZBTB38 emerges as an interesting candidate gene because of the overall expres- Methods: This Affymetrix GeneChip (Affymetrix, Santa sion level, differential expression, and function. Clara, Calif) study of gene expression in primary cul- ture human trabecular meshwork cells uses a positional Conclusions: Only1geneintheGLC1C interval fits our differential expression profile model for prioritization of model for differential expression under multiple glau- candidate genes within the GLC1C genetic inclusion in- coma risk conditions. The use of multiple prioritization terval. models resulted in filtering 7 candidate genes of higher interest out of the 41 known genes in the region. Results: Sixteen genes were expressed under all condi- tions within the GLC1C interval. TMEM22 was the only Clinical Relevance: This study identified a small sub- gene within the interval with differential expression in set of genes that are most likely to harbor mutations that the same direction under both conditions tested. Two cause glaucoma linked to GLC1C. genes, ATP1B3 and COPB2, are of interest in the con- text of a protein-misfolding model for candidate selec- tion. SLC25A36, PCCB, and FNDC6 are of lesser inter- Arch Ophthalmol. 2007;125:117-127 IGH PREVALENCE AND PO- identification of additional GLC1C fami- tential for severe out- lies7,18-20 who provide optimal samples for come combine to make screening candidate genes for muta- adult-onset primary tions.7,18,20 The existence of 2 distinct open-angle glaucoma GLC1C haplotypes suggests that muta- (POAG) a significant public health prob- tions will not be limited to rare descen- H1 lem.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Genome Informatics 4–8 September 2002, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
    Comparative and Functional Genomics Comp Funct Genom 2003; 4: 509–514. Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cfg.300 Feature Meeting Highlights: Genome Informatics 4–8 September 2002, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK Jo Wixon1* and Jennifer Ashurst2 1MRC UK HGMP-RC, Hinxton, Cambridge CB10 1SB, UK 2The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK *Correspondence to: Abstract Jo Wixon, MRC UK HGMP-RC, Hinxton, Cambridge CB10 We bring you the highlights of the second Joint Cold Spring Harbor Laboratory 1SB, UK. and Wellcome Trust ‘Genome Informatics’ Conference, organized by Ewan Birney, E-mail: [email protected] Suzanna Lewis and Lincoln Stein. There were sessions on in silico data discovery, comparative genomics, annotation pipelines, functional genomics and integrative biology. The conference included a keynote address by Sydney Brenner, who was awarded the 2002 Nobel Prize in Physiology or Medicine (jointly with John Sulston and H. Robert Horvitz) a month later. Copyright 2003 John Wiley & Sons, Ltd. In silico data discovery background set was genes which had a log ratio of ∼0 in liver. Their approach found 17 of 17 known In the first of two sessions on this topic, Naoya promoters with a specificity of 17/28. None of the Hata (Cold Spring Harbor Laboratory, USA) sites they identified was located downstream of a spoke about motif searching for tissue specific TSS and all showed an excess in the foreground promoters. The first step in the process is to sample compared to the background sample.
    [Show full text]
  • Download Download
    Supplementary Figure S1. Results of flow cytometry analysis, performed to estimate CD34 positivity, after immunomagnetic separation in two different experiments. As monoclonal antibody for labeling the sample, the fluorescein isothiocyanate (FITC)- conjugated mouse anti-human CD34 MoAb (Mylteni) was used. Briefly, cell samples were incubated in the presence of the indicated MoAbs, at the proper dilution, in PBS containing 5% FCS and 1% Fc receptor (FcR) blocking reagent (Miltenyi) for 30 min at 4 C. Cells were then washed twice, resuspended with PBS and analyzed by a Coulter Epics XL (Coulter Electronics Inc., Hialeah, FL, USA) flow cytometer. only use Non-commercial 1 Supplementary Table S1. Complete list of the datasets used in this study and their sources. GEO Total samples Geo selected GEO accession of used Platform Reference series in series samples samples GSM142565 GSM142566 GSM142567 GSM142568 GSE6146 HG-U133A 14 8 - GSM142569 GSM142571 GSM142572 GSM142574 GSM51391 GSM51392 GSE2666 HG-U133A 36 4 1 GSM51393 GSM51394 only GSM321583 GSE12803 HG-U133A 20 3 GSM321584 2 GSM321585 use Promyelocytes_1 Promyelocytes_2 Promyelocytes_3 Promyelocytes_4 HG-U133A 8 8 3 GSE64282 Promyelocytes_5 Promyelocytes_6 Promyelocytes_7 Promyelocytes_8 Non-commercial 2 Supplementary Table S2. Chromosomal regions up-regulated in CD34+ samples as identified by the LAP procedure with the two-class statistics coded in the PREDA R package and an FDR threshold of 0.5. Functional enrichment analysis has been performed using DAVID (http://david.abcc.ncifcrf.gov/)
    [Show full text]
  • Gurdon Institute 20122011 PROSPECTUS / ANNUAL REPORT 20112010
    The Wellcome Trust/Cancer Research UK Gurdon Institute 20122011 PROSPECTUS / ANNUAL REPORT 20112010 Gurdon I N S T I T U T E PROSPECTUS 2012 ANNUAL REPORT 2011 http://www.gurdon.cam.ac.uk CONTENTS THE INSTITUTE IN 2011 INTRODUCTION........................................................................................................................................3 HISTORICAL BACKGROUND..........................................................................................................4 CENTRAL SUPPORT SERVICES....................................................................................................5 FUNDING.........................................................................................................................................................5 RETREAT............................................................................................................................................................5 RESEARCH GROUPS.........................................................................................................6 MEMBERS OF THE INSTITUTE................................................................................44 CATEGORIES OF APPOINTMENT..............................................................................44 POSTGRADUATE OPPORTUNITIES..........................................................................44 SENIOR GROUP LEADERS.............................................................................................44 GROUP LEADERS.......................................................................................................................48
    [Show full text]
  • Ontology-Based Methods for Analyzing Life Science Data
    Habilitation a` Diriger des Recherches pr´esent´ee par Olivier Dameron Ontology-based methods for analyzing life science data Soutenue publiquement le 11 janvier 2016 devant le jury compos´ede Anita Burgun Professeur, Universit´eRen´eDescartes Paris Examinatrice Marie-Dominique Devignes Charg´eede recherches CNRS, LORIA Nancy Examinatrice Michel Dumontier Associate professor, Stanford University USA Rapporteur Christine Froidevaux Professeur, Universit´eParis Sud Rapporteure Fabien Gandon Directeur de recherches, Inria Sophia-Antipolis Rapporteur Anne Siegel Directrice de recherches CNRS, IRISA Rennes Examinatrice Alexandre Termier Professeur, Universit´ede Rennes 1 Examinateur 2 Contents 1 Introduction 9 1.1 Context ......................................... 10 1.2 Challenges . 11 1.3 Summary of the contributions . 14 1.4 Organization of the manuscript . 18 2 Reasoning based on hierarchies 21 2.1 Principle......................................... 21 2.1.1 RDF for describing data . 21 2.1.2 RDFS for describing types . 24 2.1.3 RDFS entailments . 26 2.1.4 Typical uses of RDFS entailments in life science . 26 2.1.5 Synthesis . 30 2.2 Case study: integrating diseases and pathways . 31 2.2.1 Context . 31 2.2.2 Objective . 32 2.2.3 Linking pathways and diseases using GO, KO and SNOMED-CT . 32 2.2.4 Querying associated diseases and pathways . 33 2.3 Methodology: Web services composition . 39 2.3.1 Context . 39 2.3.2 Objective . 40 2.3.3 Semantic compatibility of services parameters . 40 2.3.4 Algorithm for pairing services parameters . 40 2.4 Application: ontology-based query expansion with GO2PUB . 43 2.4.1 Context . 43 2.4.2 Objective .
    [Show full text]