A Tool for Searching Putative Factors Regulating Gene

Total Page:16

File Type:pdf, Size:1020Kb

A Tool for Searching Putative Factors Regulating Gene bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 1 TFmapper: A tool for searching putative 2 factors regulating gene expression using 3 ChIP-seq data 4 5 Jianming Zeng 1, Gang Li 1* 6 1Faculty of Health Sciences, University of Macau, Macau, China 7 8 Email: Zeng Jianming [email protected] 9 Gang Li [email protected] 10 11 * To whom correspondence should be addressed: [email protected] 1 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 12 Abstract 13 Background: Next-generation sequencing coupled to chromatin immunoprecipitation 14 (ChIP-seq), DNase I hypersensitivity (DNase-seq) and the transposase-accessible chromatin 15 assay (ATAC-seq) has generated enormous amounts of data, markedly improved our 16 understanding of the transcriptional and epigenetic control of gene expression. To take 17 advantage of the availability of such datasets and provide clues on what factors, including 18 transcription factors, epigenetic regulators and histone modifications, potentially regulates the 19 expression of a gene of interest, a tool for simultaneous queries of multiple datasets using 20 symbols or genomic coordinates as search terms is needed. 21 Results: In this study, we annotated the peaks of thousands of ChIP-seq datasets generated 22 by ENCODE project, or ChIP-seq/DNase-seq/ATAC-seq datasets deposited in Gene 23 Expression Omnibus and curated by CistromeDB; We built a MySQL database called 24 TFmapper containing the annotations and associated metadata, allowing users without 25 bioinformatics expertise to search across thousands of datasets to identify factors targeting a 26 genomic region/gene of interest in a specified sample through a web interface. Users can also 27 visualize multiple peaks in genome browsers and download the corresponding sequences. 28 Conclusion: TFmapper will help users explore the vast amount of publicly available 29 ChIP-seq/DNase-seq/ATAC-seq data, and perform integrative analyses to understand the 30 regulation of a gene of interest. The web server is freely accessible at 31 http://www.tfmapper.org/. 2 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 32 33 Keywords: 34 Chromatin immunoprecipitation; Next-generation sequencing; Regulation of gene expression 3 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 35 Background 36 Next-generation sequencing (NGS) coupled to chromatin immunoprecipitation (ChIP-seq), 37 DNase I hypersensitivity (DNase-seq) and the transposase-accessible chromatin assay 38 (ATAC-seq) provide powerful tools for studying gene regulation by factors in cis and trans, 39 which includes components of the basal transcriptional machinery, transcription factors, 40 chromatin regulators, histone variants, histone modifications and others. A great amount of 41 data of ChIP-seq, DNase-seq and ATAC-seq has been accumulated by individual laboratories 42 and large-scale collaborative projects, including the ENCODE (Encyclopedia of DNA 43 Elements) Consortium [1], the Roadmap Epigenomics Mapping Consortium (Roadmap) [2] 44 and the International Human Epigenome Consortium (IHEC) projects [3]. The datasets are 45 usually publicly accessible through the Gene Expression Omnibus (GEO) of the National 46 Center for Biotechnology Information (NCBI) [4], and the data portals of large-scale projects. 47 Despite the easy access, mining and interpreting the ChIP-seq/DNase-seq/ATAC-seq 48 data is challenging for regular users, especially for bench biologists with limited bioinformatic 49 expertise. Adding to the complexity, the data qualities and data analysis pipelines are 50 remarkably varied, which hinder their direct use in further analyses [5]. To address the 51 challenges, multiple algorithms/metrics have been developed to evaluate the data quality 52 bioinformatically, such as ENCODE quality metrics, NGS-QC, ChiLin and others [6-9]. 53 NGS-QC developed by Gronemeyer and colleagues built a quality control (QC) indicator 54 database of the largest collection of publicly available NGS datasets [6, 8], which provides a 4 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 55 solid start point for further analysis. CistromeDB developed by Liu and colleagues curated 56 and processed a huge collection of human and mouse ChIP-seq and chromatin accessibility 57 datasets from GEO with a standard analysis pipeline ChiLin [9], and further evaluated 58 individual data quality under several scoring metrics [10]. High-quality processed ChIP-seq 59 data generated by ENCODE consortium, including histone modification, chromatin regulator 60 and transcription factor binding data in a selected set of biological samples, are also available 61 through its data portal [11]. ENCODE and CistromeDB provide access to the processed data, 62 and the corresponding metadata including the sources and properties of biological samples, 63 experimental protocols, the antibody used and others, which offer opportunities for users to 64 re-analyze the data and identify the genome-wide targets of a transcription regulator in 65 different cell lines and tissues. Nonetheless, different questions are often asked, such as: 66 Which transcription factors are responsible for the regulation of a gene of interest, and what is 67 the epigenetic landscape of a gene of interest in a particular tissue or cell type? To address 68 these questions, a tool for simultaneous queries of multiple datasets using symbols or 69 genomic coordinates of target genes as search terms is needed. 70 In this study, we collected and annotated a large number of ChIP-seq, DNase-seq and 71 ATAC-seq datasets including the ENCODE datasets and the GEO datasets curated by 72 CistromeDB [10]. We built a Structured Query Language (SQL) database containing the 73 annotations and the associated metadata, allowing users to search across multiple datasets 74 and identify the putative factors which target a specified gene or genomic locus based on 5 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 75 actual NGS data. We also provide links for users to visualize the peaks and download the 76 corresponding sequences. In addition, we included an example to demonstrate the utility of 77 TFmapper. 6 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. 78 Construction and content 79 Data collection 80 We downloaded the GEO datasets processed by Liu and colleagues from CistromeDB as 81 BED (Browser Extensible Data) files, which includes 6092 ChIP-seq datasets for trans-acting 82 factors and 6068 ChIP-seq datasets for histone modifications generated from human samples; 83 4786 ChIP-seq datasets for trans-acting factors and 5002 ChIP-seq datasets for histone 84 modifications generated from mouse samples; and 1371 DNase-seq/ATAC-seq datasets. For 85 ENCODE datasets, we downloaded the conservative IDR (Irreproducible Discovery Rate)- 86 thresholded peaks, which are called after combinational analysis of two replicates for each 87 ChIP-seq experiment. The ENCODE datasets selected in this study include 955 transcription 88 factor and 771 histone modification datasets for Homo sapiens, and 145 transcription factor 89 and 1252 histone modification datasets for Mus musculus. 90 91 Data processing 92 We used HOMER [12] to annotate the peaks in BED files per the newest reference genome, 93 GRCh38/hg38 for human and GRCm38/mm10 for mouse respectively. HOMER assigns each 94 peak to the nearest gene by calculating the distance between the middle of a peak and the 95 transcription start site (TSS) of a gene. Meanwhile, seven genomic features based on RefSeq 96 annotations were assigned to each peak, which are promoter (by default defined from -1kb to 97 +100bp of TSS), TTS region (transcription termination site, by default defined from -100 bp to 7 bioRxiv preprint doi: https://doi.org/10.1101/262923; this version posted February 9, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license.
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Foxj1-Dependent Gene Expression Is Required for Differentiation of Radial Glia Into Ependymal Cells and a Subset of Astrocytes in the Postnatal Brain Benoit V
    RESEARCH ARTICLE 4021 Development 136, 4021-4031 (2009) doi:10.1242/dev.041129 FoxJ1-dependent gene expression is required for differentiation of radial glia into ependymal cells and a subset of astrocytes in the postnatal brain Benoit V. Jacquet1, Raul Salinas-Mondragon1, Huixuan Liang1, Blair Therit1, Justin D. Buie1, Michael Dykstra2, Kenneth Campbell3, Lawrence E. Ostrowski4, Steven L. Brody5 and H. Troy Ghashghaei1,* Neuronal specification occurs at the periventricular surface of the embryonic central nervous system. During early postnatal periods, radial glial cells in various ventricular zones of the brain differentiate into ependymal cells and astrocytes. However, mechanisms that drive this time- and cell-specific differentiation remain largely unknown. Here, we show that expression of the forkhead transcription factor FoxJ1 in mice is required for differentiation into ependymal cells and a small subset of FoxJ1+ astrocytes in the lateral ventricles, where these cells form a postnatal neural stem cell niche. Moreover, we show that a subset of FoxJ1+ cells harvested from the stem cell niche can self-renew and possess neurogenic potential. Using a transcriptome comparison of FoxJ1- null and wild-type microdissected tissue, we identified candidate genes regulated by FoxJ1 during early postnatal development. The list includes a significant number of microtubule-associated proteins, some of which form a protein complex that could regulate the transport of basal bodies to the ventricular surface of differentiating ependymal cells during FoxJ1-dependent ciliogenesis. Our results suggest that time- and cell-specific expression of FoxJ1 in the brain acts on an array of target genes to regulate the differentiation of ependymal cells and a small subset of astrocytes in the adult stem cell niche.
    [Show full text]
  • Polycomb Cbx Family Members Mediate the Balance Between Haematopoietic Stem Cell Self-Renewal and Differentiation
    ARTICLES Polycomb Cbx family members mediate the balance between haematopoietic stem cell self-renewal and differentiation Karin Klauke1, Vi²nja Radulovi¢1, Mathilde Broekhuis1, Ellen Weersing1, Erik Zwart1, Sandra Olthof1, Martha Ritsema1, Sophia Bruggeman1, Xudong Wu2, Kristian Helin2, Leonid Bystrykh1 and Gerald de Haan1,3 The balance between self-renewal and differentiation of adult stem cells is essential for tissue homeostasis. Here we show that in the haematopoietic system this process is governed by polycomb chromobox (Cbx) proteins. Cbx7 is specifically expressed in haematopoietic stem cells (HSCs), and its overexpression enhances self-renewal and induces leukaemia. This effect is dependent on integration into polycomb repressive complex-1 (PRC1) and requires H3K27me3 binding. In contrast, overexpression of Cbx2, Cbx4 or Cbx8 results in differentiation and exhaustion of HSCs. ChIP-sequencing analysis shows that Cbx7 and Cbx8 share most of their targets; we identified approximately 200 differential targets. Whereas genes targeted by Cbx8 are highly expressed in HSCs and become repressed in progenitors, Cbx7 targets show the opposite expression pattern. Thus, Cbx7 preserves HSC self-renewal by repressing progenitor-specific genes. Taken together, the presence of distinct Cbx proteins confers target selectivity to PRC1 and provides a molecular balance between self-renewal and differentiation of HSCs. Mature blood cells have a limited lifespan and are continuously unclear. Yet, expression patterns of PcG family members vary between
    [Show full text]
  • Mouse Cbx8 Knockout Project (CRISPR/Cas9)
    https://www.alphaknockout.com Mouse Cbx8 Knockout Project (CRISPR/Cas9) Objective: To create a Cbx8 knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Cbx8 gene (NCBI Reference Sequence: NM_013926 ; Ensembl: ENSMUSG00000025578 ) is located on Mouse chromosome 11. 5 exons are identified, with the ATG start codon in exon 1 and the TGA stop codon in exon 5 (Transcript: ENSMUST00000026663). Exon 1~5 will be selected as target site. Cas9 and gRNA will be co-injected into fertilized eggs for KO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Mice homozygous for a knock-out allele exhibit impaired MLL-AF9 transformation but are otherwise viable with normal hematopoiesis. Exon 1 starts from about 0.09% of the coding region. Exon 1~5 covers 100.0% of the coding region. The size of effective KO region: ~2169 bp. The KO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 2 3 4 5 Legends Exon of mouse Cbx8 Knockout region Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot (up) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section upstream of start codon is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the Dot Plot (down) Window size: 15 bp Forward Reverse Complement Sequence 12 Note: The 2000 bp section downstream of stop codon is aligned with itself to determine if there are tandem repeats.
    [Show full text]
  • Regulation of Cancer Epigenomes with a Histone-Binding Synthetic Transcription Factor
    www.nature.com/npjgenmed ARTICLE OPEN Regulation of cancer epigenomes with a histone-binding synthetic transcription factor David B. Nyer1, Rene M. Daer1, Daniel Vargas1, Caroline Hom1 and Karmella A. Haynes1 Chromatin proteins have expanded the mammalian synthetic biology toolbox by enabling control of active and silenced states at endogenous genes. Others have reported synthetic proteins that bind DNA and regulate genes by altering chromatin marks, such as histone modifications. Previously, we reported the first synthetic transcriptional activator, the “Polycomb-based transcription factor” (PcTF) that reads histone modifications through a protein–protein interaction between the polycomb chromodomain motif and trimethylated lysine 27 of histone H3 (H3K27me3). Here, we describe the genome-wide behavior of the polycomb-based transcription factor fusion protein. Transcriptome and chromatin profiling revealed several polycomb-based transcription factor- sensitive promoter regions marked by distal H3K27me3 and proximal fusion protein binding. These results illuminate a mechanism in which polycomb-based transcription factor interactions bridge epigenomic marks with the transcription initiation complex at target genes. In three cancer-derived human cell lines tested here, some target genes encode developmental regulators and tumor suppressors. Thus, the polycomb-based transcription factor represents a powerful new fusion protein-based method for cancer research and treatment where silencing marks are translated into direct gene activation. npj Genomic Medicine (2017) 2:1 ; doi:10.1038/s41525-016-0002-3 INTRODUCTION to the C terminus of the 60 amino acid PCD. The C-terminal VP64 Proteins from the gene regulatory complex known as chromatin domain allows PcTF to stimulate activation at repressed mediate stable, epigenetic expression states that persist over H3K27me3-associated genes (Fig.
    [Show full text]
  • A Chromatin-Independent Role of Polycomb-Like 1 to Stabilize P53 and Promote Cellular Quiescence
    Downloaded from genesdev.cshlp.org on March 4, 2020 - Published by Cold Spring Harbor Laboratory Press A chromatin-independent role of Polycomb-like 1 to stabilize p53 and promote cellular quiescence Gerard L. Brien,1 Evan Healy,1,7 Emilia Jerman,1,7 Eric Conway,1 Elisa Fadda,2 Darragh O’Donovan,3 Andrei V. Krivtsov,4 Alan M. Rice,1 Conor J. Kearney,1 Andrew Flaus,5 Simon S. McDade,6 Seamus J. Martin,1 Aoife McLysaght,1 David J. O’Connell,3 Scott A. Armstrong,4 and Adrian P. Bracken1 1Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland; 2Department of Chemistry, National University of Ireland, Maynooth, Ireland; 3The Conway Institute, University College Dublin, Dublin 4, Ireland; 4Cancer Biology and Genetics Program, Memorial Sloan Kettering Cancer Center, New York, New York 10065, USA; 5Centre for Chromosome Biology, School of Life Sciences, National University of Ireland Galway, Galway, Ireland; 6Centre for Cancer Research and Cell Biology, Queen’s University Belfast, Belfast BT9 7BL, United Kingdom Polycomb-like proteins 1–3 (PCL1–3) are substoichiometric components of the Polycomb-repressive complex 2 (PRC2) that are essential for association of the complex with chromatin. However, it remains unclear why three proteins with such apparent functional redundancy exist in mammals. Here we characterize their divergent roles in both positively and negatively regulating cellular proliferation. We show that while PCL2 and PCL3 are E2F-regu- lated genes expressed in proliferating cells, PCL1 is a p53 target gene predominantly expressed in quiescent cells. Ectopic expression of any PCL protein recruits PRC2 to repress the INK4A gene; however, only PCL2 and PCL3 confer an INK4A-dependent proliferative advantage.
    [Show full text]
  • A Tool for Searching Putative Factors Regulating Gene Expression Using Chip-Seq Data Jianming Zeng, Gang Li
    Int. J. Biol. Sci. 2018, Vol. 14 1724 Ivyspring International Publisher International Journal of Biological Sciences 2018; 14(12): 1724-1731. doi: 10.7150/ijbs.28850 Research Paper TFmapper: A Tool for Searching Putative Factors Regulating Gene Expression Using ChIP-seq Data Jianming Zeng, Gang Li Faculty of Health Sciences, University of Macau, Taipa, Macau, China Corresponding author: Gang Li, Faculty of Health Sciences, University of Macau, Taipa, Macau, China. Email address: [email protected] © Ivyspring International Publisher. This is an open access article distributed under the terms of the Creative Commons Attribution (CC BY-NC) license (https://creativecommons.org/licenses/by-nc/4.0/). See http://ivyspring.com/terms for full terms and conditions. Received: 2018.07.29; Accepted: 2018.07.30; Published: 2018.09.07 Abstract Background: Next-generation sequencing coupled to chromatin immunoprecipitation (ChIP-seq), DNase I hypersensitivity (DNase-seq) and the transposase-accessible chromatin assay (ATAC-seq) has generated enormous amounts of data, markedly improved our understanding of the transcriptional and epigenetic control of gene expression. To take advantage of the availability of such datasets and provide clues on what factors, including transcription factors, epigenetic regulators and histone modifications, potentially regulates the expression of a gene of interest, a tool for simultaneous queries of multiple datasets using symbols or genomic coordinates as search terms is needed. Results: In this study, we annotated the peaks of thousands of ChIP-seq datasets generated by ENCODE project, or ChIP-seq/DNase-seq/ATAC-seq datasets deposited in Gene Expression Omnibus (GEO) and curated by Cistrome project; We built a MySQL database called TFmapper containing the annotations and associated metadata, allowing users without bioinformatics expertise to search across thousands of datasets to identify factors targeting a genomic region/gene of interest in a specified sample through a web interface.
    [Show full text]
  • Variation in Protein Coding Genes Identifies Information Flow
    bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 1 1 2 3 4 5 Variation in protein coding genes identifies information flow as a contributor to 6 animal complexity 7 8 Jack Dean, Daniela Lopes Cardoso and Colin Sharpe* 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Institute of Biological and Biomedical Sciences 25 School of Biological Science 26 University of Portsmouth, 27 Portsmouth, UK 28 PO16 7YH 29 30 * Author for correspondence 31 [email protected] 32 33 Orcid numbers: 34 DLC: 0000-0003-2683-1745 35 CS: 0000-0002-5022-0840 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Abstract bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 2 1 Across the metazoans there is a trend towards greater organismal complexity. How 2 complexity is generated, however, is uncertain. Since C.elegans and humans have 3 approximately the same number of genes, the explanation will depend on how genes are 4 used, rather than their absolute number.
    [Show full text]
  • Tandem Histone-Binding Domains Enhance the Activity of a Synthetic Chromatin Effector
    bioRxiv preprint doi: https://doi.org/10.1101/145730; this version posted October 29, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Tandem histone­binding domains enhance the activity of a synthetic chromatin effector Stefan J. Tekel1 , Daniel Vargas 1, Lusheng Song 2, Joshua LaBaer2 , and Karmella A. Haynes1 1. School of Biological and Health Systems Engineering, Arizona State University, Tempe, AZ 2. Biodesign Institute, Arizona State University, Tempe, AZ Keywords: chromatin, histones, transcription factor, protein engineering, epigenetics Corresponding author: Karmella A. Haynes [email protected] ABSTRACT Fusion proteins that specifically interact with biochemical marks on chromosomes represent a new class of synthetic transcriptional regulators that decode cell state information rather than DNA sequences. In multicellular organisms, information relevant to cell state, tissue identity, and oncogenesis is often encoded as biochemical modifications of histones, which are bound to DNA in eukaryotic nuclei and regulate gene expression states. We have previously reported the development and validation of the “Polycomb­based transcription factor” (PcTF), a fusion protein that recognizes histone modifications through a protein­protein interaction between its polycomb chromodomain (PCD) motif and trimethylated lysine 27 of histone H3 (H3K27me3) at genomic sites. We demonstrated that PcTF activates genes at methyl­histone­enriched loci in cancer­derived cell lines. However, PcTF induces modest activation of a methyl­histone associated reporter compared to a DNA­binding activator.
    [Show full text]
  • CBX8, a Component of the Polycomb PRC1 Complex, Modulates DOT1L-Mediated Gene Expression Through AF9/MLLT3 Q ⇑ Bhavna Malik A, Charles S
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector FEBS Letters 587 (2013) 3038–3044 journal homepage: www.FEBSLetters.org CBX8, a component of the Polycomb PRC1 complex, modulates DOT1L-mediated gene expression through AF9/MLLT3 q ⇑ Bhavna Malik a, Charles S. Hemenway a,b, a Department of Molecular and Cellular Biochemistry, Loyola University Chicago, Stritch School of Medicine, Maywood, IL 60153, United States b Department of Pediatrics and the Oncology Institute, Loyola University Chicago, Stritch School of Medicine, Maywood, IL 60153, United States article info abstract Article history: AF9 is known to interact with multiple proteins including activators and repressors of transcription. Received 6 April 2013 Our data indicate that other AF9 binding proteins compete with the histone methyltransferase Revised 3 July 2013 DOT1L for AF9 binding thus diminishing its ability to methylate lysine 79 of histone 3. Specifically, Accepted 5 July 2013 we show that AF9 is part of a protein multimer containing members of Polycomb group (PcG) PRC1 Available online 25 July 2013 complex, CBX8, RING1B, and BMI1. Interaction with CBX8 precludes AF9–DOT1L binding. Knock- down of CBX8 with short-hairpin RNA (shRNA) leads to decreased expression of the AF9 target gene Edited by Ivan Sadowski ENaCa. In contrast, CBX8 overexpression results in increased ENaCa mRNA levels and this effect can be partially overcome by co-overexpression of AF9. Keywords: Ó 2013 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved. Polycomb-group protein Histone methylation ENaCa Gene regulation 1. Introduction precursor cells, expression of a chimeric protein comprising of the C-terminus of AF9 fused to the N-terminus of MLL leads to leu- AF9/MLLT3, located on chromosome band 9p22, was first identi- kemic transformation [6,7].
    [Show full text]
  • The Polycomb Orthologues in Teleost Fishes and Their Expression in the Zebrafish Model
    G C A T T A C G G C A T genes Article The Polycomb Orthologues in Teleost Fishes and Their Expression in the Zebrafish Model Ludivine Raby, Pamela Völkel, Xuefen Le Bourhis and Pierre-Olivier Angrand * University Lille, CNRS, Inserm, CHU Lille, Centre Oscar Lambret, UMR 9020 - UMR 1277 - Canther - Cancer Heterogeneity, Plasticity and Resistance to Therapies, F-59000 Lille, France; [email protected] (L.R.); [email protected] (P.V.); [email protected] (X.L.B.) * Correspondence: [email protected]; Tel.: + 33-320-336-222 Received: 21 February 2020; Accepted: 26 March 2020; Published: 27 March 2020 Abstract: The Polycomb Repressive Complex 1 (PRC1) is a chromatin-associated protein complex involved in transcriptional repression of hundreds of genes controlling development and differentiation processes, but also involved in cancer and stem cell biology. Within the canonical PRC1, members of Pc/CBX protein family are responsible for the targeting of the complex to specific gene loci. In mammals, the Pc/CBX protein family is composed of five members generating, through mutual exclusion, different PRC1 complexes with potentially distinct cellular functions. Here, we performed a global analysis of the cbx gene family in 68 teleost species and traced the distribution of the cbx genes through teleost evolution in six fish super-orders. We showed that after the teleost-specific whole genome duplication, cbx4, cbx7 and cbx8 are retained as pairs of ohnologues. In contrast, cbx2 and cbx6 are present as pairs of ohnologues in the genome of several teleost clades but as singletons in others.
    [Show full text]
  • Molecular and Epigenetic Mechanisms of MLL in Human Leukemogenesis
    Cancers 2012, 4, 904-944; doi:10.3390/cancers4030904 OPEN ACCESS cancers ISSN 2072-6694 www.mdpi.com/journal/cancers Review Molecular and Epigenetic Mechanisms of MLL in Human Leukemogenesis Erica Ballabio and Thomas A. Milne * MRC Molecular Haematology Unit, Weatherall Institute of Molecular Medicine, John Radcliffe Hospital Headington, Oxford OX3 9DS, UK; E-Mail: [email protected] (E.B.) * Author to whom correspondence should be addressed; E-Mail: [email protected]; Tel.: +44-1865-222-626; Fax: +44-1865-222-500. Received: 2 August 2012; in revised form: 31 August 2012 / Accepted: 4 September 2012 / Published: 10 September 2012 Abstract: Epigenetics is often defined as the study of heritable changes in gene expression or chromosome stability that don’t alter the underlying DNA sequence. Epigenetic changes are established through multiple mechanisms that include DNA methylation, non-coding RNAs and the covalent modification of specific residues on histone proteins. It is becoming clear not only that aberrant epigenetic changes are common in many human diseases such as leukemia, but that these changes by their very nature are malleable, and thus are amenable to treatment. Epigenetic based therapies have so far focused on the use of histone deacetylase (HDAC) inhibitors and DNA methyltransferase inhibitors, which tend to have more general and widespread effects on gene regulation in the cell. However, if a unique molecular pathway can be identified, diseases caused by epigenetic mechanisms are excellent candidates for the development of more targeted therapies that focus on specific gene targets, individual binding domains, or specific enzymatic activities.
    [Show full text]