Systematic Approach for Dissecting the Molecular Mechanisms Of

Total Page:16

File Type:pdf, Size:1020Kb

Systematic Approach for Dissecting the Molecular Mechanisms Of Systematic approach for dissecting the molecular mechanisms of transcriptional regulation in bacteria Nathan M. Belliveaua, Stephanie L. Barnesa, William T. Irelandb, Daniel L. Jonesc, Michael J. Sweredoskid, Annie Moradiand, Sonja Hessd,1, Justin B. Kinneye, and Rob Phillipsa,b,f,2 aDivision of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125; bDepartment of Physics, California Institute of Technology, Pasadena, CA 91125; cDepartment of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 751 24 Uppsala, Sweden; dProteome Exploration Laboratory, Beckman Institute, California Institute of Technology, Pasadena, CA 91125; eSimons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724; and fDepartment of Applied Physics, California Institute of Technology, Pasadena, CA 91125 Edited by Curtis G. Callan Jr., Princeton University, Princeton, NJ, and approved April 6, 2018 (received for review December 19, 2017) Gene regulation is one of the most ubiquitous processes in biol- constructed from known binding sites, and to search for new ogy. However, while the catalog of bacterial genomes continues transcriptional regulatory sequences (13–19). CRISPR assays to expand rapidly, we remain ignorant about how almost all of have also shown promise for identifying longer-range enhancer– the genes in these genomes are regulated. At present, character- promoter interactions in mammalian cells (20). However, no izing the molecular mechanisms by which individual regulatory approach for using massively parallel reporter technologies to sequences operate requires focused efforts using low-throughput decipher the functional mechanisms of previously uncharacter- methods. Here, we take a first step toward multipromoter dissec- ized regulatory sequences has yet been established. tion and show how a combination of massively parallel reporter Here, we take a first step toward quantitative, multipromoter assays, mass spectrometry, and information-theoretic modeling dissection and describe a systematic approach for identifying can be used to dissect multiple bacterial promoters in a systematic the functional architecture of previously uncharacterized bacte- way. We show this approach on both well-studied and previously rial promoters at nucleotide resolution using a combination of uncharacterized promoters in the enteric bacterium Escherichia genetic, functional, and biochemical measurements. A massively coli. In all cases, we recover nucleotide-resolution models of promoter mechanism. For some promoters, including previously BIOPHYSICS AND unannotated ones, the approach allowed us to further extract Significance COMPUTATIONAL BIOLOGY quantitative biophysical models describing input–output relation- ships. Given the generality of the approach presented here, it opens Organisms must constantly make regulatory decisions in up the possibility of quantitatively dissecting the mechanisms of response to a change in cellular state or environment. How- promoter function in E. coli and a wide range of other bacteria. ever, while the catalog of genomes expands rapidly, we remain ignorant about how the genes in these genomes are gene regulation j massively parallel reporter assay j quantitative models j regulated. Here, we show how a massively parallel reporter DNA affinity chromatography j mass spectrometry assay, Sort-Seq, and information-theoretic modeling can be used to identify regulatory sequences. We then use chro- he sequencing revolution has left in its wake an enor- matography and mass spectrometry to identify the regulatory Tmous challenge: the rapidly expanding catalog of sequenced proteins that bind these sequences. The approach results in genomes is far outpacing a sequence-level understanding of quantitative base pair-resolution models of promoter mecha- how the genes in these genomes are regulated. This ignorance nism and was shown in both well-characterized and unanno- extends from viruses to bacteria to archaea to eukaryotes. Even tated promoters in Escherichia coli. Given the generality of the in Escherichia coli, the model organism in which transcriptional approach, it opens up the possibility of quantitatively dissect- regulation is best understood, we still have no indication if or how ing the mechanisms of promoter function in a wide range of more than one-half of the genes are regulated (SI Appendix, Fig. bacteria. S1) [RegulonDB (1) or EcoCyc (2)]. In other model bacteria, Author contributions: N.M.B., D.L.J., and R.P. designed research; N.M.B., S.L.B., D.L.J., such as Bacillus subtilis, Caulobacter crescentus, Vibrio harveyii, M.J.S., A.M., S.H., and J.B.K. performed research; N.M.B., W.T.I., M.J.S., A.M., and J.B.K. or Pseudomonas aeruginosa, far fewer genes have established analyzed data; N.M.B., S.L.B., and R.P. provided conceptualization; N.M.B., S.L.B., W.T.I., regulatory mechanisms (3–5). D.L.J., and J.B.K. provided methodology; N.M.B. provided investigation; N.M.B., W.T.I., D.L.J., M.J.S., and J.B.K provided software; N.M.B., W.T.I., M.J.S., and J.B.K. provided New approaches are needed for studying regulatory architec- validation; S.H. and R.P. acquired funding; N.M.B., S.L.B., J.B.K., and R.P. wrote the paper. ture in these bacteria and others. Chromatin immunoprecipita- The authors declare no conflict of interest. tion (ChIP) and other high-throughput techniques are increas- ingly being used to study gene regulation in E. coli (6–11), but This article is a PNAS Direct Submission. these methods are incapable of revealing either the nucleotide- This open access article is distributed under Creative Commons Attribution- NonCommercial-NoDerivatives License 4.0 (CC BY-NC-ND). resolution location of all functional transcription factor binding Data deposition: Raw sequencing files have been deposited on the NCBI Sequence sites or the way in which interactions between DNA-bound Read Archive, https://www.ncbi.nlm.nih.gov/sra (accession no. SRP121362) and Sort- transcription factors and RNA polymerase (RNAP) modulate Seq regulatory sequencing data from Escherichia coli has been deposited in NCBI transcription. Although an arsenal of now classic genetic and BioSample, https://www.ncbi.nlm.nih.gov/biosample (accession no. SAMN07830099). biochemical methods has been developed for dissecting pro- Thermo RAW mass spectrometry files have been deposited in the jPOST Reposi- tory, https://repository.jpostdb.org (accession no. PXD007892). Files containing pro- moter function at individual bacterial promoters [reviewed in the cessed data and python code association with data analysis and plotting have work by Minchin and Busby (12)], these methods are not readily been deposited on GitHub and Zenodo (available at https://www.github.com/RPGroup- parallelized and often require purification of promoter-specific PBoC/sortseq belliveau and https://doi.org/10.5281/zenodo.1184169, respectively). regulatory proteins. 1 Present address: Antibody Discovery and Protein Engineering, MedImmune, Gaithers- In recent years, a variety of massively parallel reporter assays burg, MD 20878. have been developed for dissecting the functional architecture of 2 To whom correspondence should be addressed. Email: [email protected]. transcriptional regulatory sequences in bacteria, yeast, and meta- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. zoans. These technologies have been used to infer biophysical 1073/pnas.1722055115/-/DCSupplemental. models of well-studied loci, to characterize synthetic promoters www.pnas.org/cgi/doi/10.1073/pnas.1722055115 PNAS Latest Articles j 1 of 10 Downloaded by guest on October 2, 2021 parallel reporter assay [Sort-Seq (13)] is performed on a pro- A moter in multiple growth conditions to identify functional tran- scription factor binding sites. DNA affinity chromatography and mass spectrometry (21, 22) are then used to identify the reg- ulatory proteins that recognize these sites. In this way, one is able to identify both the functional transcription factor binding sites and cognate transcription factors in previously unstudied promoters. Subsequent massively parallel assays are then per- formed in gene deletion strains to provide additional validation B of the identified regulators. The reporter data thus generated are also used to infer sequence-dependent quantitative models of transcriptional regulation. In what follows, we first illustrate the overarching logic of our approach through application to four A C previously annotated promoters: lacZYA, relBE, marRAB, and - G yebG. We then apply this strategy to the previously uncharacter- T ized promoters of purT, xylE, and dgoRKADT, showing the ability to go from regulatory ignorance to explicit quantitative models of a promoter’s input–output behavior. C Results To dissect how a promoter is regulated, we begin by perform- ing Sort-Seq (13). As shown in Fig. 1A, Sort-Seq works by first generating a library of cells, each of which contains a mutated promoter that drives expression of green fluorescent protein (GFP) from a low copy plasmid [5–10 copies per cell (23)] and provides a readout of transcriptional state. We use fluorescence- activated cell sorting (FACS) to sort cells into multiple bins gated Fig. 1. Overview of the approach to characterize transcriptional regulatory by their fluorescence level and then sequence the mutated plas- DNA using Sort-Seq and mass spectrometry. (A) Schematic of Sort-Seq. A mids from each bin. We found it sufficient
Recommended publications
  • Escherichia Coli: Protein Families and Binding Sites M
    Functional determinants of transcription factors in Escherichia coli: protein families and binding sites M. Madan Babu and Sarah A. Teichmann DNA-binding transcription factors regulate the expression A reasonable hypothesis would be to assume that of genes near to where they bind. These factors can be activators, repressors and dual regulators are more activators or repressors of transcription, or both. Thus, a closely related to the other proteins within each fundamental question is what determines whether a regulatory group than to proteins in other groups. This transcription factor acts as an activator or repressor? would mean that there would be protein families and Previous research into this question found that a protein's domain architectures that were characteristic either of regulatory function is determined by one or more of the activators, repressors or dual regulators. Prag et al. [10] following factors: protein–proteinprotein–protein contacts, position of the and Perez-Rueda et al. [11] reported evidence to support DNA-binding domain in the protein primary sequence, this by relating the position of helix-turn-helix motifs altered DNA structure, and the position of its binding site along the primary sequence to a protein's regulatory on the DNA relative to the transcription start site. Although function. They used sequence comparison and profile there are many aspects specific to different transcription methods to locate the helix-turn-helix motifs that factors, in this work we demonstrate that, in general, in the represent DBDs, and found that the helix-turn-helix prokaryote Escherichia coli, a transcription factor's protein tends to be at the N terminus for repressors and at the C family is not indicative of its regulatory function, but the terminus for activators.
    [Show full text]
  • Regulondb (Version 6.0): Gene Regulation Model Of
    D120–D124 Nucleic Acids Research, 2008, Vol. 36, Database issue doi:10.1093/nar/gkm994 RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation Socorro Gama-Castro1, Vero´ nica Jime´ nez-Jacinto1, Martı´n Peralta-Gil1, Alberto Santos-Zavaleta1,Mo´ nica I. Pen˜ aloza-Spinola1, Bruno Contreras-Moreira1, Juan Segura-Salazar1, Luis Mun˜ iz-Rascado1, Irma Martı´nez-Flores1, Heladia Salgado1,Ce´ sar Bonavides-Martı´nez1, Cei Abreu-Goodger1, Carlos Rodrı´guez-Penagos1, Juan Miranda-Rı´os2, Enrique Morett2, Enrique Merino2, Araceli M. Huerta1, Luis Trevin˜ o-Quintanilla1 and Julio Collado-Vides1,* Downloaded from 1Program of Computational Genomics, Centro de Ciencias Geno´ micas, Universidad Nacional Auto´ noma de Me´ xico, A.P. 565-A, Cuernavaca, Morelos 62100, Mexico and 2Instituto de Biotecnologı´a, Universidad Nacional Auto´ noma de Me´ xico, A.P. 510-3, Cuernavaca, Morelos 62100, Mexico http://nar.oxfordjournals.org/ Received September 14, 2007; Revised October 19, 2007; Accepted October 22, 2007 ABSTRACT with more than 4000 curation notes, can now be RegulonDB (http://regulondb.ccg.unam.mx/) is the searched with the Textpresso text mining engine. primary reference database offering curated knowl- edge of the transcriptional regulatory network at Carleton University on June 21, 2015 of Escherichia coli K12, currently the best-known INTRODUCTION electronically encoded database of the genetic A major current task for bioinformatics is that of repre- regulatory network of any free-living organism. senting biological information into an electronic and This paper summarizes the improvements, new computable form. These computational representations biology and new features available in version 6.0.
    [Show full text]
  • Description Data File Programa De Genómica Computacional / Centro De Ciencias Genómicas
    Description Data File Programa de Genómica Computacional / Centro de Ciencias Genómicas DESCRIPTION DATA FILE 1. GENERAL INFORMATION. Title: LeuO and H-NS Data Set Reference: Shimada T, Bridier A, Briandet R, Ishihama A. Novel roles of LeuO in transcription regulation of E. coli genome: antagonistic interplay with the universal silencer H-NS. Mol Microbiol. 2011 Oct;82(2):378-97. PMID: 21883529 Contact person for this data set: Questions concerning the content of the data set that are raised by users of RegulonDB will be forwarded to this person. We would appreciate receiving a copy of the response to the user, so we can keep track of taking care of user requests. Person: RegulonDB staff Email address: [email protected] 2. DATA SET DESCRIPTION. Summary: LeuO Data Set. LeuO, the regulator of the leucine biosynthesis operon of Escherichia coli, is involved in the regulation of as-yet-unspecified genes that affect the stress response and pathogenesis expression. LeuO is involved in an antagonistic interplay with the universal silencer H-NS. Genomic SELEX screening was performed to identify the whole set of LeuO and H-NS regulation targets. A total of 140 LeuO-binding peaks (183 regulatory interactions) were identified by SELEX methodology, of which as many as 133 (95%) were found to contain the binding site of H-NS. In addition, a DNA microarray assay was carried out to identify genes affected by deletion or overexpression of the leuO gene. A total of 35 regulatory interactions were found via SELEX as well as microarray analysis. Based on the behavior of the regulated genes in the microarray analysis, from which it was determined that LeuO acts as an activator for 18 of them, 13 as a repressor and 4 as a dual function; we deduced that LeuO acts as a dual regulator.
    [Show full text]
  • Protein-DNA Interactions II
    Protein-DNA Interactions II Bio 5488 Overview 1. Review of information content and weight matrices 2. Online PWM resources 3. Motif discovery: Greedy algorithm 4. Motif discovery: Gibbs sampling Information Content EcoR1 Random Rap1 GAATTC GCCTAC TGTATGGGTG GAATTC ACATTC TGTTCGGATT GAATTC TCATTC TGCATGGGTG GAATTC CGACTC TGTACAGGTG GAATTC GAATTC TGTATGGATG GAATTC ATATCG TGTTCGGGTT GAATTC GAAATG TGTATGGGTG Information Content Matrix of Frequencies A: 0.1 0.7 0.2 0.3 0.4 0.1 C: 0.1 0.1 0.1 0.3 0.2 0.1 G: 0.1 0.1 0.2 0.1 0.2 0.1 T: 0.7 0.1 0.5 0.3 0.2 0.7 aka Relative Entropy, Kullbach-Liebler Distance Obtaining a Weight Matrix Hertz and Stormo, Bioinformatics (1999) Online tools: PWMs JASPAR (jaspar.genereg.net) open-access curated, non-redundant, PWMs for multi-cellular eukaryotes hundreds of PWMs JASPAR_FAM - metamodels for structural families Online tools: PWMs AC M00213 XX ID F$RAP1_C XX DT 05.09.1995 (created); dbo. DT 30.11.1995 (updated); ewi. CO Copyright (C), Biobase GmbH. XX NA RAP1 XX DE yeast repressor/activator protein 1 XX BF T00715 RAP1; Species: yeast, Saccharomyces cerevisiae. XX PO A C G T 01 6.89 2.40 0.79 4.74 W 02 8.80 0.00 4.58 1.43 R 03 7.20 6.83 0.79 0.00 M 04 14.82 0.00 0.00 0.00 A TRANSFAC 05 0.00 14.82 0.00 0.00 C 06 0.00 14.82 0.00 0.00 C Free access w/ reduced functionality 07 0.00 14.82 0.00 0.00 C 08 14.82 0.00 0.00 0.00 A to professional version 09 2.13 0.67 2.87 9.15 T 10 11.92 0.76 1.49 0.64 A 11 0.76 14.06 0.00 0.00 C 12 11.45 3.36 0.00 0.00 A Public version dates to 2005 13 0.79 5.64 0.00 7.10 Y 14 1.64 7.18 1.61 4.39 Y XX BA total weight of sequences: 14.82 XX CC consind generated matrix (random_expectation: 0.03) XX // (www.gene-regulation.com/pub/databases.html/#transfac) Online tools: PWMs RegulonDB (regulondb.ccg.unam.mx) Bacterial regulons and operons in E.
    [Show full text]
  • A Database of Regulatory Networks in Gamma-Proteobacterial Genomes Abel D
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Open Repository and Bibliography - Luxembourg D98–D102 Nucleic Acids Research, 2005, Vol. 33, Database issue doi:10.1093/nar/gki054 TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes Abel D. Gonza´lez, Vladimir Espinosa, Ana T. Vasconcelos1, Ernesto Pe´rez-Rueda2 and Julio Collado-Vides3,* National Bioinformatics Center, Industria y San Jose´, Capitolio Nacional, CP. 10200, Habana Vieja, Habana, Cuba, 1National Laboratory for Scientific Computing, Avenue Getulio Vargas 333, Quitandinha, CEP 25651-075, Petropolis, Rio de Janeiro, Brazil, 2Depto. de Ingenierı`a Celular y Biocata´lisis, IBT-UNAM, Cuernavaca, Morelos, Mexico and 3Center of Genomics, UNAM, AP 565-A Cuernavaca, CP. 62100, Morelos, Mexico Received August 6, 2004; Revised and Accepted October 1, 2004 Downloaded from ABSTRACT out in this direction (1–3). The first step towards this goal is the recognition of all the genes regulated by a transcription factor Experimental data on the Escherichia coli (TF), i.e. its regulon. transcriptional regulatory system has been used in Computational approaches to recognizing the location of http://nar.oxfordjournals.org/ the past years to predict new regulatory elements regulatory sites in bacterial genomes include the use of weight (promoters, transcription factors (TFs), TFs’ binding matrices (4), phylogenetic footprinting (5), searching for sites and operons) within its genome. As more gen- statistical overrepresentation of oligonucleotides within a omes of gamma-proteobacteria are being sequenced, genome and clustering co-expressed genes in order to find the prediction of these elements in a growing number conserved patterns in their upstream regions (6,7), among of organisms has become more feasible, as a others (8,9).
    [Show full text]
  • Synthetic CRISPR-Cas Gene Activators for Transcriptional Reprogramming in Bacteria
    Corrected: Author correction ARTICLE DOI: 10.1038/s41467-018-04901-6 OPEN Synthetic CRISPR-Cas gene activators for transcriptional reprogramming in bacteria Chen Dong1, Jason Fontana2, Anika Patel1, James M. Carothers 2,3,4 & Jesse G. Zalatan 1,2,4 Methods to regulate gene expression programs in bacterial cells are limited by the absence of effective gene activators. To address this challenge, we have developed synthetic bacterial transcriptional activators in E. coli by linking activation domains to programmable CRISPR-Cas 1234567890():,; DNA binding domains. Effective gene activation requires target sites situated in a narrow region just upstream of the transcription start site, in sharp contrast to the relatively flexible target site requirements for gene activation in eukaryotic cells. Together with existing tools for CRISPRi gene repression, these bacterial activators enable programmable control over multiple genes with simultaneous activation and repression. Further, the entire gene expression program can be switched on by inducing expression of the CRISPR-Cas system. This work will provide a foundation for engineering synthetic bacterial cellular devices with applications including diagnostics, therapeutics, and industrial biosynthesis. 1 Department of Chemistry, University of Washington, Seattle, WA 98195, USA. 2 Molecular Engineering & Sciences Institute, University of Washington, Seattle, WA 98195, USA. 3 Department of Chemical Engineering, University of Washington, Seattle, WA 98195, USA. 4 Center for Synthetic Biology, University of Washington, Seattle, WA 98195, USA. Correspondence and requests for materials should be addressed to J.M.C. (email: [email protected]) or to J.G.Z. (email: [email protected]) NATURE COMMUNICATIONS | (2018) 9:2489 | DOI: 10.1038/s41467-018-04901-6 | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/s41467-018-04901-6 acteria are attractive targets for a wide variety of engi- in bacteria suggests that RpoZ may not be effective as a general Bneering applications.
    [Show full text]
  • A User-Friendly Tool for Improving Bacterial Genome Annotation Through Analysis of Transcription Control Signals
    SigmoID: a user-friendly tool for improving bacterial genome annotation through analysis of transcription control signals Yevgeny Nikolaichik and Aliaksandr U. Damienikan Department of Molecular Biology, Belarusian State University, Minsk, Belarus ABSTRACT The majority of bacterial genome annotations are currently automated and based on a `gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bac- terial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where Submitted 26 January 2016 it didn't fit with regulatory information allowed us to correct product and gene names Accepted 29 April 2016 for over 300 loci.
    [Show full text]
  • Data Resources and Mining Tools for Reconstructing Gene Regulatory Networks in Lactococcus Lactis
    Japanese Journal of Lactic Acid Bacteria Copyright © 2011, Japan Society for Lactic Acid Bacteria Review Data resources and mining tools for reconstructing gene regulatory networks in Lactococcus lactis Anne de Jong1,2,3), Jan Kok1,2,3) and Oscar P. Kuipers1,2,3)* 1) Department of Molecular Genetics, University of Groningen, Groningen Biomolecular Sciences and Biotechnology Institute, Nijenborgh 7, 9747 AG Groningen, the Netherlands 2) Top Institute Food and Nutrition, Wageningen, the Netherlands 3) The Netherlands Kluyver Centre for Genomics of Industrial Fermentations / NCSB, Delft, the Netherlands. Abstract DNA is the blueprint and template for tRNA, rRNA, sRNA and mRNA synthesis in all living organisms. Subsequently, the mRNA is translated to proteins a process in which other RNA types, compounds and proteins play important roles. The regulation of transcription and translation is a delicate equilibrium of thousands of factors interacting within an organism. These factors vary from compound- and protein concentrations, to their activity and stability to physical parameters like temperature and pH. Especially the transcription factors and transcription factor binding sites play a central role in the regulation of gene expression. For lactococci lactis several data sets on gene expression and regulation are available in databases or literature. Furthermore, the number of (in-) complete sequenced lactococci genomes is increasing, while the majority of the strains will only be studied in silico. This review focuses on the visualization and mining of the complex transcription and translation systems via interactive graphics and network reconstruction tools and the amalgamation of DNA microarray data with biological data that together lead to the reconstruction of gene regulatory networks (GRN).
    [Show full text]
  • Nucleic Acids Research, 2004, Vol
    Nucleic Acids Research, 2004, Vol. 32, Database issue D303±D306 DOI: 10.1093/nar/gkh140 RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12 Heladia Salgado, Socorro Gama-Castro, Agustino MartõÂnez-Antonio, Edgar DõÂaz-Peredo, Fabiola SaÂnchez-Solano, MartõÂn Peralta-Gil, Del®no Garcia-Alonso, Vero nica JimeÂnez-Jacinto, Alberto Santos-Zavaleta, CeÂsar Bonavides-MartõÂnez and Julio Collado-Vides* Program of Computational Genomics, CIFN, UNAM. A.P. 565-A Cuernavaca, Morelos 62100, Mexico Received September 12, 2003; Revised October 8, 2003; Accepted October 29, 2003 ABSTRACT the molecular biology of this cell. It may be that this is the cell for which we know more about the function of its genes, its RegulonDB is the primary database of the major metabolism and transcriptional regulation. This knowledge is international maintained curation of original litera- the foundation for the proposal within the International E.coli ture with experimental knowledge about the Alliance, to achieve in E.coli, as a long-term goal, the ®rst elements and interactions of the network of tran- whole-cell model (1). We contribute to this international effort scriptional regulation in Escherichia coli K-12. This with RegulonDB, the primary database of the major inter- includes mechanistic information about operon national maintained curation of original literature with organization and their decomposition into transcrip- experimental knowledge about the elements and interactions tion units
    [Show full text]
  • Dataset Description
    Promoter_from_454_Dataset.txt Programa de Genómica Computacional / Centro de Ciencias Genómicas DATASET DESCRIPTION DATASET: Promoter_from_454_Dataset.txt. Experimental determination of Transcription Star Sites (TSSs) with High Throughput Pyrosequencing Strategy (HTPS) and computational promoter prediction. Contact person for this dataset: Person: RegulonDB team Email address: [email protected] Type of dataset: Experimental TSS mapping and computational promoter prediction Reference: Alfredo Mendoza, Leticia Olvera, Maricela Olvera, Ricardo Grande, Veronica Jiménez-Jacinto, Blanca Taboada, Leticia Vega, Katy Juárez, Heladia Salgado, Araceli Huerta, Julio Collado-Vides and Enrique Morett. (2009). Genome-wide Identification of Transcription Start Sites, Promoters and Transcription Factor Binding Sites in E. coli . PLoS ONE. 4(10):e7526. Description: This file contains: High Throughput Pyrosequencing Strategy (HTPS) Data, with and without previously know TSSs. For many, promoter region was predicted in this work. Summary: This file has a collection of more than 1491 transcription start sites (TSSs) that have been experimentally determined with high precision in Escherichia coli using an unbiased High Throughput Pyrosequencing Strategy (HTPS). From this collection, about 148 corresponded to previously reported TSSs, which helped us to benchmark both our methodologies and the accuracy of the previous mapping experiments. The other ca 1300 TSSs mapped belong to about 900 different genes, many of them with no assigned function. Página 1 de 4 Promoter_from_454_Dataset.txt Programa de Genómica Computacional / Centro de Ciencias Genómicas Methods Version of programs: We use blast version BLASTN 2.2.18 For the σ38 -10 element a new matrix was constructed with the WCONSENSUS program (version 6d), utilizing the nucleotide sequences of 71 promoters of E.
    [Show full text]
  • Investigating Binding Site Ratios of Transcription Factors Center for Advanced Biotechnology and Medicine, Stock Lab Zeyue Li Rutgers University
    Investigating Binding Site Ratios of Transcription Factors Center for Advanced Biotechnology and Medicine, Stock Lab Zeyue Li Rutgers University Abstract Results Bacterial signal transduction is controlled by transcription factors (TFs), which bind to specific binding sites on bacterial chromosomes.3 Expression levels of TFs are often presumed to be in Protein Abundance Data from PaxDB TFBS Identified by High Throughput Methods great excess to their binding sites, leading to high ratios of TF concentrations to the numbers of TF binding sites. Previous studies have suggested a median ratio of 10.4 Here we used the updated binding data from regulonDB and multiple protein abundance databases to reexamine the ratios for E. coli TFs. We discovered that TFs are unlikely to be in great excess to their binding sites. Further, we believe that the binding site data from regulonDB is biased toward the intergenic promoter regions and there are more binding sites that have not been discovered, especially those within open reading frames. To better estimate the number of binding sites, we took a bioinformatics direction, using multiple sources, including high-throughput data, gSELEX data, and computational predictions using FIMO scan of the bacterial genome. Although different sources vary in their reports of the number of binding sites, all indicated at least two fold higher numbers of binding sites than those from regulonDB. Overall, we found that repressors tend to have a higher ratio compared to activators and dual function TFs, and Figure 1: The median ratio of TF concentration (PaxDB) to TFBS is 7.23 we found that the originally reported ratio is higher than what studies with newer technology Figure 7: Box plot of the ratio of high throughput data to regulonDB data list.
    [Show full text]
  • Determination and Dissection of DNA-Binding Specificity for The
    International Journal of Molecular Sciences Article Determination and Dissection of DNA-Binding Specificity for the Thermus thermophilus HB8 Transcriptional Regulator TTHB099 Kristi Moncja and Michael W. Van Dyke * Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, GA 30144, USA; [email protected] * Correspondence: [email protected]; Tel.: +1-470-578-2793 Received: 14 October 2020; Accepted: 24 October 2020; Published: 26 October 2020 Abstract: Transcription factors (TFs) have been extensively researched in certain well-studied organisms, but far less so in others. Following the whole-genome sequencing of a new organism, TFs are typically identified through their homology with related proteins in other organisms. However, recent findings demonstrate that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs. Here we explore TTHB099, a cAMP receptor protein (CRP)-family TF from the extremophile Thermus thermophilus HB8. Using the in vitro iterative selection method Restriction Endonuclease Protection, Selection and Amplification (REPSA), we identified the preferred DNA-binding motif for TTHB099, 50–TGT(A/g)NBSYRSVN(T/c)ACA–30, and mapped potential binding sites and regulated genes within the T. thermophilus HB8 genome. Comparisons with expression profile data in TTHB099-deficient and wild type strains suggested that, unlike E. coli CRP (CRPEc), TTHB099 does not have a simple regulatory mechanism. However, we hypothesize that TTHB099 can be a dual-regulator similar to CRPEc. Keywords: bioinformatics; biolayer interferometry (BLI); electrophoretic mobility shift assay (EMSA); extremophile; protein-DNA binding; type IIS restriction endonuclease 1. Introduction Transcription factors (TFs) are DNA-binding proteins that allow for modulation of transcription initiation in response to intracellular and extracellular changes.
    [Show full text]