Determination and Dissection of DNA-Binding Specificity for The

Total Page:16

File Type:pdf, Size:1020Kb

Determination and Dissection of DNA-Binding Specificity for The International Journal of Molecular Sciences Article Determination and Dissection of DNA-Binding Specificity for the Thermus thermophilus HB8 Transcriptional Regulator TTHB099 Kristi Moncja and Michael W. Van Dyke * Department of Chemistry and Biochemistry, Kennesaw State University, Kennesaw, GA 30144, USA; [email protected] * Correspondence: [email protected]; Tel.: +1-470-578-2793 Received: 14 October 2020; Accepted: 24 October 2020; Published: 26 October 2020 Abstract: Transcription factors (TFs) have been extensively researched in certain well-studied organisms, but far less so in others. Following the whole-genome sequencing of a new organism, TFs are typically identified through their homology with related proteins in other organisms. However, recent findings demonstrate that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs. Here we explore TTHB099, a cAMP receptor protein (CRP)-family TF from the extremophile Thermus thermophilus HB8. Using the in vitro iterative selection method Restriction Endonuclease Protection, Selection and Amplification (REPSA), we identified the preferred DNA-binding motif for TTHB099, 50–TGT(A/g)NBSYRSVN(T/c)ACA–30, and mapped potential binding sites and regulated genes within the T. thermophilus HB8 genome. Comparisons with expression profile data in TTHB099-deficient and wild type strains suggested that, unlike E. coli CRP (CRPEc), TTHB099 does not have a simple regulatory mechanism. However, we hypothesize that TTHB099 can be a dual-regulator similar to CRPEc. Keywords: bioinformatics; biolayer interferometry (BLI); electrophoretic mobility shift assay (EMSA); extremophile; protein-DNA binding; type IIS restriction endonuclease 1. Introduction Transcription factors (TFs) are DNA-binding proteins that allow for modulation of transcription initiation in response to intracellular and extracellular changes. Over decades of research, there have been many advances in exploring the TFs regulatory mechanisms cells use to control their gene expression. However, technological innovations such as massively parallel sequencing and data sciences have expanded our interest in new model organisms and their adaptations. TFs are trans factors that bind to cis-regulatory elements, promoter or enhancer sequences known as TF binding sites (TFBSs). It has been reported that most of the bacterial TFBSs are found in the proximal region (about 100 to +20 bp from the transcription start site [TSS]) and distal regions (up to − 200 from TSS) [1–3]. Functionally, TFs are categorized into activators and suppressors, with a few of − them being dual-regulators [4]. Regarding the number of genes regulated, TFs are classified into local or global regulators [5]. Such characteristics make up the mechanism of transcription regulation and help identify novel TFs. Proteomic studies allow the grouping of TFs into families based on structural comparison studies. However, new findings have shown that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs [6]. A more comprehensive characterization of the TF regulatory network is achieved by identifying the TFBSs, the genes regulated, and the method of regulation. Advances in computational biology and data processing have given rise to inclusive databases that can Int. J. Mol. Sci. 2020, 21, 7929; doi:10.3390/ijms21217929 www.mdpi.com/journal/ijms Int.Int. J. Mol. Sci. 20202020, 2211, x 7929 FOR PEER REVIEW 22 ofof 1818 databases that can predict structure and function for TFs in new model organisms [7]. However, most predictof these structure databases and are function built from for experimental TFs in new model studies. organisms [7]. However, most of these databases are builtTo fromgain experimentalinsights into studies.transcriptional regulatory networks in extremophile organisms, our laboratoryTo gain has insights employed into transcriptional a novel biochemistry regulatory-based networks method, in extremophile Restriction organisms,Endonuclease, our laboratorySelection, hasProtection, employed and a novelAmplification biochemistry-based (REPSA), to method, characterize Restriction several Endonuclease, TFs in the extrem Selection,e therm Protection,ophilic andmodel Amplification organism Thermus (REPSA), thermophilus to characterize HB8. To severaldate, w TFse have in studied the extreme four tetracycline thermophilic repressor model organismprotein (TetRThermus) family thermophilus transcriptionalHB8. suppressors To date, we and have have studied successfully four tetracycline identified their repressor TFBSs protein [8–11]. (TetR)Commonly family, suppressors transcriptional bind DNA suppressors in the absence and have of small successfully-molecule identifiedmodulators/cofactors their TFBSs and [8 –with11]. Commonly,high-affinity. suppressors Contrary, numerous bind DNA transcriptional in the absence activators of small-molecule employ small modulators-molecule /modulatorscofactors and in withorder high-a to bindffi nity.DNA Contrary,, thus complicating numerous their transcriptional analysis in activators vitro. employ small-molecule modulators in orderIn this to bind study, DNA, we thus explore complicating the utility their of analysis REPSA in vitro.to identify and characterize a potential thermophilicIn this study,transcriptional we explore activator, the utility TTHB099. of REPSA Protein tosequence identify homology and characterize analysis indicates a potential that thermophilicTTHB099 is one transcriptional of the four cAMP activator, receptor TTHB099. protein Protein(CRP) family sequence members homology (TTHA1437, analysis TTHA1359, indicates thatTTHB099, TTHB099 and is TTHA1567) one of the four in T cAMP. thermophilus receptor HB8 protein and (CRP)should family bind memberspalindromic (TTHA1437, DNA sequences TTHA1359, as a TTHB099,homodimer and [12 TTHA1567)]. However, in despiteT. thermophilus having a HB8cAMP and binding should domain, bind palindromic it does not DNArequire sequences this cofactor as a homodimerto bind DNA. [12 Here,]. However, we identified despite havingthe preferred a cAMP DNA binding-binding domain, sequence it does for not TTHB099 require this as the cofactor 16-mer to bindmotif: DNA. 5′–TGT(A/g)n(t/c)c(t/c)(a/g)g(a/g)n(T/c)ACA Here, we identified the preferred DNA-binding–3′. Furthermore, sequence for TTHB099we used as binding the 16-mer kinetics motif: 5studies0–TGT(A and/g)n(t mRNA/c)c(t expression/c)(a/g)g(a/ g)n(Tdata to/c)ACA–3 validate0 .potential Furthermore, biological we used roles binding of TTHB099. kinetics studies and mRNA expression data to validate potential biological roles of TTHB099. 2. Results 2. Results 2.1. Preferred TTHB099-Binding Sequences Selected Via REPSA 2.1. Preferred TTHB099-Binding Sequences Selected Via REPSA REPSA was used to select for TTHB099-binding sites present in a large pool (~60 billion molecules)REPSA wasof synthesize used to selectd double for TTHB099-binding-stranded DNA sites. Our present selection in a large library pool (~60, ST2R24 billion, molecules)has been ofsuccessfully synthesized used double-stranded in four previous DNA. studies Our selection [8–11]. library, IRDye ST2R24,® 700 (IRD7) has been-labeled successfully library used DNA in fourwas ® previousincubated studies with purified [8–11]. IRDye TTHB099700 protein (IRD7)-labeled to permit library specific DNA binding, was incubated then challenged with purified by a TTHB099 type IIS proteinrestriction to permitendonuclease specific (IISRE) binding,. Sequence then challenged-specific binding by a type of IISTTHB099 restriction to a endonuclease subset of the (IISRE).library Sequence-specificprotected those oligonucleotides binding of TTHB099 from endonuclease to a subset of activity the library, thereby protected permitting those their oligonucleotides amplification fromby PCR. endonuclease Seven rounds activity, of binding, thereby permittingIISRE cleavage their, amplificationand PCR res byulted PCR. in Seventhe enrichment rounds of binding,of DNA IISREresistant cleavage, to IISRE and cleavage PCR resulted when in TTHB099 the enrichment was present of DNA (Figure resistant 1, Round to IISRE 7) cleavage. Note that when in TTHB099Round 4, wassubstantial present uncut (Figure DNA1, Round appeared 7). Noteon the that IISRE in Roundcontrol 4,lane substantial (–/F) as well uncut as DNAthe test appeared lane (+/F) on. This the IISREnonspecific control cleavage lane (–/ F)inhibiti as wellon ashas the been test observed lane (+/F). before This and nonspecific has been cleavage ascribed inhibition to the selection has been of observedFokI cleavage before-resistant and has sequences been ascribed [8,13]. to Thus, the selection subsequent of FokIrounds cleavage-resistant of REPSA were sequencesperformed [ 8with,13]. Thus,an alternative subsequent, albeit rounds less of efficient REPSA IISRE wereperformed (BpmI). with an alternative, albeit less efficient IISRE (BpmI). Round 1 Round 2 Round 3 Round 4 Round 5 Round 6 Round 7 –/– –/F +/F –/– –/F +/F –/– –/F +/F –/– –/F +/F –/– –/B +/B –/– –/B +/B –/– –/B +/B 099 / RE - T - X - D - P FigureFigure 1.1. Selection of TTHB099-bindingTTHB099-binding DNA sequences.sequences. Shown Shown are are IR fluorescencefluorescence images ofof restrictionrestriction endonucleaseendonuclease cleavage-protection cleavage-protection assays assays made made during during Rounds Rounds 1–7 of1–7 REPSA of REPSA selection selection with 50.6with nM 50.6 TTHB099 nM TTHB099 protein. protein The presence. The presence (+) or absence
Recommended publications
  • DNA Sequencing and Sorting: Identifying Genetic Variations
    BioMath DNA Sequencing and Sorting: Identifying Genetic Variations Student Edition Funded by the National Science Foundation, Proposal No. ESI-06-28091 This material was prepared with the support of the National Science Foundation. However, any opinions, findings, conclusions, and/or recommendations herein are those of the authors and do not necessarily reflect the views of the NSF. At the time of publishing, all included URLs were checked and active. We make every effort to make sure all links stay active, but we cannot make any guaranties that they will remain so. If you find a URL that is inactive, please inform us at [email protected]. DIMACS Published by COMAP, Inc. in conjunction with DIMACS, Rutgers University. ©2015 COMAP, Inc. Printed in the U.S.A. COMAP, Inc. 175 Middlesex Turnpike, Suite 3B Bedford, MA 01730 www.comap.com ISBN: 1 933223 71 5 Front Cover Photograph: EPA GULF BREEZE LABORATORY, PATHO-BIOLOGY LAB. LINDA SHARP ASSISTANT This work is in the public domain in the United States because it is a work prepared by an officer or employee of the United States Government as part of that person’s official duties. DNA Sequencing and Sorting: Identifying Genetic Variations Overview Each of the cells in your body contains a copy of your genetic inheritance, your DNA which has been passed down to you, one half from your biological mother and one half from your biological father. This DNA determines physical features, like eye color and hair color, and can determine susceptibility to medical conditions like hypertension, heart disease, diabetes, and cancer.
    [Show full text]
  • Escherichia Coli: Protein Families and Binding Sites M
    Functional determinants of transcription factors in Escherichia coli: protein families and binding sites M. Madan Babu and Sarah A. Teichmann DNA-binding transcription factors regulate the expression A reasonable hypothesis would be to assume that of genes near to where they bind. These factors can be activators, repressors and dual regulators are more activators or repressors of transcription, or both. Thus, a closely related to the other proteins within each fundamental question is what determines whether a regulatory group than to proteins in other groups. This transcription factor acts as an activator or repressor? would mean that there would be protein families and Previous research into this question found that a protein's domain architectures that were characteristic either of regulatory function is determined by one or more of the activators, repressors or dual regulators. Prag et al. [10] following factors: protein–proteinprotein–protein contacts, position of the and Perez-Rueda et al. [11] reported evidence to support DNA-binding domain in the protein primary sequence, this by relating the position of helix-turn-helix motifs altered DNA structure, and the position of its binding site along the primary sequence to a protein's regulatory on the DNA relative to the transcription start site. Although function. They used sequence comparison and profile there are many aspects specific to different transcription methods to locate the helix-turn-helix motifs that factors, in this work we demonstrate that, in general, in the represent DBDs, and found that the helix-turn-helix prokaryote Escherichia coli, a transcription factor's protein tends to be at the N terminus for repressors and at the C family is not indicative of its regulatory function, but the terminus for activators.
    [Show full text]
  • Regulondb (Version 6.0): Gene Regulation Model Of
    D120–D124 Nucleic Acids Research, 2008, Vol. 36, Database issue doi:10.1093/nar/gkm994 RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation Socorro Gama-Castro1, Vero´ nica Jime´ nez-Jacinto1, Martı´n Peralta-Gil1, Alberto Santos-Zavaleta1,Mo´ nica I. Pen˜ aloza-Spinola1, Bruno Contreras-Moreira1, Juan Segura-Salazar1, Luis Mun˜ iz-Rascado1, Irma Martı´nez-Flores1, Heladia Salgado1,Ce´ sar Bonavides-Martı´nez1, Cei Abreu-Goodger1, Carlos Rodrı´guez-Penagos1, Juan Miranda-Rı´os2, Enrique Morett2, Enrique Merino2, Araceli M. Huerta1, Luis Trevin˜ o-Quintanilla1 and Julio Collado-Vides1,* Downloaded from 1Program of Computational Genomics, Centro de Ciencias Geno´ micas, Universidad Nacional Auto´ noma de Me´ xico, A.P. 565-A, Cuernavaca, Morelos 62100, Mexico and 2Instituto de Biotecnologı´a, Universidad Nacional Auto´ noma de Me´ xico, A.P. 510-3, Cuernavaca, Morelos 62100, Mexico http://nar.oxfordjournals.org/ Received September 14, 2007; Revised October 19, 2007; Accepted October 22, 2007 ABSTRACT with more than 4000 curation notes, can now be RegulonDB (http://regulondb.ccg.unam.mx/) is the searched with the Textpresso text mining engine. primary reference database offering curated knowl- edge of the transcriptional regulatory network at Carleton University on June 21, 2015 of Escherichia coli K12, currently the best-known INTRODUCTION electronically encoded database of the genetic A major current task for bioinformatics is that of repre- regulatory network of any free-living organism. senting biological information into an electronic and This paper summarizes the improvements, new computable form. These computational representations biology and new features available in version 6.0.
    [Show full text]
  • Description Data File Programa De Genómica Computacional / Centro De Ciencias Genómicas
    Description Data File Programa de Genómica Computacional / Centro de Ciencias Genómicas DESCRIPTION DATA FILE 1. GENERAL INFORMATION. Title: LeuO and H-NS Data Set Reference: Shimada T, Bridier A, Briandet R, Ishihama A. Novel roles of LeuO in transcription regulation of E. coli genome: antagonistic interplay with the universal silencer H-NS. Mol Microbiol. 2011 Oct;82(2):378-97. PMID: 21883529 Contact person for this data set: Questions concerning the content of the data set that are raised by users of RegulonDB will be forwarded to this person. We would appreciate receiving a copy of the response to the user, so we can keep track of taking care of user requests. Person: RegulonDB staff Email address: [email protected] 2. DATA SET DESCRIPTION. Summary: LeuO Data Set. LeuO, the regulator of the leucine biosynthesis operon of Escherichia coli, is involved in the regulation of as-yet-unspecified genes that affect the stress response and pathogenesis expression. LeuO is involved in an antagonistic interplay with the universal silencer H-NS. Genomic SELEX screening was performed to identify the whole set of LeuO and H-NS regulation targets. A total of 140 LeuO-binding peaks (183 regulatory interactions) were identified by SELEX methodology, of which as many as 133 (95%) were found to contain the binding site of H-NS. In addition, a DNA microarray assay was carried out to identify genes affected by deletion or overexpression of the leuO gene. A total of 35 regulatory interactions were found via SELEX as well as microarray analysis. Based on the behavior of the regulated genes in the microarray analysis, from which it was determined that LeuO acts as an activator for 18 of them, 13 as a repressor and 4 as a dual function; we deduced that LeuO acts as a dual regulator.
    [Show full text]
  • A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection
    ChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection Nikolaos Alachiotis Emmanouella Vogiatzi∗ Scientific Computing Group Institute of Marine Biology and Genetics HITS gGmbH HCMR Heidelberg, Germany Heraklion Crete, Greece [email protected] [email protected] Pavlos Pavlidis Alexandros Stamatakis Scientific Computing Group Scientific Computing Group HITS gGmbH HITS gGmbH Heidelberg, Germany Heidelberg, Germany [email protected] [email protected] ∗ Affiliated also with the Department of Genetics and Molecular Biology of the Democritian University of Thrace at Alexandroupolis, Greece. Corresponding author: Nikolaos Alachiotis Keywords: chromatograms, software, mis-calls Abstract Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability.
    [Show full text]
  • A Brief History of Sequence Logos
    Biostatistics and Biometrics Open Access Journal ISSN: 2573-2633 Mini-Review Biostat Biometrics Open Acc J Volume 6 Issue 3 - April 2018 Copyright © All rights are reserved by Kushal K Dey DOI: 10.19080/BBOAJ.2018.06.555690 A Brief History of Sequence Logos Kushal K Dey* Department of Statistics, University of Chicago, USA Submission: February 12, 2018; Published: April 25, 2018 *Corresponding author: Kushal K Dey, Department of Statistics, University of Chicago, 5747 S Ellis Ave, Chicago, IL 60637, USA. Tel: 312-709- 0680; Email: Abstract For nearly three decades, sequence logo plots have served as the standard tool for graphical representation of aligned DNA, RNA and protein sequences. Over the years, a large number of packages and web applications have been developed for generating these logo plots and using them handling and the overall scope of these plots in biological applications and beyond. Here I attempt to review some popular tools for generating sequenceto identify logos, conserved with a patterns focus on in how sequences these plots called have motifs. evolved Also, over over time time, since we their have origin seen anda considerable how I view theupgrade future in for the these look, plots. flexibility of data Keywords : Graphical representation; Sequence logo plots; Standard tool; Motifs; Biological applications; Flexibility of data; DNA sequence data; Python library; Interdependencies; PLogo; Depletion of symbols; Alphanumeric strings; Visualizes pairwise; Oligonucleotide RNA sequence data; Visualize succinctly; Predictive power; Initial attempts; Widespread; Stylistic configurations; Multiple sequence alignment; Introduction based comparisons and predictions. In the next section, we The seeds of the origin of sequence logos were planted in review the modeling frameworks and functionalities of some of early 1980s when researchers, equipped with large amounts of these tools [5].
    [Show full text]
  • Bioinformatics Manual Sequence Data Analysis with CLC Main Workbench
    Introduction to Molecular Biology and Bioinformatics Workshop Biosciences eastern and central Africa - International Livestock Research Institute Hub (BecA-ILRI Hub), Nairobi, Kenya May 5-16, 2014 Bioinformatics Manual Sequence data analysis with CLC Main Workbench Written and compiled by Joyce Njuguna, Mark Wamalwa, Rob Skilton 1 Getting started with CLC Main Workbench A. Quality control of sequenced data B. Assembling sequences C. Nucleotide and protein sequence manipulation D. BLAST sequence search E. Aligning sequences F. Building phylogenetic trees Welcome to CLC Main Workbench -- a user-friendly sequence analysis software package for analysing Sanger sequencing data and for supporting your daily bioinformatics work. Definition Sequence: the order of nucleotide bases [or amino acids] in a DNA [or protein] molecule DNA Sequencing: Biochemical methods used to determine the order of nucleotide bases, adenine(A), guanine(G),cytosine(C)and thymine(T) in a DNA strand TIP CLC Main Workbench includes an extensive Help function, which can be found in the Help menu of the program’s Menu bar. The Help can also be shown by pressing F1. The help topics are sorted in a table of contents and the topics can be searched. Also, it is recommended that you view the Online presentations where a product specialist from CLC bio demonstrates the software. This is a very easy way to get started using the program. Read more about online presentations here: http://clcbio.com/presentation. A. Getting Started with CLC Main Workbench 1. Download the trial version of CLC Main Workbench 6.8.1 from the following URL: http://www.clcbio.com/products/clc-main-workbench-direct-download/ 2.
    [Show full text]
  • A STAT Protein Domain That Determines DNA Sequence Recognition Suggests a Novel DNA-Binding Domain
    Downloaded from genesdev.cshlp.org on September 25, 2021 - Published by Cold Spring Harbor Laboratory Press A STAT protein domain that determines DNA sequence recognition suggests a novel DNA-binding domain Curt M. Horvath, Zilong Wen, and James E. Darnell Jr. Laboratory of Molecular Cell Biology, The Rockefeller University, New York, New York 10021 Statl and Stat3 are two members of the ligand-activated transcription factor family that serve the dual functions of signal transducers and activators of transcription. Whereas the two proteins select very similar (not identical) optimum binding sites from random oligonucleotides, differences in their binding affinity were readily apparent with natural STAT-binding sites. To take advantage of these different affinities, chimeric Statl:Stat3 molecules were used to locate the amino acids that could discriminate a general binding site from a specific binding site. The amino acids between residues -400 and -500 of these -750-amino-acid-long proteins determine the DNA-binding site specificity. Mutations within this region result in Stat proteins that are activated normally by tyrosine phosphorylation and that dimerize but have greatly reduced DNA-binding affinities. [Key Words: STAT proteins; DNA binding; site selection] Received January 6, 1995; revised version accepted March 2, 1995. The STAT (signal transducers and activators if transcrip- Whereas oligonucleotides representing these selected se- tion) proteins have the dual purpose of, first, signal trans- quences exhibited slight binding preferences, the con- duction from ligand-activated receptor kinase com- sensus sites overlapped sufficiently to be recognized by plexes, followed by nuclear translocation and DNA bind- both factors. However, by screening different natural ing to activate transcription (Darnell et al.
    [Show full text]
  • Protein-DNA Interactions II
    Protein-DNA Interactions II Bio 5488 Overview 1. Review of information content and weight matrices 2. Online PWM resources 3. Motif discovery: Greedy algorithm 4. Motif discovery: Gibbs sampling Information Content EcoR1 Random Rap1 GAATTC GCCTAC TGTATGGGTG GAATTC ACATTC TGTTCGGATT GAATTC TCATTC TGCATGGGTG GAATTC CGACTC TGTACAGGTG GAATTC GAATTC TGTATGGATG GAATTC ATATCG TGTTCGGGTT GAATTC GAAATG TGTATGGGTG Information Content Matrix of Frequencies A: 0.1 0.7 0.2 0.3 0.4 0.1 C: 0.1 0.1 0.1 0.3 0.2 0.1 G: 0.1 0.1 0.2 0.1 0.2 0.1 T: 0.7 0.1 0.5 0.3 0.2 0.7 aka Relative Entropy, Kullbach-Liebler Distance Obtaining a Weight Matrix Hertz and Stormo, Bioinformatics (1999) Online tools: PWMs JASPAR (jaspar.genereg.net) open-access curated, non-redundant, PWMs for multi-cellular eukaryotes hundreds of PWMs JASPAR_FAM - metamodels for structural families Online tools: PWMs AC M00213 XX ID F$RAP1_C XX DT 05.09.1995 (created); dbo. DT 30.11.1995 (updated); ewi. CO Copyright (C), Biobase GmbH. XX NA RAP1 XX DE yeast repressor/activator protein 1 XX BF T00715 RAP1; Species: yeast, Saccharomyces cerevisiae. XX PO A C G T 01 6.89 2.40 0.79 4.74 W 02 8.80 0.00 4.58 1.43 R 03 7.20 6.83 0.79 0.00 M 04 14.82 0.00 0.00 0.00 A TRANSFAC 05 0.00 14.82 0.00 0.00 C 06 0.00 14.82 0.00 0.00 C Free access w/ reduced functionality 07 0.00 14.82 0.00 0.00 C 08 14.82 0.00 0.00 0.00 A to professional version 09 2.13 0.67 2.87 9.15 T 10 11.92 0.76 1.49 0.64 A 11 0.76 14.06 0.00 0.00 C 12 11.45 3.36 0.00 0.00 A Public version dates to 2005 13 0.79 5.64 0.00 7.10 Y 14 1.64 7.18 1.61 4.39 Y XX BA total weight of sequences: 14.82 XX CC consind generated matrix (random_expectation: 0.03) XX // (www.gene-regulation.com/pub/databases.html/#transfac) Online tools: PWMs RegulonDB (regulondb.ccg.unam.mx) Bacterial regulons and operons in E.
    [Show full text]
  • A Database of Regulatory Networks in Gamma-Proteobacterial Genomes Abel D
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Open Repository and Bibliography - Luxembourg D98–D102 Nucleic Acids Research, 2005, Vol. 33, Database issue doi:10.1093/nar/gki054 TRACTOR_DB: a database of regulatory networks in gamma-proteobacterial genomes Abel D. Gonza´lez, Vladimir Espinosa, Ana T. Vasconcelos1, Ernesto Pe´rez-Rueda2 and Julio Collado-Vides3,* National Bioinformatics Center, Industria y San Jose´, Capitolio Nacional, CP. 10200, Habana Vieja, Habana, Cuba, 1National Laboratory for Scientific Computing, Avenue Getulio Vargas 333, Quitandinha, CEP 25651-075, Petropolis, Rio de Janeiro, Brazil, 2Depto. de Ingenierı`a Celular y Biocata´lisis, IBT-UNAM, Cuernavaca, Morelos, Mexico and 3Center of Genomics, UNAM, AP 565-A Cuernavaca, CP. 62100, Morelos, Mexico Received August 6, 2004; Revised and Accepted October 1, 2004 Downloaded from ABSTRACT out in this direction (1–3). The first step towards this goal is the recognition of all the genes regulated by a transcription factor Experimental data on the Escherichia coli (TF), i.e. its regulon. transcriptional regulatory system has been used in Computational approaches to recognizing the location of http://nar.oxfordjournals.org/ the past years to predict new regulatory elements regulatory sites in bacterial genomes include the use of weight (promoters, transcription factors (TFs), TFs’ binding matrices (4), phylogenetic footprinting (5), searching for sites and operons) within its genome. As more gen- statistical overrepresentation of oligonucleotides within a omes of gamma-proteobacteria are being sequenced, genome and clustering co-expressed genes in order to find the prediction of these elements in a growing number conserved patterns in their upstream regions (6,7), among of organisms has become more feasible, as a others (8,9).
    [Show full text]
  • Bioinformatics I Sanger Sequence Analysis
    Updated: October 2020 Lab 5: Bioinformatics I Sanger Sequence Analysis Project Guide The Wolbachia Project Page Contents 3 Activity at a Glance 4 Technical Overview 5-7 Activity: How to Analyze Sanger Sequences 8 DataBase Entry 9 Appendix I: Illustrated BLAST Alignment 10 Appendix II: Sanger Sequencing Quick Reference Content is made available under the Creative Commons Attribution-NonCommercial-No Derivatives International License. Contact ([email protected]) if you would like to make adaptations for distribution beyond the classroom. The Wolbachia Project: Discover the Microbes Within! was developed by a collaboration of scientists, educators, and outreach specialists. It is directed by the Bordenstein Lab at Vanderbilt University. https://www.vanderbilt.edu/wolbachiaproject 2 Activity at a Glance Goals To analyze and interpret the quality of Sanger sequences To generate a consensus DNA sequence for bioinformatics analyses Learning Objectives Upon completion of this activity, students will (i) understand the Sanger method of sequencing, also known as the chain-termination method; (ii) be able to interpret chromatograms; (iii) evaluate sequencing Quality Scores; and (iv) generate a consensus DNA sequence based on forward and reverse Sanger reactions. Prerequisite Skills While no computer programming skills are necessary to complete this work, prior exposure to personal computers and the Internet is assumed. Teaching Time: One class period Recommended Background Tutorials • DNA Learning Center Animation: Sanger Method of
    [Show full text]
  • Identification of Factors That Interact with the E IA-Inducible Adenovirus E3 Promoter
    Downloaded from genesdev.cshlp.org on September 30, 2021 - Published by Cold Spring Harbor Laboratory Press Identification of factors that interact with the E IA-inducible adenovirus E3 promoter Helen C. Hurst and Nicholas C. Jones Gene Regulation Group, Imperial Cancer Research Fund, London, England, WC2A 3PX We have investigated the E1A-inducible E3 promoter of adenovirus type 5 with respect to its ability to bind specific nuclear proteins. Four distinct nucleoprotein-binding sites were detected, located between positions - 7 to -33, -44 to -68, -81 to - 103, and - 154 to - 183, relative to the E3 cap site. These sites contain sequences previously shown to be functionally important for efficient E3 transcription. No major qualitative or quantitative differences were found in the binding pattern between nucleoprotein extracts prepared from uninfected or adenovirus-infected HeLa cells. Competition experiments suggest that the factors binding to the - 154 to - 183 and -81 to - 103 sites are the previously identified nucleoproteins, NFI and AP1, respectively. The factor binding to the -44 to -68 site, which we term ATF, also interacts with other E1A-inducible promoters and is very similar and probably identical to the factor that binds to the cAMP-responsive element of somatostatin. We have purified this factor, which is a protein of 43 kD in size. [Key Words: Trans-activation; gel retardation; footprinting; nuclear factors] Received June 12, 1987; revised version accepted October 5, 1987. The control of the rate of transcription initiation is an (Borrelli et al. 1984; Velcich and Ziff 1985), whereas the important element in the regulation of eukaryotic gene protein of 243 amino acids, although it represses very expression.
    [Show full text]