Identification of a Consensus DNA-Binding Site for The
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Locating Mammalian Transcription Factor Binding Sites: a Survey of Computational and Experimental Techniques
Downloaded from genome.cshlp.org on September 26, 2021 - Published by Cold Spring Harbor Laboratory Press Review Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques Laura Elnitski,1,4 Victor X. Jin,2 Peggy J. Farnham,2 and Steven J.M. Jones3 1Genomic Functional Analysis Section, National Human Genome Research Institute, National Institutes of Health, Rockville, Maryland 20878, USA; 2Genome and Biomedical Sciences Facility, University of California–Davis, Davis, California 95616-8645, USA; 3Genome Sciences Centre, British Columbia Cancer Research Centre, Vancouver, British Columbia, Canada V5Z-4E6 Fields such as genomics and systems biology are built on the synergism between computational and experimental techniques. This type of synergism is especially important in accomplishing goals like identifying all functional transcription factor binding sites in vertebrate genomes. Precise detection of these elements is a prerequisite to deciphering the complex regulatory networks that direct tissue specific and lineage specific patterns of gene expression. This review summarizes approaches for in silico, in vitro, and in vivo identification of transcription factor binding sites. A variety of techniques useful for localized- and high-throughput analyses are discussed here, with emphasis on aspects of data generation and verification. [Supplemental material is available online at www.genome.org.] One documented goal of the National Human Genome Research and other regulatory elements is mediated by DNA/protein in- Institute (NHGRI) is the identification of all functional noncod- teractions. Thus, one major step in the characterization of the ing elements in the human genome (ENCODE Project Consor- functional elements of the human genome is the identification tium 2004). -
Specificity Landscapes Unmask Submaximal Binding Site Preferences of Transcription Factors
Specificity landscapes unmask submaximal binding site preferences of transcription factors Devesh Bhimsariaa,b,1, José A. Rodríguez-Martíneza,2, Junkun Panc, Daniel Rostonc, Elif Nihal Korkmazc, Qiang Cuic,3, Parameswaran Ramanathanb, and Aseem Z. Ansaria,d,4 aDepartment of Biochemistry, University of Wisconsin–Madison, Madison, WI 53706; bDepartment of Electrical and Computer Engineering, University of Wisconsin–Madison, Madison, WI 53706; cDepartment of Chemistry, University of Wisconsin–Madison, Madison, WI 53706; and dThe Genome Center of Wisconsin, University of Wisconsin–Madison, Madison, WI 53706 Edited by Michael Levine, Princeton University, Princeton, NJ, and approved September 24, 2018 (received for review July 13, 2018) We have developed Differential Specificity and Energy Landscape Here, we report the development of Differential Specificity (DiSEL) analysis to comprehensively compare DNA–protein inter- and Energy Landscapes (DiSEL) to compare experimental actomes (DPIs) obtained by high-throughput experimental plat- platforms, computational methods, and interactomes of TFs, forms and cutting edge computational methods. While high-affinity especially those factors that bind identical consensus motifs. Our DNA binding sites are identified by most methods, DiSEL uncovered results reveal that (i) most high-throughput experimental plat- nuanced sequence preferences displayed by homologous transcription forms reliably identify high-affinity motifs but yield less reliable factors. Pairwise analysis of 726 DPIs uncovered homolog-specific dif- information on submaximal sites; (ii) with few exceptions, com- ferences at moderate- to low-affinity binding sites (submaximal sites). putational methods model DPIs with a focus on high-affinity DiSEL analysis of variants of 41 transcription factors revealed that sites; (iii) submaximal sites improve the annotation of biologi- many disease-causing mutations result in allele-specific changes in cally relevant binding sites across genomes; (iv) among members binding site preferences. -
DNA Sequencing and Sorting: Identifying Genetic Variations
BioMath DNA Sequencing and Sorting: Identifying Genetic Variations Student Edition Funded by the National Science Foundation, Proposal No. ESI-06-28091 This material was prepared with the support of the National Science Foundation. However, any opinions, findings, conclusions, and/or recommendations herein are those of the authors and do not necessarily reflect the views of the NSF. At the time of publishing, all included URLs were checked and active. We make every effort to make sure all links stay active, but we cannot make any guaranties that they will remain so. If you find a URL that is inactive, please inform us at [email protected]. DIMACS Published by COMAP, Inc. in conjunction with DIMACS, Rutgers University. ©2015 COMAP, Inc. Printed in the U.S.A. COMAP, Inc. 175 Middlesex Turnpike, Suite 3B Bedford, MA 01730 www.comap.com ISBN: 1 933223 71 5 Front Cover Photograph: EPA GULF BREEZE LABORATORY, PATHO-BIOLOGY LAB. LINDA SHARP ASSISTANT This work is in the public domain in the United States because it is a work prepared by an officer or employee of the United States Government as part of that person’s official duties. DNA Sequencing and Sorting: Identifying Genetic Variations Overview Each of the cells in your body contains a copy of your genetic inheritance, your DNA which has been passed down to you, one half from your biological mother and one half from your biological father. This DNA determines physical features, like eye color and hair color, and can determine susceptibility to medical conditions like hypertension, heart disease, diabetes, and cancer. -
DNA Binding Site of the Growth Factor-Inducible Protein Zif268
Proc. Natl. Acad. Sci. USA Vol. 86, pp. 8737-8741, November 1989 Biochemistry DNA binding site of the growth factor-inducible protein Zif268 ("zinc fingers"/transcription factor/cell growth) BARBARA CHRISTY AND DANIEL NATHANS* Howard Hughes Medical Institute and the Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205 Contributed by Daniel Nathans, August 31, 1989 ABSTRACT Zif268, a zinc finger protein whose mRNA is The recombinant plasmid was transfected into E. coli strain rapidly activated in cells exposed to growth factors or other BL21(DE3) and the transfected cells were grown as described signaling agents, is thought to play a role in regulating the (22) to an optical density of about 0.6 before induction of T7 genetic program induced by extracellular ligands. We report RNA polymerase with 0.4 mM isopropyl B-D-thiogalacto- that Zif268 has one of the characteristics of a transcriptional pyranoside (IPTG). Bacterial extracts used for DNA binding regulator, namely, sequence-specific binding to DNA. Zif268 were prepared as described (23). After solubilization of synthesized in Escherichia coli bound to two sites upstream of proteins with 4 M urea the preparation was dialyzed against the zif268 gene and to sites in the promoter regions of other 20 mM Tris, pH 7.7/50 mM KCI/10 mM MgCl2/i mM genes. The nucleotide sequences responsible for binding were EDTA/10 ,uM ZnSO4/20% (vol/vol) glycerol/1 mM dithio- defined by DNase I footprinting, by methylation interference threitol/0.2 mM phenylmethylsulfonyl fluoride/1 mM so- experiments, and by use of synthetic oligonucleotides. -
Direct Inhibition of the DNA-Binding Activity of POU Transcription Factors
Published online 25 April 2008 Nucleic Acids Research, 2008, Vol. 36, No. 10 3341–3353 doi:10.1093/nar/gkn208 Direct inhibition of the DNA-binding activity of POU transcription factors Pit-1 and Brn-3 by selective binding of a phenyl-furan-benzimidazole dication Paul Peixoto1,2, Yang Liu3, Sabine Depauw1,2, Marie-Paule Hildebrand1,2,4, David W. Boykin3, Christian Bailly1,2, W. David Wilson3 and Marie-He´ le` ne David-Cordonnier1,2,* Downloaded from https://academic.oup.com/nar/article/36/10/3341/2410535 by guest on 23 September 2021 1INSERM U-837, Team 4-‘Molecular and cellular targeting for cancer treatment’, Jean-Pierre Aubert Research Center, Institut de Recherches sur le Cancer de Lille, Place de Verdun, F-59045 Lille, 2IMPRT-IFR114, Lille, France, 3Department of Chemistry, Georgia State University, Atlanta, GA, USA and 4Institut de Recherches sur le Cancer de Lille - IRCL, Lille, France Received February 5, 2008; Revised and Accepted April 8, 2008 ABSTRACT control of transcription factors activity by those The development of small molecules to control gene compounds. expression could be the spearhead of future- targeted therapeutic approaches in multiple pathol- ogies. Among heterocyclic dications developed with INTRODUCTION this aim, a phenyl-furan-benzimidazole dication The aim of exerting precise control over the expression DB293 binds AT-rich sites as a monomer and level of specified genes using a small molecule drug is an ’ 5 -ATGA sequence as a stacked dimer, both in the objective with major consequences for many therapeutic minor groove. Here, we used a protein/DNA array applications including cancer, chronic inflammatory dis- approach to evaluate the ability of DB293 to specif- orders, neuro-degenerative or cardiovascular diseases ically inhibit transcription factors DNA-binding in a (1–4). -
Preferences in a Trait Decision Determined by Transcription Factor Variants
Preferences in a trait decision determined by transcription factor variants Michael W. Dorritya,b, Josh T. Cuperusa,c, Jolie A. Carlislea, Stanley Fieldsa,c,d,1, and Christine Queitscha,1 aDepartment of Genome Sciences, University of Washington, Seattle, WA 98195; bDepartment of Biology, University of Washington, Seattle, WA 98195; cHoward Hughes Medical Institute, University of Washington, Seattle, WA 98195; and dDepartment of Medicine, University of Washington, Seattle, WA 98195 Contributed by Stanley Fields, June 27, 2018 (sent for review April 9, 2018; reviewed by Leah E. Cowen and Andrew W. Murray) Few mechanisms are known that explain how transcription factors mating, Ste12 can bind at pheromone-responsive genes as a can adjust phenotypic outputs to accommodate differing environ- homodimer or with the cofactors Mcm1 or Matα1 (10–13). The ments. In Saccharomyces cerevisiae, the decision to mate or invade consensus DNA-binding site of Ste12 is TGAAAC, known as the relies on environmental cues that converge on a shared transcription pheromone response element (PRE) (11). For invasion, Ste12 factor, Ste12. Specificity toward invasion occurs via Ste12 binding and its cofactor Tec1 are both required to activate genes that me- cooperatively with the cofactor Tec1. Here, we determine the range diate filamentation (8, 14, 15). Some of these genes contain a of phenotypic outputs (mating vs. invasion) of thousands of DNA- Ste12 binding site near a Tec1 consensus sequence of GAATGT, an binding domain variants in Ste12 to understand how preference for organization for heterodimeric binding known as a filamentation invasion may arise. We find that single amino acid changes in the response element (FRE). -
A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection
ChromatoGate: A Tool for Detecting Base Mis-Calls in Multiple Sequence Alignments by Semi-Automatic Chromatogram Inspection Nikolaos Alachiotis Emmanouella Vogiatzi∗ Scientific Computing Group Institute of Marine Biology and Genetics HITS gGmbH HCMR Heidelberg, Germany Heraklion Crete, Greece [email protected] [email protected] Pavlos Pavlidis Alexandros Stamatakis Scientific Computing Group Scientific Computing Group HITS gGmbH HITS gGmbH Heidelberg, Germany Heidelberg, Germany [email protected] [email protected] ∗ Affiliated also with the Department of Genetics and Molecular Biology of the Democritian University of Thrace at Alexandroupolis, Greece. Corresponding author: Nikolaos Alachiotis Keywords: chromatograms, software, mis-calls Abstract Automated DNA sequencers generate chromatograms that contain raw sequencing data. They also generate data that translates the chromatograms into molecular sequences of A, C, G, T, or N (undetermined) characters. Since chromatogram translation programs frequently introduce errors, a manual inspection of the generated sequence data is required. As sequence numbers and lengths increase, visual inspection and manual correction of chromatograms and corresponding sequences on a per-peak and per-nucleotide basis becomes an error-prone, time-consuming, and tedious process. Here, we introduce ChromatoGate (CG), an open-source software that accelerates and partially automates the inspection of chromatograms and the detection of sequencing errors for bidirectional sequencing runs. To provide users full control over the error correction process, a fully automated error correction algorithm has not been implemented. Initially, the program scans a given multiple sequence alignment (MSA) for potential sequencing errors, assuming that each polymorphic site in the alignment may be attributed to a sequencing error with a certain probability. -
A Brief History of Sequence Logos
Biostatistics and Biometrics Open Access Journal ISSN: 2573-2633 Mini-Review Biostat Biometrics Open Acc J Volume 6 Issue 3 - April 2018 Copyright © All rights are reserved by Kushal K Dey DOI: 10.19080/BBOAJ.2018.06.555690 A Brief History of Sequence Logos Kushal K Dey* Department of Statistics, University of Chicago, USA Submission: February 12, 2018; Published: April 25, 2018 *Corresponding author: Kushal K Dey, Department of Statistics, University of Chicago, 5747 S Ellis Ave, Chicago, IL 60637, USA. Tel: 312-709- 0680; Email: Abstract For nearly three decades, sequence logo plots have served as the standard tool for graphical representation of aligned DNA, RNA and protein sequences. Over the years, a large number of packages and web applications have been developed for generating these logo plots and using them handling and the overall scope of these plots in biological applications and beyond. Here I attempt to review some popular tools for generating sequenceto identify logos, conserved with a patterns focus on in how sequences these plots called have motifs. evolved Also, over over time time, since we their have origin seen anda considerable how I view theupgrade future in for the these look, plots. flexibility of data Keywords : Graphical representation; Sequence logo plots; Standard tool; Motifs; Biological applications; Flexibility of data; DNA sequence data; Python library; Interdependencies; PLogo; Depletion of symbols; Alphanumeric strings; Visualizes pairwise; Oligonucleotide RNA sequence data; Visualize succinctly; Predictive power; Initial attempts; Widespread; Stylistic configurations; Multiple sequence alignment; Introduction based comparisons and predictions. In the next section, we The seeds of the origin of sequence logos were planted in review the modeling frameworks and functionalities of some of early 1980s when researchers, equipped with large amounts of these tools [5]. -
Bioinformatics Manual Sequence Data Analysis with CLC Main Workbench
Introduction to Molecular Biology and Bioinformatics Workshop Biosciences eastern and central Africa - International Livestock Research Institute Hub (BecA-ILRI Hub), Nairobi, Kenya May 5-16, 2014 Bioinformatics Manual Sequence data analysis with CLC Main Workbench Written and compiled by Joyce Njuguna, Mark Wamalwa, Rob Skilton 1 Getting started with CLC Main Workbench A. Quality control of sequenced data B. Assembling sequences C. Nucleotide and protein sequence manipulation D. BLAST sequence search E. Aligning sequences F. Building phylogenetic trees Welcome to CLC Main Workbench -- a user-friendly sequence analysis software package for analysing Sanger sequencing data and for supporting your daily bioinformatics work. Definition Sequence: the order of nucleotide bases [or amino acids] in a DNA [or protein] molecule DNA Sequencing: Biochemical methods used to determine the order of nucleotide bases, adenine(A), guanine(G),cytosine(C)and thymine(T) in a DNA strand TIP CLC Main Workbench includes an extensive Help function, which can be found in the Help menu of the program’s Menu bar. The Help can also be shown by pressing F1. The help topics are sorted in a table of contents and the topics can be searched. Also, it is recommended that you view the Online presentations where a product specialist from CLC bio demonstrates the software. This is a very easy way to get started using the program. Read more about online presentations here: http://clcbio.com/presentation. A. Getting Started with CLC Main Workbench 1. Download the trial version of CLC Main Workbench 6.8.1 from the following URL: http://www.clcbio.com/products/clc-main-workbench-direct-download/ 2. -
A STAT Protein Domain That Determines DNA Sequence Recognition Suggests a Novel DNA-Binding Domain
Downloaded from genesdev.cshlp.org on September 25, 2021 - Published by Cold Spring Harbor Laboratory Press A STAT protein domain that determines DNA sequence recognition suggests a novel DNA-binding domain Curt M. Horvath, Zilong Wen, and James E. Darnell Jr. Laboratory of Molecular Cell Biology, The Rockefeller University, New York, New York 10021 Statl and Stat3 are two members of the ligand-activated transcription factor family that serve the dual functions of signal transducers and activators of transcription. Whereas the two proteins select very similar (not identical) optimum binding sites from random oligonucleotides, differences in their binding affinity were readily apparent with natural STAT-binding sites. To take advantage of these different affinities, chimeric Statl:Stat3 molecules were used to locate the amino acids that could discriminate a general binding site from a specific binding site. The amino acids between residues -400 and -500 of these -750-amino-acid-long proteins determine the DNA-binding site specificity. Mutations within this region result in Stat proteins that are activated normally by tyrosine phosphorylation and that dimerize but have greatly reduced DNA-binding affinities. [Key Words: STAT proteins; DNA binding; site selection] Received January 6, 1995; revised version accepted March 2, 1995. The STAT (signal transducers and activators if transcrip- Whereas oligonucleotides representing these selected se- tion) proteins have the dual purpose of, first, signal trans- quences exhibited slight binding preferences, the con- duction from ligand-activated receptor kinase com- sensus sites overlapped sufficiently to be recognized by plexes, followed by nuclear translocation and DNA bind- both factors. However, by screening different natural ing to activate transcription (Darnell et al. -
Bioinformatics I Sanger Sequence Analysis
Updated: October 2020 Lab 5: Bioinformatics I Sanger Sequence Analysis Project Guide The Wolbachia Project Page Contents 3 Activity at a Glance 4 Technical Overview 5-7 Activity: How to Analyze Sanger Sequences 8 DataBase Entry 9 Appendix I: Illustrated BLAST Alignment 10 Appendix II: Sanger Sequencing Quick Reference Content is made available under the Creative Commons Attribution-NonCommercial-No Derivatives International License. Contact ([email protected]) if you would like to make adaptations for distribution beyond the classroom. The Wolbachia Project: Discover the Microbes Within! was developed by a collaboration of scientists, educators, and outreach specialists. It is directed by the Bordenstein Lab at Vanderbilt University. https://www.vanderbilt.edu/wolbachiaproject 2 Activity at a Glance Goals To analyze and interpret the quality of Sanger sequences To generate a consensus DNA sequence for bioinformatics analyses Learning Objectives Upon completion of this activity, students will (i) understand the Sanger method of sequencing, also known as the chain-termination method; (ii) be able to interpret chromatograms; (iii) evaluate sequencing Quality Scores; and (iv) generate a consensus DNA sequence based on forward and reverse Sanger reactions. Prerequisite Skills While no computer programming skills are necessary to complete this work, prior exposure to personal computers and the Internet is assumed. Teaching Time: One class period Recommended Background Tutorials • DNA Learning Center Animation: Sanger Method of -
On the Prediction of DNA-Binding Preferences of C2H2-ZF Domains Using Structural Models: Application on Human CTCF
bioRxiv preprint doi: https://doi.org/10.1101/859934; this version posted November 29, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. On the prediction of DNA-binding preferences of C2H2-ZF domains using structural models: application on human CTCF. Alberto Meseguer1,*, Filip Årman1,*, Oriol Fornes2, Ruben Molina1, Jaume Bonet3, Narcis Fernandez-Fuentes4,5, Baldo Oliva1. 1 Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, University Pompeu Fabra, Barcelona 08005, Catalonia, Spain, 2 Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Department of Medical Genetics, University of British Columbia, Vancouver, BC V5Z 4H4, Canada, 3 Laboratory of Protein Design & Immunoengineering, School of Engineering, Ecole Polytechnique Federale de Lausanne, Lausanne 1015, Vaud, Switzerland, 4 Department of Biosciences, U Science Tech, Universitat de Vic-Universitat Central de Catalunya, Vic 08500, Catalonia, Spain, 5IBERS, Institute of Biological, Environmental and Rural Science, Aberystwyth University U.K *Both authors contributed equally to the paper as first authors. ABSTRACT Cis2-His2 zinc finger (C2H2-ZF) proteins are the largest family of transcription factors in human and higher metazoans. However, the DNA-binding preferences of many members of this family remain unknown. We have developed a computational method to predict these DNA-binding preferences. We combine information from crystal structures composed by C2H2-ZF domains and from bacterial one-hybrid experiments to compute scores for protein-DNA binding based on statistical potentials.