Simultaneous Characterization of Sense and Antisense Genomic

Total Page:16

File Type:pdf, Size:1020Kb

Simultaneous Characterization of Sense and Antisense Genomic Nucleic Acids Research Advance Access published November 17, 2015 Nucleic Acids Research, 2015 1 doi: 10.1093/nar/gkv1184 Simultaneous characterization of sense and antisense genomic processes by the double-stranded hidden Markov model Julia Glas1,†, Sebastian Dumcke¨ 2,3,†, Benedikt Zacher1,2,†,DonPoron3, Julien Gagneur1 and Achim Tresch1,2,3,* 1Gene Center Munich and Department of Biochemistry, Ludwig-Maximilians-Universitat¨ Munchen,¨ Feodor-Lynen-Straße 25, 81377 Munich, Germany, 2Department of Plant Breeding and Genetics, Max Planck Institute for Plant Breeding Research, Carl-von-Linne-Weg´ 10, 50829 Cologne, Germany and 3Institute for Genetics, University of Cologne, Zulpicher¨ Str. 47b, 50674 Cologne, Germany Downloaded from Received June 19, 2015; Revised October 16, 2015; Accepted October 24, 2015 ABSTRACT group data points into a finite number of distinct ‘groups’ or ‘states’, according to some measure of similarity. Our Hidden Markov models (HMMs) have been exten- present work considers the analysis of data from several http://nar.oxfordjournals.org/ sively used to dissect the genome into functionally experiments whose output can be aligned to a genomic se- distinct regions using data such as RNA expression quence, such as RNA expression-, ChIP- or DNA methyla- or DNA binding measurements. It is a challenge to tion data. The main purpose is to cluster genomic positions disentangle processes occurring on complementary into ‘states’, which ideally correspond to distinct biological strands of the same genomic region. We present functions. Hidden Markov models (HMMs) have become the double-stranded HMM (dsHMM), a model for the the method of choice, since they additionally account for strand-specific analysis of genomic processes. We the dependency of consecutive observations introduced by applied dsHMM to yeast using strand specific tran- the sequential structure of the data. HMMs were success- at MPI Study of Societies on April 4, 2016 scription data, nucleosome data, and protein binding fully used for dissecting the genome into ‘chromatin states’ (1) or ‘transcription states’ (2). Recently, HMMs were em- data for a set of 11 factors associated with the regula- ployed to infer distinct genomic states from genome-wide tion of transcription.The resulting annotation recov- ChIP data in human (1,3–8), fly (9,10), Arabidopsis (11)and ers the mRNA transcription cycle (initiation, elonga- worm (12,13). However, the drawback of the HMMs used tion, termination) while correctly predicting strand- in these applications is their inability to integrate stand- specificity and directionality of the transcription pro- specific (e.g. RNA expression) with non-strand-specific (e.g. cess. We find that pre-initiation complex formation ChIP) data and thus limiting the analysis either to only non- is an essentially undirected process, giving rise to a strand-specific or only single-stranded data. This limitation large number of bidirectional promoters and to per- was first addressed in (14). There, the hidden Markov chain vasive antisense transcription. Notably, 12% of all is replaced by a dynamic Bayesian network (DBN). This al- transcriptionally active positions showed simultane- lows the modeling of strand-specific data and the introduc- ous activity on both strands. Furthermore, dsHMM tion of structured states, i.e. hierarchical labels for each po- sition. Other approaches employed reversible HMMs (15), reveals that antisense transcription is specifically which were further extended by the bidirectional hidden suppressed by Nrd1, a yeast termination factor. Markov model (2). Still all of these models are unable to ac- count for overlapping processes that might occur on both INTRODUCTION DNA strands. This situation is frequently encountered in compact genomes. For example, cryptic unstable transcripts The rapidly growing amount of heterogeneous data gen- (CUTs) and stable uncharacterized transcripts (SUTs) often erated by experimental high-throughput techniques makes overlap with annotated features (16). Neil et al. (17) showed integrative data analysis an essential part of molecular bi- that by far the most CUTs in yeast are antisense CUTs, i.e. ology. One major purpose is the creation of comprehen- CUTs that are transcribed from between tandem features in sive views on high dimensional data which are impossi- antisense direction. ble to obtain manually. To that end, clustering methods *To whom correspondence should be addressed. Tel: +49 89 21807634; Fax: +44 89 218076797; Email: [email protected] †These authors contributed equally to the work as the first authors. C The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 2 Nucleic Acids Research, 2015 In the present work we introduce double-stranded hidden Markov models (dsHMMs), which explicitly model the for- ward and reverse DNA strand by two Markov chains run- ning in opposite directions. Therefore dsHMMs are able to disentangle the two strands at every single position of the genome. dsHMMs capture the biology of directed genomic processes such as transcription, which often occur at the same position on both strands. We illustrate the use of dsH- MMs on a data set comprised of strand-specific expression, nucleosome occupancy and ChIP-chip data of 11 factors involved in yeast transcription. We present the first strand- specific map of transcription states in yeast. Our results con- firm the role of Nrd1 in a non-canonical pathway of tran- scription termination, which predominately occurs in anti- sense direction of stable gene transcripts and is mainly in- volved in the termination of CUTs (16,18). Downloaded from MATERIALS AND METHODS Definition of the dsHMM Let P be a set of genome-wide experiments (‘tracks’) which http://nar.oxfordjournals.org/ give rise to a sequence of observables O = (o1, ..., oT), ot ∈ RP j ,whereot contains the measurement value of track j at position t. The observables are emitted by two independent, homogeneous Markov chains of hidden variables running S+ = +, ..., + in opposite direction, the forward chain (s1 sT ) S− = −, ..., − and the reverse chain (s1 sT ). The idea is that among a finite set of biological processes D, each state + ∈ D − ∈ D st (respectively st ) indicates which process is tak- ing place on the forward (respectively reverse) DNA strand. Figure 1. (A) Graphical representation of a dsHMM showing two hidden at MPI Study of Societies on April 4, 2016 { +,..., +} { −,..., −} state chains s1 sT and s1 sT which run in opposite direction The emission distribution of an observation ot is condition- +, − = ally independent of all other observations, given the hidden (white circles). Each state pair (st st ) emits an observation ot, t 1, ..., +, − T (gray circles). (B) Viterbi paths inferred from a synthetic data set us- state pair (st st ). A graphical specification of the double- ing three different HMM models. The top panel shows a simulated ChIP stranded hidden Markov model (dsHMM) is given in Fig- experiment (purple track) and an RNA-Seq experiment with forward (or- ure 1A. According to our assumptions, the joint likelihood ange track) and reverse strand (brown track) expression data for a genomic of a dsHMM factors into region containing partly overlapping genes (arrows in the middle panel) located on both DNA strands. The bottom panel shows the Viterbi paths P(O, S+, S−) = P(O | S+, S−) · P(S+) · P(S−)(1) obtained from the standard HMM, the bdHMM and the dsHMM with three hidden states: an intergenic state (gray) and two gene-specific states (red and green). For the bdHMM, equivalent forward and reverse states are T indicated by the same color, positioned either above (forward direction) or O | S+, S− = | +, − below (reverse direction) the undirected gray states. P( ) P(ot st st )(2) t=1 π ∈ RD T for some initial state distribution . For convenience, + + + + we require that A is ergodic, and that ␲ is its unique steady P(S ) = P(s ) P(s | s − ), 1 t t 1 state distribution. A dsHMM can then be transformed into t=2 (3) a standard HMM with state space D2 = D × D,withstate T S = , ..., = +, − ∈ D2 − − − − sequence (s1 sT), st (st st ) . The transition P(S ) = P(s ) P(s − | s ) D2×D2 T t 1 t matrix B = (brs) ∈ R becomes t=2 − + − + − = + + − − π − π 1 , = , , = , ∈ D2 It is natural to assume that state transitions happening brs ar s as r s r− r (r r ) s (s s ) (6) in forward direction on the forward strand have the same D2 probability as state transitions on the reverse strand hap- and the initial state distribution, τ ∈ R is τs = πs+ πs− pening in reverse direction, i.e. (see Supplementary Data Part I Section 1 for details). The + + − − dsHMM is a reversible HMM (see Supplementary Data P(s = j | s = i) = P(s = j | s = i) = a , i, j ∈ D t t−1 t−1 t ij (4) Part I Section 1 Remark 3). The transformation of the D×D dsHMM into a standard HMM will allow us to apply well- for some transition matrix A = (aij) ∈ R . By the same reasoning, we assume known, efficient techniques for HMM learning, namely the Forward-Backward, Viterbi and Baum–Welch algorithms + = = − = = π , ∈ D P(s1 i) P(sT i) i i (5) (see Supplementary Data Part I Sections 3–5). Nucleic Acids Research, 2015 3 It remains to find a sparse parametrization of the emis- expression data. The selected proteins cover most of the fac- 2 sion distributions ␺ s(o) = P(o|s), s ∈ D . Let the obser- tors involved in the mRNA transcription cycle. They in- vation space RP be the Cartesian product of the strand- clude initiation factors, different types of elongation factors unspecific observations RB, and the forward- respectively as well as termination factors.
Recommended publications
  • IGA 8/E Chapter 8
    8 RNA: Transcription and Processing WORKING WITH THE FIGURES 1. In Figure 8-3, why are the arrows for genes 1 and 2 pointing in opposite directions? Answer: The arrows for genes 1 and 2 indicate the direction of transcription, which is always 5 to 3. The two genes are transcribed from opposite DNA strands, which are antiparallel, so the genes must be transcribed in opposite directions to maintain the 5 to 3 direction of transcription. 2. In Figure 8-5, draw the “one gene” at much higher resolution with the following components: DNA, RNA polymerase(s), RNA(s). Answer: At the higher resolution, the feathery structures become RNA transcripts, with the longer transcripts occurring nearer the termination of the gene. The RNA in this drawing has been straightened out to illustrate the progressively longer transcripts. 3. In Figure 8-6, describe where the gene promoter is located. Chapter Eight 271 Answer: The promoter is located to the left (upstream) of the 3 end of the template strand. From this sequence it cannot be determined how far the promoter would be from the 5 end of the mRNA. 4. In Figure 8-9b, write a sequence that could form the hairpin loop structure. Answer: Any sequence that contains inverted complementary regions separated by a noncomplementary one would form a hairpin. One sequence would be: ACGCAAGCUUACCGAUUAUUGUAAGCUUGAAG The two bold-faced sequences would pair and form a hairpin. The intervening non-bold sequence would be the loop. 5. How do you know that the events in Figure 8-13 are occurring in the nucleus? Answer: The figure shows a double-stranded DNA molecule from which RNA is being transcribed.
    [Show full text]
  • Comparative Analyses of Long Non-Coding RNA in Lean and Obese Pigs
    www.impactjournals.com/oncotarget/ Oncotarget, 2017, Vol. 8, (No. 25), pp: 41440-41450 Research Paper Comparative analyses of long non-coding RNA in lean and obese pigs Lin Yu1, Lina Tai1, Lifang Zhang1, Yi Chu1, Yixing Li1 and Lei Zhou1 1State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, College of Animal Science and Technology, Guangxi University, Nanning, P.R. China Correspondence to: Yixing Li, email: [email protected] Lei Zhou, email: [email protected] Keywords: lncRNA, pig, obesity, QTL Received: April 28, 2017 Accepted: May 15, 2017 Published: May 26, 2017 Copyright: Yu et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License 3.0 (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. ABSTRACT Objectives: Current studies have revealed that long non-coding RNA plays a crucial role in fat metabolism. However, the difference of lncRNA between lean (Duroc) and obese (Luchuan) pig remain undefined. Here, we investigated the expressional profile of lncRNA in these two pigs and discussed the relationship between lncRNA and fat deposition. Materials and Methods: The Chinese Luchuan pig has a dramatic differences in backfat thickness as compared with Duroc pig. In this study, 4868 lncRNA transcripts (including 3235 novel transcripts) were identified. We determined that patterns of differently expressed lncRNAs and mRNAs are strongly tissue-specific. The differentially expressed lncRNAs in adipose tissue have 794 potential target genes, which are involved in adipocytokine signaling pathways, the PI3k-Akt signaling pathway, and calcium signaling pathways.
    [Show full text]
  • Differences in Methylation Patterns in the Methylation Boundary Region of IDS Gene in Hunter Syndrome Patients: Implications for Cpg Hot Spot Mutations
    European Journal of Human Genetics (2006) 14, 838–845 & 2006 Nature Publishing Group All rights reserved 1018-4813/06 $30.00 www.nature.com/ejhg ARTICLE Differences in methylation patterns in the methylation boundary region of IDS gene in hunter syndrome patients: implications for CpG hot spot mutations Shunji Tomatsu*,1, Kazuko Sukegawa2, Georgeta G Trandafirescu1, Monica A Gutierrez1, Tatsuo Nishioka1, Seiji Yamaguchi3, Tadao Orii2, Roseline Froissart4, Irene Maire4, Amparo Chabas5, Alan Cooper6, Paola Di Natale7, Andreas Gal8, Akihiko Noguchi1 and William S Sly9 1Department of Pediatrics, Saint Louis University, Pediatric Research Institute, St Louis, MO, USA; 2Department of Pediatrics, Gifu University School of Medicine, Gifu, Japan; 3Department of Pediatrics, Shimane University, Shimane, Japan; 4Centre d’Etude des Maladies Me´taboliques, Hoˆpital Debrousse, Lyon France; 5Institut de Bioquimica Clinica, Barcelona, Spain; 6Willink Biochemical Genetics Unit, Royal Manchester Children’s Hospital, Great Britain, UK; 7Department of Biochemistry and Medical Biotechnologies, University of Naples, Federico II, Italy; 8Institut fur Humangenetik, Universitatsklinikum Hamburg-Eppendorf, Hamburg, Germany; 9Edward A. Doisy Department of Biochemistry and Molecular Biology, Saint Louis University School of Medicine, St Louis, MO, USA Hunter syndrome, an X-linked disorder, results from deficiency of iduronate-2-sulfatase (IDS). Around 40% of independent point mutations at IDS were found at CpG sites as transitional events. The 15 CpG sites in the coding sequences of exons 1 and 2, which are normally hypomethylated, account for very few of transitional mutations. By contrast, the CpG sites in the coding sequences of exon 3, though also normally hypomethylated, account for much higher fraction of transitional mutations.
    [Show full text]
  • Lecture #1: Introduction Syllabus Introduction to Biology
    Special Topics in Computational Biology Lecture #1: Introduction Bud Mishra Professor of Computer Science and Mathematics 1 ¦ 25 ¦ 2001 Syllabus • Introductory Material – What do we know? – Biological information – Biotechnology (e.g. arrays, PCR, hybridization; single molecules; mass spectrometry) – Some biology (terminology) • Population Genetics – Diseases – Linkage analysis – Kinship analysis • Comparative Genomics – Phylogeny – Gene rearrangements between species – Gene families within specie • Functional Genomics – Taking cells at different stages of development, what can we infer from gene expression levels data? Can we determine the sequence of gene activation? Tools that allow biologists to try to answer these questions.) – Genetic Networks – Clustering algorithms • Proteomics • Cancer Genomics – (What can be done here) Introduction to Biology • Genome: – Hereditary information of an organism is encoded in its DNA and enclosed in a cell (unless it is a virus). All the information contained in the DNA of a single organism is its genome. • DNA molecule can be thought of as a very long sequence of nucleotides or bases: = {A, T, C, G} Complementarity • DNA is a double-stranded polymer and should be thought of as a pair of sequences over . However, there is a relation of complementarity between the two sequences: – A , T, C , G – That is if there is an A (respectively, T, C, G) on one sequence at a particular position then the other sequence must have a T (respectively, A, G, C) at the same position. • We will measure the sequence length (or the DNA length) in terms of base pairs (bp): for instance, human (H. sapiens) DNA is 3.3 £ 109 bp measuring about 6 ft of DNA polymer completely stretched out! Genome Size The genomes vary widely in size: measuring from » 11 • Few thousand base pairs for viruses to 2 » 3 £ 10 bp for certain amphibian and flowering plants.
    [Show full text]
  • Antisense Rnas During Early Vertebrate Development Are Divided in Groups with Distinct Features
    Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press Research Antisense RNAs during early vertebrate development are divided in groups with distinct features Sanjana Pillay,1 Hazuki Takahashi,2 Piero Carninci,2,3 and Aditi Kanhere4,5 1Department of Cell, Developmental and Regenerative Biology, Mount Sinai School of Medicine, New York, New York 10029, USA; 2Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan; 3Fondazione Human Technopole, 20157 Milan, Italy; 4Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 3GE, United Kingdom Long noncoding RNAs or lncRNAs are a class of non-protein-coding RNAs that are >200 nt in length. Almost 50% of lncRNAs during zebrafish development are transcribed in an antisense direction to a protein-coding gene. However, the role of these natural antisense transcripts (NATs) during development remains enigmatic. To understand NATs in early vertebrate development, we took a computational biology approach and analyzed existing as well as novel data sets. Our analysis indicates that zebrafish NATs can be divided into two major classes based on their coexpression patterns with re- spect to the overlapping protein-coding genes. Group 1 NATs have characteristics similar to maternally deposited RNAs in that their levels decrease as development progresses. Group 1 NAT levels are negatively correlated with that of overlapping sense-strand protein-coding genes. Conversely, Group 2 NATs are coexpressed with overlapping protein-coding genes. In contrast to Group 1, which is enriched in genes involved in developmental pathways, Group 2 protein-coding genes are en- riched in housekeeping functions.
    [Show full text]
  • UN2005/UN2401 '20 -- Lecture # 13 -- RNA & Protein Synthesis 0. Energy Poll (#1 of Last Time) Revisited II . Central Dogma
    UN2005/UN2401 '20 -- Lecture # 13 -- RNA & Protein Synthesis (c) 2020 Mowshowitz Department of Biological Sciences Columbia University New York, NY. Last revised 10/22/20 Handouts: You will need 12 A – PCR 13 A -- code table & tRNA structure 13 B -- DNA synthesis vs RNA synthesis PDFs are linked to the Handouts page. Note that Dr. Mary Ann Price has made videos of all class demonstrations; see the videos link on the Course Menu. (We also have 2 older videos of protein synthesis; links are below.) 0. Energy Poll (#1 of last time) Revisited Qs: How many ATPs are used and how many ‘high energy bonds’ are broken to add one dCMP to a growing chain? Ans: It takes two ATP but >2 high energy bonds. Details: 1. Two high energy bonds (in ATP) are broken to convert dCMP to dCTP. This is not a net change, but involves the breaking of 2 high energy bonds (in the ATP) to form 2 (in the dCTP). 2. Then two P's in the dCTP are knocked off in two steps, breaking two high energy bonds. (First, the bond between the inner P and the other 2 is broken, and PPi is released. Then the bond between the two 2 P's in PPi is broken.) P = phosphate. 3. Overall: Four high energy bonds are broken, but two are made, for a net change of 2. So 4 high energy bonds are broken, but only 2 high energy bonds are used up (net), More Q to think about: 1. How come the same enzyme can catalyze a reaction in both directions? Say A B and B A? 2.
    [Show full text]
  • Identification of Novel Non-Coding RNA-Based Negative Feedback Regulating the Expression of the Oncogenic Transcription Factor GLI1
    Accepted Manuscript Identification of novel non-coding RNA-based negative feedback regulating the expression of the oncogenic transcription factor GLI1 Victoria E. Villegas, Mohammed Ferdous-Ur Rahman, Maite G. Fernandez-Barrena, Yumei Diao, Eleni Liapi, Enikö Sonkoly, Mona Ståhle, Andor Pivarcsi, Laura Annaratone, Anna Sapino, Sandra Ramírez Clavijo, Thomas R. Bürglin, Takashi Shimokawa, Saraswathi Ramachandran, Philipp Kapranov, Martin E. Fernandez- Zapico, Peter G. Zaphiropoulos PII: S1574-7891(14)00057-X DOI: 10.1016/j.molonc.2014.03.009 Reference: MOLONC 487 To appear in: Molecular Oncology Received Date: 16 November 2013 Revised Date: 18 February 2014 Accepted Date: 11 March 2014 Please cite this article as: Villegas, V.E., Ferdous-Ur Rahman, M., Fernandez-Barrena, M.G., Diao, Y., Liapi, E., Sonkoly, E., Ståhle, M., Pivarcsi, A., Annaratone, L., Sapino, A., Ramírez Clavijo, S., Bürglin, T.R., Shimokawa, T., Ramachandran, S., Kapranov, P., Fernandez-Zapico, M.E., Zaphiropoulos, P.G., Identification of novel non-coding RNA-based negative feedback regulating the expression of the oncogenic transcription factor GLI1, Molecular Oncology (2014), doi: 10.1016/j.molonc.2014.03.009. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. ACCEPTED MANUSCRIPT Identification of novel non-coding RNA-based negative feedback regulating the expression of the oncogenic transcription factor GLI1 Victoria E.Villegas 1,2§ , Mohammed Ferdous-Ur Rahman 1§ , Maite G.
    [Show full text]
  • Downloaded from SRA Database (
    bioRxiv preprint doi: https://doi.org/10.1101/2020.02.08.940148; this version posted February 10, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Pillay et al. Antisense RNAs in Zebrafish development Antisense ncRNAs during early vertebrate development are divided in groups with distinct features Sanjana Pillay1, Hazuki Takahashi2, Piero Carninci2, Aditi Kanhere1,3* 1 School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK 2 Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences, Yokohama, Kanagawa, 230-0045, Japan 3 Institute of Translational Medicine, University of Liverpool, Liverpool, L69 3GE, UK* * To whom correspondence should be addressed. Tel: +44 (0)151 795 1292 Email: [email protected] Present Address: Dr Aditi Kanhere, Department of Molecular and Clinical Pharmacology, Institute of Translational Medicine, University of Liverpool, Crown Street, Liverpool, L69 3BX, United Kingdom. 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.02.08.940148; this version posted February 10, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Pillay et al. Antisense RNAs in Zebrafish development ABSTRACT Long non-coding RNAs or lncRNAs are a broad class of non-protein coding RNAs that are >200 nucleotides in length. A number of lncRNAs are shown to play an important role in gene expression regulation. LncRNAs antisense to a protein-coding gene can act either as positive or negative regulators of overlapping protein-coding mRNAs.
    [Show full text]
  • Central Dogma of Genetics
    Central Dogma of Genetics • Within each cell the genetic information flows from – DNA to RNA to protein. • This flow of information is unidirectional and irreversible. • The information carried within the DNA dictates the end product (protein) that will be synthesized. – This information is the genetic code. • Conversion of DNA encoded information to RNA – is called transcription. • The information from a mRNA is then translated to an amino acid sequence in the corresponding protein 1 Central Dogma 2 Molecular Biology of the Cell, 4th Edition. How do we know RNA is intermediate and moves to Cytoplasm in Euks? Grown briefly in Hot Uracil, spun out and grown in Cold Uracil. 3 Where does the radioactivity incorporated into RNA go? RNA is Different than DNA 4 Transcription • Process by which the genetic information is conveyed from a double stranded DNA molecule to a single stranded RNA molecule. • Only one strand of DNA serves as a template: – this is the transcribed or anti-sense strand. • The complementary strand has a sequence identical to the RNA sequence (except for a U in place of a T), – is called the sense strand, or the RNA-like strand. 5 Designation of DNA strands Sense strand Anti-sense strand 6 RNA is always 5’ to 3’ 7 Salient Features of Transcription • RNA polymerase: – catalyzes the addition of one ribonucleotide at a time, – extending the RNA strand being synthesized in the 5’ to 3’ direction. • Promoter: – DNA sequences near the beginning of a gene. – These signal the RNA polymerase to begin transcription. • Terminators: – sequences within the RNA products, – which signal the RNA polymerase to stop transcription.
    [Show full text]
  • The Role and Regulation of Histone H2B Monoubiquitination During Tumorigenesis
    The role and regulation of histone H2B monoubiquitination during tumorigenesis Dissertation for the award of the degree “Doctor rerum naturalium (Dr. rer. nat.)“ Division of Mathematics and Natural Sciences of the Georg-August-Universität Göttingen submitted by Theresa Gorsler born in Leinefelde Hamburg, 2013 Members of the Thesis Committee: Prof. Dr. Steven A. Johnsen (Reviewer) Department of Tumor Biology University Medical Center Hamburg-Eppendorf, Hamburg Prof. Dr. Holger Reichardt (Reviewer) Department of Cellular and Molecular Immunology University of Göttingen Medical School, Göttingen Prof. Dr. Dieter Kube Department of Immunology and Experimental Oncology University of Göttingen Medical School, Göttingen Date of the oral examination: 3rd of june 2013 Affidavit I hereby declare that the PhD thesis entitled “The role and regulation of histone H2B monoubiquitination during tumorigenesis” has been written independently and with no other sources and aids than quoted. _____________________________ Theresa Gorsler April, 2013 Hamburg Table of Contents Abbreviations ............................................................................................... I List of Figures ............................................................................................ VI Summary ................................................................................................. VIII 1 Introduction .............................................................................................. 1 1.1 Chromatin structure ...............................................................................................
    [Show full text]
  • Comparative Analysis of Alternative Polyadenylation in S. Cerevisiae and S. Pombe
    Downloaded from genome.cshlp.org on October 4, 2021 - Published by Cold Spring Harbor Laboratory Press Research Comparative analysis of alternative polyadenylation in S. cerevisiae and S. pombe Xiaochuan Liu,1 Mainul Hoque,1 Marc Larochelle,2 Jean-François Lemay,2 Nathan Yurko,3 James L. Manley,3 François Bachand,2 and Bin Tian1 1Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA; 2RNA Group, Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada; 3Department of Biological Sciences, Columbia University, New York, New York 10027, USA Alternative polyadenylation (APA) is a widespread mechanism that generates mRNA isoforms with distinct properties. Here we have systematically mapped and compared cleavage and polyadenylation sites (PASs) in two yeast species, S. cer- evisiae and S. pombe. Although >80% of the mRNA genes in each species were found to display APA, S. pombe showed greater 3′ UTR size differences among APA isoforms than did S. cerevisiae. PASs in different locations of gene are sur- rounded with distinct sequences in both species and are often associated with motifs involved in the Nrd1-Nab3-Sen1 ter- mination pathway. In S. pombe, strong motifs surrounding distal PASs lead to higher abundances of long 3′ UTR isoforms than short ones, a feature that is opposite in S. cerevisiae. Differences in PAS placement between convergent genes lead to starkly different antisense transcript landscapes between budding and fission yeasts. In both species, short 3′ UTR iso- forms are more likely to be expressed when cells are growing in nutrient-rich media, although different gene groups are affected in each species.
    [Show full text]
  • Insights Into Genomic Imprinting and Neurodevelopmental Phenotypes
    Review Angelman syndrome: insights into genomic imprinting and neurodevelopmental phenotypes Angela M. Mabb*, Matthew C. Judson*, Mark J. Zylka and Benjamin D. Philpot Department of Cell and Molecular Physiology, UNC Neuroscience Center, and Carolina Institute for Developmental Disabilities, University of North Carolina, Chapel Hill, NC 27599, USA Angelman syndrome (AS) is a severe genetic disorder arise from microdeletions that affect imprinting at the caused by mutations or deletions of the maternally 15q11-q13 locus (2-4%) or from paternal uniparental inherited UBE3A gene. UBE3A encodes an E3 ubiquitin disomy (7%), where two copies of an epigenetically si- ligase that is expressed biallelically in most tissues but is lenced UBE3A allele are inherited [17,18]. Interestingly, maternally expressed in almost all neurons. In this re- while 15q11–q13 deletions cause AS, the most genetically view, we describe recent advances in understanding the identifiable form of autism results from maternal duplica- expression and function of UBE3A in the brain and the tion of the 15q11–q13 locus encompassing UBE3A [19–22]. etiology of AS. We highlight current AS model systems, Although pharmacological options to control mood and epigenetic mechanisms of UBE3A regulation, and the sleep disorders have been partially effective [23], in gener- identification of potential UBE3A substrates in the brain. al, AS therapeutics have met with limited success. For In the process, we identify major gaps in our knowledge example, frequent seizures in AS patients are especially that, if bridged, could move us closer to identifying difficult to treat. Many AS patients exhibit unique seizure treatments for this debilitating neurodevelopmental dis- types and only a fraction of these individuals respond to order.
    [Show full text]