Biosynthesis and Function of Modified Bases in Bacteria and Their Viruses
Total Page:16
File Type:pdf, Size:1020Kb
This is an open access article published under a Creative Commons Non-Commercial No Derivative Works (CC-BY-NC-ND) Attribution License, which permits copying and redistribution of the article, and creation of adaptations, all for non-commercial purposes. Review pubs.acs.org/CR Biosynthesis and Function of Modified Bases in Bacteria and Their Viruses Peter Weigele Chemical Biology, New England Biolabs, Ipswich, Massachusetts 01938, United States Elisabeth A. Raleigh* Research, New England Biolabs, Ipswich, Massachusetts 01938, United States *S Supporting Information ABSTRACT: Naturally occurring modification of the canonical A, G, C, and T bases can be found in the DNA of cellular organisms and viruses from all domains of life. Bacterial viruses (bacteriophages) are a particularly rich but still underexploited source of such modified variant nucleotides. The modifications conserve the coding and base-pairing functions of DNA, but add regulatory and protective functions. In prokaryotes, modified bases appear primarily to be part of an arms race between bacteriophages (and other genomic parasites) and their hosts, although, as in eukaryotes, some modifications have been adapted to convey epigenetic information. The first half of this review catalogs the identification and diversity of DNA modifications found in bacteria and bacteriophages. What is known about the biogenesis, context, and function of these modifications are also described. The second part of the review places these DNA modifications in the context of the arms race between bacteria and bacteriophages. It focuses particularly on the defense and counter-defense strategies that turn on direct recognition of the presence of a modified base. Where modification has been shown to affect other DNA transactions, such as expression and chromosome segregation, that is summarized, with reference to recent reviews. CONTENTS 3.2.7. 5-Hydroxycytosine 12665 4. Central Role of Deoxypyrimidine Nucleotide 1. Introduction 12656 fi Monophosphate (Hydroxy) Methyltransferases in 1.1. Early Observations of Modi ed Bases in Generating Modified Pyrimidines 12665 Prokaryote and Viral DNA 12656 fi fi 4.1. Enzymatic Pyrimidine C5 Modi cation: U 1.2. Detection and Analysis of Modi ed Nucleo- versus C, Methyl versus Hydroxymethyl 12666 bases 12656 fi 4.2. Phylogenetic and Functional Clustering of 2. Modi ed Nucleobases Produced by DNA Methyl- dYMP (Hydroxy)methyltransferases 12666 transferases 12657 4.3. Phages with Potentially Undiscovered hm5dC 2.1. Protective and Regulatory Functions of DNA- Modifications 12668 MTs 12657 5. Arms Race: The Biology of Modification and 2.2. DNA Methytransferase Structure and Func- Restriction 12668 tion 12657 5.1. Modification Protects from Restriction and 2.3. DNA Methyltransferase Mechanism 12658 Causes Sensitivity to It 12668 3. Modified Nucleobases Found in Bacteriophages 12658 fi 5.1.1. RM Type Summary 12668 3.1. Modi ed Purines in Phages 12659 5.1.2. Other Reviews: Perspectives on RM 3.1.1. N6-Carbamoyl-methyladenine 12659 Systems 12669 3.1.2. 2-Aminoadenine 12659 5.1.3. Interaction of RM Types with Biological 3.1.3. 7-Methylguanine 12659 Base Modifications 12670 3.1.4. Deoxyarchaeosine 12659 fi fi 5.2. Role of Modi cations in Virulent Phage Life 3.2. Phage Modi ed Pyrimidines 12660 Cycle 12670 3.2.1. Deoxyuracil (dU) 12660 3.2.2. 5-Hydroxymethyldeoxyuracil 12661 3.2.3. Hypermodified Thymidines 12662 3.2.4. 5-Dihydroxypentauracil 12665 Special Issue: Genome Modifying Mechanisms 3.2.5. 5-Methylcytosine 12665 Received: February 12, 2016 hm5 hm5 3.2.6. C and Glucosyl- C of T-Even Phages 12665 Published: June 20, 2016 © 2016 American Chemical Society 12655 DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687 Chemical Reviews Review 5.3. Role of Modifications in Temperate Phage expression in certain contexts. Because such modifications to the Life Cycle 12671 DNA can alter the phenotypic expression of a genome without 5.4. Modification Facilitating Migration of RM altering the genotype, per se, the biological information carried Systems 12672 by DNA modification (as well histone modification and other 5.4.1. Lysogenic Conversion: EcoP1I and Eco- cellular phenomena) is, by convention, referred to as the GIII 12672 organism’s “epigenome.” In prokaryotes and their viruses, 5.4.2. Replacement Cassettes DpnI/DpnII/ modified nucleobases appear primarily to be part of an arms DpnIII 12672 race between viruses (and other genomic parasites) and their 5.5. Orphan Modifying Enzymes 12672 hosts, though epigenetic functions of modified DNA in bacteria 5.5.1. Lineage-Conserved Orphan Methyl- are also beginning to be elucidated. transferases 12672 1.1. Early Observations of Modified Bases in Prokaryote and 5.5.2. Eroded RM Systems 12673 Viral DNA 5.5.3. Migratory Orphans: Prophage and Plas- The first modified nucleoside, 5-methylcytosine (m5C), was mid Orphan Modifying Enzymes 12673 observed in 1925 by Johnson and Coghill in the DNA of 5.6. Restricting Modified DNA: DNA Binding Mycobacterium tuberculosis.1 m5C was not confirmed in bacteria Domain Fusions 12674 again until 1965 by Doskociľ and Šormova,́2,3 though it had been 5.6.1. McrBC: DUF3578 DBD-Translocase Fu- found in bacteriophage λ.4,5 N6-Methyladenine (m6A) was sion, PD(D/E)XK Separate Protein 12674 shown by Dunn and Smith in 1955 to be a minor component of 5.6.2. MspJI Family: SRA DBD-Mrr-Cat fusion 12675 bacterial and bacteriophage DNAs.6,7 N4-Methylcytosine was 5.6.3. PvuRts1I Family: PD(D/E)XK-SRA DBD first shown in 1983 to be a minor component in Bacillus DNA8 Fusion 12675 and then later shown to be widespread among thermophilic9 and 5.6.4. Sco5333: SRA DBD-HNH Fusion 12675 mesophilic bacteria.10 Structures of these nucleobases are shown 5.6.5. EcoKMrr: Mrr-N DBD-Mrr-Cat Fusion 12676 in Figure 1. Wyatt and Cohen observed complete substitution of 5.6.6. EcoKMcrA: EcoMcrA-N DBD-HNH Fusion 12676 5.6.7. ScoA3McrA: ScoA3McrA(N) DBD-HNH 12676 5.6.8. GmrSD Family: ParB/Srx DBD-HNH Fu- sion 12676 5.6.9. SauUSI PLDc-Helicase-DUF3427 DBD Fu- sion 12676 5.6.10. DpnI: PD(D/E)XK-Winged Helix DBD 12677 5.6.11. GlaI Family, Unidentified Domains 12677 5.7. Restricting Modified DNA with DNA Repair Enzymes 12677 5.7.1. Repair Glycosylase UDG 12678 5.7.2. Repair Nuclease Nfi (Endonuclease V) 12678 5.8. Inhibition of RE Action 12678 Figure 1. Methylated bases found in bacteria and their viruses. 6. Future Directions 12679 Associated Content 12679 Supporting Information 12679 hm5 Author Information 12679 cytosine by 5-hydroxymethylcytosine ( C) in the T-even 11 Corresponding Author 12679 bacteriophages, and a subset of these bases were subsequently 12 Notes 12679 shown to be glucosylated. Biographies 12679 An understanding of a biological role for methylated bases in Acknowledgments 12679 bacteria did not emerge until decades later. In the late 1960s and Abbreviations 12679 early 1970s, Werner Arber and others noticed that bacter- References 12680 iophages could retain a “memory” of their most recent host that could profoundly affect their ability to infect closely related bacterial strains.13 The basis of this memory was shown to be the 1. INTRODUCTION methylation of specific DNA sequences by a host-encoded enzyme.14 The methylation of the DNA was essential to the DNA is more than just combinations of A, G, C, and T. Naturally viability of the phage in subsequent rounds of infection on the occurring variations of the canonical nucleotides can be found in same host. This phenomenon of host-controlled modification led the DNA of cellular organisms and viruses from all domains of to the discovery of restriction enzymes, for which Werner Arber, life. A variety of chemical groups can be biologically appended to Daniel Nathans, and Hamilton O. Smith shared a Nobel Prize in the nucleobase portion of a nucleotide, ranging from simple 1978. methyl groups in cellular organisms and their viruses, to amino fi acids, polyamines, monosaccharides, and disaccharides as found 1.2. Detection and Analysis of Modi ed Nucleobases in viruses of bacteria. These modifications do not alter the A variety of techniques have been used historically for the specificity of base pairing; rather they are interpreted by cells, detection and characterization of modified nucleotides. DNA viruses, and mobile DNAs in a context-specific fashion through samples first had to be decomposed to individual nucleotides, the interaction of cellular and viral encoded proteins with the usually by harsh chemical hydrolysis, such as boiling in modified DNA to distinguish self from nonself, protect DNA hydrochloric or formic acid.15 Gentler, more physiological from being degraded, and/or control gene regulation. In methods employed mixtures of enzymes such as DNase I from eukaryotes, DNA modification can profoundly influence gene bovine pancreas, snake venom phosphodiesterase, and S1 12656 DOI: 10.1021/acs.chemrev.6b00114 Chem. Rev. 2016, 116, 12655−12687 Chemical Reviews Review nuclease, and where dephosphorylation was required, “zinc as the nucleotide under investigation (e.g., same retention time, activated” bacterial alkaline phosphatase (BAP) was used.16,17 same mass, same reactive side groups), the identity of the Chemically or enzymatically prepared nucleotide/nucleoside modification can be supported. mixtures could then be applied to various separation methods. More recently, single molecule real-time sequencing tech- The pioneering method, paper chromatography, was succeeded nologies (SMRT) have been employed for the detection of by thin-layer chromatography (TLC) with derivatized cellulose modified nucleotides at base resolution in bacterial genomic on a glass support. This offered improved resolution and DNA.29,30 This “SMRT” sequencing technology optically flexibility, based on the same principles.18 Mobilities and monitors the kinetics of fluorescently tagged nucleotide positions of nucleotides could be visualized by chemical stains incorporation by an engineered DNA polymerase as it progresses applied to plates postseparation, or, in the case of metabolically along a template strand.