Triple-Helical Nucleic Acids Springer New York Berlin Heidelberg Barcelona Budapest Hong Kong London Milan Paris Santa Clara Singapore Tokyo Valery N. Soyfer Vladimir N. Potaman

Triple-Helical Nucleic Acids

With 121 Figures

, Springer Valery N. Soyfer, Ph.D. Vladimir N. Potaman, Ph.D. Laboratory of Molecular Genetics Center for Genome Research Department of Biology Institute for Biosciences and George Mason University Technology Fairfax, VA 22030-4444 Texas A&M University USA Houston, TX 77030 USA

The cover illustration depicts a triple-helical model of DNA, derived from Arnott fiber DNA coordinates. Teresa Larsen, of The Institute, built the model using NAB by Thomas J. Macke, and rendered the computer graphic image using custom software by David S. Goodsell. © 1995, T. Larsen, TSRI.

Library of Congress Cataloging in Publication Data Soyfer, Valerii. Triple-helical nucleic acids / Valery N. Soyfer & Vladimir N. Potaman. p. cm. Includes bibliographical references and index. ISBN-13: 978-1-4612-8454-3 e-ISBN-1: 978-1-4612-3972-7 DOl: 10.1007/978-1-4612-3972-7

I. DNA. I. Potaman, Vladimir N. II. Title. QP624.S69 1995 574.87'3282-dc20 95-12214

Printed on acid-free paper.

© 1996 Springer-Verlag New York, Inc. Softcover reprint of the hardcover 1st edition 1996

All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use of general descriptive names, trade names, trademarks, etc., in this publication, even if the former are not especially identified, is not to be taken as a sign that such names, as understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by anyone.

Acquiring editor: Robert C. Garber. Production coordinated by Chernow Editorial Services, Inc., and managed by Terry Kornak; manufacturing supervised by Joe Quatela. Typeset by Best-set Typesetter Ltd., Hong Kong.

9 8 7 6 5 4 3 2 1 To Nina, Marina, and Vladimir Soyler, and Olga and Gosha Potaman Preface

Alexander Rich, Gary Felsenfeld, and David Davis published the first observation of triple-helical structures in nucleic acids in 1957. Great changes in the field occurred in the mid-1980s. For us personally, the pioneering thoughts and suggestions of Dr. Maxim Frank-Kamenetskii had particular importance and greatly influenced our interest in triplexes. He presented the first model of H DNA and began his efforts to under• stand triplex structures, especially their physical chemistry and the role of DNA supercoiling. We were inspired by Maxim, with whom we have collaborated for several years. Ten years have flown by like one day in our lives, and the science of triplexes has changed dramatically during this decade. Because of out• standing work by , Claude Helen, Robert Wells, Jeremy Lee, Valentin Vlassov, and many others, the field of DNA triplexes has become a center of interest for specialists in nucleic acids specifically, and, more widely, molecular biologists. Although the biological role of triplexes is not yet clear, their importance in many key processes of life is now obvious. We understand that without a definite clarification of their biological role, any description of triplexes is incomplete. However, the enormous volume of information that is now available motivates us to provide a systematic description of this field in book form. This book discusses the structure and stability of triplexes, the factors involved in their appearance, the methods used for their investigation, and, of course, the current understanding of their biological role. New information dis• covered in the search for the potential role of triplexes in antisense regulation of gene expression, as well as numerous attempts to apply triplexes in gene therapy and other medical applications, have widened hopes for the great potential presented by triplex studies. Although none of these new ideas has yet resulted in therapeutic use, there is widespread anticipation that triplex nucleic acids will enjoy numerous applications. Many of our friends and colleagues helped make this book possible. Without their encourgement discussions, and generosity in sending us copies of reprints, photos, and unpublished materials, this book could not have been prepared so quickly. Drs. Valery Ivanov from the Moscow Institute of Molecular Biology, Valentin Vlassov from the Novosibirsk Institute of Bioorganic Chemistry, Sergei Mirkin from the University of

vii viii Preface

Chicago, and Oleg Voloshin from the National Institutes of Health were the first reviewers of the manuscript; it was they who first encouraged us to try to find a publisher. Since then, the moral support of Dr. Robert Garber, Senior Editor at Springer-Verlag, and his kind openness in con• sidering our manuscript, cannot be overestimated. His continuing support helped us avoid mistakes and expedite the publication of this book. We are very happy to express our deep gratitude and appreciation to Alex Rich, , Gary Felsenfeld, Maxim Frank• Kamenetskii, Vadim Demidov, Lyudmila Shlyakhtenko, B. Montgomery Pettitt, Albino Bacolla, Paul Chastain, and Igor Panyutin for discussions that resulted in a better understanding of different aspects of triplex science. We are especially indebted to Richard R. Sinden, who read the entire manuscript and made numerous suggestions for changes, to Jan Klysik, who read several chapters and made valuable suggestions, and to Adam Jaworski, who offered insightful criticisms of Chapter 6. Contents

Preface ...... vii Introduction ...... XIlI

1. The Discovery of Triple-Stranded Nucleic Acids ...... 1 Basic Physicochemical Properties of Nucleic Acids ...... 1 Bases, Nucleosides, Nucleotides, Polynucleotides ...... 1 Base Pairs and Double-Stranded Nucleic Acids ...... 7 Some Methods of Investigating Nucleic Acids ...... 17 Circular and Superhelical DNA ...... 19 Physicochemical Studies of Model Triple-Stranded Structures ... 26 The Discovery of Nucleic Acid Triplexes ...... 26 Physicochemical Studies of Model Triplexes ...... 28 Early Hypotheses About the Biological Roles of Triple-Stranded Nucleic Acids ...... 37 Studies of Nuclease S1 Susceptibility of Specific DNA Sequences ...... 38 Experimental Evidence That the S1-Sensitive PyPu Tracts in Supercoiled DNA Form Intramolecular Triplexes ...... 40 Intermolecular Triplexes Between DNA and Oligonucleotides ...... 44 Current Fields of Interest in Investigation of Triplexes ...... 45

2. Methods of Triplex Study ...... 47 Physical Methods for Triplex Study ...... 47 Spectral Methods ...... 47 Differential Scanning Calorimetry ...... 60 Equilibrium Sedimentation ...... 60 Electrophoretic Techniques ...... 61 Immunological Methods ...... 65 Affinity Methods ...... 67 Affinity Chromatography ...... 67 Filter-Binding Assay ...... 68 Electron Microscopy ...... 70 X-Ray Analysis ...... 71

ix x Contents

A Short Overview of Experimental Physical Methods for Triplex Studies ...... 72 Theoretical Descriptions of the Triplex Structures ...... 73 Enzymatic and Chemical Probing of Triplex Structures ...... 75 Analysis of Modifications Using Maxam-Gilbert Sequencing Gel Analysis ...... 76 Primer Extension Analysis ...... 76 Enzymatic Methods ...... 77 Single-Strand-Specific Nucleases ...... 77 DNase I Footprinting ...... 77 Inhibition of Restriction Nuclease Action ...... 78 Chemical Methods ...... 79 Unbound Agents ...... 79 Photofootprinting ...... 88 A Short Overview of Chemical and Enzymatic Probes ...... 91 Site-Directed Agents ...... 92 Nuclease-Like Oligonucleotides ...... 93 Photoactive Groups Attached to Oligonucleotides ...... 94 A Short Overview of Site-Directed Agents ...... 99 Conclusion ...... 99

3. General Features of Triplex Structures ...... 100 Basic Types of Triplexes ...... 100 Nucleotide Sequence Requirements ...... 100 Intermolecular and Intramolecular Triplexes ...... 101 Molecular Details of Triplex Structure ...... 103 Major-Groove Location of the Third Strand ...... 103 Base Triads in Nucleic Acids ...... 104 Orientation of the Third Strand ...... 108 Overall Conformations of Triplexes ...... 109 Stabilization Common to Intramolecular and Intermolecular Triplexes ...... 111 Factors That Destabilize Triplexes ...... 113 Specific Features of Intramolecular Triplexes (H and H* Forms) ...... 114 Specific Features of Intermolecular Triplexes ...... 123 DNA-RNA Triplexes ...... 131 Kinetics of Triplex Formation ...... 133 Thermodynamics of Triplexes ...... 136 New Variants of Triplex Structure ...... 140 Conclusion ...... 150

4. Triplex Recognition ...... 151 Extension of Triplex Recognition Schemes ...... 151 Natural Bases in Unusual Triads ...... 152 Contents xi

Triplexes with Base and Nucleoside Analogs in the Third Strand ...... 158 Abasic Sites in the Third Strands ...... 170 Base Analog in the Duplex Part of the Triplex ...... 171 Alternate Strand Triplex Formation ...... 171 Modified Oligomer Backbones ...... 176 Modifications in the Sugar-Phosphate Backbone ...... 177 Conjugated Oligonucleotides ...... 182 Protein-DNA Interactions and Triplex Formation ...... 184 Triplex DNA-Drug Interactions ...... 186 Groove-Binders ...... 186 Intercalators ...... 189 Conclusion ...... 193

5. The Forces Participating in Triplex Stabilization ...... 194 Triplex Stabilizing Factors ...... 194 Reduction of Interstrand Repulsion ...... 194 pH Stabilization ...... 196 Length-Dependence ...... 197 Differential Effect of Divalent Cations ...... 199 The Hydration State of Nucleic Acids ...... 200 Hydrophobic Substituents in the Third Strand ...... 201 Possible Interactions Which Favor and Stabilize Triplexes ...... 202 Electrostatic Forces ...... 202 Stacking Interactions ...... 208 Hoogsteen Hydrogen Bonds ...... 209 Hoogsteen Hydrogen Bond Enhancement ...... 210 Hydration Forces ...... 215 Contribution of Hydrophobicity ...... 217 Interrelation of Different Triplex-Stabilizing Contributions ... 217 Conclusion ...... 218

6. In Vivo Significance of Triple-Stranded Nucleic Acid Structures ...... 220 In Vivo Existence of Triplexes ...... 220 Search for Triplexes in the Cell ...... 220 Factors That Could Be Responsible for Triplex Formation In Vivo ...... 227 Possible Biological Roles of Triplexes ...... 232 Possible Regulation of Transcription ...... 232 Possible Regulation of Replication ...... 239 Possible Triplex-Mediated Chromosome Folding ...... 242 Structural Role at Chromosome Ends ...... 245 Recombination ...... 246 Possible Role in Mutational Processes ...... 247 xii Contents

Do PyPu Tracts Play a.Role in RNA Splicing? ...... 248 Elements of Triple-Stranded Structure in RNA ...... 249 Other Roles of the PyPu Tracts ...... 251 Coding of Charged or Hydrophobic Amino Acid Clusters .... 251 Can the PyPu Tracts Exclude Nucleosomes from Certain Gene Regions? ...... 251 Conclusion ...... 252

7. Possible Spheres of Application of Intermolecular Triplexes .... 253 Applications of Intermolecular Triplex Methodology ...... 253 Extraction and Purification of the Specific Nucleotide Sequences ...... 253 Affinity Chromatography ...... 254 Quantitation of Polymerase Chain Reaction Products ...... 260 Nonenzymatic Ligation of Double-Helical DNA Mediated by Triple Helix Formation ...... 261 Triplex-Mediated Inhibition of Viral DNA Integration ...... 262 Site-Directed Mutagenesis ...... 262 Detection of Mutations in Homopurine DNA Sequences ..... 264 Mapping of Genomic DNA ...... 265 Control of Gene Expression ...... 274 Conclusion ...... 283

References ...... 285

Index ...... 347 Introduction

Interest in triple-helical nucleic acids has been stimulated by the recogni• tion of their potential biological roles and genetic applications. DNA triplexes can be formed in natural homopurine-homopyrimidine (PuPy) sequences, which represent up to 1% of eukaryotic genomes. Although direct evidence of participation of triplexes in biological processes has not yet been obtained, a growing body of data suggests that triplexes can be involved in regulation of DNA replication, transcription, recom• bination, and development. Interest in DNA triplexes has been further enhanced by the first findings that triplex-like structures can exist in vivo. Appropriately designed third-strand oligonucleotides that hybridize to targeted duplex domains can be used to control gene expression, serve as artificial endonucleases in genome mapping strategies, extract and purify the specific duplex DNA, and so forth. Triplex regulation of DNA functions seems very promising because of the demonstrated ability of oligonucleotides to penetrate cell walls via liposome- and receptor• mediated endocytosis, or to be taken up by cells directly. It is important to emphasize that full-length oligomers may persist in the cell for at least several hours after being taken up. In the course of transcription, a local wave of supercoiling that could promote triplex formation may develop behind the moving RNA poly• merase complex. The PuPy tracts necessary for transcription and capable of forming triplexes have been found in the 5' flanking regions of some eukaryotic genes, which offers hope that the role of triplexes in this particular process will be elucidated soon. It has been suggested that triplex-like secondary structures in DNA are involved in the ordered transcriptional switch from y-globin synthesis in fetal erythroid cells to ~-globin synthesis just before birth. The triplex structure causes specific termination of DNA polymerization in vitro and may participate in the regulation of DNA replication in vivo. A triple-stranded structure is also presumed to be an intermediate in DNA recombination. Triple-strand formation also represents the basis for numerous site• specific manipulations with duplex DNA. Appropriately designed third• strand oligonucleotides that hybridize to targeted duplex domains might be used to control gene expression, serve as artificial endonucleases in genome mapping strategies, modulate the sequence specificity of DNA-

xiii xiv Introduction binding drugs, selectively alter the sites of protein activity, provide non• enzymatic ligation of double-helical DNA by alternate-strand triple helix formation, quantitate polymerase chain reaction products, and provide physical genome mapping by electron microscopy. Triplex regulation of DNA expression and its interactions with a range of molecules seems very promising. All of these findings have occurred within the last five years, and have opened a new and rapidly growing field of research. At the same time, it would be naive to see only successes in the understanding of triplexes and overlook numerous difficulties in fulfilling some very high initial expec• tations. Not all of the early hopes have been fulfilled, but expectations in the field of triplexes are very high. Many laboratories are involved in research on the nature of triplexes and their potential biological role, as well as in identifying applications of triplexes to biotechnology. Different aspects of triple-stranded structures have been discussed in several reviews (Wells et aI., 1988; Frank-Kamenetskii, 1990, 1992; Helene, 1991, 1992; Palecek, 1991; Yagil, 1991; Cheng and Pettitt, 1992; Strobel and Dervan, 1992; Sun and Helene, 1993; Thuong and Helene, 1993; Mirkin and Frank-Kamenetskii, 1994; Radhakrishnan and Patel, 1994d). However, the amount of new material in the field is growing fast. In this book we systematically describe the properties of triplexes, the methods of investigation, possible triplex-stabilizing interactions, triplex recognition schemes, potential biological roles of triplexes, and genetic and pharmacological applications of triplex methodology. We discuss in more detail the issues not reviewed previously, and briefly outline the better known material.