<<

PART I. NUCLEIC ACID SITE-SELECTIVE BINDING STUDIES OF ISOMERS OF DIHYDRODIOXIN-MASKED ORTHO-QUINONES AS POTENTIAL ANTITUMOR DRUGS. PART II. THE ROLE OF NON-WATSON-CRICK BASE PAIRS IN STABILIZING A RECURRENT RNA MOTIF.

Emil F. Khisamutdinov

A Dissertation

Submitted to the Graduate College of the Bowling Green State University in a partial fulfillment of the requirements for the degree of

DOCTOR OF PHILOSOPHY

August 2012

Committee:

Neocles B. Leontis, Advisor

Tracy L. Huziak-Clark Graduate Faculty Representative

R. Marshall Wilson

Zhaohui Xu

© 2012

Emil F. Khisamutdinov

All Rights Reserved iii

ABSTRACT

Neocles B. Leontis, Advisor

This work comprises two parts:

The first focuses on pyrene dihydrodioxins as potential antitumor drugs. The compound

comprises effective DNA intercalation agents that are masked ortho - quinones and exists in two

enantiomeric forms. Reactive quinone can be released by visible irradiation which is especially

useful in photodynamic thera0py. The binding properties of PDHD enantiomers to herring sperm

DNA as well as to short synthetic oligomers have been studied by spectroscopic methods and

biochemical methods. Both PDHD enantiomers bind to double helical DNA with a high affinity.

5 -1 PDHD II has a slightly larger binding constant (Kb = 2.3 ± 0.8 ×10 M ) than PDHD I (Kb = 1.6

±0.15 ×105 M-1). Upon addition of DNA to the PDHD rac, CD spectra change dramatically:

these results, together with UV titration experiments, reveal that DHDs intercalate to DNA

double helix. Ability of DHDs to stabilize DNA was confirmed by UV-melting studies. DHD II

enhances the melting temperature of 10 bps DNA by 15° C and, more surprisingly, it provides

∆Tm of almost 29 °C for 12-mer DNA.

This type of DNA binding is unique in that, it implements both the hydrophobic bonding characteristic of many aromatic hydrocarbons and ionic bonding. It reduces the ionic repulsion of the negatively charge phosphate backbones, inhibiting the separation of the two duplex chains. In addition, this type of duplex stabilizer is also photochemically active in the oxidation of ds-DNA, as it can be concluded from our φX 174 plasmid DNA photocleavage assay. Studies conducted to determine if there is any preference of PDHDs for base sequencing showed that PDHDs highly

selective for guanine residues. iv

The second project denotes to investigation of roles of non-Watson-Crick base pairs in stabilizing of RNA structural motif on the example of sarcin-ricin motif. Particularly, we address the question, is there a selection for particular base combinations in structurally conserved motifs, depending on the temperature regime to which an organism adapts.

At present, the Data Bank (www.pdb.org) contains atomic resolution of ribosomal

RNA structures of three bacteria, one archaeon and two eukaryal organisms. To identify relevant sequence substitutions to study, we examined sequence variations of sarcin-ricin in 3D structures and in carefully selected aligned 16S rRNA sequences representing a range of phylogenetic and ecological groups (Sweeny B. data Ms. Thesis, BGSU 2011). It was found that only isosteric variations of base pairs are presented. Moreover, trans-Hoogsteen/sugar (tHS) AG is the most

frequent occurred in thermophilic and hyperthermophilic 16S sarcin-ricin (S/R) motif

sequences.

Thermodynamic analysis of SRM variations found in organisms occupying different

temperature niches as well as those presented in 3D structures indicated that a single non-CW mutation considerably contributes to overall stability of the motif. In almost all studied S/R motif variations dependence of Mg ions was observed. Given the different stabilities of the motifs we have studied their solution structures by using single stranded specific nucleases T1, A, and the helix specific nuclease V1. All isosteric or neutral base pairs substitutions do not interrupt the overall conformation.

We demonstrated that the compositions of non-canonical interactions in sarcin-ricin RNA

motifs are very important. By changing nucleotides at certain non-canonical positions we can

control the stability, and perhaps function of large RNAs. vi

ACKNOWLEDGMENTS

During the endeavor of pursuing a Ph.D. degree, the support of many people is necessary, that it is difficult to frame of my appreciation of the help and support received in a few words.

Foremost, I o we sp ecial t hanks t o my r esearch ad visor Dr. N eocles B asil L eontis, f or h is constant a nd ge nerous support t hroughout t he years I h ave be en w orking i n hi s group. T he freedom with which he provided me in pursuing my research, combined with his endless wealth of know ledge a nd ideas, a llowed me t o de velop t he n ecessary independency in approaching scientific matter. I also thank the members of my group for keeping up friendly atmosphere and their help throughout my PhD studies

I w ould l ike t o a cknowledge D r. R . M arshal Wilson f or hi s c ollaboration a nd f ruitful discussions. I t w as h is contribution t hat h as m ade m y t ime i n gr aduate s chool rewarding a nd enjoyable. I would like to thank Dr. H. Peter Lu and Dr. Michael Y. Ogawa of the Center for

Photochemical Sciences for providing necessary instrumentation.

Most importantly, I would like to thank my wife for her support throughout my Ph.D. years.

v

To my lovely wife Alya

To my parents Farit and Victoriya

For their love and support

vii

TABLE OF CONTENTS

Page

PART I. NUCLEIC ACID SITE-SELECTIVE BINDING STUDIES

OF ISOMERS OF DIHYDRODIOXIN-MASKED ORTHO-QUINONES

AS POTENTIAL ANTITUMOR DRUGS

CHAPTER I. INTRODUCTION ...... 1

1.1 DNA characteristics ...... 1

1.2 DNA-ligand binding ...... 7

1.3 DNA photocleavage and photodynamic therapy ...... 9

CHAPTER II. MATERIALS AND METHODS ...... 11

2.1 Experimental approaches ...... 11

2.2 Materials ...... 13

2.3 UV-Vis absorption spectroscopy ...... 14

2.4 Optical melting experiments ...... 15

2.5 Circular dichroism (CD) spectropolarimetry ...... 15

2.6 ΦX 174 plasmid DNA photocleavage assay ...... 16

2.7 Sequence selectivity of a DNA oligonucleotide ...... 19

2.8 DNase I footprinting assay ...... 19

2.9 Quantum yield of quinone release calculations ...... 23

CHAPTER III. RESULTS AND DUSSCUSION ...... 24

3.1 Background ...... 24

3.2 Theoretical PDHD/DNA binding studies ...... 26

3.3 UV-vis spectroscopy studies ...... 29 viii

3.4 Optical melting studies ...... 34

3.5 Circular Dichroism (CD) spectropolarimetry ...... 38

3.6 ΦX 174 plasmid DNA photocleavage assay ...... 40

3.7 Effect of oxygen on ds cleavage efficiency ...... 42

3.8 DNA specific damage by PDHD enantiomers ...... 44

3.9 Binding specificity by DNase I footprinting assay ...... 47

3.10 Proposed cleavage mechanism ...... 50

3.11 Conclusion ...... 54

3.12 References ...... 55

PART II. THE ROLE OF NON WATSON CRICK BASE PAIRS IN STABILIZING A

RECURRENT 3D MOTIF

CHAPTER IV. SIGNIFICANCE OF RNA IN BIOLOGY ...... 60

4.1 Distinctive features of non-coding RNAs ...... 60

4.2 RNA structural motifs: building blocks of modular ...... 62

4.3 Search, extraction and analysis of RNA 3D motifs ...... 64

4.4 Factors stabilizing RNA 3D motifs...... 65

CHAPTER V. EXPERIMENTAL METHODS FOR STUDYING RNA MOTIF

THERMODYNAMIC AND STRUCTURE...... 71

5.1 Overview of RNA synthesis and purification ...... 71

5.2 RNA design ...... 72

5.3 Preparation of DNA template ...... 72 ix

5.3.1Purification of DNA products on rechargeable QIAGEN columns ...... 74

5.4 In vitro transcription using T7 RNA polymerase ...... 77

5.5 RNA purification ...... 77

5.5.1 Purification on denaturing PAGE ...... 77

5.5.2 RNA purification on TLC plates ...... 79

5.6 RNA labeling ...... 81

5.7 Structural probing experiment ...... 84

5.8 UV-melting experiment ...... 88

5.8.1 Sample preparation ...... 89

5.8.2. Data analysis ...... 91

5.9 RNA concentration and melting temperature ...... 95

5.10 UV-melting of RNA duplexes with sarcin-ricin motif ...... 95

CHAPTER VI. RESULTS AND DISCUSSIONS ...... 97

6.1 Background information ...... 97

6.2 Searching for sarcin-ricin motifs in the 3D database ...... 100

6.3 Base pair isostericity in S/R motifs ...... 105

6.4 Thermodynamic study of RNA duplexes containing sarcin-ricin motifs ...... 108

6.5 UV-melting of duplexes containing S/R motif ...... 112

6.6 Structural probing of S/R motifs ...... 124

6.7 Sequence variation in S/R motifs observed in 16S rRNA

sequences from Eubacteria with known optimal growth temperatures ...... 133

6.8 Conclusion ...... 137

6.9 References ...... 139 xiii

LIST OF TABLES

Tables Page

1.1.1 Structural characteristics for B-, A- and Z-DNA ...... 3

3.3.1 Calculated apparent binding constants from UV-vis titration data ...... 33

3.4.1 PDHD-induced DNA duplex thermal stabilization data (∆TM)

for resolved enantiomers ...... 37

3.6.1 Statistical efficiency of single-strand n1 and double-strand n2 breaks ...... 42

3.7.1 Statistical efficiency of single-strand and double-strand break

formations without oxygen ...... 44

3.10.1 Quantum yield of quinone photorelease ...... 50

4.1.1 Diversity of non-coding RNAs in living cells ...... 61

4.4.1 RNA geometric base pair families ...... 68

5.7.1 Common chemical and enzymatic probes used in structured RNA ...... 85

5.8.2.1 Two-state analysis of nucleic acids transitions ...... 93

6.2.1 Representative PDB structure files containing S/R motifs ...... 101

6.3.1 RNA base pair variations observed in 3D structures of S/R motifs ...... 106

6.4.1 Effect of flanking base pairs on stability of S/R motif ...... 111

6.5.1 Thermodynamic parameters for S/R motif duplex formation at

10 mM MgCl2,100 mM NaCl; and at 1M NaCl ...... 114

6.5.2 Effect of Mg2+ ions in stabilization of S/R motif duplex ...... 115

xiv

A1 UV-melting curves and thermodynamic parameters for RNA duplexes ...... 146

A2 Autoradiograms of 15% denaturing gel and cleavage pattern of RNA

hairpins by nucleases ...... 153

A3 Phylogenetic survey of 16S rRNA sarcin-ricin motifs ...... 156

x

LIST OF FIGURES

Figures Page

1.1.1 Simplified view of the DNA organization in the chromosome ...... 2

1.1.2 Double helical structure of B-DNA showing the major and minor grooves ...... 4

1.1.3 Examples of DNA binding compounds ...... 6

1.2.1 Illustration of ionic, groove and intercalative binding ...... 8

2.1.1 Illustration of double helix DNA deformation by an intercalator...... 12

2.6.1 Illustration of supercoiled plasmid DNA conversion to circular and

linear forms ...... 16

2.8.1 Schematic illustration of DNase I footprinting experiment ...... 22

3.1.1 Structural units of PDHD photonuclease ...... 24

3.2.1 Geometrical conformations of PDHD 1 ...... 27

3.2.2 Binding of PDHD 1 enantiomers to a DNA decamer in

an aqueous environment ...... 28

3.3.1 The absorption spectra of 10 μM pure PDHD enantiomers and

PDHD/DNA complexes in cacodylate buffer ...... 31

3.3.2 Absorption titration curve constructed at 343nm...... 34

3.4.1 Thermal denaturation of 10 µM 10 bps DNA alone (leftmost curve) and

in the presence of increasing concentrations of PDHDs

1 µM, 5 µM, 10 µM, 15 µM and 20 µM ...... 36

3.4.2 First derivative melting curves for the melting transitions of 10 µM 10bps DNA

in the absence and presence of saturated 20 µM amount of PDHD enantiomers ... 37 xi

Figures Page

3.4.3 Differential UV melting curves of DNA dodecamer duplex ...... 38

3.5.1 CD spectra of 10 bps DNA in the absence and presence of PDHD racemate ...... 39

3.6.1 Photochemical cleavage of supercoiled ΦX 174 plasmid DNA ...... 41

3.7.1 Photochemical cleavage of supercoiled φX 174 plasmid DNA no oxygen ...... 43

3.8.1 Autoradiography of 20% denaturing polyacrylamide gel of 5’-end

32P-ATP labeled 22-mer DNA fragments ...... 46

3.9.1 Combined autoradiography experimental data for PDHD rac

specificity to double stranded 38 – mer DNA ...... 49

4.2.1 Illustration of secondary structure of RNA molecule ...... 63

5.3.1.1 Illustration of use of rechargeable QUAGEN column ...... 76

5.5.2.1 Image of RNA sequences resolved by preparative 20x20 cm TLC plate ...... 80

5.6.1 Image of 10% denaturing PAGE of radiolabeled RNAs ...... 84

5.8.2.1 Representative UV-melting curve of a RNA duplex ...... 91

5.8.2.2 UV-melting curve represented as derivative plot dA260 / dT to obtain TM ...... 92

5.8.2.3 Fractions of RNA duplex (F) calculated from baselines ...... 93

5.8.2.4 Van’t Hoff plot of data shown in Fig. 5.8.2.3 ...... 94

6.2.1 Different representations of the S/R motif from helix- 95 of

H. marismortui 23S rRNA ...... 100

6.2.2 2D diagrams of S/R motif from different organisms found in

3D rRNA data base ...... 104

6.3.1 Isosteric and near isosteric relationships between basepairs

frequently observed in S/R motif...... 107 xii

Figures Page

6.4.1 Sequences and schematic structures of the RNA sarcin-ricin

motifs used in thermal denaturation studies...... 108

6.5.1 Melting temperature of reference duplex as a function of [Mg2+] ...... 116

6.5.2 Representative absorbance vs temperature profiles of 5 RNA duplexes melted in

sodium cacodylate buffer pH 6.94, 10 mM MgCl2 and 0.1 M NaCl ...... 116

6.5.3 Examples of BPh interactions between upper tHS and triplet U ...... 121

6.5.4 Comparison of structures of S/R motifs varying the bulged base ...... 123

6.6.1 Representative structures of RNA molecules used in structure

probing experiments...... 126

6.6.2 Summary of the nuclease digestion experiments ...... 130

6.6.3 Autoradiogram and summary of chemical probing analysis ...... 132

6.7.1 Survey of non-canonical base pairs of S/R motifs in 16S rRNA of Eubacteria ..... 136

1

PART I. NUCLEIC ACID SITE-SELECTIVE BINDING STUDIES OF ISOMERS OF

DIHYDRODIOXIN-MASKED ORTHO-QUINONES AS POTENTIAL ANTITUMOR

DRUGS

CHAPTER I. INTRODUCTION

1.1 DNA characteristic

In a past decade, research involving the direct modification of oligonucleotides and DNA increased dramaticaly and opened innumerable new research fields. Whether this research involves working with small oligonucleotide strands, medium sized plasmid or entire genetic sequences of organisms, all of these tasks require the specific modification and alternation of the

DNA strands to suit the desired application. To understand some of the challenges involved in the research with DNA, a closer look at this macromolecule is necessary. Figure 1.1.1 shows a

DNA double strand (a) wich forms an a-helix (b), then undergoes multiple steps of supercoiling

(c, d) until it finaly adoprts the shape of chromosome (e). Histone protein octamers (c) lead to the formation of a hollow structure (d) which undergoes more steps of supercoiling.

Oligonucleotides and plasmids are double stranded structures, in which the individual single strands are annealed by hydrogen bonding. Each of the polymeric single strands is made from monomeric nucleotides, which consist of deoxyribose phosphate and ane of four nucleobases attached to the 1’-position of the deoxyribose shugar. The purine bases cytosine (C) and thymine

(T), together with the pyrimidine bases adenine (A) and guanine (G), are the building block responsible for encoding all genetic information. The deoxyribose phosphates form the polymeric sugar-phosphate backbone of an individual DNA strand. When two of these single strands are brought together, they will anneal via Watson-Crick hydrogen bonds between the nucleobases of the opposite strand, resulting in the formation of a double stranded helix, while 2

only A-T and G-C base pairs are formed.[1] The geometry of hydrogen bonds directs the complimentary strand to bind in an antiparallel fashion to the original one, where the “direction” of a strand is defined by the orientation of the 3’ and 5’ positions of the deoxyribose sugar within the back bone polymer.

Figure 1.1.1. Simplified view of the DNA organization in the chromosome.[2]

The double sranded helix can adopt various geometrical shapes, which led to the most

common conformations of DNA B-, A-, and Z-DNA (Table 1.1.1). B-DNA, which is the most 3 frequently occurring conformation in nature, is a right-handed duplex with 10.4 bp per helical turn and 3.45 Å distance between the stacked bases. A-DNA is also right-handed, but is wider, having 11 bp pr helical turn. Z-DNA is a left-handed helix with 12 bp per helical turn.

Transitions between B, A and Z conformations can be induced by e.g. salt concentration and solvent. Also other conformations are known, such as triple helices, three-way and four-way junctions. The human telomere consists of four strands stabilized by potassium ions in the center of the helix. The separation of double stranded structure, denaturing, can be achieved either by chemical or thermal treatment. If denaturing is achieved thermaly, the temperature at which the

DNA denatures is referred to as the melting temperature (TM) of the DNA.

Table 1.1.1. Structural characteristics for B-, A- and Z-DNA

Geometry attribute A-form B-form Z-form

Helix sense right-handed right-handed left-handed

Repeating unit 1 bp 1 bp 2 bp

Rotation/bp 33.6° 35.9° 60°/2

Mean bp/turn 10.7 10.4 12

Rise/bp along axis 2.3 Å 3.4 Å 3.8 Å

Pitch/turn of helix 24.6 Å 35.4 Å 45.6 Å

Diameter 26 Å 20 Å 18 Å

Sugar pucker C3´-endo C2´-endo C:C2´-endo,G:C2´-exo

CD signals in nm +275L/-235Sa +280M/-240Ma +280S/-240Ma a L-large, M-medium, S-small mdeg.

4

The double helical structure of B-DNA has very characteristic shape, which is responsible for many reactions and interactions of molecules and with it. The distance between the phosphate backbone between the turn is not thesame, allowing the distinction between the wider

“major groove” and narrow “minor groove” (Figure 1.1.2). It can be clearly seen how the

Watson-Crick base pairs form a ladder-like stack of parallel planes (Figure 1.1.2 (a)). While these stacked pairs of purine and pyrimidine bases create a hydrophobic center in the helix, the hydrophylic sugar-phosphate backbone points towards the outside of the helix, as shown in

Figure 1.2 (b). This structural organization leaves the hydrophilic sugarphosphate backbone as the preliminary accessible surface, while the hydrophobic core is accessible only inside the grooves.

Figure 1.1.2. Double helical structure of B-DNA showing the major and minor grooves. 5

A logical consequence of research with DNA is the need to cut the strand at very specific

sites, achieving the insertion or removal of specific genes, fro example. The task of cleaving a

DNA sequence specifically has unusually been accomplished by using restriction endonuclease.

Each of this restriction enzymes will very specificsally target and then cleave the strand at a

known base pair sequence. While this method achieves very high site specifisity in strand

cleavage, there are disadvantages. Each sequence of base pairs to be addressed requires the use

of a different anzyme. Despite the large number of known endonucleases, it is impossible to find

one for each conceivable base pair combination of interest as a conseqience a plethora of new

target sites will need to be addresed. The ability of selectivelly remove, modify, or disable genes

may lead to new methods of treatment for disease carrying genomes. What appears to be

standard genetic engineering procedures, such as isolating or inserting short sequences at a very

specific point in an oligonucleotide strand, would be the first ones to benefit from a versatile

sequence-specific DNA-cleaving agent. While this technique currenntly rely on the use of

endonucleases, the development of new DNA cleaving agents with inherent sequence

recognizing units is extremelly crucial.

The most wildly known examples of synthetic DNA cleaving agents is shown in Figure

1.1.3. This includes ruthenium and osmium complexes (1),[3-5] pyrrole--polyamide chains (2),[6, 7] anthraquinones (3),[8, 9] bleomycins (5),[10, 11] and enediyenes (7)[12].

Partucularly intriquing is the class of phototriggered DNA cleaving agents, reviewed by

Armitage.[13, 14] These papers demonstrate the versatility and advantages of photochemical processes within the wide field of cleaving agents.

6

Figure 1.1.3. Examples of DNA binding compounds. 1) Ruthenium and osmium complexes, 2) pyrrole-imidazole-polyamide chain, 3) anthraquinone, 4) ethidium bromide, 5) bleomycin, 6) Hoechst 33258, 7) enediyene.

7

1.2 DNA-ligand binding

Considering the very different molecular structures of these compounds, it is obvious that they will interact in many different ways with the DNA strand. Generally, there are three well known models that describe the binding of small molecules to the DNA double helix: electrostatic or surface binding, groove (minor and major) binding, and intercalative binding

(Figure 1.2.1). Electrostatic binding interactions between cationic species and the negatively charged DNA phosphate backbone usually occur along the exterior of the helix. The binding is often non-specific and is difficult to observe directly. Typically only indirect binding is detectable, for example as a change in the backbone configuration.

Minor groove binders are typically long elongated structures with a curvature that fits the curvature of the minor groove. The ligand is fitted between the narrow walls of the groove and stabilized via hydrogen bonds and van der Waals interactions. The minor groove also has a certain flexibility to accommodate for ligands that do not have a perfect fit. Major groove binders utilize the numerous possibilities for specific hydrogen bonds with donors and acceptors on the nucleic bases providing the basis for both complex stabilization and sequence specificity. Many proteins bind to DNA in the major groove.

Stacking interactions between nucleobases and aromatic ligands are important in defining the third type of binding mode known as the intercalative binding, which occurs when a planar, heteroaromatic moiety slides between the DNA base pairs and binds perpendicular to the helix axis. This requires some unwinding of the helix to create a space between the basepairs. The intercalation is stabilized by stacking interactions, i.e. π-π interactions between the aromatic rings. Also substituents placed in the grooves or on the DNA surface can give additional 8

stabilization of the intercalated structure.[15, 16] Intercalators have found widespread application in the study of DNA.[17-20]

In addition, there are a number of multivalent DNA-binding molecules that contain an intercalating polycyclic ring and a carbohydrate or peptide moiety that binds to the minor groove of the duplex. The non-intercalating portion can form several hydrogen bonds or electrostatic interactions with DNA, thus contributing to the stability of the complex. The detection and characterization of multivalent modes of binding may result in the formulation of better rules for rational drug design. Improved drug binding affinity and the ability to discriminate larger DNA sequences may allow us to target unique genome sites. [21, 22]

Electrostatic binding

+ Intercalative binding + Groove binding + +

Figure 1.2.1. Illustration of ionic, groove and intercalative binding of small molecules to the DNA double helix.

9

1.3 DNA photocleavage and photodynamic therapy

Light’s therapeutic properties have been known for many years. However, photodynamic therapy (PDT) was developed only in the last century. At present, PDT is being tested in the clinic for use in oncology — to treat cancers of the head and neck, brain, lung, pancreas, breast, prostate and skin. In PDT, irradiation with visible light leads to activation of a photosensitizer drug. Upon irradiation, the photosensitizer generates oxygen species, which react rapidly with mutant DNA, initiating oxidative damage and eventually cell death. Much less is known, however, about drug-DNA targeted damage which would improve the effect of PDT. While several potential PDT drugs are currently in clinical trial phases, Photofrin has already been approved by the FDA for treating lung cancer, the most common fatal cancer in men (28%) and women (26%) [American Cancer Society].

DNA intercalators have attracted particular attention due to their antitumoral activity and ability to manipulate DNA. They provide significant structural modifications to the DNA double helix and may inhibit DNA replication as well as gene transcription. [23, 24] These unique properties have many important applications, for example, as artificial nucleases and in gene- targeted chemotherapy. [25-27] Importantly, not all intercalators display therapeutic properties.

They may also damage the DNA and/or the organism. The structural modifications caused by binding of intercalators to DNA may lead to the retardation or inhibition of transcription and replication, and even cause mutagenesis. The genotoxicity of non-covalent interactions have been shown already.[25, 28] In spite of this, it is desirable to perform controlled mutation.

Photochemically activated molecules represent a special group of DNA intercalation agents which capable of cleaving double-stranded DNA. There are numerous examples of DNA-binding drugs that serve as cleaving agents already in use in molecular biology. These include metal 10

complexes, organometallic complexes, and organic compounds such as enediynes,

nitrosubstituted aromatics, halogen containing compounds, riboflavins, naphthalimide

derivatives and certain anthraquinone derivatives. [29-34] Although there are several possible

cleavage mechanisms and types of damage that they cause, all photochemical cleavers offer

advantages over regular DNA cleaving agents. These advantages include the prior binding of the

molecule to the DNA before activation and the ability to control the photoactivation in both a

spatial and temporal manner. Also, light serves as a selective reagent; usually photocleaving

agents are activated at wavelengths longer than 300 nm, the spectral region where nucleic acids

and most proteins do not absorb light. This combination of sequence specific DNA binding with

the controlled photochemical activation of the cleaving agent could lead to powerful and

versatile tool in medicine and .

Using this approach, a research group leading by Prof. Wilson have previously reported the

site specific cleavage of a nucleic acid target by a phenanthrene based DHD-DNA conjugate[35]

and have extended these ideas to include several different phenanthrene based DHDs. In this

part, the detailed binding of a new highly reactive derivative of pyrene dihydrodioxine to DNA as well as its photocleaving properties will be described.

11

CHAPTER II. MATERIALS AND METHODS

2.1 Experimental approaches

Many anticancer drugs (e.g. anthracyclines, mitoxantrone, dactinomycin) interact with DNA through intercalation without disturbing the overall stacking pattern due to Watson–Crick hydrogen bonding. Since various typical intercalating agents contain three or four fused rings that absorb light in the UV–visible region, they are usually known as chromophores. Besides the chromophore, other substituents in the intercalator molecule may highly influence the binding mechanism, the geometry of the ligand–DNA complex, and the sequence selectivity.

The intercalation process starts with the transfer of the intercalating molecule from an aqueous environment to the hydrophobic space between two adjacent DNA base pairs.[36] This process is thermodynamically favored because of the positive contribution associated to disruption of the organized shell of molecules around the ligand (hydrophobic effect). In order to accommodate the ligand, DNA must undergo a conformational change involving an increase in the vertical separation between the base pairs to create a cavity for the incoming intercalator. The double helix is thereby partially unwound, which leads to distortions of the sugar–phosphate backbone and changes in the twist angle between successive base pairs (Figure

2.1.1).[37] Once the drug has been sandwiched between the DNA base pairs, the stability of the complex is optimized by a number of non-covalent interactions, including van der Waals and π- stacking interactions,[38] reduction of columbic repulsion between the DNA phosphate groups associated with the increased distance between the bases because of helix unwinding, ionic interactions between positively charged groups of the ligand and DNA phosphate groups, and hydrogen bonding. 12

Figure 2.1.1. Illustration of double helix DNA deformation by an intercalator.

DNA intercalators are less sequence selective than minor groove binding agents, and, in

contrast with them, show a preference for G-C regions. This selectivity is mainly due to complementary hydrophobic or electrostatic interactions, which are due to substituents attached to the chromophore within the major or minor grooves. DNA intercalation is also governed by the nearest-neighbour exclusion principle, which states that both neighbouring sites on each site of the intercalation remain empty, that is, they bind, at most, between alternate base pairs.[39]

This is an example of a negative cooperative effect, whereby binding to one site induces a conformational change that hampers binding to the adjacent base pair.

The general methodology for study DNA ligand interaction includes but not limited to: structural analysis (X-Ray, NMR), spectroscopy (ESI MS, UV-Vis, ITC, CD spectropolarimetry), biochemical techniques (gel shift assay, photocleavage assay, DNAse footprinting). The results of these studies are only the first step in a series of events that 13

eventually could lead in vivo application. Structural changes induced in DNA by intercalation

result to interference with recognition and function of DNA associated proteins such as

polymerases, transcription factors, DNA repair systems, and, specially, topoisomerases. In this

chapter, the general protocols for DNA intercalation studies are described with highlights to UV-

melting and DNA photocleavage assay experiments.

2.2 Materials

All chemicals were obtained from commercial suppliers and were of analytical grade, and

used without further purification unless otherwise noted. Detailed descriptions of the synthesis of

enantiomeric DHDs and separation of enantiomers on chiral HPLC column are available in

supplemental data. DHD stock solutions (100 µM) were prepared in CB buffer (10mM Sodium

Cacodylate three hydrate, 100 mM NaCl, 0.5 mM EDTA, pH = 7.1) using the empirically

-1 -1 determined extinction coefficient ε343 = 77,459 M cm .

Synthetic DNA oligonucleotides:

10 base pair step (10 bps) DNA (5’- ATCGACCAAGC -3’/3’- TAGCTGGTTCG -5’),

12-mer DNA (5’- GTTAGTATATGG-3’/3’ – CAATCATATACC -5’),

22-mer DNA:

(5’-ATCGACCAAGCTAGCTGGTTCG- 3’/3’ –TAGCTGGTTCGATCGACCAAGC- 5’),

38-mer DNA:

5-AGTCTATTGGTTGCTTTGTTGATTGTTTATTTACTTAT-3

5- ATAAGTAAATAAACAATCAACAAAGCAACCAATAGACT-3 were employed in this study. These oligomers were purchased from DNA Integrated Technology

(www.idtdna.com) and purified on 20% denaturing PAGE in the presense of 8 M UREA, using 14

standard crush and soak protocol to extract DNA from gel slices.[40] DNA strands were annealed at a concentration of 100 µM of each strand in CB buffer by heating to 100° C and slow cooling to room temperature. The concentration of purified stock solutions was determined

-1 absorbance at 260 nm using extinction coefficients calculated by the supplier: ε260 211100 M

cm-1 (10 bps DNA), 244600 M-1 cm-1 (12-mer DNA), 419600 M-1 cm-1 (22-mer DNA), and

762200 M-1 cm-1 (38-mer DNA) at 25° C.

Commercially prepared Herring Sperm DNA (hs DNA) was obtained from Sigma-Aldrich

Co. (www.sigmaaldrich.com) and dissolved in double deionized water (dd H2O) containing 50

mM NaCl with further dialysis for 24 h at 4° C against a buffered solution (pH 7.1, sodium cacodulate). The removal of protein was monitored using the ratio of Abs (260 nm) / Abs (280 nm).[41] A stock solution of hs DNA (1 mol/L in nucleotide phosphates) was prepared by

-1 -1 directly dissolving DNA in dd H2O, and quantified using ε260 = 6600 M cm , as described in

literature methods. [22]

ΦX174 plasmid RF I 5386 basepair DNA (> 90% supercoiled form, 1 mg/ml) was purchased

from New England BioLabs Inc. (www.neb.com) cat # N3021L.

2.3 UV-Vis absorption spectroscopy.

UV-Vis spectra and melting curves were measured using a 100 CARY-BIO UV-Visible spectrophotometer (Varian Inc.) equipped with a thermoelectrically controlled cell holder and a quvette with 1.00 cm path length. To evaluate binding constant UV-Vis titrations were performed. Experiment conducted at 20° C in CB pH =7.1 buffer by manually injecting 1-10 µl aliquots of hs DNA or oligonucleotide DNA solution into 1ml of DHD samples at 10 µM concentration in the same buffer and recording spectra between 200 and 500nm. The resulting 15

DNA concentration ranged from 0 to 225 µM. After each injection, the absorption spectrum was

recorded between 200 and 500 nm.

2.4 Optical melting experiment.

To evaluate the effect of the bound DHD enantiomers on the thermal stability of the DNA

duplexes, UV melting experiments were conducted in the presence and absence of each isomer.

Absorbance vs. temperature profiles were measured at 260 nm on CARY-BIO UV-Visible

spectrophotometer (Varian Inc.) equipped with a thermoelectrically controlled cell holder, and a

1.00 cm path-length cuvette heated at a rate of 0.5° C/min. DNA melting temperatures (TM), determined by UV melting in the presence and absence of saturating amounts of DHD enantiomers were used to calculate the affinity of enantiomers to DNA 10 bps.

2.5 Circular Dichroism (CD) spectropolarimetry.

CD measurements were performed in an AVIV model 62A DS spectropolarimeter (Aviv

Associates, Lakewood, NJ) equipped with a thermoelectrically controlled cell holder and a

cuvette with a path length of 0.1 cm. The wavelengths studied ranged from 240 to 400 nm. Scans

of DNA duplexes (~50 µM) were recorded at 1nm intervals. A scan of the buffer alone was

recorded to provide a baseline. DNA duplex was titrated by the addition of a concentrated DHD

solution to cover DNA:DHD ration over the range of 1:0 to 1:2.

16

2.6. ΦX 174 plasmid DNA photocleavage assay.

The DNA cleaving ability of restriction enzymes and synthetic molecules is commonly tested using a supercoiled plasmid relaxation assay.[42-44] ΦX174 is a small icosahedral E. coli bacterial virus or bacteriophage that has a certain historical significance. The DNA molecule of the phage ΦX174 is a 5386 bases long single stranded circle. Only during the replication phase within the cell does the double stranded form of the molecule arise; therefore, the supercoiled, circular duplex DNA is referred to as a replicative form (RF) I. The supercoiled form is achieved when a circular double stranded DNA molecule is further twisted. In the event of a single stranded cleavage the supercoiled form becomes relaxed, resulting in an open circular double stranded φX174, RF II. If the second strand is nicked within close proximity (~16 bp) of the single stranded damage, linearization of the plasmid occurs, RF III (Figure 2.6.1). All three forms of the plasmid can be separated by agarose gel electrophoresis. After the quantification of

DNA bands, the relative cleavage efficiencies of DNA cleaving agents can be assessed.

Figure 2.6.1 Illustration of supercoiled plasmid DNA conversion to circular and linear forms.

17

Linearization of the plasmid can occur in two ways. In the event of random, single stranded cuts, the cleaving agent leaves the area of the initial scission and creates cuts somewhere else on the DNA. In the event of a double stranded scission, the cut on the opposite strand has to occur within 8 nucleotides of an initial strand cut. In general, the DNA cleaving agents that create double stranded breaks are more sought after, as chemotherapeutic agents. The DNA repair systems are less prone to repair double stranded cuts and as a result cell death occurs.[45]

The Freifelder–Trumbo equation describes the number of expected double stranded cuts in the event of random single stranded cleavage. It is expressed as the ratio of the average number of single strand cuts per molecule, n1, to the average number of double strand cuts per molecule, n2.[46] The statistical test of Provirk et al. shows how to calculate n1 and n2 from the fractions of linear and supercoiled DNA after the cleavage event. The resulting n1/n2 values are compared to the expected ratio form the random single stranded cuts.[47] The test has been used for such

DNA cleaving agents as a copper based transition metal complex, bleomycin, dicerium complexes.

Procedure

Supercoiled plasmid DNA stock solution (1 µg/µl) was diluted in CB buffer to final concentration 0.03 µg/µl. The samples were prepared by adding enantiomericaly pure DHD to plasmid DNA in the dark to achieve a final concentration of 1 mol DHD per 50 bp DNA.

Samples were rapidly mixed and incubated at ambient temperature for 30 min. DHD/DNA complexes were photolized in eppendorf tubes by placing on ice at a distance of 20 cm from a

He/Cd CW Laser system (56 series Melles Griot Co.), producing 442 nm light with 78 mW of power. For air –free samples, 100 µL vials containing the sample were out-gassed on ice for 10 min with ultrapure argon via a sterile 16-gauge needle before irradiation. Irradiation was carried 18

out for 0 to 420 seconds with 30 s intervals. ΦX174 plasmid supercoiled DNA was used as a

control and irradiated for 420 s. Irradiated samples were subsequently mixed with 0.4 volume of loading dye (30% glycerol, 1X TAE and 0.1% bromphenole blue) vortexed and loaded on 1% agarose gels to resolve the conversion of supercoiled plasmid (RFI) to circular (RFII) or linear form (RFIII). [48] The gel was run at 50 V for 3 hours in 1X TAE buffer (40mM Tris HCl,

20mM acetic acid, and 1mM EDTA, pH =8) and then was stained for 30 min in GelRed from

Biotium Inc. (6 μL of 10,000X solution oin water dissolved in ~ 300-350 mL of 1X TAE buffer).

The relative quantities of the supercoiled, nicked, and linear DNA were calculated by integrating the area of each band using the Image J software (http://rsbweb.nih.gov/ij/).

Single stand breaks (n1) were quantitated by calculating the fraction of supercoiled

replicative form of φX 174 remaining after irradiation according to[49]

RFI = exp [-(n1+n2)] (2)

Where the average number of double strand breaks (n2) per molecule was obtained using the

following equation

n2 = 1/[(RFI + RFII + RFIII / RFIII) – 1] (3)

where RFI, RFII and RFIII are the supercoiled, circular/nicked and linear forms of plasmid

DNA, respectively.

19

2.7 Sequence selectivity of a DNA oligonucleotide.

22-mer and 38-mer deoxyoligonucleotides were 32P-5’-end-labeled with T4 polynucleotide

kinase (New England Biolabs Inc.) and [γ-32P] ATP (PerkinElmer Co.) according to described procedure.[50] The end-labeled DNA strands were hybridized to the complementary strand

(2.5µM strand concentration) in 10mM sodium cacodylate buffer, pH 7.0 with presence of

100mM NaCl.

The 22-mer DNA duplex was incubated with DHD enantiomers (50µM) at 37° C for 1 hour.

Aliquots of 40 µL were subjected for DNA photocleavage using 442nm He-Cd laser as described

above for 10 min. After piperidine treatment (94° C, 20min), the samples were resuspended in

formamide loading dye (50% fromamide, 10mM EDTA 0.01% Bromphenol blue), loaded onto

20% denaturing polyacrylamide sequencing gel, and electrophoresed at 1800 V for

approximately 3 hours. DNA fragments were visualized using the Storm phosphoimager

(Amersham, Storm 860).

The 38-mer DNA duplex after the treatment with DHD for 1 h incubation at 37° C the samples was irradiated at 442 nm and directly subjected into 15% polyacrylamide gel electrophoresis. The piperidine treatment step was not applied.

2.8 DNase I footprinting assay.

Footprinting is essentially a protection assay, in which cleavage of DNA is inhibited at discrete locations by the sequence specific binding of a ligand or protein. In this technique, a

DNA fragment of known sequence and length (~100-200 bp), which has been selectively radiolabeled at one end of one strand, is lightly digested by a suitable endonucleolytic probe in the presence and absence of the drug under investigation. The cleavage agent is prevented from 20

cutting around the drug-binding sites so that, when the products of reaction are separated on a

denaturing polyacrylamide gel and exposed to autoradiography, the position of the ligand can be

seen as a gap in the otherwise continuous ladder of bands (Figure 2.8.1). Since the bands are

visualized by autoradiography, only the shortest of these species bearing the radioactive label

will be visualized. The conditions of the cleavage reaction are adjusted so that, on average, each

DNA fragment is cut no more than once. As a result, each of the bands on the autoradiograph is

produced by a single cleavage event, i.e., single-hit kinetics. If an excessive amount of cleavage

agent is used, then labeled products can arise from more than one cleavage event, biasing the

distribution of fragments toward short products. In general, the extent of cleavage is adjusted so

that between 60 and 90% of the radiolabeled DNA remains uncut, though longer fragments

require greater amounts of digestion to produce suitable band intensities.

DNase I is a monomeric glycoprotein of MW 30,400 Da. It is a double strand specific

endonuclease, which introduces single strand nicks in the phosphodiester backbone, cleaving the

3'-Phosphate bond. Single stranded DNA is degraded at least four orders of magnitude more

slowly.[51, 52] The enzyme requires divalent cations and shows optimal activity in the presence

of calcium and magnesium.[45] Although it cuts all phosphodiester bonds, and it does not

possess any simple sequence dependency, its cleavage pattern is very uneven and is thought to

reflect variations in DNA structure.[53] In particular, An • Tn tracts and GC-ruch regions are

poor substrates for the enzyme. The most important factors affecting its cleavage are thought to

be minor groove width and DNA flexibility.[54, 55]

DNase I footpringting has been successfully employed for indentifying or confirming the

preferred DNA binding sites for several ligands including actinomycin,[56] mithramycin,[57] quinoxaline ,[58, 59] daunomycin.[60] 21

Procedure

Footpringting experiment was performed using DNA 38-mer. The 32P-5’-end-labeled 38–bp

DNA 2µL 2.5 µM was combined with 4 µL of DHD rac solution having different concentration.

Both the DNA and the ligand are dissolved in 10mM CB buffer to make final volume of 10 µL.

Individual reactions were performed in 1.5 mL Eppendorf tubes with increasing concentrations of DHD rac; thus the final concentration of the DHD enantiomer covered the range of 0.01 – 100

µM. The mixtures were incubated to obtain equilibrium at room temperature for 1 hour. The cleavage reaction was initiated by adding 2 µL of DNAse I (Promega Co.) (0.05 U/mL) and incubating for 8 min at room temperature. The reaction was stopped by transferring the tubes to a dry ice followed by liophilization with a SpeedVac for 30 min. Then, the dry products were dissolved in 5 µL formamide loading dye solution and loaded onto 15% denaturing polyacrylamide sequencing gel, and electrophoresed at 1700 V for approximately 2 hours. DNA fragments visualization was performed using phosphor imaging screen and the Storm phosphoimager (Amersham, Storm 860).

22

A

B

Figure 2.8.1. Schematic illustration of DNase I footprinting experiment. Panel A is a DNA double strand after the DNase I digestion, all nucleotides are visible. Panel B DNA complex with a drug, protected nucleotides are not visible on denaturing PAGE.

23

2.9 Quantum yield of quinone release calculations.

DHD racemate was used to calculate the quantum yield of quinone release by the comparative method with irradiation at 442 nm, using following equation

-Abs442 Ф = ΔA343*V / (ε343 *l* τ*q *(1-10 )) (4)

Where, ΔA is an absorbance change after irradiation, V is the volume of irradiated sample, ε343 is

the extinction coefficient of DHD at 343 nm, l is a path length, τ is the irradiation time q is the

photon flux and Abs442 is the absorbance (442 nm) of the sample in 5cm quvette at 442 nm.

Photon flux was calculated using potassium ferrooxalate (K3[Fe(C2O4)3]) as an actinometer, following the standard procedure.[61]

Two experiments were performed in presence and absence of oxygen. Two independent solutions of DHD 1.72 x 10-5 M (presence of oxygen) and 1.36 x 10-5 M (absence of oxygen)

were irradiated for three different times in sodium cacodylate buffer. Change in steady state

absorption spectra was determined by measuring the decrease of absorption band at 343 nm

relative to control (not irradiated) samples. The graph absorbance change vs. irradiation time was

linear for both samples.

A Ti:sapphire amplified laser system (Hurricane, Spectra Physics) was used as irradiation source. The Hurricane output is divided by a 50/50 beam splitter into two parts. One of the beams sent to TOPAS-C optical parametric amplifier (Light Conversion, Lt), which generates

442 nm excitation pulses that go through the sample. Samples were irradiated in 5cm quvette.

Absorbance spectra were taken using the Varian Cary 50 UV/Vis spectrophotometer in 1cm quvette. All measurements were provided at the temperature of 22° C.

24

CHAPTER III. RESULTS AND DISSCUSION

3.1 Background

The major criteria to design an effective and versatile nuclease are site specificity, ease of

handling and the ability phototriggered precisely at the time desired. One of the approaches to

this problem would be the development of synthetic photonucleases. These compounds can be

designed to recognize and bind selectively to a specific target DNA sequence which makes them

potentially powerful tools for applications requiring site-specific modifications of DNA.

An effective family of intercalation agents is Pyrene DiHydroDioxins (PDHD) which are

photoreactive masked ortho-quinones. PDHDs have shown significant promise in this

capacity.[35] PDHD photonucleases consist of three structural units. These are depicted in

Figure 3.1.1 and include: the Masking Cleaving Unit (MCU), the DHD ring system which is

released from pyrenequinone upon excitation with UV light, and the Masking Unit (MU).These

materials are easily prepared in a classical organic photoreaction known as the Schönberg-

Mustafa reaction (Scheme 3.1.1). [62]

Figure 3.1.1 Structural units of PDHD photonuclease. Masked Cleaving Unit (MCU), Masking Unit (MU), Sequence Recognizing Unit (SRU).

25

Scheme 3.1.1 Schönberg-Mustafa reaction: The forward reaction occurs with visible light (425- 514 nm), while the reverse reaction occurs with UV light (350 nm).

Thus, visible light irradiation of ortho-quinones of polycyclic aromatic hydrocarbons in the

presence of olefins leads to dihydrodioxin formation. This reaction is reversed by irradiation with

short wavelength (UV light) that reforms the original ortho-quinone and olefin. [63]

This olefin-derived unit not only serves the purpose of masking the quinone, but also may be

used to attached the Sequence-Recognizing Unit (SRU), and may be used to introduce chirality

to the nuclease. An oligonucleotide is one of the example that could be serve as possible SRU.

In previous studies, this quinone releasing property of dihydrodioxins has been investigated as a

possible tool for DNA cleavage.[35, 64, 65] This discovery makes masked ortho-quinone

derivatives promising photocleaving systems, since these molecules can be derivatized with a

variety of substituents. The introduction of different substituents on the masking unit can

dramatically affect the binding mode and cleavage efficiency in these photocleaving agents.

However, little is known about PDHD-DNA binding properties and its sequence specificity, as

well as the behavior of quinone photorelease in the presence of DNA.

A new compound, dimethylpyridinium dihydrodioxin tetrafluoroborate salt (PDHD 1), has been synthesized (Scheme 3.2) in an effort to study these problems. This compound exists in two 26 enantiomeric forms (SS, RR) and has been found to have exceptional DNA binding and double- strand cleaving properties. Enantiomers are differentiated based on their retention time on chiral

HPLC column (see supplementary information section) as follows: PDHD I (first fraction eluted) and PDHD II (second fraction eluted). Bellow, we describe and compare their unusual properties.

N

H C CH3 3 2 BF4 N N N N

N O O OO OO 1 (H3C)3O BF4

Scheme 3.1.2 Synthesis of dipyridinium pyrene dihydrodioxin (PDHD) 1.

3.2 Theoretical PDHD/DNA binding studies

PDHD 1 might exist in either an diequatorial or diaxial conformation (Figure 3.2.1 A).

Apparently, the diaxial conformation is easily accessed due to the significant steric and electrostatic repulsion between the two proximate positively charged pyridinium rings in the diequatorial conformation. This diaxial conformation would seem to be ideally suited for binding in the major groove of duplex DNA. In this conformation the pyrene ring could intercalate in the base-pair stack while the positively charged pyridinium “arms” are held in closely proximity to the negatively charged phosphate backbone strands of the DNA. While it is not certain which enantiomer, SS or RR, would fit most effectively into the major groove of the 27

DNA, extensive molecular mechanics calculations[66-68] indicate the SS enantiomer usually forms the more stable complexes at various sites in duplex B-DNA than does the RR enantiomer.

SS RR

A B

Figure 3.2.1. Geometrical conformations of PDHD 1. A) Diequatorial and Diaxial conformations. B) Binding of PDHD enantiomers (SS and RR) in the major groove of duplex DNA.

An example of this type of comparison is shown in Figure 3.2.2. In this example, the diaxial conformation of the SS enantiomer binds as shown in Figure 3.2.2 B. While the RR enantiomer adopts a diequatorial conformation this leads to a much less stable complex. This is the usual outcome of these comparisons. The SS enantiomer seems to cause far less distortion of the DNA duplex and seems to form much more stable complexes than does the RR enantiomer. However, it must be noted that many different types of complexes have been found in these calculations depending upon the base-pair sequence and the proximity of the pyrene intercalation to the end of the duplex strand.

28

Figure 3.2.2. Binding of PDHD 1 enantiomers to a DNA decamer in an aqueous environment.

Complexes have even been calculated in which both pyridinium rings are bonded to the same

DNA strand. Extensive further theoretical studies are going to be required in order to determine the optimum binding base-pair sequence. What can be said at this point is that a single PDHD binds to a three to four base-pair length of duplex DNA. Thus, while a duplex DNA strand can bind to more than one PDHD, these theoretical considerations indicate that the strand will become saturated when the strand becomes bound to (number of DNA base pairs)/3-4 PDHDs.

This limit is consistent with experimental binding studies, vide infra. Finally, this PDHD/DNA binding model is unusual in that it consists of coordinated intercalative and electrostatic binding 29

which should greatly stabilize DNA duplexes to which it is applied, and this model will help the

reader interpret the results related in the remainder of this paper.

3.3 UV-Vis spectroscopy studies.

A fixed amount of 1 mL of 10 µM PDHD isomers were titrated by incrementally addition 1

µL of 0.5mM 10bps DNA (5’- ATCGACCAAGC -3’/ 3’- TAGCTGGTTCG -5’) or 1 µL hs

DNA solution directly in a quvete until no further change in DHD absorbance was observed.

Titration was performed in sodium cacodylate buffer pH = 7.1 at 20 °C. The absorbance spectra

were corrected for small dilution effects so they refer to constant concentrations of PDHD in the

measured solution final concentrations of DNA vary from 0 µM (free DHD) - 28 µM (fully saturated) (Figure 3.3.1).

The UV-vis spectra of the PDHD isomers are characterized by four transitions: a lower absorption in the 310-350 nm region and a higher energy bands in the 200-290 nm region. Generally, batochromic and hypochromic effects are observed in the absorption spectra of small molecules if they intercalate with DNA.[69] In addition, the magnitudes of hypochroism are correlated with the strength of the intercalative binding interaction. From the Figure 3.3.1 D and E, it can be seen that the addition of DNA induces a hypochromic effect (> 40% observed in all cases) and produces a bathochromic shift (>10 nm observed in all cases) in the spectra, which indicates that enantiomers intercalate into the DNA helix.

30

A A1

Hypochroism at 343 nm

0µM 10bps DNA

30µM 10bps DNA

B B1

Hypochroism at 343 nm

0µM 10bps DNA

30µM 10bps DNA

C C1

Hypochroism at 343 nm

0µM 10bps DNA

30µM 10bps DNA

31

D D1

E E1

Figure 3.3.1. The absorption spectra of 10 μM pure PDHD enantiomers and PDHD/DNA complexes in cacodylate buffer (10 mM sodium cacodylate, 100 mM NaCl, 0.5 mM EDTA, pH 7.1) at 20° C. A is PDHD I titrated with 10 bps DNA; B is PDHD II titrated with 10bps DNA; C is PDHD racemate titrated with 10bps DNA. D and E related to PDHD I, PDHD II UV-Vis curves, respectively, with the maxima at 343 nm shifting to 356 nm with increasing concentrations of hs DNA (25, 50, 75, 100, 125, 150, 175, 200 and 225 µΜ of DNA basepair). Insets are the half-reciprocal plots of [DNA]/∆εap as a function of DNA concentration as determined from the absorption spectral data. The solid line represents the best linear fit of the data with a regression coefficient of > 0.995. 32

Another evidence for PDHD/DNA complex formation is the presence of characteristic

isosbestic point at 295 nm. These results suggest that there exit interaction between PDHD and

DNA and that the binding mode is the classical intercalation binding.

To compare the binding strengths of PDHD enantiomers, the intrinsic binding constant (Kb)

was calculated according to the previously reported procedures.[70, 71] When the ratio of bound

PDHD to DNA base pairs is relatively low, the intrinsic binding constant Kb may be determined

from a double-reciprocal plot of the changes in the apparent extinction coefficient of PDHD and

DNA concentration as follows:

1 / ∆εap = 1/∆εKb[DNA] + 1/∆ε (5)

Where ∆εap = |εap – εf|, ∆ε = |εb – εf|. The apparent extinction coefficient, εap, is obtained by calculating Aobs/[PDHD]. εb and εf are extinction coefficients of PDHD free and PDHD fully

bound form respectively. Multiplying by [DNA] puts the equation in the half-reciprocal plot

form:

[DNA]/∆εap = (1/∆ε)*[DNA] + 1/∆εKb (6)

A plot of [DNA]/∆εap versus [DNA] will have a slope of 1/∆ε and a y-intercept equal to

1/(∆εKb). Kb is then given by the ratio of the slope to intercept. Since a double-reciprocal plot

gives excessive weight to data points obtained at low amount of DNA, the half-reciprocal plot

should generally be more accurate. The change in the absorbance of PDHD enantiomers with 33

increasing DNA concentration was used to construct the half-reciprocal plot as shown in Figure

3.3.1 insets. The resulting intrinsic binding constants for PDHDs are listed in Table 3.3.1.

Table 3.3.1 Calculated apparent binding constants from UV-vis titration data. Errors of Kb are from three independent UV-titration experiments.

-1 apparent binding site DNA Enantiomers apparent Kb (M )

size n (bp) PDHD rac 7.1 ± 0.2 x 104 4.2 PDHD I 1.8 ± 0.1 x 105 3.3

PDHD II 3.2 ± 0.1 x 105 3.1

PDHD rac No data No data herring sperm PDHD I 1.6 ± 0.2 x 105 No data PDHD II 2.3 ± 0.2 x 105 No data

Binding data suggest that PDHD enantiomers bind to either short 10 bps DNA or hs DNA

with high affinity. This strong magnitude of binding is comparable with those observed for

known intercalators.[72-74] PDHD II has a slightly larger binding constant in contrast to PDHD

I and its racemic mixture. Although, racemic mixture is expected to have an average value of binding constant of resolved enantiomers, the actual constant for PDHD rac is less (Table 2).

This is probably due to negative cooperativity of PDHD rac binding i.e. the hill coefficient n is less than 1 as it was observed for most intercalators (e.g. ethidium bromide).[75]

Apparent binding site size for DHD was determined from UV-Vis titration experiment.

Absorbance at 343 nm was monitored as a function of DNA/DHD concentrations in moles. In all cases the absorbance at 343 nm reaches a plateau from which stoichiometry and apparent binding site size (napp) are estimated Figure 3.3.2. 34

PDHD I PDHDII PDHD rac

Figure 3.3.2 Absorption titration curve constructed at 343nm. DHD I has ratio r of 0.3 moles of DNA/DHD, napp = 0.3x11bp DNA = 3.3. DHD II has ratio r of 0.28 moles of DNA/DHD, napp = 0.28x11bp DNA= 3.1. DHD rac has ratio r of 0.38 moles of DNA/DHD, napp = 0.38x11bp DNA = 4.18.

3.4 Optical melting studies.

If the temperature of a solution containing a helical nucleic acid is raised sufficiently, strand

separation, or "melting," occurs. The temperature that marks the midpoint of the melting process

is called the melting temperature (or TM). At the TM, half of the nucleic acid exists in the helical

state and the other half exists in the single-stranded state and the two species are in equilibrium:

Helical nucleic acid ↔ Single stranded nucleic acid

Numerous small molecules including metal ions are known to bind directly to nucleic acids

and, in many cases, these binding interactions have been shown to stabilize the structure of

helical nucleic acids.[76] This has a direct effect on the TM. For this reason, studies of the effect

of potential drugs on the TM values of nucleic acids are performed when evaluating

ligand/nucleic acid interactions.[77]

Melting transitions can be detected by UV absorbance, circular dichroism (CD), NMR,

viscosity, electrophoresis, or calorimetry.[78] However, UV absorbance is by far the most commonly used technique because of the method's simplicity, sensitivity, and reproducibility. A 35

list of the advantages and disadvantages of TM determinations made by UV absorbance is shown

as follows:

Advantages:

1. Spectrophotometers are economical and are commonly found in research laboratories

2. UV absorbance measurements are simple, sensitive, and reproducible

3. Only small amounts of nucleic acid and drug are required

4. No spectral signal from the drug is necessary

5. The method is reliable for screening, as well as for rank ordering of drugs within a family

of related drugs

6. A single TM value can provide an estimate of the association constant between drug and a

helical nucleic acid

Disadvantages:

1. Drugs might have to be stable at temperatures as high as 95-100°C.

2. Drugs must be soluble in buffers that are optically transparent.

3. Binding interactions of drugs are compared at the Tm, not at standard (25°C) or

physiological (37°C) temperatures.

4. TM values do not give specific information about the structure of the nucleic acid -

ligand complex nor about the kinetics of the nucleic acid/drug interaction.

Ligands that bind more tightly to double-stranded DNA than to the single stranded state induce an increase in the melting temperature of the host duplex. Figure 3.4.1 represents the measured PDHD induced changes in the thermal stability of the 10 bps DNA. Melting experiments were performed in Na+ cacodylate buffer pH=7.1 with presence of 100mM NaCl. 36

Figure 3.4.1. Thermal denaturation of 10 µM 10 bps DNA alone (leftmost curve) and in the presence of increasing concentrations of PDHDs 1 µM, 5 µM, 10 µM, 15 µM and 20 µM. Saturation (r) occurs when the ratio of PDHD I to DNA reaches 1.5; PDHD II to DNA reaches 2; PDHD rac to DNA reaches 1.

0 In the absence of enantiomers, the TM of 10 bps DNA was measured 50.22 ± 0.3 C. When the ratio of [PDHD]/[DNA] is sufficient to saturate DNA duplex, the Tm was increased to 61.7±0.3,

65.53±0.7 and 63.49±0.4° C with PDHD I, PDHD II and PDHD rac, respectively (Figure 3.4.2).

Table 3.4.1 depicts the resulting melting temperature shifts by PDHD enantiomers.

37

Figure 3.4.2. First derivative melting curves for the melting transitions of 10 µM 10bps DNA in the absence and presence of saturated 20 µM amount of PDHD enantiomers.

Table 3.4.1. PDHD-induced DNA duplex thermal stabilization data (∆TM) for resolved enantiomers.

a DNA 10 bps Enantiomers ∆TM (°C)

DHD rac 11.48 ± 0.3

DHD I 13.27 ± 0.4 DHD II 15.31 ± 0.3

a ∆TM values were derived from UV melting profiles at 10 µM 10 pbs DNA in the absence (Tm°) and presence of saturated amount of PDHD (Tm). Each Tm value is an average derived from three experiments, with the indicated errors corresponding to the average deviation from the mean.

Both of these PDHD enantiomers bind to DNA and strongly stabilize the duplexes of small

oligomers, as shown in Figures 4 and 5. Both enantiomers stabilize the duplex, but PDHD II is

substantial more effective in enhancing the duplex stability, ∆Tm = 15.3° C with PDHD II on a

2:1 molar basis. Inspection of data obtained by UV titration and UV- melting analysis reveal that 38

the apparent binding affinities of the PDHDs to 10 bps DNA and HS DNA follow the hierarchy:

PDHD II > PDHD I > PDHD rac.

Figure 3.4.3. Differential UV melting curves of duplex dodecamer 5’-GTTAGTATATGG-3’/3’- CAATCATATACC-5’ in the absence of PDHD ( ) TM= 39.4º C, this dodecamer with the enantiomer PDHD I ( ) has TM = 61.2º C and with enantiomer PDHD II ( ) TM = 68.1º C. Complexes were melted at 1:1 stoichiometric ratios.

Notably, in some cases addition of PDHD to a duplex, in particular 12-mer DNA 5’-

GTTAGTATATGG-3’/3’-CAATCATATACC-5’, demonstrated even more significant duplex stabiliztion (Figure 3.4.3). The greater ∆Tm of 12-mer DNA in contrast to 10 bps DNA probably is due to the greater PDHD binding specificity of the oligomer.

3.5 Circular Dichroism (CD) spectropolarimetry.

Intercalation of aromatic molecules into double-stranded DNA induces large chirality changes and consequently significantly affects their CD spectra. [79, 80] The CD changes are 39 useful for determination of mobility and orientation of intercalated compound in double-helical

DNA. The uncomplexed 10 bps DNA displayed a positive CD signal at 280 nm and a negative

CD signal at 250 nm, which are typical features of B-DNA.[81] Upon addition of racemic

PDHD, the CD spectrum of the resulting DNA undergoes significant changes: the positive band decreased gradually, whereas the negative signal increased to a positive value (Figure 3.5.1).

`

Figure 3.5.1. CD spectra of 10 bps DNA in the absence (black curve) and presence (colored) of PDHD racemate.

Such changes were likely to result from structural alterations induced by intercalation of

PDHD enantiomers into the double-helical structure. Classical intercalation enhances the base stacking and stabilizes helicity by modifying the intensities of both bands. In contrast, simple groove binding and the electrostatic interactions of small molecules show less or no perturbative effect on the base stacking and helicity bands.[82]

40

3.6 ΦX 174 plasmid DNA photocleavage assay. The ability of PDHD to cleave DNA upon irradiation was investigated using a supercoiled plasmid relaxation assay. The supercoiled form (RFI) of ΦX174 plasmid DNA was used in this study. The DNA cleavage efficiencies of each PDHD enantiomers and the racemate were determined using He-Cd CW laser light at 442 nm. Conversion of supercoiled ΦX174 DNA into relaxed circular and linear forms (RF II and RF III) by visible light irradiation was detected by

1% agarose gel electrophoresis. The relative amounts of the three DNA forms were determined using densitometric analysis of the gel electrophoresis bands at different irradiation times.

Irradiation of plasmid ΦX 174 in the presence of PDHD enantiomers generates linear DNA before the entire supercoiled DNA is converted to the relaxed-circular forms (lanes 1–5; 10-14;

18-21) (Figure 3.6.1). Under conditions where all of the three DNA forms are present at the same time, one can gain a more thorough insight into the nature of the cleavage using the statistical test of Povirk.[47] This test assumes a Poisson distribution of strand breaks and calculates the average number of ss-(n1) and ds-breaks (n2) per DNA molecule (Table 3.6.1). In all cases of plasmid DNA photochemical cleavage, the n1/n2 ratio decreases and all three forms of DNA

(supercoiled (RF I), circular/nicked (RF II), and linear (RF III) are present at the same time suggesting that ss and ds events are kinetically independent and that relaxed to linear conversion occurs at a much slower rate than the initial scission. However, at 30 – 180 s. of irradiation the range of n1/n2 values is significantly smaller than the expected ratio for strictly random single- strand cleavage events, which is for ΦX174 (5386 base pairs) requiring 653 random single-strand cuts for a single linearization to occur. This could be calculated using typical n1 values obtained from the experiment according to Freifelder–Trubo relation:

n2 = n12 (2h + 1)/4L, (7) 41

where h is the maximum number of unbroken base pairs between single strand breaks in opposite

strands that produces a linear form (h = 16), L is the number of phosphodiester bonds per DNA

strand (L = 5386 for ΦX174 plasmid DNA)

Following this assumption the theoretical n1/n2 ratio could be calculated:

if n1 = 0.5; 2611

if n1 = 1; 653

if n1 = 1.5; 290

By comparing the n1/n2 range between DHD enantiomers, one can see that these values

decrease with increasing times of irradiation. PDHD I has noticeably smaller n1/n2 values which

indicates that it more efficiently linearizes the supercoiled form of DNA than PDHD II. These

data indicate that double-strand cleavage is particularly favorable with PDHD enantiomers and

racemate. Thus, once released, the quinone seems to remain strongly associated with the DNA,

and it can inflict multiple lesions in the DNA backbone in the region of its release.

Figure 3.6.1. Photochemical cleavage of supercoiled ΦX 174 plasmid DNA (0.033 µg/µl) with PDHD enantiomers and racemic mixture (1 µM of each) in cacodylate buffer (10 mM sodium cacodylate, 100 mM NaCl, 0.5 mM EDTA, pH 7.1). Each sample was irradiated with the 442 nm He/Cd CW laser light at 18° C. Samples were then analyzed on a 1% agarose gel. Lanes 1–8, are DNA-PDHD I complex with an increasing irradiation time of 0 – 420 s; lanes 9–16 are DNA- PDHD complex with an increasing irradiation time of 0–420 s; lanes 17–24 are DNA-PDHD rac 42

with an increasing irradiation time of 0–420 s; lane 25 is DNA irradiated for 420 s ,and lane 26 is DNA that was not irradiated.

Table 3.6.1. Statistical efficiency of single-strand n1 and double-strand n2 breaks caused by PDHD enantiomers and racemic mixture as a function of irradiation time in the presence of oxygen.

PDHD I PDHD II PDHD rac

Irradiation ss-break ds-break ss-break ds-break ss-break ds-break n1 / n2 n1 / n2 n1 / n2 time (sec) (n1) (n2) (n1) (n2) (n1) (n2) 0 0.27 0.00 - 0.27 0.00 - 0.16 0.00 - 30 2.13 0.04 51.12 0.77 0.01 95.02 0.74 0.01 81.90 60 2.33 0.06 40.80 1.87 0.03 72.99 1.25 0.02 68.06 90 3.48 0.10 36.52 2.33 0.04 56.00 2.90 0.10 28.92 120 3.63 0.10 35.37 3.76 0.10 38.06 4.44 0.17 26.15 180 4.45 0.14 32.31 4.09 0.11 36.80 4.61 0.22 21.00 300 4.68 0.15 31.31 6.72 0.19 35.27 5.06 0.23 21.59 420 5.62 0.19 29.50 6.70 0.20 32.73 6.48 0.43 15.12

3.7 Effect of oxygen on ds cleavage efficiency

In order to investigate the role of molecular oxygen in a formation of reactive oxygen species

(ROS) and their impact on efficiency of DNA ds-cuts, we have performed the same ΦX 174 cleavage experiments under an inert atmosphere by elimination of oxygen from the reaction vessel by bubbling pure argon gas into the PDHD-DNA mixture and irradiation with He/Cd laser light at 442 nm. The resulting agarose gel electrophoresis analysis is shown in Figure 3.7.1. The n1/n2 values for all PDHDs were surprisingly low (Table 3.7.1) in comparison to the previous

aerobic data (Table 3.6.1). PDHD II showed much better DNA ds-break efficiency: the n1/n2

range is 5.2 – 29.4 with less than 10% of supercoiled DNA remaining after 180 s of irradiation.

Comparable n1/n2 values have been observed for iron bleomycin and lysine-enediyne

conjugates.[34, 49] However, in previous experiments in the presence of molecular oxygen, the 43

opposite results were observed, PDHD II was not as efficient at breaking ds-DNA as was PDHD

I and PDHD rac.

One possible explanation is that the PDHD which may generate ROS such as hydroxyl radicals which react with the sugar backbone and readily cleave DNA strands. Thus, if molecular oxygen

(O2) is limited in the reaction mixture, the production of ROS is also limited and/or eliminated,

and cleavage of the DNA will be solely dependent upon the photochemistry of the released

pyrene quinone, which should produce more localized ds cuts than extremely mobile ROS such

as hydroxyl radicals.

Figure 3.7.1. Photochemical cleavage of supercoiled φX 174 plasmid DNA (0.033 µg/µl) with PDHD enantiomers and racemic mixture (1 µM of each) in cacodylate buffer (10 mM sodium cacodylate, 100 mM NaCl, 0.5 mM EDTA, pH 7.1) in absence of oxygen. Each sample was irradiated with the 442 nm He/Cd CW laser light while maintaining temperature at 18° C then samples were analyzed on a 1% agarose gel. Lanes 1–8, are DNA-PDHD I complex with an increasing irradiation time from 0 – 420 s; lanes 9–16 are DNA-PDHD II complex with an increasing irradiation time from 0–420 s; lanes 17–24 are DNA- PDHD rac with an increasing irradiation time from 0–420 s; lane 25 is DNA alone irradiated for 420 s,and lane 26 is DNA alone with no irradiation.

44

Table 3.7.1. Statistical efficiency of single-strand and double-strand break formations without oxygen by PDHD enantiomers and the racemic mixture as a function of irradiation time.

no oxygen PDHD I no oxygen PDHD II no oxygen DHD r Irradiation ss-break ds-break ss- ds-break ss- ds- time (sec) (n1) (n2) n1 / n2 break (n2) n1 / n2 break break n1 / n2 (n1) (n1) (n2) 0 0.29 0.00 - 0.28 0.00 - 0.25 0.00 - 30 0.60 0.01 45.50 0.45 0.02 29.44 0.52 0.01 39.71 60 0.99 0.03 34.47 0.88 0.03 27.64 0.94 0.03 30.29 90 1.34 0.05 26.65 1.16 0.05 22.00 1.29 0.05 24.07 120 1.51 0.10 15.89 1.70 0.11 15.33 1.68 0.08 20.40 180 1.97 0.23 8.41 2.12 0.23 9.03 1.92 0.14 14.08 300 2.66 0.33 7.99 2.40 0.41 5.89 2.29 0.24 9.49 420 3.46 0.45 7.71 2.94 0.56 5.23 2.82 0.40 7.04

In summary, the experimentally determined ratio for PDHDs is 5.2-45.5, which is dramatically lower than expected random cleavage event ~653 cuts. These data indicates that photoactivadet PDHDs inflict very non-random DNA cleavage. It could be due to several possible mechanisms. The cleaving species of DNA could be catalytically renewed, or the

PDHDs may preferentially bind to a specific DNA region where damage could occur more extensively. It is also possible that intercalated PDHD undergoes charge transfer causing damage at guanine rich regions.

3.8 DNA specific damage by PDHD enantiomers.

Sequence selectivity of DNA base damage induced by PDHDs was investigated using 22- mer and 38-mer oligodeoxyribonucleotide duplexes. Each duplex was 32P-5’-end-labeled and incubated with PDHD enantiomers in 10 mM sodium cacodylate buffer (pH 7.0) at 37 °C for 1 hour . Aliquots were exposed to 442 nm radiation as describer in plasmid photocleavage assay using He/Cd laser system. The damaged nucleotides sites were detected as cleavage bands 45

generated by hot piperidine treatment or without it and analyzed on 20 % denaturing

polyacrylamide gel electrophoresis (PAGE). The resulting autoradiograms of double stranded

DNA fragments treated with PDHD isomers plus 442 nm light are shown in Figure 3.8.1 left

panel.

Treatment of the 22-mer DNA with PDHD enantiomers followed by irradiation and

piperedine treatment afforded strong cleavage bands at the all DNA base pairs with no specificity

as it shown in figure 16 lanes 2,4 and 6. The intensities of cleaved bases are different for PDHD

II, PDHD I and PDHD rac; irradiated PDHD II/DNA complex exhibited more significant DNA

base damage than did irradiated PDHD I/DNA and PDHD rac/DNA complexes. Since piperidine

breaks the DNA at a sugar with a modified base or a sugar without a base,[83] it is suggested that

base alteration and/or liberation occurred with exposure to 442 nm light in the presence of

PDHDs.

Interestingly, simple incubation of PDHDs with double stranded DNA with subsequent

piperedine treatment lead to purine (A and G) specific base damage Figure 3.8.1 left panel. This

suggests presence of modified purine bases upon treatment with PDHD with no light required.

The migrations of the damaged bases are the same as GA specific Maxam Gilbert reaction where

DNA was methylated using dimethylsulfate prior to addition of 1M piperedine at 94° C.

46

Figure 3.8.1. Autoradiography of 20% denaturing polyacrylamide gel of 5’-end 32P-ATP labeled 22-mer DNA fragments. Conditions for these cleavage reactions are aerobic and described in materials and methods. On the left: lanes 1,3, and 5 are irradiated PDHD II:DNA, PDHD I:DNA, and PDHD rac:DNA complexes (50 µM PDHDs : 2.5 µM DNA in 10 mM CB buffer, pH = 7.0) respectively, irradiated for 10 min with no piperedine treatment; lanes 2, 3, and 6 are PDHD II:DNA, PDHD I:DNA, and PDHD rac:DNA complex (50 µM PDHDs : 2.5 µM DNA in 10 mM CB buffer, pH = 7.0) respectively, irradiated for 10 min; lane DNA is DNA 22-mer control in absence of DHDs after 10 min irradiation (control); Lane Gs is Maxam-Gilbert sequencing reaction specific for guanines.[84] On the right: 22-mer DNA was incubated with PDHD enantiomers, the light was not introduced. Lanes: PDHDI, PDHDII and PDHD rac are enantiomers complexes with the DNA; GA rxn is sequencing reaction specific for purines.[84]

47

To observe the true 22-mer DNA and the 38-mer DNA strand cleavage the treatment with

hot piperedine step was not performed (Figure 3.8.1 lanes 1,3 and 5 and Figure 3.9.1 lanes 3 and

5). As it could be seen all PDHD enantiomers as well as racemic mixture cleave DNA at guanine

specific positions. Comparison different irradiation dose 5 min and 10 min in Figure 3.9.1

suggest that with longer irradiation time more cleavage at “G” positions is produced.

Additionally, the intensities of cleaved Gs are different; G9 and G10 nucleotides are the most

accessible for PDHD induced cleavage upon irradiation, while nucleotides G13, G18, G21 and

G25 remain the same lower cleavage pattern (Figure 3.9.1 lanes 4 and 5). The cleaved DNA

bases comigrate with those obtained by Maxam-Gilbert for purine specific sequencing reaction

indicating that the products are about the same electrophoretic mobility size.

One possible explanation of the DNA cleavage is generation of singlet oxygen by PDHD

enantiomers. Singlet oxygens are capable to cleave DNA strands they are generated by

photonucleases such as anthraquinone and derivative. [85] Another reason comprises the

formation of superoxide. Superoxide by itself is not responsible for DNA cleavage, but two

molecules of superoxide can disproportionate to form ground state oxygen and hydrogen

peroxide. In the well-known Fenton reaction, hydrogen peroxide can undergo a one electron

reduction by trace metals in solution to yield a hydroxyl radical and hydroxide. Being a highly

reactive molecule, hydroxyl radical has been shown to cleave DNA effectively. [86]

3.9 Binding specificity by DNase I footprinting assay

The DHD binding specificity was examined using DNA cleavage-protection mapping, better known as the footprinting assay. This technique has gained widespread popularity in pharmacology as a simple and reliable method for identifying binding sites for small molecules 48

that interact with nucleic acids. [87, 88] The results of a typical DNase I footprinting experiment

using the 32P-5’-end-labeled 38-mer synthetic DNA is shown in Figure 3.9.1. The cleavage

reaction was performed in the absence and presence of increasing concentrations of the DHD

rac. The products of a Maxam–Gilbert sequencing reaction on the same radiolabeled DNA molecules were also separated by electrophoresis in parallel to allow for identification of the sequences protected from cleavage by the nuclease. A region corresponding to G9 and G10 that protected from digestion by the enzyme is clearly visible on the gel. At higher 10 µM PDHD concentration (about 1 DHD per DNA base pair) the DNA seems to change its 3D conformation allowing thymines T15 and T16 to be digested by DNase I (Figure 3.9.1 lane 9). This structurally changes of DNA upon intercalative binding was confirmed by circular dichroizm spectropolarimetry (see above). In presence of 100 µM PDHD the DNA became fully protected with no observation of cleavage pattern. The dose-dependent enhancements of cleavage rates relative to those in PDHD-free control lane occur at all base pairs even at lower PDHD concentration. The inhibition of cleavage is particularly strong at the 5’-GG site. As expected, this high-affinity site contains GG step to which the drug is assumable bind tightly. The footprint extends to 3 basepairs, suggesting the presence of overlapping binding sites.

49

Figure 3.9.1 Combined autoradiography experimental data for PDHD rac specificity to double stranded 38 – mer DNA. Picture of 15 % denaturing polyacrylamide gel. Lanes: 1 is DNA ladder (Maxam-Gilbert sequence cleavage at purine bases); 2 is is DNA 38-mer treated with DNAse I; 3 – control DNA irradiated for 10 min; 4 – PDHD/ DNA 3 to 1 ratio irradiated for 5 min; 5 - PDHD/ DNA 3 to 1 ratio irradiated for 10 min. Lanes 6,7,8,9 and 10 correspond to DNase I cleavage assay in presence of different PDHD rac concentration . 50

3.10 Proposed cleavage mechanism

Para-quinones are known to damage DNA (23,25,27 46,66), ortho-quinones, while they

have not been studied extensively in this capacity, are significant more reactive then their para-

substituted isomers. Since ortho-quinones are extremely reactive, their effective delivery to the

DNA target site will require some masking group, and that the quinone be released only upon to the targeted DNA site. Relatively unreactive PDHDs fulfill these criteria well, since they can be photochemically triggered to release highly reactive ortho-quinones only in demand.

As shown form UV-Vis titration, UV-meting and CD experiments the PDHDs bind to duplex

DNA by intercalating of PDHD moiety into the π-stack nucleotides and possibly electrostatic interaction between positively charged pyridinium rings with negatively charged DNA phosphate back bones. Such interactions allow holding PDHDs in designated site of DNA. Upon irradiation with Visible light (442 nm), the PDHDs are photolized and the parent ortho-quinone is likely to be released (Scheme 3 upper panel). The calculated quantum yields for complex PDHD rac quinone photorelease upon irradiation with visible light is shown in Table 3.10.1.

Table 3.10.1 Quantum yield of quinone photorelease.

PDHD irradiation at 442 nm Quantum yield (Ф)

PDHD rac in presence of oxygen 1.703 * 10-4

PDHD rac in absence of oxygen 1.968 * 10-4

Upon the absorption of a second photon, the quinone is expected to induce the DNA damage at the site of its release. The proposed mechanism describing this reaction is summarized in 51

Scheme 3 lower panel. The catalytic cycle of quinone reduction and oxidation reported by

Bolton (66) can be translated to ortho-quinones. Hydrogen abstraction by the excited quinone would yield hydroquinone as an intermediate, and electron transfer would result in the formation of semiquinone radical. Either hydroquinone or semiquinone radical anion may be re-oxidized to the quinone by molecular oxugen, O2. This step can give rise to reactive oxygen species ROS such as peroxides or hydroxyl radicals which cause irreversible DNA damage.

hν (442 nm)

Φoxy = 1.703 * 10-4 Φno = 1.968 * 10-4

DNA

damage!

Scheme 3.10.1. Upper panel is proposed events of PDHD ortho-quinon release under 442 nm light. Lower panel is proposed demonstration of DNA damage caused by excited ortho-quinone.

52

Alternative mechanism

The interaction by intercalation may be suitable for the electron transfer from guanine base to

excited ortho-quinone. The oxidative degradation at the 5’ site of 5’-GG-3’ sequence may be

explained as shown in Scheme 4. The radical ion pair resulting from forward electron transfer

from single guanine to photoexcited ortho-quinone intermediate easily reverts back to the ground state by back electron transfer within the complex (1st step). However, with consecutive guanine residues, another electron transfer to the ortho-quinone interacting guanine from the adjacent

guanine occurs, resulting in much lower rate of back electron transfer, and subsequently in high

efficiency of damage of the 5’ located guanine (2nd step). In this mechanism, the electron transfer

from 5’ guanine to 3’ guanine is expected to occur more readily than that from 3’ guanine to 5’

guanine.

Although we have no direct evidence that would support this electron-transfer mechanism,

Sugiyama et al. have proposed that an intramolecular electron transfer from purine at 5’-side to an adjacent uracilyl-5-yl radical may occur in a specially oriented complex formed in the duplex.[89] However, there remains a possibility that the specific cleavage is a consequence of more selective intercalation of PDHD to 5’-GG-3’ base pair step, compared with single guanine residue. This may be supported by the report that the interaction of lumiflavin with poly(dG)- poly(dC) was the strongest among the polynucleotides investigated (e.g. poly(dG), poly(dC), poly[d(A-T)] /poly[d(A-TI] DNA).[90] The formed guanine cation radical could react with molecular oxygens and water molecules to form various kinds of oxidized products (Scheme 4 steps 3 and 4) including 8-hydroxydeoxyguanosine as it have been shown by riboflavines.[91] It 53 has been suggested from experiments with ionizing radiation that guanine cation radicals react with water to produce 8-Hydroxydeoxyguanosine.[92]

Scheme 3.10.2. Alternative mechanism of double stranded DNA damage by PDHD.

54

3.11 Conclusion

In this work, the binding properties of PDHD to herring sperm DNA as well as to short

synthetic oligomers have been studied by spectroscopic methods. Both PDHD enantiomers bind

to double helical DNA with a high affinity. Remarkable hypochromic and bathochromic effects

of enantiomers in the presence of increasing amounts of DNA have been observed in the

5 -1 absorption spectra. PDHD II has a slightly larger binding constant (Kb = 2.3 ± 0.8 × 10 M )

5 -1 than PDHD I (Kb = 1.6 ± 0.15 × 10 M ). Upon addition of DNA to the PDHD rac, CD spectra

change dramatically: these results, together with UV titration experiments, reveal that DHDs

intercalate to DNA double helix. Ability of PDHDs to stabilize DNA was confirmed by UV-

melting studies. PDHD II enhances the melting temperature of 10bps DNA by 15 °C and, more surprisingly, it provides ∆Tm of almost 29 °C for 12-mer DNA. This type of DNA binding is

unique in that, it implements both the hydrophobic bonding characteristic of many aromatic

hydrocarbons and ionic bonding. It reduces the ionic repulsion of the negatively charge

phosphate backbones, inhibiting the separation of the two duplex chains.

Studies conducted to determine if there is any preference of PDHDs for base sequencing

showed that PDHDs highly selective for guanine residues. At this point, it can be stated that this

type of duplex stabilizers or “clamps” should find utility in stabilizing many types of DNA or

RNA nanostructures that are currently under active investigation. In addition, this type of duplex

stabilizer is also photochemically active in the oxidation of ds-DNA, as it can be concluded from

our φX 174 plasmid DNA photocleavage assay and DNA base damage.

55

3.12 References

1. Watson, J.D. and F.H. Crick, Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 1953. 171(4356): p. 737-8. 2. Miralles, F., Compositional Properties and Thermal Adaptation of SRP-RNA in Bacteria and Archaea. J Mol Evol, 2010. 3. Williams, T.T., et al., Effects of the photooxidant on DNA-mediated charge transport. J Am Chem Soc, 2004. 126(26): p. 8148-58. 4. Augustyn, K.E., E.D. Stemp, and J.K. Barton, Charge separation in a ruthenium- quencher conjugate bound to DNA. Inorg Chem, 2007. 46(22): p. 9337-50. 5. Kurbanyan, K., et al., DNA-protein cross-linking via guanine oxidation: dependence upon protein and photosensitizer. Biochemistry, 2003. 42(34): p. 10269-81. 6. Marini, N.J., et al., DNA binding hairpin polyamides with antifungal activity. Chem Biol, 2003. 10(7): p. 635-44. 7. Farkas, M.E., et al., DNA sequence selectivity of hairpin polyamide turn units. Bioorg Med Chem Lett, 2009. 19(14): p. 3919-23. 8. Qiao, C., et al., Study of interactions of anthraquinones with DNA using ethidium bromide as a fluorescence probe. Spectrochim Acta A Mol Biomol Spectrosc, 2008. 70(1): p. 136-43. 9. Yan, M., et al., Nephrotoxicity study of total rhubarb anthraquinones on Sprague Dawley rats using DNA microarrays. J Ethnopharmacol, 2006. 107(2): p. 308-11. 10. Laffon, B., et al., The organic selenium compound selenomethionine modulates bleomycin-induced DNA damage and repair in human leukocytes. Biol Trace Elem Res, 2010. 133(1): p. 12-9. 11. Fung, H. and B. Demple, Distinct roles of Ape1 protein in the repair of DNA damage induced by ionizing radiation or bleomycin. J Biol Chem, 2011. 286(7): p. 4968-77. 12. Jones, L.H., et al., Conversion of enediynes into quinones by catalysis and in aqueous buffers: implications for an alternative enediyne therapeutic mechanism. J Am Chem Soc, 2001. 123(15): p. 3607-8. 13. Armitage, B., Photocleavage of Nucleic Acids. Chem Rev, 1998. 98(3): p. 1171-1200. 14. Kan, Y., B. Armitage, and G.B. Schuster, Selective stabilization of triplex DNA by anthraquinone derivatives. Biochemistry, 1997. 36(6): p. 1461-6. 15. Searle, M.S., et al., NMR studies of the interaction of the nogalamycin with the hexadeoxyribonucleotide duplex d(5'-GCATGC)2. Biochemistry, 1988. 27(12): p. 4340- 9. 16. Searle, M.S., et al., Anthracycline antibiotic arugomycin binds in both grooves of the DNA helix simultaneously: an NMR and molecular modelling study. Nucleic Acids Res, 1991. 19(11): p. 2897-906. 17. Barcelo, F., D. Capo, and J. Portugal, Thermodynamic characterization of the multivalent binding of chartreusin to DNA. Nucleic Acids Res, 2002. 30(20): p. 4567-73. 18. Hou, M.H., et al., Crystal structure of actinomycin D bound to the CTG triplet repeat sequences linked to neurological diseases. Nucleic Acids Res, 2002. 30(22): p. 4910-7. 19. Leng, F., J.B. Chaires, and M.J. Waring, Energetics of echinomycin binding to DNA. Nucleic Acids Res, 2003. 31(21): p. 6191-7. 56

20. ohn N. Lisgarten, M.C., Jose Portugal, Colin W. Wright and Juan Aymami, The antimalarial and cytotoxic drug cryptolepine intercalates into DNA at cytosine-cytosine sites. Nature Structural Biology 2002. 9(1): p. 57-60. 21. Goodisman, J., et al., Site-specific binding constants for actinomycin D on DNA determined from footprinting studies. Biochemistry, 1992. 31(4): p. 1046-58. 22. Rodriguez, M. and A.J. Bard, Electrochemical studies of the interaction of metal chelates with DNA. 4. Voltammetric and electrogenerated chemiluminescent studies of the interaction of tris(2,2'-bipyridine)osmium(II) with DNA. Anal Chem, 1990. 62(24): p. 2658-62. 23. Hendry, L.B., et al., intercalation with double stranded DNA: implications for normal gene regulation and for predicting the biological efficacy and genotoxicity of drugs and other chemicals. Mutat Res, 2007. 623(1-2): p. 53-71. 24. Shahabuddin, M.S., et al., Intercalative pyrimido[4',5':4,5]thieno(2,3-b)quinolines induce apoptosis in leukemic cells: a comparative study of methoxy and morpholino substitution. Invest New Drugs, 2011. 29(5): p. 873-82. 25. Ferguson, L.R. and W.A. Denny, Genotoxicity of non-covalent interactions: DNA intercalators. Mutat Res, 2007. 623(1-2): p. 14-23. 26. Patel, M.N., et al., DNA cleavage, binding and intercalation studies of drug-based oxovanadium(IV) complexes. J Enzyme Inhib Med Chem, 2009. 24(3): p. 715-21. 27. Barone, G., et al., Intercalation of daunomycin into stacked DNA base pairs. DFT study of an anticancer drug. J Biomol Struct Dyn, 2008. 26(1): p. 115-30. 28. Li, H.H., J. Aubrecht, and A.J. Fornace, Jr., Toxicogenomics: overview and potential applications for the study of non-covalent DNA interacting chemicals. Mutat Res, 2007. 623(1-2): p. 98-108. 29. Chowdhury, N., et al., N,O-diacyl-4-benzoyl-N-phenylhydroxylamines as photoinduced DNA cleaving agents. Bioorg Med Chem Lett, 2010. 20(18): p. 5414-7. 30. van der Steen, S., et al., Novel heteronuclear ruthenium-copper coordination compounds as efficient DNA-cleaving agents. Chem Commun (Camb), 2010. 46(20): p. 3568-70. 31. Fan, G.J., et al., Hydroxycinnamic acids as DNA-cleaving agents in the presence of Cu(II) ions: mechanism, structure-activity relationship, and biological implications. Chemistry, 2009. 15(46): p. 12889-99. 32. Ozalp-Yaman, S., et al., Platinated copper(3-clip-phen) complexes as effective DNA- cleaving and cytotoxic agents. Chemistry, 2008. 14(11): p. 3418-26. 33. Toshima, K., et al., Molecular design and evaluation of quinoxaline-carbohydrate hybrids as novel and efficient photo-induced GG-selective DNA cleaving agents. Chem Commun (Camb), 2002(3): p. 212-3. 34. Natrajan, A.H., S. M, Molecular Aspects of Anticancer Drug- DNA Interactions. CRC Press: Boca Raton, FL, 1994. 2(4): p. 197. 35. Bendinskas, K.G., et al., Sequence-specific photomodification of DNA by an oligonucleotide-phenanthrodihydrodioxin conjugate. Bioconjug Chem, 1998. 9(5): p. 555-63. 36. Hamel, E., Antimitotic natural products and their interactions with tubulin. Med Res Rev, 1996. 16(2): p. 207-31. 37. Wood, K.W., W.D. Cornwell, and J.R. Jackson, Past and future of the mitotic spindle as an oncology target. Curr Opin Pharmacol, 2001. 1(4): p. 370-7. 57

38. Kuppens, I.E., Current state of the art of new tubulin inhibitors in the clinic. Curr Clin Pharmacol, 2006. 1(1): p. 57-70. 39. Jordan, M.A. and L. Wilson, Microtubules as a target for anticancer drugs. Nat Rev Cancer, 2004. 4(4): p. 253-65. 40. Maxam, A.M.a.W.G., , 1977. 74: p. 560. 41. Zhong, W.Y., et al., [Interaction between balofloxacin and DNA and the influence of Mg2+ on the interaction]. Yao Xue Xue Bao, 2005. 40(7): p. 663-7. 42. Samuels, D.S. and C.F. Garon, inhibits growth and induces relaxation of supercoiled plasmids in Borrelia burgdorferi, the Lyme disease agent. Antimicrob Agents Chemother, 1993. 37(1): p. 46-50. 43. Madhavaiah, C. and S. Verma, Plasmid relaxation induced by copper metalated diglycine conjugates under heterogeneous reaction conditions. Bioorg Med Chem Lett, 2003. 13(5): p. 923-6. 44. Sengupta, S., et al., ATP independent type IB topoisomerase of Leishmania donovani is stimulated by ATP: an insight into the functional mechanism. Nucleic Acids Res, 2011. 39(8): p. 3295-309. 45. Detmer, I.C., F.V. Pamatong, and J.R. Bocarsly, Effects in Metal Complex Mediated Double-Strand Cleavage of DNA: Reactivity and Binding Studies with Model Substrates. Inorg Chem, 1997. 36(17): p. 3676-3682. 46. Freifelder, D., Single-strand breaks in bacterial DNA associated with thymine starvation. J Mol Biol, 1969. 45(1): p. 1-7. 47. Povirk, L.F., et al., DNA double-strand breaks and alkali-labile bonds produced by bleomycin. Nucleic Acids Res, 1977. 4(10): p. 3573-80. 48. Yang, W.Y., et al., C-lysine conjugates: pH-controlled light-activated reagents for efficient double-stranded DNA cleavage with implications for cancer therapy. J Am Chem Soc, 2009. 131(32): p. 11458-70. 49. Alabugin, S.V.K.a.I.V., Lysine–enediyne conjugates as photochemically triggered DNA doublestrand cleavage agents. Cemical communication, 2005: p. 1444-1446. 50. Maniatis, T.F., E. F.; Sambrook, J. Molecular Cloning Cold Spring Harbor Laboratory, 1982. 51. Simon, M., H.C. Chang, and M. Laskowski, Sr., Action of pancreatic deoxyribonuclease I on crab d(A-T) polymer. Biochim Biophys Acta, 1971. 232(3): p. 462-71. 52. Cardew, A.S. and K.R. Fox, DNase I footprinting. Methods Mol Biol, 2010. 613: p. 153- 72. 53. Lomonossoff, G.P., P.J. Butler, and A. Klug, Sequence-dependent variation in the conformation of DNA. J Mol Biol, 1981. 149(4): p. 745-60. 54. Fox, K.R. and M.J. Waring, DNA structural variations produced by actinomycin and distamycin as revealed by DNAase I footprinting. Nucleic Acids Res, 1984. 12(24): p. 9271-85. 55. Hogan, M.E., M.W. Roberson, and R.H. Austin, DNA flexibility variation may dominate DNase I cleavage. Proc Natl Acad Sci U S A, 1989. 86(23): p. 9273-7. 56. Lane, M.J., J.C. Dabrowiak, and J.N. Vournakis, Sequence specificity of actinomycin D and Netropsin binding to pBR322 DNA analyzed by protection from DNase I. Proc Natl Acad Sci U S A, 1983. 80(11): p. 3260-4. 57. Cons, B.M. and K.R. Fox, Footprinting studies of sequence recognition by mithramycin. Anticancer Drug Des, 1990. 5(1): p. 93-7. 58

58. Bailly, C., et al., Recognition elements that determine affinity and sequence-specific binding to DNA of 2QN, a biosynthetic bis-quinoline analogue of echinomycin. Anticancer Drug Des, 1999. 14(3): p. 291-303. 59. Low, C.M., H.R. Drew, and M.J. Waring, Sequence-specific binding of echinomycin to DNA: evidence for conformational changes affecting flanking sequences. Nucleic Acids Res, 1984. 12(12): p. 4865-79. 60. Chaires, J.B., et al., Site and sequence specificity of the daunomycin-DNA interaction. Biochemistry, 1987. 26(25): p. 8227-36. 61. Phillips, D.J., Journal of chemical Education, 1971. 48(3): p. 198-200. 62. Schonberg, A. and M. Ahmed, Photochemical reactions in sunlight; reactions with phenanthraquinone, 9-arylxanthens, and diphenyl triketone. J Chem Soc, 1947: p. 997- 1000. 63. Mack, E.T., et al., Thermal and photochemistry of a pyrene dihydrodioxin (PDHD) and its radical cation: a photoactivated masking group for ortho-quinones. J Am Chem Soc, 2004. 126(47): p. 15324-5. 64. Mack, E.T., et al., DNA photocleavage and biological activity of a pyrene dihydrodioxin. Bioorg Med Chem Lett, 2005. 15(8): p. 2173-6. 65. Wilson, R.M.S., K. A.; Harsch, A.; Keller, S.; Schlemm, D. J., U.S. Patent 6,018,058, Junuary 25, 2000. 66. Spiegel, K. and A. Magistrato, Modeling anticancer drug-DNA interactions via mixed QM/MM simulations. Org Biomol Chem, 2006. 4(13): p. 2507-17. 67. Spackova, N., et al., Molecular dynamics simulations and thermodynamics analysis of DNA-drug complexes. Minor groove binding between 4',6-diamidino-2-phenylindole and DNA duplexes in solution. J Am Chem Soc, 2003. 125(7): p. 1759-69. 68. Harris, S.A., et al., Cooperativity in drug-DNA recognition: a molecular dynamics study. J Am Chem Soc, 2001. 123(50): p. 12658-63. 69. E.C. Long, J.K.B., On demonstrating DNA intercalation. Acc. Chem. Res, 1990. 23: p. 271-273. 70. A.M. Pyle, J.P.R., R. Meshoyrer, C.V. Kumar, N.J. Turro, J.K. Barton, Mixed-ligand complexes of ruthenium(II): factors governing binding to DNA,. J. Am. Chem. Soc., 1989. 111: p. 3051-3058. 71. A. Wolfe, G.H.S., T. Meehan, Polycyclic aromatic hydrocarbons physically intercalate into duplex regions of denatured DNA. Biochemistry, 1987. 26: p. 6392-6396. 72. Neto, B.A. and A.A. Lapis, Recent developments in the chemistry of deoxyribonucleic acid (DNA) intercalators: principles, design, synthesis, applications and trends. Molecules, 2009. 14(5): p. 1725-46. 73. Sinha, R., et al., The binding of DNA intercalating and non-intercalating compounds to A-form and protonated form of poly(rC).poly(rG): spectroscopic and viscometric study. Bioorg Med Chem, 2006. 14(3): p. 800-14. 74. Chaires, J.B., Energetics of drug-DNA interactions. Biopolymers, 1997. 44(3): p. 201-15. 75. Chan Yong Lee, H.-W.R., and Thong-Sung Ko, Binding Features of Ethidium Bromide and Their Effects on Nuclease Susceptibility of Calf Thymus DNA in Presence of Spermine. Bull. Korean Chem. Soc., 2001. 22(1). 76. Schroeder, S.J. and D.H. Turner, Optical melting measurements of nucleic acid thermodynamics. Methods Enzymol, 2009. 468: p. 371-87. 59

77. Wilson, W.D., et al., Evaluation of drug-nucleic acid interactions by thermal melting curves. Methods Mol Biol, 1997. 90: p. 219-40. 78. Crothers, D.M., Statistical thermodynamics of nucleic acid melting transitions with coupled binding equilibria. Biopolymers, 1971. 10(11): p. 2147-60. 79. Fasman G, D., Circular dichroism and conformational analysis of biomolecules. New York: Plenum Press, 1996: p. 413-432. 80. Kapuscinski J, D.Z., Interactions of acridine orange with double stranded nucleic acids. Spectral and affinity studies. J Biomol Struct Dyn 1987. 5: p. 127-142. 81. Mauffret O, M.M., Lanson M, Armier J, Fermandjian S., Conformational variations in d(TGACGTCA) and its reverse sequence d(ACTGCAGT): a joint circular dichroism and nuclear magnetic resonance study. Biochem Biophys Res Commun, 1989. 165: p. 602- 606. 82. A, R., Circular and linear dichroism of drug-DNA systems. Methods Mol Biol., 2010. 613: p. 37-54. 83. Maxam, A.M. and W. Gilbert, Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol, 1980. 65(1): p. 499-560. 84. Maxam AM, G.W., A new method for sequencing DNA. Proc Natl Acad Sci U S A, 1977. 74(2): p. 560-564. 85. Bruce Armitage, C.Y., Chelladurai Devadoss, Gary B. Schuster, Cationic Anthraquinone Derivatives as Catalytic DNA Photonucleases: Mechanisms for DNA Damage and Quinone Recycling. J. Am. Chem. SOC, 1994. 116: p. 9847-9859. 86. Gwen E. Shafer, M.A.P., Thomas D. Tullius, Use of the hydroxyl radical and gel electrophoresis to study DNA structure. ELECTROPHORESIS, 1989. 10(5-6): p. 397- 404. 87. Fox, K.R., DNase I footprinting. In Drug–DNA Interaction Protocols. Methods in Molecular Biology. Humana Press, Totowa, NJ,, 1997. 90: p. 1-22. 88. Fox, K.R.a.W., M. J., High-resolution footprinting studies of drug–DNA complexes using chemical and enzymic probes. Methods Enzymology, 2001. 340: p. 412–430. 89. Hiroshi Sugiyama, Y.T., Isao Saito, Highly sequence-selective photoreaction of 5- bromouracil-containing deoxyhexanucleotides. J. Am. Chem. Soc., 1990. 112(18): p. 6720-6721. 90. Kuratomi, K. and Y. Kobayashi, Studies on the interactions between DNA and flavins. Biochim Biophys Acta, 1977. 476(3): p. 207-17. 91. Ito, K., et al., 8-Hydroxydeoxyguanosine formation at the 5' site of 5'-GG-3' sequences in double-stranded DNA by UV radiation with riboflavin. J Biol Chem, 1993. 268(18): p. 13221-7. 92. P, C.J.a.V., in Bioorpanic Photochemistry (Morrison, H. ed.). John Wiley &-Sons, New York, 1990. 1: p. 1-172.

60

PART II. THE ROLE OF NON-WATSON-CRICK BASE PAIRS IN STABILIZING

RIBOSOMAL RNA SARCIN-RICIN MOTIF.

CHAPTER IV. SIGNIFICANCE OF RNA IN BIOLOGY

4.1 Distinctive features of non-coding RNAs

Since the discovery of DNA function scientists have learned that RNA does much more than simply play a role in protein synthesis. Scientist’s expectation that more complex

organisms would have a greater number of genes is turn to be wrong. It is now established

that humans and mice have approximately the same number of protein-coding genes as the

microscopic roundworm Caenorhabditis elegans has[1]. Two recent and unexpected findings

explain this paradox: first is biological complexity generally correlates with the proportion of

the genome that is non-protein-coding; [1] and second is that, while only 2% of the

mammalian genome encodes mRNAs, the vast majority is transcribed, largely as long and

short ncRNAs.[2-4]

Various types of RNA have been found to catalyze or suppress biochemical reactions

similar to those enzymes do[5-7]. In addition, many RNA have complex regulatory roles in

cells.[8] They assist in many essential functions, which are still being enumerated and

defined. As a group, these RNAs are frequently referred to as small regulatory RNAs

(sRNAs) or non-coding RNA (ncRNA), and, in eukaryotes, they are extremely diverse

(Table 4.1.1). Together, these various regulatory RNAs exert their effects through a

combination of complementary base pairing, complexing with proteins, and their own

enzymatic activities. This makes RNA molecules significant key elements in both normal

cellular processes and disease states. 61

Table 4.1.1. Diversity of non-coding RNAs in living cells. Non-coding RNAs Length (nt.) Functions ref Long non-coding ~200 Epigenetic regulation by sequence- [9] specific tethers for protein RNA (lncRNA) complexes and specifying subcellular compartments or localization Small interfering ~20-21 Form complexes with Argonaute [10, proteins and are involved in gene RNAs (siRNAs) 11] regulation, transposon control and viral defence microRNAs ~22 Associate with Argonaute proteins [12] and are primarily involved in post- (miRNAs) transcriptional gene regulation PIWI-interacting ~26-30 Associate with PIWI-clade [10] Argonaute proteins and regulate RNAs (piRNAs) transposon activity and chromatin state Small nucleolar vary Guide of rRNA methylation and [13] RNAs (snoRNAs) pseudouridylation and gene- regulatory roles Sno-derived RNAs vary Function as miRNA in regulation of [14] (sdRNAs) translation tRNA-derived RNAs vary Induce translational repression [15] microRNA-offset ~20 Function is unknown. Derived from [16] RNAs (moRNAs) the regions adjacent to pre-miRNAs. RNA enzymes vary enzymatic activity [17, (ribozymes) 18] RNA aptamers vary Ability to recognize specific ligands [19, (organic compounds, nucleotides, or 20] peptides) through the formation of binding pockets Riboswitches vary Bind small molecules [21] and control gene expression

One of the intriguing features of RNA is the possibility to produce self-assembled RNA nanoparticles in vivo. Unlike DNA (carrier of genetic information) and proteins (catalysts of 62

biochemical reaction), RNA comprises functions of both macromolecules as well as performs

wide variety of other distinct attributes within the living cell.[22-25] The versatile functionality of RNA can be attributed to its unique structure; the foundation of RNA is very similar to that of DNA, differing only in RNA possessing a hydroxyl group on its 2’-ribose structure and the replacement of the thymine base in DNA with uracil. RNA possesses not only Watson-Crick base pairing but also non-canonical base pairing which promotes folding into rigid structural motifs distinct from the structure of DNA. The non-canonical property facilitates loop-receptor interactions and allows the creation of synthetic ribozymes. It also allows the formation of structural motifs. Currently, an 80-nucleotide RNA can display up to

480 different structures with unique shapes involving non-canonical interactions. This large

pool of RNA molecules with various structural conformations would ease the search for

viable partners in particle assemblies, substrate binding, building architectures, and

manufacture engineering.

4.2. RNA structural motifs: building blocks of modular biomolecules

RNA structure, like , is folded hierarchically in time and space to form

specific 3D structures necessary for molecular function and described at the sequence or

primary structure level, the secondary, tertiary and quaternary levels. As is the case for

proteins, the global structures of RNA molecules change more slowly than their sequences or

secondary structures which allow us to analyze and understand them.

Initially, RNA motifs were identified and described at the sequence level as commonly

occurring short sequences in functional RNAs, such as transfer RNA (tRNA) or ribosomal

RNA (rRNA).[26] The variation in these sequences, within or between RNAs, is often

represented by an expanded alphabet describing a consensus sequence (e.g. GNRA where N 63

is any nucleotide and R is a purine). These conserved sequences are often given in the

context of the predicted base-pairing structure (e.g. tetraloops) surrounding them.

RNA secondary structure.

The next level of RNA structure is the Watson-Crick helixes, or secondary structure,

which identifies both the canonically (e.g. A-U, U-A, C-G, G-C as well as “wooble” G-U, U-

G) base paired regions and non-paired regions (loops). These canonical base pairs are the components of the helical portions of RNA, and they have a regular structure and hydrogen bonding pattern. The helices are generally short (< 10–15 WC pairs) because they are interrupted or terminated at their ends by nominally unpaired stretches of sequence that are called, hairpin, internal, or multi-helix junction loops, pseudoknots and bulges (Figure 4.2.1).

Bases in loops are usually depicted in secondary structures as not forming basepairs.

However, canonical base pairs account for only approximately half of the nucleotides found in RNA[27].

Figure 4.2.1. Illustration of secondary structure of RNA molecule.

RNA secondary structures can be predicted by various computational methods that may or may not include pseudoknots, but generally does not include non-canonical pairing or 3D 64

information such as details of base stacking, backbone hydrogen bonding and tertiary interactions.[28, 29] Various biochemical and biophysical experimental methods can also be used to infer secondary structure[30-32].

RNA tertiary structure.

RNA tertiary conformation structure describes the overall 3D of a single molecule as determined by crystallography, NMR or modeling methods. The 3D structure is determined by long-range intramolecular interactions such as pseudoknots, ribose zippers, kissing hairpin loops, tetraloop–tetraloop receptor interactions, coaxial helices and other yet uncharacterized modes of interaction, and can be mediated by intermolecular interactions with ligands including metals and small molecules or other macromolecules (DNA, RNA or protein).

Some of these interactions can be identified computationally (e.g. covariation analysis) or by biochemical/biophysical experiments, but in general, 3D structural information from crystallography or NMR spectroscopy is necessary for complete description. This structural information is provided and rapidly populated each year in the Nucleic Acid Database (NDB) and the RCSB Protein Data Bank (PDB). 3D RNA data shows that most “loops” in 2D representations in fact form specific 3D motifs, characterized by non-Watson–Crick base- pairing, base-stacking and base-phosphate interactions between loop nucleotides[33].

4.3 Search, extraction and analysis of RNA 3D motifs.

To extract useful information from RNA 3D motifs the analysis of known RNA structure from NMR or X-ray crystallographic data is necessary. However, X-ray database on RNA molecules is growing extremely fast which makes practically impossible to analyze them manually unless the research interest is focused on one particular case. So, modern 65

bioinformatics tools are available to perform search, extraction and comparison of the RNA

motifs[34-38]. The following can help you to identify the core nucleotides of each motif,

their interaction network, variability and build more realistic RNA three-dimensional models.

Our group have recently developed an on-live available version of “Find RNA 3D”

(FR3D) program called WebFR3D. This allows to search for local structural motifs and as

well for composite motifs in RNA (http://rna.bgsu.edu/WebFR3D)[39] . The main advantage

of this program is that search procedures can be performed based on the combination of

symbolic and geometric specifications which are more reliable and provide control over the

search and return bigger number of candidates as compared to other programs trials. The

search outcome provides instances of the motif in various RNAs, interaction network and a

structural alignment via user-friendly visualization. WebFR3D analysis gives information regarding current interaction network such as base-base, base-stacking and base-phosphate interactions and list of all RNA motifs found in the particular pdb file. Details how to conduct the searches are not provided here, a user’s manual can be found online under http://rna.bgsu.edu/WebFR3D.

4.4 Factors stabilizing RNA 3D motifs

Microorganisms that grow at temperatures above 50° C and below 20° C are called extremophiles. Those growing at moderate temperatures (20°-50° C) are mesophiles. Some hyperthermophiles can tolerate temperature exceeding 100° C and some extreme psychrophiles can survive at temperatures below 0° C. All macromolecules (RNA, DNA, proteins) of such organisms must preserve their functionality in temperature range in which

the species lives. Much research is conducted to elucidate the mechanisms of bacterial 66

adaptation to temperature extremes regarding RNA molecules, the focus directed to the

composition of Watson-Crick basepairs as G-C base pairs are more stable than A-U pairs

[40-42]. Thus, it is found that G+C content of rRNA from thermophilic genomes has higher

G+C content. In contrast, no such simple relationship exists for genomic DNA; the average

G+C content varies widely from approximately 30% to more than 60% in bacteria,

irrespective of the optimal growth temperature. [43] However, the G+C content of ribosomal

5S, 23S and 16S RNA has been shown to be proportional to the optimal growth temperature

in bacteria.[42, 44] The same positive correlation of increasing G+C content with optimal

growth temperature is found in stem regions of RNA component of the signal recognition

particle among archaea[45].

Chemical stability (resistance to strand cleavage) is also important for survival at high growth temperatures. The presence of monovalent metal ions (Na+ and K+) reduces the

spontaneous degradation of RNA strands while Mg 2+ ions favor hydrolytic degradation. On

the other hand, Mg2+ acts as counterion to shield the highly negative phosphate backbone of

nucleic acid and thereby aid its folding. [46, 47] It is known that Mg2+ ions strongly bind to specific binding pockets in folded RNA, some of which are dependent on conserved modified nucleotides.[48] Hence, presence of Mg2+, is believed to stabilize the conformation

of RNA against thermo-denaturation, but can catalyze strand cleavage at specific binding

sites.

Non-cyclic aliphatic cations containing two or more protonated amino nitrogen atoms (linear

or branched polyamines) can also stabilize complex RNA structure by neutralizing negative

charges of phosphates.[49] Interestingly, some unusual long and branched polyamines are

found only in hypothermophilic bacteria and archaea and not in mesophilic ones.[50] 67

Stabilization of cellular RNA structures (rRNA and tRNA) by covalent modification of

bases has been frequently reported.[51-54] At present, more than 100 different modified

nucleosides have been observed (http://medlib.med.utah.edu/RNAmods). For example, the

non-planar conformation of the modified nucleotide dihydrouridine (D) that prevents it from

stacking on neighboring bases increases the backbone flexibility. Interestingly,

dihydrouridine is totally absent in tRNA from hyperthermophiles while it is abundant in

psycrophilic bacteria.[55, 56] This is consistent with the need to increase chain flexibility at

low temperatures while minimizing it at higher temperature to avoid denaturation. On the other hand, a remarkable number of doubly modified nucleoside with covalent modification of the base in combination with the 2’-O-ribose methylation (ac4Cm, m2Gm, m5Cm etc.)

were discovered only in tRNAs of hyperthermophilic archaea.[57-59] The presence of such

nucleobases provides exceptional conformational rigidity of tRNAs which may increase local

and global stabilization of nucleic acid.

Another and most crucial factor for RNA stabilization and function is folding to compact

geometry through non-canonical base pairing. In a survey of the 3D structures of rRNA in

the 70S ribosomes of E. coli and T. thermophilus and the 50S subunit of H. marismortui,

only ~59% of bases form standard WC basepairs and about 21% do not comprise base pair at

all[60]. Thus, the loops contain a significant fraction of the nucleotides of structured RNA

molecules and most of these nucleotides interact with other nucleotides, proteins or ligands

are necessary for a complete description. Various RNA loops forming 3D motifs are

classified according to their structures and functions and discussed in literature[34, 61, 62].

Tertiary structures are often stabilized by non-canonical (non-Watson/Crick) nucleobase interactions. There are 29 possible base pairs that share at least two hydrogen bonds[63, 64] 68

.Of these, there are 12 basic geometric families (Table 4.3.1)[65, 66]. The relationships of these base pair families have been described in terms of “isostericity matrices” 40, which

reflect the degree of structural similarity between non-canonical base pairs.

Table 4.4.1. RNA geometric base pair families

Number Glycosidic bond orientation Interacting edges Symbols

1 Cis Watson-Crick/Watson-Crick 2 Trans Watson-Crick/Watson-Crick 3 Cis Watson-Crick/Hoogsteen 4 Trans Watson-Crick/Hoogsteen 5 Cis Watson-Crick/Sugar Edge 6 Trans Watson-Crick/Sugar Edge 7 Cis Hoogsteen/Hoogsteen 8 Trans Hoogsteen/Hoogsteen 9 Cis Hoogsteen/Sugar Edge 10 Trans Hoogsteen/Sugar Edge 11 Cis Sugar Edge/Sugar Edge 12 Trans Sugar Edge/Sugar Edge

They are absolutely essential for building up the complex three-dimensional architectures of

large RNAs. The cWW (“wobble”) G-U basepair, the most common non-canonical basepair in RNA, was first recognized by Crick in 1966 in the context of the tRNA - mRNA anticodon

- codon interactions [67]. Highly conserved G-U base pairs specific positions in rRNA have been shown to form conserved tertiary interactions, by comparison of ribosomal 3D 69

structures and aligned sequences.[68] The RNA non-cWW base pairs show an astonishing

variability of base-pairing combinations.[33, 69, 70] They structure internal, hairpin, junction

loops and mediate long-range tertiary interactions needed to fold RNA molecules into their

biologically active structures.[71-74] Also, non-WC base pairs are support the formation of

some pseudoknots including RNA loop-loop interaction (kissing complex), cross-strand

purine stacking interactions, water - and metal - bridges, the formation of binding sites for

proteins, local hydrophobic pockets and segments (junctions) consisting of three or more

RNA helical regions. Moreover, recent studies on RNA structure at atomic resolution

revealed additional recurrent folding motifs such as kink-turn, GNRA hairpin loop, A-minor

motif, V-turn, bulged G motif, sarcin-ricin motif [75]. All these various types of tertiary

interactions and motifs probably act together to allow the compaction of RNA conformation

and modulate the flexibility or rigidity of macromolecules according to their functional needs

to optimize growth in the environment colonized by an organism.

Taken all together, the stability of a given RNA structure depends on its 3D conformation

and the chemical and molecular interactions in solution. In such a way, the RNA architecture

performs variety degrees of stability, depending upon factors such as sequence and presence

of metal ions and non canonical base pairing. Theoretically, it is possible to predict the equilibrium shape of RNA from thermodynamic principles. Practically, our limited knowledge of the details of RNA interactions and energetic prevents this. Nevertheless, the average sensitivity (the percentage of known base pairs that are correctly predicted) of free energy minimization prediction has been identified as high as 72.8 ± 9.4% for a diverse database of sequences having fewer than 800 nt[27]. Thus, the major goal of thermodynamic studies of RNA is to provide a knowledge base able to predict the structures of functional 70

RNA sequences. Many papers covering this subject can be found [29, 76-78]. In these

chapter thermodynamic studies of RNA sarcin-ricin motifs with variety of non-canonical base pair substitution is presented.

71

CHAPTER V. EXPERIMENTAL METHODS FOR STUDYING RNA MOTIF

THERMODYNAMIC AND STRUCTURE.

5.1 Overview of RNA synthesis and purification

Short RNA oligonucleotides (13 and 12 nt.) for UV-melting experiments were purchased from Integrated DNA Thechnology Inc. (http://www.idtdna.com/site) at 100 nmoles synthetic

scale. The oligonucleotides were purified by preparative 20 cm × 20 cm thin-layer

chromatography (TLC) (n-propanol : ammonium hydroxide : water, 55:35:10). RNAs were eluted from silica with 1 mL of double deionized water (dd H2O) and dialyzed at +4° C over

night against 1 L dd H2O, lyophilized and redissolved in 100 µL dd H2O.

RNA 62 nt. for structural probing purposes were prepared by run-off transcription of PCR

amplified DNA templates as previously described. Synthetic DNA molecules coding for the anti-

sense sequence of the desired RNA were purchased from IDT DNA (www.idtdna.com) and

amplified by PCR using primers containing the T7 RNA polymerase promoter. PCR products

were purified using the home made purification system analog to commercial QiaQuick PCR

purification kit (Qiagen Sciences, Maryland 20874). RNA molecules were prepared by in vitro

transcription using T7 RNA polymerase (either “home-made” or purchased from Takara Bio Inc.

http://www.takara-bio.com) and purified on denaturing PolyAcrylamide Gels Electrophoresis

(PAGE) (15% acrylamide, 8M urea). The RNA was eluted from gel slices overnight at 4° C into

buffer containing 300 mM NaOAc, 10 mM Tris pH 8.0, 0.5 mM EDTA, ethanol precipitated,

rinsed twice with 80% ethanol, dried and dissolved in dd H2O. A complete description of the

RNA synthesis steps is discussed bellow.

72

5.2 RNA design

RNA motif design was initially performed based on results obtained by WebF3D

software[39] or manually searching for the 3D structures of corresponding motifs using Swiss-

PDB-viewer. Then, Mfold[76] (www.mfold.bioinfo.rpi.edu) has been used to check the

secondary structures avoiding self-complements of each RNA candidate prior to ordering. The

terminal base pairs G–C were chosen to prevent end fraying of the sequence during melting. The

duplexes were also designed to have a melting temperature between 40° C and 60° C and to have

minimal formation of hairpin structures or mis-aligned duplexes. Also, for TLC purification

purposes, RNA oligonucleotides were designed to avoid poly guanosine sequences as they hard

to elute from silica. Detailed explanations of the designing techniques can be found in result

sections of the following chapter. RNA sequences were converted into DNA sequences and

corresponding reverse and forward primers were designed with use of the following link:

http://rna.bgsu.edu/oldwebsite/rnatodna.html. Sequence information of RNA molecules used in

this study is provided in the Appendix.

5.3 Preparation of DNA template

PCR is an acronym for “Polymerase Chain Reactions” which is an enzymatic reaction aimed at amplifying sequences of DNA. PCR reactions can be accomplished due to the thermo stability property of the enzyme Taq DNA polymerase from the bacterium Thermos aquaticus.

“Hot start” PCR is an enhanced PCR method in which the DNA template and the primers are heated to 94ºC for 4 minutes before the enzyme is added. This generates cleaner PCR products by reducing the non-specific annealing of the primers to the DNA template[79]. For a total volume of 100 μl, the following components are combined: 73

1) dd H2O – 82 µL

2) 10X PCR buffer (100mM TRIS-HCl pH=9.0, TRITON X-100 1%, 500mM KCl, 250 mM

MgCl2) – 10 µL

3) 10 mM dNTPs – 5 µL

4) 0.2 mM DNA template – 1µL

5) 100mM Forward primer – 1µL

6) 100mM Reverse primer – 1µL

7) TAQ after “hot start” – 5 µL

All the PCR reaction components are combined before the Taq DNA polymerase is added. Just prior to starting the cycling process, the enzyme is added to catalyze the reaction.

Each cycle of PCR contains three steps:

1-Denaturation at 94°C: At this temperature the double strand melts and open to form single stranded DNA, the Taq DNA polymerase activity is completely stopped at this temperature. Note that the temperature used for each step must be optimized for the particular template being optimized

2-Annealing at 52°C: The thermocycler lowers the temperature quickly to the appropriate annealing temperature, in our case, this is 52°C. At this temperature, the primers can anneal to the template. The forward and reverse primers anneal to their complementary DNA sequence by forming Watson-Crick base pairs. However, at this temperature, non specific binding could also occur.

3-Extension at 72°C: This temperature is optimized for the polymerase activity. Primers that may bind non-specifically are dissociated at 72°C while the primers matching exactly to their 74

complementary sequence remain bound. Taq DNA polymerase binds and initiates the extension

of the chain by adding the dNTP(s) one at a time complementary to the bases of the DNA

template. As the number of cycles increase, there is a huge amplification of the

DNA. After 20 cycles, a double stranded molecule of DNA is amplified 220 = 1,048,576 times.

5.3.1 Purification of DNA products on rechargeable QIAGEN columns

Here we describe the preparation and use of inexpensive recharged spin columns which are at least as efficient in purification of PCR products as commercially available kits. These recharged columns can substitute for commercial columns in QIAquick kit which is traditionally have been used in our laboratory.

Recharging of used commercial columns from Qiagen. Silica membranes were cut out from

GF/F borosilicate glass microfiber filters (Whatman, England Cat# 1825110) by a 7-mm wad punch. GF/F has the largest DNA-binding capacity 2.8 l µg DNA/mg of membrane[80] . The used bearing disc, membrane and plastic gasket ring are pushed out from the column by a spatula or unfolded paper clip. The plastic parts were washed by and water and rinsed with alcohol. Then, parts are dried at 120° C. New membranes were inserted and fixed by the gasket ring (Figure 5.3.1.1).

Buffers compositions.

Absorbtion buffer (AB):

1. 5M Guanidine isothiocianate (GuSCN)

2. 50 mM MES, pH 5.5

3. 20 mM EDTA, pH 8.0

4. 0.5% Triton X-100 75

5. Cresol Red 0.05%

Washing buffer (WB):

1. 80% Ethanol

2. 10 mM TRIS-HCl, pH 7.5

Ellution buffer (EB):

1. 10mM TRIS-HCl, pH 8.5

Cresol red is included in AB and NIB buffers as a pH indicator dye. It is yellow at pH < 7.5,

which is optimal for the DNA binding to silica. The buffer AB includes GuSCN it is an active

so AB washes out some impurities that could be stacked on the membrane.

EDTA is included to get rid of RNA, irreversibly denaturated DNA, and single stranded

fragments of genomic DNA from the silica membrane[81]. 80% ethanol in WB is used to wash

the chaotropic salt and TRIS buffer is important component as ethanol itself denaturates DNA.

Purification protocol:

1. To each volume of the PCR reaction, five volumes of buffer AB are added to one volume

of PCR reaction. The mixture is placed in the column.

2. The column is placed in a provided 2 ml collection tube. To bind the DNA, the sample is

applied to the column and centrifuged for 30-60 seconds at 12,000 rpm.

3. The flow-through is discarded and the column is reattached to the collection tube.

4. For washing, 0.75 ml of buffer WB is added to the column and centrifuged for 30-60 seconds.

5. The flow through is discarded and the column is reattached to the collection tube then centrifuged for an additional 1 min at maximum speed.

6. Important: residual ethanol from buffer WB will not be completely removed unless the 76

flow through is discarded before additional centrifugation.

7. The column is placed then in a clean 1. 5 ml eppendorf tube.

8. To elute the DNA, 50 μL of buffer EB or dd H2O (pH =8.5) is added to the center of the membrane and the column centrifuged for 1 minute. Alternatively for increased DNA concentration 30 μL elution buffer is added to the center, allowed to stand for 1 min, and then centrifuged.

Figure 5.3.1.1. Illustration of use of rechargeable QUAGEN column. A is image of glass microfiber filter GF/F used to recharge the column. B is illustration of the parts of the column. C is 1% agarose gel images of PCR products before purification and after with comparison of commercial column and rechargeable.

77

5.4 In vitro transcription using T7 RNA polymerase.

In vitro transcription requires the following: A purified linear DNA template containing a promoter sequence, ribonucleotide triphosphates (ATP, UTP, GTP and CTP), a buffering system that includes Spermidine, DTT and magnesium ions and an RNA polymerase. Also added are the following proteins: RNAsin (to inhibit any contaminating RNAse), and inorganic pyrophosphatase to hydrolyse the pyrophosphates (P-Pi) resulting from the reaction whose accumulation in the solution might inhibit the T7 RNA polymerase. The most commonly used

RNA polymerase is the T7 RNA polymerase. The T7 promoter sequence is incorporated into the template 5' end of the forward primer. These bases become double-stranded during the initial

PCR cycles. The transcription reactions often carried out in our lab are accomplished by using in house produced T7 RNA polymerase purified by Ni- ion chromatography of His tagged proteins as explained in the T7 RNA polymerase purification section. After 4 hours of incubation at 37ºC,

1μl of RNAse free-DNAse solution is added to the mixture, mixed and incubated again at 37ºC for 30 minutes. The DNAse will hydrolyze the DNA completely, leaving the RNA intact. The next step is to purify the RNA by electrophoresis on a denaturing polyacrylamide 8M urea gel.

5.5 RNA purification

5.5.1 Purification on denaturing PAGE

RNA was purified on 15% denaturing (29:1 acrylamide:bisacrylamide) PAGE containing 8

M urea. Two glass plates which one is half an inch longer than the other, were cleaned with ethanol and dried. The larger glass plate was laid flat on the bench and two Teflon spacers arranged in place along the sides. Then the smaller plate is laid on the spacers. The plates were then clamped together using binder clips then taped using 3M electric insulating plastic tapes to 78

prevent any leakage. After addition of APS and TEMED, the solution of 8M Urea, 15%

polyacrylamide is poured and a comb is inserted between the plates. The comb will determine

the shape, the width and the spacing of the lanes. The transcription reactions are mixed, using a

1:1 ratio, with loading buffer (8M Urea, 25% glycerol, 0.01% of Xylene cyanol and 0.01%

bromophenol blue).

The urea gels were run using a 0.5X TBE running buffer at < 50 milliamperes for 2-3 hours or

just until the bromophenol blue runs completely off the gel. The rest of the procedure is

summarized in the following steps:

1. The plates are opened carefully using a spatula and the upper plate totally removed leaving gel

on lower plate

2. The gel is covered with a Saran (Saran, Dow chemical).

3. The plate is turned upside down and the plastic wrap is carefully pulled down slowly dragging

the gel with it until the entire gel is removed from the glass surface. Then gel is covered with a

second sheet of Saran.

4. To visualize the RNA bands, the gel is shadowed by using a UV lamp and a fluorescent TLC

plate

6. The bands are marked, cut using a razor blade and placed carefully in a 2ml eppendorf tubes.

7. A 0.6 ml of elution buffer or crush and soak buffer (40ml total volume: 5ml 10xTBE; 15ml

NaOAc 1M; 30 ml dd H2O) is added to the 2ml tubes.

8. The tubes are soaked overnight on the rotator placed at 4ºC in the cold room.

To precipitate the RNA, the following procedure is followed:

1. The crash and soak buffer is removed using sterile Pasteur pipettes; the small gel pieces

should be avoided as much as possible. 79

2. 2.5 volumes of 98 % ethanol are added to one volume of crash and soak buffer.

3. The tubes are cooled for 10 minutes on crushed dry ice.

4. The tubes are centrifuged at high speed for 5min. (~14,000 rpm in the cold room).

5. The supernatant is removed carefully. A small white pellet can be seen depending on the

RNA yield.

6. The pellet is then washed twice with 0.5 ml of cold 80% ethanol

8. The tubes are dried under vacuum using a Speed Vac pump for < 20 min.

9. The pure RNA is finally dissolved in the desired volume of water or desired buffer.

5.5.2 RNA purification on TLC plates

Small RNA sequences < 15 nt. could be simply purified on TLC plates. Glass-coated

cellulose thin-layer plates (20 × 20 cm, 0.1-mm layer) from Merck (cat. no. 105730-001) were used. All TLCs were run at room temperature ~20° C, in closed chromatographic tanks. Mobile

phase 55:35:10 n-propanol:ammonia:dd H2O was poured into the tank so that the height of the

solution in the bottom of tank is 4 to 5 mm. The tank is kept under a well-ventilated hood, and

the solvent is poured in advance to ensure good saturation with solvent vapors. The origin of the

deposit is marked on one corner (at 1.5–2 cm from each edge) of the plates with a soft pencil,

and the information needed for plate identification is written at the opposite corner. An aliquot of

the RNA are then spotted on top of the lane designated by pencil. These must be applied approx

0.5–1.0 μL at a time to give spots not more than 3 to 4 mm in diameter. The sample is allowed to

dry between successive applications.

RNA separation is usually developed overnight (15–18 h). Once the solvent reaches the top

of the plate, some extra time for migration is allowed with the mobile phase (1 or 2 h) to 80

maximize resolution of the unwanted nucleotides length along the migration lane. Two TLC plates can be run in parallel in the same chromatographic tank. After migration the plates are withdrawn from the tank and dried thoroughly under a fume hood for several hours.

To visualize the RNA bands, the gel was shadowed by using a UV lamp (Figure 5.5.2.1) and scratched using a razor blade and silica beads placed in a 2ml eppendorf tubes. A 1 ml of dd H2O is applied and the mixture was thoroughly vortexed to dissolve RNA in water or in desired buffer. Then the mixture was centrifuged at 14,000 rpm and supernatant containing RNAs is collected.

In such a way, RNA samples purified by TLC technique are absent of salts and can be used for UV-melting experiments where, in the most cases, presence of metal ions impact on the melting temperature of nucleic acids.

Figure 5.5.2.1. Image of RNA sequences resolved by preparative 20x20 cm TLC plate. 81

5.6 RNA labeling

Standard precautions should be observed

when using radioactive isotopes!!!

Overview. Experiments involving low (nanomolar) scale of nucleic acids require sensitive detection methods. Generally, RNA labeling either by radioactive isotope 32P (radiolabeling) or a fluorophore (fluorescence labeling) is applied. RNA can be labeled on the 3’ or the 5’ end with

32P phosphate. For enzymatically prepared RNA, 3’ end-labeling using citydine 3' 5'- bisphosphate (pCp) is preferred because 5’-end-labeling with polynucleotide kinase requires dephosphorylation of the 5’-end.

However, synthetically prepared nucleic acids do not contain phosphate groups on either ends and could be directly labeled at 5’ end using [γ-32P] ATP. The labeling is carried out by a

T4 PolyNucleotide Kinase (T4 PNK). The reaction is enzymatic and catalyzes the transfer and exchange of Pi from the γ position of ATP to the 5´ -hydroxyl terminus of polynucleotides

(double-and single-stranded DNA and RNA) and nucleoside 3´-monophosphates.

In this study we performed labeling RNA at the 5’-end. The description of the procedure is detailed below:

Materials

1. RNA ~ 10 pmoles. 82

2. Phosphatase buffer (10× as supplied by manufacturer).

3. 500 mM Tris–HCl, pH 8.5 and 1 mM EDTA.

4. Calf intestinal alkaline phosphatase (CIP): 1 U/μL (New England Biolabs).

5. Phenol saturated with 10 mM Tris–HCl buffer, pH 7.9 (Sigma-Aldrich).

6. 24:1 chloroform: isoamyl alcohol (CHISAM).

7. T4 polynucleotide kinase (T4 PNK): 10 U/μL (New England BioLabs).

8. Kinase buffer (5×): 25 mM MgCl2, 125 mM CHES (pH 9.0 at 20°C), and 15 mM DTT.

9. [γ-32P] ATP: 6000 Ci/mmole total ATP on reference date (Perking-Elmer).

Procedure

1. Calculate the concentration of the RNA transcription product by measuring the absorbance at

260 nm (A260)

2. Combine the following in a 1.5-mL microfuge tube to perform the dephosphorylation reaction, and incubate at 50°C for 30 min: a. 10 pmol RNA b. 2 μL 10× dephosphorylation buffer c. 2 μL CIP d. dH2O to 20 μL

3. After incubation, add 172 μL dd H2O and 8 μL 5 M NaCl.

4. Remove the CIP by phenol/chloroform extraction as follows. Combine equal parts of Tris- buffered phenol and CHISAM. Then add 200 μL of phenol/CHISAM to the dephosphorylation reaction and vortex. Separate the phases by centrifugation and transfer aqueous phase to a new 83

microfuge tube. Add 200 μL of CHISAM to the aqueous layer and vortex. Separate the layers by

centrifugation and transfer the aqueous phase to a new microfuge tube.

5. Precipitate RNA from the dephosphorylation reaction with ethanol and pellet bycentrifugation.

7. Decant the supernatant and dry pellet. Resuspend in 20 μL dH2O. Employ this

dephosphorylated RNA in the 5´ 32P end-labeling reaction.

8. Combine the following in a 1.5 mL microfuge tube for the 5´ 32P end-labeling reaction, and incubate at 37°C for 30–60 min: a. 10 pmol dephosphorylated RNA b. 4 μL 5× kinase buffer c. 6 μL [γ-32P] ATP ATP (1.6 μM)

d. 2.5 μL T4 PNK

e. dd H2O to 20 μL

9. Purify the 5´ 32P end-labeled RNA by denaturing PAGE as follows:

Add 20 μL 2 × gel-loading buffers (10M UREA, 0.1% bromophenol blue) and load the sample onto a denaturing 10% polyacrylamide gel (1.5 mm thick). Run gel until the band of interest is midway down the plate. Use the location of the bromophenol blue to estimate the location of the

5´ 32P end-labeled RNA according to the expected molecular weight of the product. Separate the

glass plates, removing the top plate and covering the gel with plastic wrap. Cut 5 small pieces of

Whatman paper in apply one drop of leftover radioactive material (mixed with a little bit of dye).

These small pieces of paper serve as locators. They are placed under a plastic wrap, above the

gel at different positions to map the RNA. Then, place a phosphorImager screen on the top of the

gel covering all gel area including the Whatman markers for around 20 sec. After scanning, print 84

the obtained image (Figure 5.6.1) and place underneath the bottom glass plate. Finally, cut RNA bands and proceed to precipitation step (see above).

Figure 5.6.1. Image of 10% denaturing PAGE of radiolabeled RNAs.

5.7 Structural probing experiment

Overview. Structure probing of RNA is the process by which biochemical techniques are used to determine the patterns which can infer the molecular structure, experimental analysis of molecular structure and function, and further understanding on development of smaller molecules for further biological research. An RNA of interest is treated with a chemical reagent 85

that ‘modifies’ or cleave the structured RNA in some specific way (Table 5.7.1). The reagent can be a small organic molecule, a metal ion, or an RNase enzyme. The experiment is performed such that reaction with the RNA is relatively limited and any two modification events are uncorrelated. Modification results either in cleavage of the RNA or in formation of a covalent chemical adduct between the RNA and probe molecule. Enzymatic and chemical cleavage is usually detected by resolving end-labeled RNA fragments by size on denaturing (sequencins)

PAGE. In this particular study, RNA hairpins were probed using three types of Rnases: T1, A and V1 as well as Diethylpyrocarbonate (DEPC).

Table 5.7.1. Common chemical and enzymatic probes used in structured RNA

Probes* Target Detection method direct RT DMS A (N1) - + C (N3) s +

G (N7) s s

DEPC A (N7) s + Kethoxal G (N1 - N7) + + CMCT G (N1) - + U (N3) - +

specific binding sites Pb (II) acetate + + dynamic regions

T1 RNase unpaired G + + A RNase unpaired C,U + + V1 RNase pared or stacked n + +

flexible regions -/+ -/+ SHAPE ssRNA degradation over In-line + + time

* DMS – dimethylsulfate; DEPC – diethylpyrocarbonate; Kethoxal – β – ethoxy – α – ketobutyraldehyde; CMCT – 1- cyclohexyl – 3(2-morpholinoethyl) carbodiimide motho –p – toluene sulfonate. Detection method: (direct) detection of cleavages on end – labeled RNA 86

molecule; (RT) detection by primer extension with reverse transcriptase. (+) the corresponding detection method can be used and (-) can not; (s) a chemical treatment in necessary to cleave the ribose phosphate chain prior to detection.

Materials

Enzymatic probing

1. RNAse reaction buffer (2×): 100 mM Tris–HCl pH 8.3, 200 mM KCl.

2. 5’-end 32P-labeled RNA ~ 100,000 cpm/1 µl.

3. 100mM MgCl2

4. Colorless gel-loading solution (2×): 10Murea and 1.5mMEDTA (pH 8.0 at 20°C).

5. Na2CO3 buffer (10×): 0.5 M Na2CO3 (pH 9.0 at 23°C) and 10 mM EDTA.

6. Sodium citrate buffer (10×): 0.25 M sodium citrate (pH 5.0 at 23°C).

7. RNases T1, A and V1 (Life Technologies Co.)

DEPC

1. tRNA 10mg/ml

2. 5’-end 32P-labeled RNA ~ 100,000 cpm/1 µl

3. 2× CB buffer (100 mM Sodium Cacodylate pH = 6.94, 200mM KCl, 1mM EDTA)

4. 50 mM solution of MgCl2

5. DEPC (Sigma Aldrich)

6. A>G reaction buffer (50 mM NaOAc; EDTA 1mM pH=8.0)

7. Aniline/Acetate buffer pH = 4.5 (1M Aniline; glacial acetic acid to obtain pH 4.5)

87

Procedure

Limited enzymatic digestion

Prior digestions, labeled hairpins 1 µL (2 pmoles ~ 100,000 cpm) containing 1 µL tRNA

(10mg/mL) were heated to 94° C in 1 µL of dd H2O for 2 min (denaturing step) and snap-cooled on ice. 5 µL of 2 × reaction buffer were then added and the resulting mixtures were incubated at

30° C for 10 min (annealing step). Finally, 1 µL of 50 mM MgCl2 was added and the mixture continued incubation for 30° min (assembling step). The RNAs were digested for 1 min at 30° C using the following nucleases: T1 (1U/µL), A (0.011 ng/µL) and V1 (0.0005 U/µL). The digestion reactions were stopped by adding 10 µL of colorless gel-loading solution (10 M urea,

1.5 mM EDTA). Without excessive delay, the products of digestions were analyzed on 8 M urea

- 15 % sequencing PAGE.

Sequencing ladders for locating positions of cleavage products were generated by digestion of control samples with alkali (pH 9.2, cleavage at all positions) and with RNase T1 under denaturing conditions (cleavage of all G’s).

DEPC modification and cleavage of RNA

RNA chemical modification by DEPC was performed on native conditions as follows: labeled

RNA hairpins 1 µL (~ 100,000 cpm) were mixed with 1 µL of tRNA (10mg/mL) in 38 µL of dd

H2O. The mixture was heated for 2 min at 94° C and immediately snap-cooled on ice. To the mixture 50 µL of 2xCB buffer were added and the resulting solutions were incubated for 10 min at 30° C. Then, 10 µL of 50 mM MgCl2 solution were added and the mixture was left to incubate for additional 30 min at 30° C. The RNA base modification with DEPC was performed by 88

adding 15 µL of DEPC (SigmaArdrich) to each RNA sample and incubating for 30 min at 20° C with mixing every 10 min.

For the semidenaturing conditions: the above procedure was applied with no Mg2+ ions in reaction vessel.

Modified RNA bases were further precipitated with ethanol and aniline cleavage was performed.

Briefly, ethanol precipitated RNAs were dry in SpeedVac., and each pellet was resuspended in

20 µL aniline-acetate buffer (1 M Aniline, 13.9 N glacial acetic acid, pH = 4.5). The cleavage reaction was performed by incubating the mixture for 20 min at 60° C in the dark. Samples were freezed in dry ice and then lyophilized in SpeedVac. Pellets were resuspended in 50 uL of dd

H2O, freezed, and dried in SpeedVac. Finally, cleaved RNA products were resuspended in 6 µL of loading dye solution (10 M urea, TRIS-HCl pH 7.4, EDTA 1mM, 0.1% (w/v) bromophenol blue) and analyzed on 8 M urea - 15 % sequencing PAGE.

5.8 UV-melting experiment

Several experimental methods can be implemented to provide thermodynamic parameters for

RNA secondary and tertiary structures. Perhaps the most common and popular method is thermal denaturation or UV-melting. The easiest way to denature a nucleic acid is to heat them.

The result is the ordered regions of stacked base pairs are disrupted and the UV absorbance increases (hyperchromicity), the intrinsic hyperchromicity of the nucleotides is observed. The origin of hyperchromicity is the electronic interaction between neighboring stacked bases. When bases stack or pair the intrinsic electric dipole transition moments tend to be zero. The resulting profile of absorbance vs. temperature is called a melting curve, due to its similarity in appearance to a phase transition. 89

To perform UV monitored thermal denaturation experiment one needs UV detector with adjustable temperature cell changer. The apparatuses are commercially available (Beckman

Coulter, Cary, Jasco, and Shimadzu) and relatively inexpensive.

5.8.1 Sample preparation

The concentration of RNA is an important consideration. Ideally, absorbance values measured at high temperature (unfolded state ~90° C) should fall between 0.5 and 1.0. Typical percentage hyperchromicity, defined as (AF- AU) /AF, where AF and AU correspond absorbance of folded and unfolded states respectively, to at temperatures near the melting transition, is 15-

20% but can vary depending upon wavelength. The Beer-Lambert law is used to determine oligonucleotide concentration and ensure that absorbance values will be optimal for melting,

A = ε *c *L (8)

Where A – absorbance function of temperature, ε- extinction coefficient function of temperature

C- concentration of the sample and L – path length of the cell usually 1cm.

The extinction coefficient, as it is dependent on base-base interactions, is the parameter in

Eq. 8 that changes with melting of the nucleic acid. Because the extinction coefficient is dominated by base-base interactions, a nearest neighbor model can be used to predict its value with good accuracy. Average extinction coefficient at 260 nm for the purines and pyrimidines are 1.3 * 104 and 0.8*104 M-1cm-1, respectively[82]. As such, extinction coefficients of oligonucleotides are often 105 and of functional RNAs are 106 or 107 M-1cm-1.The magnitude of these values lead to optimal absorbance for melts with sample concentrations of just a few micromolar for a typical oligonucleotide and nanomolar for large RNAs with volume of the 90

quvette as low as 70 uL. Thus, one of the privilege of UV melting is a sensitivity, especially when compared to ITC and NMR.

Buffer choice is system dependent. There are two main criteria for buffer system: first, it should not absorb any UV light and second is its pKa should not vary with temperature. For melting of oligonucleotides that form only secondary structure, divalent ions are typically omitted, and sodium cacodylate and phosphate are good buffers choice given its pKa near neutrality and simple structure[83, 84]. Phosphate, however, should be avoided for Mg+2 ion containing buffers as Mg3(PO4)2 forms at high temperatures, which reveals itself in strong scattering of the signal.

A common buffer is 10mM Sodium Cacodylate (10 or 20 mM sodium cacodyate, 100mM NaCl and 0.5 mM Na2EDTA, pH 7.0). Tris and HEPES buffers do not satisfy with buffer criteria see above. The small amount of EDTA is included to bind trace of poly-valent metals that otherwise might catalyze hydrolysis of sugar-phosphate backbone, especially at the higher temperatures of a melt. In cases where higher-order folding of RNA is being studied, Mg2+ should be added, though it should be noted that these melts are generally only carried out from low to high temperature due to RNA hydrolysis.

When all calculation and preparation of an RNA sample has been made it is ready to be heated.

Perhaps, before loading into UV cell the analyte solution is needed to be degased either by bubbling through it inert gas or applying a vacuum to prevent microscopic bubbels of air during the heating process.

Last, calibration of an instrument is necessarily and should be done regularly by putting inside the cell a small thermocouple and read the temperature using electronic thermometer. Also, very helpful calibration is periodically measure melting of a standard sample with well-fit parameters, such as tRNA[85]. 91

5.8.2 Data analysis.

The common way to determine thermodynamics parameters from UV-melting experiment is

to apply van't Hoff analysis to the data. Detailed analysis and interpretation of data collected by

UV-melting method can be found elsewhere[86].

Briefly, by assuming a linear temperature dependence for absorbance coefficient of both single

stranded (AD) and double stranded (AS) RNA the transition between two indicates the fraction of

single stranded (unfoled) molecules in equilibrium at any given temperatures (Figure 5.8.2.1). In

a graphical determination of TM, the bisector between both baselines intersects the absorbance

curve at the midpoint of transition where ∆G = 0.

Figure 5.8.2.1. Representative UV-melting curve of a RNA duplex annealed in 10mM sodium cacodylate, 1M NaCl and 0.5mM EDTA, pH=7.1. The Tm is defined by bisector between lower AD and upper AS baselines.

The melting transition data can be fitted with a derivative method dA260 / dT (Figure 5.8.2.2),

where the maximum peak corresponds to RNA TM

92

T M

Figure 5.8.2.2. UV-melting curve represented as derivative plot dA260 / dT to obtain TM.

Alternative and more accurate way of determination of TM is using equation (9); it relates the absorbance (or some other properties) to a profile of fraction of folded molecules (F) vs. temperature.

F= (AD-A260)/(AD-AS) (9)

where F is the fraction of RNA duplexes, AD and AS are the values of property for the single- stranded (S) and double-stranded (D) species, respectively. Tm is the temperature where F = 0.5.

If Tm is found the thermodynamics parameters (∆Ho, ∆Go, ∆So) could be derived for the transition by translation of the absorbance melting curve into the concentration of the single- stranded and double-stranded strands (Figure 5.8.2.3). This is done most simply by assuming any two state models and accordingly which expression is used Table 5.8.2.1 to calculate binding constant.

93

Figure 5.8.2.3. Fractions of RNA duplex (F) calculated from baselines of transition in Fig 25. The Tm is temperature where F = 0.5 or 50%.

Table 5.8.2.1. Two-state analysis of nucleic acids transitions

Reaction type Keq TM concentration dependence Monomolecular Keq = [H]/[S]=F/(1-F) Do not depend S ↔ H Bimolecular(self- 2 2 o o o complementary) Keq = [D]/[S] = F/2(1-F) *Ct 1/TM=(R/ΔH )*lnCt + ΔS /ΔH 2S ↔ D Bimolecular (non-self- 2 o o o complementary) Keq = [D]/[SA][SB] = 2F/(1-F) *Ct 1/TM=(R/ΔH )*lnCt + [ΔS -Rln4]/ΔH SA+SB ↔ D Trimolecular (identical 3 2 3 o o o strands) Keq = [T]/[S] = F/3Ct *(1-F) 1/TM=(2R/ΔH )*lnCt + [ΔS -Rln4/3]/ΔH 3S ↔ T Trimolecular (nonidentical 2 3 o o o strands) Keq = [T]/[SA][SB][SC] = 9F/Ct *(1-F) 1/TM=(2R/ΔH )*lnCt + [ΔS -2Rln6]/ΔH SA + SB + SC ↔ T Tetramolecular (identical 4 2 4 o o o strands) Keq = [T]/[S] = F/4Ct *(1-F) 1/TM=(3R/ΔH )*lnCt + [ΔS -Rln2]/ΔH 4S ↔ Q

Next, the natural log of Ka is plotted versus the inverse temperature 1/T using following equation to determine thermodynamics parameters of a nucleic acid. 94

ΔGo=-RT lnKa = ΔHo-TΔSo (10)

Where ΔGo is the free Gibbs , ΔHo is the standart enthlpy, ΔSo is the enthropy and R is the molar constant.

By rearranging, one obtains the van t’Hoff expression:

Ln Ka = -ΔHo/R * (1/T) + (ΔSo/R) (11) which allows determine ΔHo from the slope and ΔSo from the ordinate intersection (Figure

5.8.2.4). If ΔHo is independent of temperature, the plot should be linear. A non-linear van’t Hoff plot can result from several factors: temperature dependence of , poor choice of baselines, or a non-two-state transition[87].

However, this method is depend on several assumption: 1) baseline have been determined correctly; 2) a two state model is valid; 3) the system is in perfect equilibrium at all temperatures and lastly ΔHo is temperature-independent and therefore ΔC = 0.

Figure 5.8.2.4. Van’t Hoff plot of data shown in Fig. 5.8.2.3. 95

5.9. RNA concentration and melting temperature

Ideally, thermodynamic data should be calculated from replicate experiments at more than

one concentration. A very easy and effective method to obtain thermodynamic data is from the

concentration dependence of TM. The relevant equations relating TM and total strand

concentration Ct are listed in Table 5.8.2.1. For intramolecular annealing only, the TM is

independent of RNA concentration. In all other cases involving secondary and tertiary

interactions between molecules, an increase in RNA concentration will result increasing TM with

no change in percent hypochromicity. Therefore, a concentration dependence of melting profile

is a clear indicator of dimerization. Data from a series of melting curves at different RNA

concentrations is plotted as 1/Tm versus lnCt. Any precise determination supposes that Ct can be

varied over a large range (µM to mM) while the absorbance remains within the linearity limits of

the instrument.

5.10 UV-melting of RNA duplexes with sarcin-ricin motif

Thermodynamic parameters for SRM formation were measured using two buffer systems:

1) 1.0 M NaCl, 10 mM sodium cacodylate, and 0.5 mM Na2EDTA at pH 6.94

2) 50 mM MgCl2, 100 mM NaCl, 10 mM sodium cacodylate, and 0.5 mM Na2EDTA at pH 6.94.

Strand concentrations were determined from high-temperature (90° C) absorbance at 260 nm using extinction coefficient provided by supplier (IDTDNA.COM). Absorbance versus temperature curves were measured at 260 nm with a heating or cooling rate of 1.0 °C/min, on a

CARY-100 BIO spectrophotometer. Each duplex was melted at least 3 times at total oligomer concentration of 10 µM. To observe that the duplexes are concentration dependant and follows 96

bimolecular assembling several molecules were melted at concentrations over 5-fold range between 5 µM and 25 µM.

Absorbance versus temperature profiles were fit to a two state model by using a nonlinear least squares program MeltWin version 3[88]. Thermodynamic parameters for duplex formation were obtained by two methods: (1) enthalpy and entropy changes from the fits of the individual

-1 melting curves, and (2) plots of the reciprocal melting temperature, TM , versus log Ct gave

enthalpy and entropy changes using Eq. 11. Parameters derived from the two methods agreed

within 10%, consistent with the two-state model.

97

CHAPTER VI. RESULTS AND DISCUSSIONS

6.1. Background information.

Compared to double stranded DNA, the interaction patterns of bases in structured RNA

molecules are considerably more complex. Atomic resolution structures of large RNAs have

revealed that a large fraction of interacting bases do not form Watson-Crick (WC) base pairs. In fact, the standard cis-Watson−Crick/Watson-Crick (cWW) AU and GC base pairs account for approximately 70% of base pairs in structured RNAs including the small and large ribosomal

RNA subunits [60]. Most of the rest of the bases of structured RNAs form non-Watson-Crick

(non-WC) base pairs which account for about 1/3 of all base pairs. The non-WC base pairs occur primarily in hairpin, internal or junction loops. They serve to structure these loops and to expose functional groups, especially those on the Watson-Crick edges, to form specific binding sites for proteins, small molecule ligands, metals, or other RNAs [65].

Previous work on RNA adaptation to different environments has focused on the ratio of GC to AU Watson-Crick basepairs [89] as GC base pairs are generally more stable than AU pairs.

These authors found that in ribosomal RNA (rRNA) the relationship between the proportion of

GC base combinations at Watson-Crick base paired positions and the optimal growth temperatures is very weak up to ~ 60°C optimal growth temperature. Above this temperature, there is a roughly linear relationship between the number of GC pairs and the optimal temperature, as GC base pairs generally confer more stability to Watson-Crick helices than AU

pairs. The same positive correlation of GC content with optimal growth temperature was found

in helices of the RNA component of the signal recognition particle among Archaea[90]. In

contrast, no such simple relationship exists for genomic DNA; the average GC content varies

widely from approximately 30% to more than 60% in bacteria, irrespective of the optimal growth 98

temperature[91]. Many loop regions are important for the function and overall structure of RNA molecules. Therefore, we suspect that the sequences of loop regions may also adapt to temperature.

In previous work, we compared the 3D structures of the rRNAs of the distantly related bacteria, E. coli (mesophile) and T. thermophilus (thermophile), by aligning the structurally conserved base-paired positions in the 5S, 16S, and 23S rRNAs [60]. We aligned all non-

Watson-Crick (non-WC) as well as Watson-Crick (WC) base pairs, and found that 1) the geometric families of aligned base pairs were nearly 100% conserved and 2) almost all base substitutions resulted in isosteric or near isosteric base pairs. This result strongly supported the hypothesis that base pair isostericity is fundamental for understanding the rules of sequence transformation during RNA evolution. Isostericity indicates which base pairs can potentially substitute for each other preserving the 3D structure of the motif. However, nothing is known about how the isosteric base substitutions impact on the stabilities of well-structured RNA 3D motifs and, is there a selection for particular base combinations depending on optimal growth temperature of the organism? These related questions were addressed in this paper.

Thermodynamic studies of RNA internal loops have taken a “bottom up” approach which attempts to cover the sequence space of internal loops starting systematically with small loops

(1x1, 1x2, 2x2, 1x3, and 2x3)[92]. Such an approach quickly confronts the problem of combinatorial explosion. It is simply not possible to study all possible sequence variants of internal loops of size 4x4 or greater, without high throughput methodology.

In this work we use a different approach. We chose a recurrent, well-structured and relatively large internal loop (5x4 nucleotides not counting the closing bases or 6x5 with closing bases) for detailed study. We have chosen the sarcin-ricin (S/R) motif as the focus of detailed study of 99

phylogenetic and ecological sequence variation and thermodynamic measurements for these

reasons:

1) The motif is recurrent and widespread in structured RNA molecules; 2) It is highly conserved,

different sequence compositions can give essentially the same 3D structure of S/R motif as it will

be shown bellow; 3) The S/R motif is modular and autonomous, it occur independently in

different molecules and in different locations in the same molecule[93].

We began by cataloguing the sequence variations observed in atomic-resolution 3D

structures, and then we examined sequence variations of the motif at corresponding positions in

aligned sequences. We collected these data to choose sequence variants for thermodynamic

characterization. We also studied sequences containing single mutations that are predicted to

disrupt the motif, guided by base pair isostericity matrices[94].

Thermodynamic analysis of S/R motif variations found in organisms occupying different

temperature niches and present in 3D structures indicates that a single change in a non-WC pair can considerably contribute to the overall stability of the motif in presence of Mg2+ ions. Given the different stabilities of the motifs, we have studied their solution structures by using single stranded specific nucleases T1, A, and the helix specific nuclease V1. We found that all isosteric or neutral base pairs substitutions do not interrupt the overall conformation. To determine the accessibility of adenine N7 positions within the sequence variants we performed chemical probing experiments using diethylpyrocarbonate (DEPC) under native and semi-denaturing conditions.

100

6.2. Searching for sarcin-ricin motifs in the 3D database

The standard S/R motif is usually represented in RNA 2D diagrams as an asymmetrical 5x4 internal loop (Figure 6.2.1, left). The 3D structure (see Figure 6.2.1 in the middle) shows that all bases are involved in non-WC base pairing and form five base pairs. The essential features of the

3D structure are depicted using the Leontis-Westhof base pair classification[61] as shown in the rightmost panel in Figure 6.2.1.

Figure 6.2.1. Different representations of the S/R motif from helix- 95 of H. marismortui 23S rRNA. Left, 2D structure of 5x4 nt asymmetric internal loop, 3D stereo view from PDB ID 1S72 showing non-canonical base pairs; Right - 2D diagram with annotation of all 5 non-canonical interactions in structure of S/R motif [65].

To examine the sequence variation in the S/R motifs found in atomic resolution 3D structures, we carried out mixed symbolic and geometric searches using WebFR3D [39]. We searched a non-redundant set of RNA 3D structures with resolution less than 4Å available as of

02/18/2012 [95] using geometric discrepancy 0.8 Å/nucleotide as the similarity cutoff [34]. The non-redundant set contained ribosomal structures from 6 different organisms, which allowed us to compare sequence variation between two related organisms, the mesophile D. radiodurans 101

and thermophilic T. thermophilus, that have adapted to different thermal environments, as well as between two phylogenetically diverged organisms that have adapted to similar thermal ranges, the mesophiles E. coli and D. radiodurans.

We identified 48 instances of S/R and S/R-like motifs (Figure 6.2.2) in 15 distinct PDB files

listed in Table 6.2.1. Based on our results, five recurrent S/R motifs are found in conserved

positions in large ribosomal subunit (LSU) of each of the six organisms and three in the small

ribosomal unit (SSU). Throughout this paper we will refer to the S/R motif located in H95 of 23S

rRNA from H. marismortui as the prototype (indicated by red rectangle in Figure 6.2.2 A, top

row).

Table 6.2.1. Representative PDB structure files containing S/R motifs. Organisms Temperature RNA type PDB ID References preference Archaeon H. marismortui mesophile 50S (LSU) 1S72 [96] Bacteria E. coli mesophile 50S (LSU) 2AW4 [97] 30S (SSU) 2AW7 T. thermophilus thermophile 50S (LSU) 3UXR [98] 30S (SSU) 3UXS D. radiodurans mesophile 50S (LSU) 2ZJR [99] B. subtilis mesophile Ribonuclease P 1NBS [100] T. maritima thermophile Lysine riboswitch 3DIR [101] Eukarya Tetr. thermophila thermophile 60S (LSU) 4A1B [102] S.cerevisiae mesophile 60S (LSU) 3U5H [103] X. laevis mesophile 5S rRNA 1UN6 [104] RAT mesophile S/R Domain from 1Q96 [105] 28S rRNA

102

The order and the base pair families in the prototype S/R motif is conserved, but its sequence varies. For example, in most eubacterial organisms the tHH base pair is AC, while in most

Archaea and Eukarya it is an AA.

In other occurrences of the S/R motif, more sequence variations are seen, especially, in the lower tHS (see Figure 6.2.1 leftmost panel for the location of lower tHS) base pair, where we observe seven base combinations in LSU, SSU and 5S rRNA 3D structures (AG, AA, UC, CC,

AC, GG and UA).

There are some unusual nucleotides in the S/R motif. For example, in H11 of S. cerevisiae there is an unexpected bulged A in place of a G which does not participate in the triplet interaction with the U and A bases. Also an peculiar for the S/R motif, tHH GA base pair is observed in a junction from domain II of H. marismortui 23S rRNA.

The structure of loop E in eukaryal and archaeal 5S ribosomal RNA is similar to that of the

S/R motif. Figure 2B shows 2D diagrams of loop E structures from S. cerevisiae and H. marismortui which share the same nucleotides and interactions, while the structure of Tetr. thermophila surprisingly forms a cWW UG base pair in place of the lower tHS.

Within the 16S rRNAs S/R-like motifs occur in three places: H27; 3-way junction of H28, 89 and 43; and H17 (Figure 6.2.2 D). These motifs differ from the prototypical structure in that the

H27 S/R motif contains a conserved bulged C, the 3WJ S/R motif is missing one tSH base pair and the S/R motif from H17 shares only tHS AG and tWH UA base pairs with the prototype.

Nevertheless, the overall geometry of these motifs is similar to the prototypical structure, and we consider them conserved.

103

S/R motifs from 23S and 28S rRNA A 3UXR 2ZJR 2AW4 1S72 4A1B 3U5H

Helix 95, prototype motif

Central Core Junction

Helix 13 (Domain I)

Helix 11 (Domain I)

Multihelix Junction in Domain II; H37, 39, 45

104

Eukaryal and Archaeal 5S rRNA S/R motif (Loop E) 16S/18S rRNA S/R and S/R-like motifs

B D Helix 27

Helix 17

S/R motif in other structures C 3WJ; H28, 89, 43

Figure 6.2.2. 2D diagrams of S/R motif from different organisms found in 3D rRNA data base. A) S/R motifs from large ribosomal subunit, the pdb code of the structures is given on the top. Rows represent diagrams of S/R motif corresponding to particular place in rRNA and name with the nucleotide numbers are given on the top of each diagram. Each column is structure of particular organism. B) S/R motif diagrams from archaea and eukaryal 5S rRNA, pdb codes are given in parentheses. C) S/R motif diagram from other structures. D) S/R motif diagrams from 16S rRNA 3D structures of E.coli and T.thermophilus and 18S rRNA of S. cerevisiae.

105

6.3. Base pair isostericity in S/R motifs.

Two base pairs are said to be isosteric when they meet three criteria [60] : (1) The C1′–C1′

distances are the same; (2) the paired bases are related by same rotations in 3D space; and (3) H-

bonds are formed between equivalent base positions. Thus, isosteric or neutral base substitutions

in S/R motifs do not interrupt its 3D geometry while non-isosteric mutations have a significant

impact on the structure. Table 6.3.1 lists all sequence variants for each non-WC base pair in S/R

motifs observed in 3D structures grouped by their isosteric family. Figure 6.2.3 represents isosteric and near isosteric tHS and tHH base pairs that appear frequently in S/R motifs.

Guided by the structural analysis, we suspect that some of the observed sequence variation in

non-WC base pairs in the S/R motifs is due to thermal adaptation. For example, Figure 6.2.2

demonstrates that in all thermophilic structures (T. thermophilus, T. maritima) the upper tHS

base pair is always AG, whereas in mesophilic groups it may be AG or AA (e.g. E. coli H27 16s

rRNA and B. subtilis RNAse P). The lower tSH base pair accommodates multiple sequence

combinations; however, it is not known to what extent, if any, these variations affect the stability

of the S/R motif. For example, there are two N-H…N bonds in the tSH GA base pair but only one

N-H…O=C H-bond in the isosteric tSH UC pair. Does this mean that motif in which GA

substitutes for UC at this position are much less stable and would this stability correlate with

organism’s optimal growth temperature? The present study aims to address these questions.

106

Table 6.3.1. RNA base pair variations observed in 3D structures of S/R motifs

Upper Lower tHS A C G U tHS A C G U

A 2 36 A 3 13 1 C C 1 8 10 2 G G U U

tWH A C G U cHS A C G U A A

C C G G

U 42 U 40

tHH A C G U cWW A C G U A 35 3 1 A 12

C 1 C 1 2 22

G 1 G 20 3

U U 5 1 1 15 a White boxes in each matrix indicate base combinations that do no t form that type of base pair. Blue boxes indicate i1 group of isosteric pairs within the family. Green boxes indicate i2 group of isosteric pairs within the family. Violet, Orange and Red boxes indicate i3, i4 and i5 gr oup of i sosteric p airs w ithin the f amily r espectively. L ight gr een boxe s indicates i6 group of i sosteric pa irs w ithin the family. Y ellow c olored boxes i ndicate c ombined i 1/i2 isosteric p airs w ithin th e f amily. G rey c olored boxe s a re modeled i nteractions, n ot ye t observed in high resolution X-ray. [66]

107

A B

Figure 6.3.1. Isosteric and near isosteric relationships between basepairs frequently observed in S/R m otif. Structures s hown ar e a ctual instances f rom ex perimental data o btained f rom P DB, representing exemplars of all annotated base pairs of the given base combination and geometric family (s ee re f. [66]). A ) trans-Hoogsteen/sugar e dge base pa ir c ombinations. B ) trans- Hoogsteen/ Hoogsteen base pair combinations.

108

6.4. Thermodynamic study of RNA duplexes containing sarcin-ricin motifs

Having analyzed sequence variability in the S/R motifs found in 3D structural database, we designed several RNA duplexes (Figure 6.4.1) that incorporate the key features of the S/R motifs identified at the previous step. Next, we describe the design of these duplexes and a series of

UV-melting experiments intended to assess their stability.

Figure 6.4.1. Sequences and schematic structures of the RNA sarcin-ricin motifs used in thermal denaturation studies. Names are given on the top of each duplex as follows: nucleotide mutation in respect to prototype molecule (red colored rectangle), appearance in rRNA with indication of helix number (e.g. 23S-H13), name of bacteria where sequence appears (e.g. E.coli), and optimal growth temperature organism adapted (e.g. mesophile). T he symbols of base pair interactions are given according to N.B. Leontis and E. Westhof nomenclature [65].

109

Nomenclature and labeling. For consistency, we numbered each nucleotide in the S/R motif

sequences in the 5’ to 3’ direction including the flanking base pair as shown in Figure 6.4.1 (the

red rectangle). Nucleotides differing from the prototypical sequence are colored red in all

figures. For example, the prototype duplex has no red colored nucleotides, while the other

duplexes contain one, two, three or even four red coloured bases, which are mutations or

substitutions compared to the prototype.

The duplexes are named and labeled according to:

a) substitution(s) relative to the prototype duplex. For instance, the U2C duplex has a

mutation of the nucleotide number 2, which is a U in the prototype molecule, to a C. The

mutation is coloured in red (Figure 6.4.1).

b) Sequence appearance in rRNA with indication of helix number and organism’s name, e.g.

23S-H13 E. coli.

c) Temperature class (psychrophiles, mesophiles, thermophiles, hyperthermophiles)

RNA duplex design. We analyzed all S/R motif instances found in the 3D structural database

(Figure 2) using WebFR3D [39] and identified 14 unique S/R motif sequences. These 4x5

internal loops were inserted in RNA duplexes surrounded by four canonical base pairs. To

prevent sequence end fraying prior to melting we used GC as terminal base pairs of the duplexes.

The sequences were checked with Mfold (www.mfold.bioinfo.rpi.edu, [76]) to avoid the formation of self-complementary secondary structures, internal hairpin structures or misaligned duplexes. We used duplexes and not hairpins because the hairpin loop region is expected to interact with divalent ions which is undesirable; base substitutions are easier and cheaper to design in duplex RNAs and data analysis in duplexes is more robust. 110

We also added two control molecules which were not observed in 3D structures. These

controls contain base substitutions intended to preclude the formation of S/R motif by disrupting

one of the two tHS base pairs. For example, in the U2G molecule U2, which forms a tSH base

pair with C12, is substituted by G. The resulting GC should not form a tSH pair since GC is never

observed forming this type of base pair in any 3D structure. On the other hand, the GC

combination can form a canonical and stable Watson-Crick base pair; however this is expected to

disrupt the formation of the S/R motif. Likewise, the A6C substitution is expected to disrupt the

formation of the other tHS base pair and thus the whole motif by producing a CG between

positions 6 and 9 of the motif.

All 16 designed RNA duplexes 14 observed in 3D structures and 2 unobserved controls are

shown in Figure 6.4.1 with different S/R motif sequences indicated by red rectangles.

Selecting flanking base pairs for the S/R motifs. The flanking Watson-Crick base pairs of S/R

motifs can vary. In addition to the canonical GC and AU flanking pairs, in 3D structures we also

see non-canonical GU, CC, UC, UU and CA Watson-Crick base pairs (Figure 6.2.2 and summarised in Table 6.3.1). Performing thermodynamic studies on all S/R motif variants with different flanking pairs would be challenging because of the large number of structures that would have to be analyzed. Instead, we decided to investigate the effect of different flanking pairs on one duplex (the U2G/C12A molecule). We kept the G7C8 closing base pair constant and

tested four different variants of the other closing base pair (C1G13, G1C13, U1G13, and U1A13).

The results of the UV-melting study are shown in Table 6.4.1. Except for the wobble U1G13

flanking pair all variants showed a good transition curve. For duplex design we chose the C1G13 cWW base pair for two reasons: 1) this variant has a high melting temperature (37.3 ºC); 2) and

on TLC plates it is easier to purify RNA molecules with fewer guanines on one strand. 111

Table 6.4.1. Effect of flanking base pairs on stability of S/R motif a

RNA duplexes with different flanking base pairs

unable to fit TM (°C) 37.3 39.9 30.5 30.8

−∆G37° 6.6 ± 0.1 7.0 ± 0.4 5.6 ± 0.7 5.6 ± 0.7 (kcal/mol) −∆H° 52.0 ± 2.3 53.1 ± 2.9 43.7 ± 1.4 45.4 ± 3.5 (kcal/mol)

−∆S° (e.u.) 146.5 ± 7.3 148.6 ± 12.8 122.8 ± 18.7 128.4 ± 15.2

a RNA duplexes were melted in 10 mM Na Cacodylate buffer pH=6.94, 0.5 mM EDTA, 100 mM NaCl, 10mM MgCl2. Thermodynamic parameters calculated at 100 µM total RNA concentration.

112

6.5 UV-melting of duplexes containing S/R motif.

UV-melting experiments of S/R motif RNA duplexes were performed with absorbance

changes monitored at 260 nm. Some RNA melting curves were also monitored at 280 nm and

gave similar results. Transitions in the presence of Mg2+ were not reversible for two temperature

cycles since Mg2+ catalyzes the hydrolysis of RNA backbone[106]. The single strands did not

exhibit absorbance versus temperature profiles. Table 6.5.1 shows the thermodynamic

parameters of duplex formation that were obtained from average of fitting each independent

melting curve to the two-state model at 100 µM Ct with the assumption that extinction

coefficients for single strands depend linearly on temperature. Additionally, we generated a

-1 Van’t Hoff plot of TM versus log (Ct/4) of the U2G/C12A duplex and found a good agreement with an error less that 10% between these two methods (see Appendix A2).

Normally, to perform a UV melting experiment, RNA oligomers are melted in the presence of 1 M NaCl to compensate for the absence of divalent ions. However, a defined conformation of

RNA bases prior to melting is crucial, which often depends on the addition of divalent ions.

Serra et. al. [107] showed that the stability of RNA structural motifs depends on Mg2+ ions, in

particular, a melting temperature increase in the presence of 50 mM MgCl2 has been

demonstrated for eukaryotic loop E. Moreover, the brownian dynamic simulation studies have

suggested three potential but delocalized metal-binding sites in S/R motif [108]. Therefore, to

investigate the role of Mg2+ to the S/R motif RNA duplex stability we performed UV-melting

studies in two buffer compositions:

1) 10mM Sodium Cacodylate buffer pH=9.64, 0.5mM EDTA with 0.1 M NaCl and 10 mM

MgCl2.

2) 10mM Sodium Cacodylate buffer pH=9.64, 0.5mM EDTA and 1M NaCl. 113

The differences in thermodynamic parameters of RNA duplexes from 10 mM Mg2+ and

in 1M NaCl are summarized in Table 3.

To find the optimal Mg2+ concentration for the S/R motif formation we titrated the

2+ U2G/C12A RNA duplex with increasing concentrations of Mg and measured the duplex’s

melting temperature (TM). Figure 6.5.1 shows that at 10 mM MgCl2 the TM curve reaches its

2+ plateau, and further Mg concentration increase does not significantly affect the TM. Thus, we

used 10 mM MgCl2 in all experiments involving magnesium.

Typical thermal denaturation curves of five representative RNA duplexes containing S/R

motif sequence taken at 10 mM MgCl2 is shown in Figure 6.5.2.

114

Table 6.5.1. Thermodynamic parameters for S/R motif duplex formation at 10 mM MgCl2,100 mM NaCl; and at 1M NaCl.

RNA DUPLEXES

a TM (°C) 72.7 70.6 73.3 69.7 69.9 66.0 65.5 67.1 64.8 77.9 74.2 58.1 54.4 44.5 69.7 66.5 ± 0.1 ± 0.2 ± 0.1 ± 0.1 ± 0.3 ± 0.2 ± 0.3 ± 0.4 ± 0.1 ± 0.4 ± 0.3 ± 0.4 ± 0.2 ± 0.5 ± 0.2 ± 0.6

−∆G37° 10.9 16.6 15.6 16.5 11.5 12.2 12.5 10.6 11.5 15.9 15.3 9.1 10.9 7.9 12.4 12.9 (kcal/mol) ± 0.1 ± 0.2 ± 0.5 ± 0.3 ± 0.1 ± 0.2 ± 0.1 ± 0.2 ± 0.1 ± 0.1 ± 0.4 ± 0.3 ± 0.1 ± 0.1 ± 0.4 ± 0.4

−∆S° 260.5 247.7 253.8 130.4 151.1 131.9 187.4 114.2 158.3 208.9 187.0 100.5 234.1 155.5 150.5 194.7 (e.u.) ± 6.4 ±16.5 ± 7.9 ± 4.3 ± 6.3 ± 1.2 ± 5.1 ± 8.5 ± 3.4 ± 3.7 ±13.7 ±17.3 ± 0.9 ± 4.3 ±16.5 ± 4.6

−∆H° 97.4 92.4 95.2 51.9 59.06 51.87 70.6 46.2 60.6 80.7 72.3 40.3 83.6 56.1 58.8 73.3 (kcal/mol) ± 2.2 ± 5.6 ± 2.7 ± 1.4 ± 2.2 ± 0.4 ± 1.7 ± 2.9 ± 1.1 ± 1.3 ± 4.6 ± 5.6 ± 0.2 ± 1.2 ± 5.5 ± 4.9

b TM (°C) 67.3 62.9 65.4 61.7 67.7 60.2 64.9 64.4 61.8 70.3 62.4 59.4 54.6 42.9 71.3 70.9 ± 0.8 ± 0.1 ± 1.0 ± 1.1 ± 0.4 ± 0.8 ± 0.2 ± 0.9 ± 0.5 ± 0.8 ± 1.1 ± 0.7 ± 0.4 ± 0.5 ± 0.1 ± 0.9

−∆G37° 12.8 11.8 13.1 9.6 10.4 9.1 10.5 8.9 8.9 12.5 9.6 9.9 11.3 7.6 13.8 15.1 (kcal/mol) ± 0.2 ± 0.01 ± 0.4 ± 0.2 ± 0.1 ± 0.1 ± 0.1 ± 0.2 ± 0.1 ± 0.3 ± 0.2 ± 0.2 ± 0.2 ± 0.1 ± 0.2 ± 1.3

−∆S° 186.2 180.24 208.7 102.7 104.2 91.4 120.3 64.4 75.6 156.9 98.1 131.7 248.9 158.2 191.1 230.4 (e.u.) ±11.7 ±2.9 ± 3.3 ±14.9 ± 1.1 ± 3.2 ± 4.1 ± 7.1 ± 2.2 ± 9.3 ± 5.7 ± 4.6 ± 5.0 ± 7.7 ± 4.3 ±17.6

−∆H° 70.6 67.6 77.8 41.5 42.7 37.5 47.8 28.9 32.4 61.1 40.0 50.8 88.5 56.6 73.1 86.5 (kcal/mol) ± 4.0 ± 1 ± 1.4 ± 4.5 ± 0.4 ± 1.1 ± 1.3 ± 2.4 ± 0.7 ± 3.2 ± 1.6 ± 5.6 ± 1.7 ± 2.4 ± 1.5 ± 16.0

115

a + -4 RNA duplexes melted in 10 mM Na Cacodylate buffer pH = 6.94 in presence of 10 mM MgCl2 and 100 mM NaCl, TM values calculated at 10 M oligomer concentration (Ct) using average curve fits of at least 4 replicate experiments. b + -4 RNA duplexes melted in 10 mM Na Cacodylate buffer pH = 6.94 in presence of 1 M NaCl, TM values calculated at 10 M oligomer concentration (Ct) using average curve fits of at least 4 replicate experiments.

2+ Table 6.5.2. Effect of Mg ions in stabilization of S/R motif duplex .

RNA DUPLEXES

∆∆TM (°C) 5.4 7.70 7.9 8.0 2.2 5.8 0.6 2.7 3.0 7.6 11.8 -1.3 -0.2 1.6 -1.6 -4.4

∆∆G37° (kcal/mol) -3.8 -3.8 -3.4 -1.9 -1.8 -1.8 -2.02 -1.7 -2.6 -3.4 -5.7 0.8 0.4 -0.3 1.4 2.2

∆∆S° (e.u.) -74.3 -67.5 -45.1 -27.7 -46.9 -40.5 -67.1 -49.8 -82.7 -52 -88.9 31.2 14.8 2.7 40.6 35.7

∆∆H° (kcal/mol) -26.8 -24.8 -17.4 -10.4 -16.36 -14.4 -22.8 -17.3 -28.2 -19.6 -32.3 10.5 4.9 0.5 14.3 13.2 a Values obtained by subtracting S/R motif data ( see Table 5) melted in presence of MgCl2 and its absence . For example for prototype S/R motif molecule: ∆∆G37° = ∆G37°Mg2+ -

∆G37°1M NaCl

116

2+ Figure 6.5.1. Melting temperature of reference duplex as a function of [Mg ]. At 10 mM MgCl2 the TM curve reaches its plateau.

Figure 6.5.2. Representative absorbance vs temperature profiles of 5 RNA duplexes melted in sodium cacodylate buffer pH 6.94, 10 mM MgCl2 and 0.1 M NaCl. Corresponding names are given in legend. 2D structures of these molecules could be found in Figure 6.4.1.

117

Influence of tHS base pair combinations on S/R motif duplex stability. There are two conserved tHS base pair interactions in S/R motifs. We refer to the base pair between nucleotides 6 and 9 as the upper tHS because of its orientation on the 2D diagrams used throughout this paper.

Conversely, we call the tHS base pair between nucleotides 2 and 12 the lower tHS.

Upper tHS position. Four molecules U2G/C12A, U2G/G9A/C12A, U2G/A6C/C12A and

U2G/A6C/G9U/C12A are different only in their base combination at the upper tHS position, while the lower tHS base pair A12G2 is the same among these duplexes. All these combinations are isosteric and should preserve the conformation of S/R motif. Comparison of their thermodynamic parameter reveals the following results: the most stable RNA duplex is

2+ U2G/C12A, which has an upper tHS of AG, with ΔG37°of -15.9 kcal/mol in the presence of Mg ions. This tHS base pair combination is the most common in S/R motifs according to WebFR3D results (Summarised in Table 6.3.1). The single mutation in the upper tHS position in molecule

G9A generates a AA tHS base pair. This dramatically decreases the stability of the RNA duplex by about 6 kcal/mol compared to tHS AG as shown for U2G/G9A/C12A duplex in Table 6.5.1.

Molecule U2G/A6C/G9U/C12A with upper tHS C6U9 has a ΔG37° value of -12.9 kcal/mol, which is also lower than the U2G/C12A molecule.

The stability of the control U2G/A6C/C12A S/R motif variant with upper tHS CG base pair has a ΔG37° value of -12.4 kcal/mol, which is similar to the U2G/A6C/G9U/C12A molecule with a CG tHS. However, the CG tHS base pair combination does not exist in any 3D structure, thus these bases most likely interact via canonical cWW base pair, which would disrupt the S/R

3D conformation.

Comparing these data in the absence of divalent ions gave us different results. The most stable RNA duplex with ΔG37° of -15.1 kcal/mol is U2G/A6C/G9U/C12A which has CU at the 118

upper tHS. The second most stable was the control molecule with ΔG37° of - 13.8 kcal/mol

followed by molecule U2G/C12A with the upper tHS AG. The least stable is molecule

U2G/G9A/C12A containing tHS AA base pair combination with ΔG37° = - 9.9 kcal/mol.

Thermodynamic data of upper tHS base pair substitution suggest that the presence of

2+ Mg ions stabilizes only the U2G/C12A molecule (ΔΔG = -3.4 kcal/mol) which contains AG at

the upper tHS (Table 6.5.2). Mg2+ ions do not contribute to the stabilities of molecules

containing isosteric tHS AA or CU or the non-existing tHS CG. The isosteric upper tHS base

pair combinations can be ordered by stability as follows:

upper tHS with Mg2+ : AG > CU > AA

upper tHS no Mg2+: CU > AG > AA

Lower tHS position. The lower tHS position exhibits much more sequence variability than the

upper tHS as shown in Table 6.3.1. There are six duplexes varying only in the lower tHS base

pair combinations. The distinct lower tHS base pair combinations are: the tHS CU, tHS CC, tHS

CA, tHS AU, tHS AG, and tHS AA. The data show that there is very little variation in ΔG37° of

these molecules. These RNA S/R motifs except C12A with a tHS AU base pair have relatively

similar stability of about ΔG37° of - 15.5 kcal/mol.

The data suggest that in the presence of Mg2+ ions the stability of the RNA duplexes varies

depending on the tHS base pair combination and follows the following order:

tHS CU (ΔG37° = -16.6 kcal/mol) > tHS CA (ΔG37° = -16.5 kcal/mol) > tHS AG (ΔG37° = -15.9

kcal/mol) > tHS ΔG37° (DG = -15.6 kcal/mol) > tHS ΔG37° (DG = -15.3 kcal/mol) >> tHS AU

(ΔG37° = -12.2 kcal/mol).

The C12A with tHS AU mutation is ambiguous. The A at position C12 could form a Watson-

Crick base pair with U2, which is expected to disrupt the motif. However, UA is also an allowed 119

base combination for tSH, and so it is possible that the tSH pairing geometry is conserved. This

sequence variant tests the resilience of the motif to ambiguous base substitutions. In fact, this

sequence variant occurs in the central junction of yeast 26S rRNA as shown in Figure 6.2.2. In

the 3D structure (PDB 3U5H) we observe the U and A forming the tHS geometry, so we expect

this variant to form the motif.

On the other hand, the control molecule U2G comprises G2 and C12 in the place of lower tHS base pair. The tHS CG combination was never observed in any 3D structure and thus is not expected to form a tHS. UV-melting data shows that this RNA duplex has a ΔG37° of -11.5

kcal/mol and suggests that this substitution significantly reduces the stability of the S/R motif.

Comparison of the lower tHS sequence combination of the S/R motif RNA duplexes melted

in the presence of 1M NaCl shows slightly different order of stability compared to the

experiment with 10 mM Mg2+:

CA (ΔG37° = -13.1 kcal/mol) > CU (ΔG37° = -12.8 kcal/mol) > AG (ΔG37° = -12.5 kcal/mol) >

CC (ΔG37° = -11.8 kcal/mol) > AU (ΔG37° = -10.4 kcal/mol) > AA (ΔG37° = -9.6 kcal/mol).

Interestingly, all isosteric combinations at the lower tHS depend on the presence of Mg2+, as

shown in Table 6.5.2. Mg2+ ions stabilize the S/R motif of these duplexes, while we do not

observe this effect at the upper tHS combinations. Moreover, while isosteric base pair

substitutions at the lower tHS position do not significantly impact on the stability of S/R motif,

substitution in the upper tHS may depend on the sequence. For example, if we compare the free

energy minimization between RNA duplex U2G/C12A with upper and lower tHS AG and

U2A/C12 RNA molecule with upper tHS AG but lower AA the ΔΔG is almost similar, diference

is only -0.6 kcal mol. However, comparison between molecules U2G/C12A and U2G/G9A/C12A

with upper tHS AA lower tHS AG shows much larger difference in ΔΔG of -6.8 kcal/mol. 120

So why do we observe such big changes in stabilities of RNA duplexes mutated in the upper tHS and not in the lower tHS? The stability of S/R motif must be defined by the context of tHS base pair family within the structure. Additionally, the stability of S/R motif structure depends not only on non-canonical hydrogen bonding and stacking interactions but also on base- phosphate (BPh) interactions [109]. 3D structures of S/R motifs with different upper tHS base pair combinations revealed formation of different BPh hydrogen bonding as demonstrated in

Figure 6.5.3. This figure shows that in the prototype S/R motif, nucleotide U2693, a part of the triplet, interacts with nucleotide G2701. This type of bonding belongs to 5BPh family. The same between corresponding nucleotides U891 and A906 in S/R motif containing upper tHS AA belongs to 6BPh family. Presumably, the BPh interactions are crucial in stabilities of S/R motifs. Previously, the importance of BPh interactions was pointed out by Zgarbova, M.,

P. et al. 2009 [110], and also it has been revealed their evolutionary roles in ribosomal RNA

[109]. It was shown that ~12% of nucleotides in large RNAs are involved in direct BPh interactions and these interactions are the most conserved during evolution. This may explain why we see such large differences in stability when the upper tHS varies. Surprisingly, there is no BPh interaction between nucleotides U269 and U294 when the upper tHS is CU as shown in

Figure 6.5.3 on the bottom, however RNA duplexes containing the upper tHS CU are more stable than tHS AA.

121

Prototype S/R motif 23S-H95 H. marismortui (1S72) U2693---G2701 5BPh family

G2701

U2693

S/R motif from 16S-H27 rRNA E. coli (2AW7) U891---A906 6BPh family

A906

U891

S/R motif from 28S rRNA S. serevisiae (3U5H) U269 and U294 do not form BPh interaction

U294

U269

Figure 6.5.3. Examples of BPh interactions between upper tHS and triplet U. When the upper tHS is AG there is a hydrogen bonding between oxygen of phosphate group of U and aminogroup of G which belongs to 5BPhfamily (top). When the upper tHS is AA then 122

interaction between the phosphate group of U and aminogroup of A belongs to 6BPh family (middle). There is no BPh interaction when the upper tHS has CU base pair combination (bottom).

Bulged nucleotides stabilize the S/R motif duplex by high entropy contribution. Mutations in

bulged nucleotides result in significant differences in entropy. The prototype motif has a ∆S° of

-260.5 e.u. while molecules G4A and G4C have ∆S° values of -114.2 and -158.3 e.u. respectively

2+ (Table 6.5.1). This effect is more evident in the absence of Mg ions where ∆S° of G4A and G4C duplexes = -64.4 and -75.6 e.u. respectively. The bulged A and C nucleotides are very rare in the

S/R motif context. As shown in crystal structure of 23S rRNA in H-11 of S. serevisiae, the

bulged A does not participate in the triple interaction as the nucleotide is bulged out from the

motif entirely. The same pattern is observed for the bulged U at equivalent position of S/R motif

of 16S rRNA of H17. The structural comparison of bugled G, A and U in the S/R motif is

summarized in Figure 6.5.4. Additionally, the NMR data indicate that the replacement of bulged

G to A destabilizes the S/R motif formation, however, the overall conformation appears to be

similar [111]. This data indicates while the bulged A and C do not participate in triplet

interaction, likely, they help to stabilize the formation of S/R motif by a large entropy

contribution.

123

A. H-95 23S rRNA H. marismortui (1S72) nt. 2689-2695, 2700-2705

B. H-11 Domain I 28S rRNA S. serevisiae (3U5H) nt. 46-52, 32-37

C. H-17 16S rRNA E. coli (2AW7) nt. 483-488, 446-450

Figure 6.5.4. Comparison of structures of S/R motifs varying the bulged base. In most S/R motifs the bulged base is G, as exemplified in the S/R motif from H-95 of 23S rRNA (Panel A). 124

The G is always observed to form a base triplet and base phosphate interaction. When this G is replaced by A (Panel B) or U panel (C), the base is no longer observed to form these interactions.

Contribution of trans Hoogsteen/Hoogsteen base pair combinations to stability of S/R motif.

Three RNA duplexes, the prototype molecule, A3G and A11C contain different tHH base pair

combinations, tHH AA, tHH GA and tHH AC respectively. Comparison of stabilities of these duplexes in presence of Mg2+ ions and its absence revealed consistent pattern:

presence 10 mM Mg2+

Prototype (tHH AA) ΔG37° = -16.6 kcal/mol > A11C (tHH AC) ΔG37° = -12.5 kcal/mol > A3G

(tHH GA) ΔG37° = - 10.9 kcal/mol

Absence of Mg2+

Prototype (tHH AA) ΔG37° = -12.8 kcal/mol > A11C (tHH AC) ΔG37° = -10.5 kcal/mol > A3G

(tHH GA) ΔG37° = - 9.1 kcal/mol

The tHH AA base pair combination is the most stable, than tHH AC and the least stable tHH GA

in the motif context. The tHH AA base pair combination is the most frequent in existing 3D

structures (Figure 6.2.2, and summarised in Table 6.3.1). These data clearly demonstrate how is important the tHH base pair combination for the S/R motif, a single mutation on tHH AA bp

family significantly impact on the motif stability.

6.6 Structural probing of S/R motifs

To determine which of the sequence variants studied by thermodynamic methods likely form

the S/R motif, we employed a range of established nuclease and chemical probes, including

RNases T1, A and V1 and DEPC (diethylpyrocarbonate). 125

Design of RNA molecules for structure probing studies.

Direct structure probing of the short RNA duplexes used in UV-melting experiment is

impossible using RNases. Thus, to carry out structure probing of S/R motif variants we designed

a set of RNA molecules that met these criteria:

1. RNA folding into a desired conformation.

2. Only two DNA primers are required to make variety of different RNA molecules.

3. Use the same (constant) sequence region that would form S/R motif among the different

RNAs (positive control) for a comparison.

4. Introduce mutation that would disrupt the S/R motif (negative control).

Using the hairpin design we are able to implement an efficient method for S/R motif structure probing as folows: each RNA hairpin has two regions corresponding to S/R motif sequences as shown in Figure 8. The lower S/R motif (enclosed in the dotted-line rectangle in

Figure 6.6.1) contains the sequence of the U2G/C12A S/R variant. It is the same in all molecules

and serves as a control for establishing the relative intensities of the cleavage when probing

different molecules under different conditions. The upper region is the variable region and is

highlighted in red rectangle in Figure 6.6.1. This design strategy allows us to analyze the probing

results for all RNAs simultaneously.

All hairpins were 62 nucleotides long. Hairpins are named according to the S/R motifs in

their probe regions which is consistent with the UV-melting experiments.

126

V A R I A B L E

C O N S T A N T

V A R I A B L E

C O N S T A N T

Figure 6.6.1. Representative structures of RNA molecules used in structure probing experiments. These molecules consist of the same sequences of GAGA hairpin loop (GNRA motif). Additionally, there are two other sequence regions: constant region (highlighted as a 127

puncture black rectangle) corresponds to sequence of molecule U2G/C12A which conserved among all molecule and variable regions (highlighted as a red rectangle) implemented as a signature of individual molecule. The variable region contains sequences of duplexes used in thermal denaturation experiment. Probe RNA molecules are named according to the names of short duplexes inserted in variable region. For instance, prototype S/R motif has this name because the sequence of prototype S/R motif short duplex was implemented to the variable region.

Nuclease probing experiments.

RNA hairpin-containing S/R motif regions was probed with the helix specific RNase V1 and two

single strand specific nucleases T1 and A. The cleavage patterns were determined with 5’-end

labeled strands. All RNAs were probed in their native conformations in the presence of 10mM

MgCl2. The conditions were identical to the UV-melting experiments described above in order to

ensure that the RNA molecules adopt the same conformation. The cleavage data averaged over

several experiments are summarized in Figure 6.6.2. Autoradiogram images are provided in

Appendix 2.

In all constructs, nucleotide G33 was the most accessible to RNAse T1 digestion (see A 2 and summary in Figure 6.6.2). After G33, the G31 was consistently the most sensitive to RNAse T1,

but significantly less than G33. This result provides evidence that in all the constructs the GAGA

sequence most likely forms the expected GNRA-type hairpin loop, in which G33 is unpaired while G31 is tSH paired with A34, consistent with 3D structures that contain this hairpin motif

[112, 113].

All other G’s in the constructs probed are significantly reactive to RNase T1 except for G24 in

the molecule that contains the U2G/G9A/C12A sequence variant. G24 corresponds to G4, the

“bulged” G that forms a cSH pair with U5 in the standard numbering system used for the

duplexes studied by thermodynamics. In the U2G/G9A/C12A sequence variant the basepair that 128

stacks on G4-U5 is tHS AA, rather than tHS AG. The increased sensitivity observed for G24 is

consistent with the lower stability measured thermodynamically for the U2G/G9A/C12A sequence

variant.

GAGA hairpins show the same cleavage pattern when treated with nuclease T1 which is

specific for unpaired Gs under native conditions (Figure 6.6.2, green arrows). In all cases, the

strong expected cleavage was observed in the GAGA tetraloop (nucleotides G31 and G33, Figure

6.6.1). Furthermore, the cleavage pattern of the unpaired G3 was similar among all RNA molecules. Interestingly, only nucleotide G24, which forms the triplet interaction in the

U2G/G9A/C12A hairpin, is accessible to T1 RNAse cleavage.

Presumably, this may be due to the conformational changes of the bulged G nucleotide

forming a triplet induced by the upper tHS AA base pair. The U2G/G9A/C12A duplex also

exhibited one of the lowest melting temperatures in the UV-melting experiments suggesting that

this motif instance is quite flexible.

Comparison of hairpin molecules treated with RNase A, which is specific for unpaired Us

and Cs, shows that nucleotides U12 and U25 are the most accessible for cleavage as shown in

Figure 6.2.2 (blue arrows) as well as in the autoradiogram in Appendix 2. This is consistent with

the results obtained by RNase A treatment of 5S rRNA from X. laevis where loop E resembles the sequence of S/R motif [114]. The U, which participates in the triplet interaction with A and

G, is very conserved in S/R motifs [115].

Interestingly, only in the control molecule U2G/A6C/C12A does the U25 remains uncut

suggesting that this nucleotide makes a pair with A40 due to conformational changes caused by

the C26 mutation. This is expected because the non-isosteric mutation of tHS AG to CG pair

disturbs the S/R motif conformation. 129

The U2G/A6C/G9U/C12A hairpin also showed strong cleavage at U39 position. Some

secondary cuts were observed in U2C hairpin RNA at the G24 even though RNase A is not

specific to this nucleotide. The slightly different conformational changes of U2C molecule may

potentially inhibit cleavage of G24. The different conformation of G24 was also observed using

RNase T1 digestion (see above). It may be possible to use the unique cleavage pattern of triplet

U’s produced by RNase A digestion to confirm the geometry of S/R motifs in solution in

efficiently.

Strong RNase V1 cuts are observed in all helical regions of RNA hairpins, and generally we

observed a good correspondence among the cleavage patterns (Figure 6.6.2, red arrows and

Appendix 2). The variable and constant regions of hairpins are resistant to nuclease V1 cuts and

only in the U2G/A6C/C12A RNA molecule are nucleotides U25 and C26 are susceptible to

cleavage. This is evidence that the control molecule has very different structure in variable

sequence region.

The S/R motif in the constant region corresponded to the U2G/C12A molecule (observed in

H13 of the LSU of H. marismotrui, see Figure 6.2.2). This motif exhibited the same cleavage pattern across all hairpins. Also, within the only hairpin that had identical variable and constant regions, was the nuclease cleavage identical. Overall, these results demonstrate that all 12 designed molecules folded into hairpins with two internal loops with U25 forming a non-

Watson-Crick base pair in all cases except for the control molecule.

130

.

Figure 6.6.2. Summary of the nuclease digestion experiments. Cleavage patterns of RNA hairpin secondary structures is represented by symbols corresponding to the legend on center. 131

Chemical probing. The reactivity of the N7 purine positions in 5'-end labeled RNAs (U2G/C12A,

U2G/G9A/C12A, and U2G/A6C/C12A) was probed with DEPC, which cleaves the N7 on A much

more strongly than in G. An autoradiogram image and the results from several such experiments

are provided in Figure 6.6.3 A and B. The reactivity of the adenine residues was determined by

subsequent aniline cleavage of 5'-end labeled RNA hairpins.

Chemical reactivity data for variable and constant regions of the tested RNAs indicate

that a complex conformation exists in this region. DEPC accessibility of A32-N7 and A34-N7 in the hairpin loop region is very similar among all studied RNAs. This is also true for G11 and G24

nucleotides that participate in triplet interaction. There is no significant difference in cleavage

patterns of RNAs between native and semi-denaturing conditions; and only slightly increased

cuts under semi-denaturing reactions were observed.

The control U2G/A6C/C12A hairpin behaves very differently in the variable region. In

addition to G24 reactivity, nucleotides G39 and A40 become fully reactive with DEPC in native

and semi-denaturing buffers indicating that this internal loop does not have a stable conformation

(Figure 6.6.3 B, rightmost molecule). These data are also consistent with the nuclease digestion

experiments discussed above.

The variable region in U2G/A6C/C12A is completely different from all other RNA constructs.

This difference is caused by a mutation of the upper tHS base pair which prevents the formation

of S/R motif because this substitution is non-isosteric.

In conclusion, our structure probing experiments demonstrate that all S/R sequences fold into

the 3D structure characteristic to this motif with an exception of one control molecule which is

not expected to form the motif. Although exact orientations of all nucleotides cannot be

established based on structure probing alone, G24 and U25 are cleaved in a consistent manner with 132

being in the triplet interaction. The structure probing data validates our observations made during

the UV-melting experiments.

SEMI- NATIVE A DENATURING B

Figure 6.6.3. Autoradiogram and summary of chemical probing analysis. A) Structure probing using DEPC to probe accessibility of N-7 positions of Hoogsteen edges of A and G residues. Lanes: 1 an d 1 0 ar e “O H-“: a lkali hydrolysis s tep l adder o f U 2G/C12A ha irpin, 2 i s “ A>G”- 133

Maxam-Gilbert s equencing r eaction, 3 i s N on-Reacted (“NR”) U2 G/C12A R NA, t reated wi th aniline as c ontrol; NATIVE 4 ,5 a nd 6 are R NAs U2G/C12A, U 2G/G9A/C12A, a nd U2G/A6C/C12A - samples treated with DEPC under native conditions (80mM NaCB buffer pH = 6.94, 100m M KCl, 10 mM MgCl2, 0.5 m M EDTA); SEMI-DENATURING 7, 8, and 9 are RNAs U2 G/C12A, U2 G/G9A/C12A, a nd U2 G/A6C/C12A -samples t reated w ith DEPC u nder semi-denaturing condition (80mM NaCB buffer pH = 6.94, 100mM KCl, 0.5 m M EDTA). B ) Summary of DEPC sensitivity given by legend on the top of the 2D structures.

6.7 Sequence variation in S/R motifs observed in 16S rRNA sequences from Eubacteria

with known optimal growth temperatures

Thus far our analysis was based on the sequence variation data derived from the structural

database. These structures come from very few organisms and provide only limited information

on the motif sequence variability. To expand our knowledge of S/R motif sequence variation we

examine ribosomal sequence alignments. We hypothesize that the organisms living at higher

temperatures have more rigid RNA 3D motifs while organisms living at lower temperatures

should have less rigid structures. In order to test this hypothesis, we constructed a special dataset

containing 16S alignments with sequences annotated by the optimal growth temperature and

phylogenetic group of the source organism.

Finding organisms with known optimal growth temperatures.

Bacterial type strains associated with the optimal growth temperature were manually extracted from the literature, with a particular effort devoted to identifying extremophiles.

Manual extraction was used because the previous attempt at extracting this data, the PGTdb

[116], is outdated and no longer maintained. Species which grow optimally below 20 °C were classified as psychrophilic; species growing from 20 °C to less than 50 °C were classified as mesophilic, species growing between 50 °C and less than 70 °C were thermophilic while species growing optimally at or above 70 °C were hyperthermophilic [117]. The distribution of 134

organisms in different phylas by their optimal growth temperatures can be found in

Supplementary data S2.

Selecting a source of sequence alignments

There are several alternative sources of aligned rRNA sequences. We examine 16S alignments because the 16S is often sequenced and therefore provides a large dataset to examine.

Unfortunately, very few 23S sequences for extremophiles are available, so we did not attempt to analyze the large ribosomal subunit. In order to choose one 16S alignment, we aligned two 16S

3D structures from distantly related bacteria E. coli (PDB 2AVY) and T. thermophilus (PDB

1J5E) using R3DAlign [118]. Next, we compared the resulting structural alignment with sequence alignments downloaded from Green Genes, RDP and SILVA [119-121]. As a result,

Green Genes alignment has been chosen as it had the most agreement with the R3DAlign structural alignment.

Selecting a representative sequence alignment dataset

The representative dataset was created by extracting all sequences corresponding to each species and all of its subspecies with known optimal growth temperatures from the Green Genes

16S alignment updated on October 6, 2010 (http://greengenes.lbl.gov). Each species and its subspecies were represented with a single sequence selected based on the highest A, C, G, and U content. This was done in order to reduce the bias due to over representation of some organisms in the dataset. This dataset contains 561 sequences; within these sequences 82 belong to psychrophilic, 328-mesophilic, 108-thermophilic, and 44-hyperthermophilic organisms

(Appendix 3). We used Uniprot’s taxonomy to obtain the phylogenetic information for each organism.

135

Correlating structures with sequence alignments

Since some nucleotides in crystal structures are not resolved, the Green Genes alignment and

the sequence from the 3D structure may not match. Therefore, we need to determine the

correlation between the columns of the sequence alignments and the structure positions. To this

end, we extracted all rows from Green Genes, which came from E. coli, and removed all gaps,

then aligned each sequence to the sequence from the E.coli 3D structure (PDB 2AVY). The

Green Genes sequence with the largest number of aligned columns was used to correlate

positions in the Green Genes alignment with the PDB sequence.

AG and AA trans Hoogsteen-Sugar base pairs correlate with organism growth temperature

Using the correlations between sequence alignments and 3D structures we now examine the

variations in S/R-like motifs H17, H27, and 3WJ found in the 16S rRNA. We extracted the columns from the alignment which corresponded to each motif. We then grouped each sequence by the phylum of its source organism and optimal growth temperature class and then counted unique sequences. The resulting tables can be found in Appendix 3.

Examining the S/R motif in the H27 of the 16S shows a strong correlation between AG and

AA in the upper tHS base pair and growth temperature. As temperature increases the fraction of tHS AG also increases (Figure 6.7.1 top panel). This does not appear to be due to phylogeny as

the variation is found in Bacteriodes, Firmuctues, and Betaproteobacteria. Other positions

remain conserved or nearly conserved. This correlation is noticeable only in one tHS base pair of

one S/R motif, but it is supported by our observations based on the UV-melting experiments.

The bottom panel in Figure 6.7.1 shows the sequence variations of the S/R motif in H17. The bulged base in the triplet is a G in thermophiles and hyperthermophiles, while in psychrophiles it 136

may be a U. All other examined motifs do not show a strong correlation between sequence and optimal growth temperature.

Overall, we do see some variations which may be due to temperature adaptation. However, our current methods are not precise enough to detect subtle changes. Further statistical analysis on a larger dataset is needed to validate these initial, limited findings.

S/R motif in the H27 of the 16S rRNA

551 organisms: 82 Psychrophiles 328 Mesophiles

108 Thermophiles

44 Hyperthermophiles

S/R motif in the H17 of the 16S rRNA 548 organisms: 79 Psychrophiles 319 Mesophiles 109 Thermophiles 41 Hyperthermophiles

Figure 6.7.1. Survey of non -canonical ba se pa irs of S /R motifs i n 1 6S r RNA o f E ubacteria. Figure contains 2D diagrams of two S/R motifs one from helix 27 on t op and on t he bottom is S/R f rom h elix 17. E ach b ase p air family h as a t able w ith co llored b ars co ntaining seq uence 137

variations of psychrophiles (blue color bar), mesophiles (green color bar), thermophiles (orange color bar) and hyperthermophiles (red color bar). Data in each cell is plotted according to counts of organisms representing sequence of particular non-canonical base pairs.

6.8 Conclusion

In the present work we systematically study sequence variation at structurally conserved

positions of S/R motif in rRNA from different organisms. In present work, our effort is aimed at

gaining thermodynamic parameters for isosteric and non-isosteric substitutions within the S/R

motif. Besides the stability investigation, a structural probing analysis is performed to observe

which mutations would disturb the motif.

This study demonstrates that isosteric substitutions do not disturb the S/R motif as it was

expected from survey of 3D structures and survey of 16S rRNA from organisms adapted to

different temperatures.

The most frequent variation was observed at the ‘lower’ tHS base pair and any neutral substitutions at this place do not significantly impact on the overall stability of the motif. On the other hand, non-isosteric substitutions destabilize the motif but the conformation is appears to be

modest as reflected from structural probing experiments.

Substitutions at the “upper’ tHS pair are the most crucial for the stability and conformation of

the motif. Non-isosteric mutation destroys the geometry of S/R motif. Isosteric mutations tHS

AA or CU destabilize the motif but the geometry is preserved. Moreover, tHS AA makes the

motif more flexible as reflected by RNse T1 digestion pattern.

Any mutations at bulged nt. participating in triplet interaction significantly impact on the

stability of the motif as shown by UV-melting experiments. 138

Our phylogenetic survey suggests a correlation between non-canonical base pairs and optimal growth temperatures. In high temperature environments tHS AG is the most frequent pair in S/R motifs, whereas tHS AA is prevalent in cold environments. In addition, the bulged G, which is conserved in most organisms, mutates to U, A or C mostly in psychrophiles and mesophiles. This suggests that organisms may adapt to different temperature environments not only by adjusting the number of GC and AU Watson-Crick pairs in the helical regions as demonstrated previously [89], but also by selecting optimal isosteric sequence combinations for their non-canonical base pairs.

Using S/R motif as an example, we have demonstrated that non-canonical interactions in structured RNA motifs are very important. By changing base pair at certain non-canonical positions we can control stability, structure, and perhaps function of large RNAs.

139

6.9 References.

1. Taft RJ, P.M., Mattick JS., The relationship between non-protein-coding DNA and eukaryotic complexity. Bioessays, 2007. 29: p. 288-299. 2. Birney, E., et al., Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature, 2007. 447(7146): p. 799-816. 3. Core, L.J., J.J. Waterfall, and J.T. Lis, Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science, 2008. 322(5909): p. 1845- 8. 4. Kapranov, P., et al., RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science, 2007. 316(5830): p. 1484-8. 5. Murina, V.N. and A.D. Nikulin, RNA-Binding Sm-Like Proteins of Bacteria and Archaea. Similarity and Difference in Structure and Function. Biochemistry (Mosc), 2011. 76(13): p. 1434-49. 6. Mueser, T.C., N.G. Nossal, and C.C. Hyde, Structure of bacteriophage T4 RNase H, a 5' to 3' RNA-DNA and DNA-DNA exonuclease with sequence similarity to the RAD2 family of eukaryotic proteins. Cell, 1996. 85(7): p. 1101-12. 7. Bell, L.R., et al., Sex-lethal, a Drosophila sex determination switch gene, exhibits sex- specific RNA splicing and sequence similarity to RNA binding proteins. Cell, 1988. 55(6): p. 1037-46. 8. Taft, R.J., et al., Non-coding RNAs: regulators of disease. J Pathol, 2010. 220(2): p. 126- 39. 9. Pan, Y.F., et al., Role of long non-coding RNAs in gene regulation and oncogenesis. Chin Med J (Engl), 2011. 124(15): p. 2378-83. 10. Ghildiyal, M. and P.D. Zamore, Small silencing RNAs: an expanding universe. Nat Rev Genet, 2009. 10(2): p. 94-108. 11. Brodersen, P. and O. Voinnet, The diversity of RNA silencing pathways in plants. Trends Genet, 2006. 22(5): p. 268-80. 12. Winter, J., et al., Many roads to maturity: microRNA biogenesis pathways and their regulation. Nat Cell Biol, 2009. 11(3): p. 228-34. 13. Matera, A.G., R.M. Terns, and M.P. Terns, Non-coding RNAs: lessons from the small nuclear and small nucleolar RNAs. Nat Rev Mol Cell Biol, 2007. 8(3): p. 209-20. 14. Taft, R.J., et al., Small RNAs derived from snoRNAs. RNA, 2009. 15(7): p. 1233-40. 15. Thompson, D.M. and R. Parker, Stressing out over tRNA cleavage. Cell, 2009. 138(2): p. 215-9. 16. Shi, W., et al., A distinct class of small RNAs arises from pre-miRNA-proximal regions in a simple chordate. Nat Struct Mol Biol, 2009. 16(2): p. 183-9. 17. Kruger, K., et al., Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of Tetrahymena. Cell, 1982. 31(1): p. 147-57. 18. Guerrier-Takada, C., et al., The RNA moiety of ribonuclease P is the catalytic subunit of the enzyme. Cell, 1983. 35(3 Pt 2): p. 849-57. 19. Ellington, A.D. and J.W. Szostak, In vitro selection of RNA molecules that bind specific ligands. Nature, 1990. 346(6287): p. 818-22. 20. Tuerk, C. and L. Gold, Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science, 1990. 249(4968): p. 505-10. 140

21. Sudarsan, N., et al., Riboswitches in eubacteria sense the second messenger cyclic di- GMP. Science, 2008. 321(5887): p. 411-3. 22. Tian, L. and B.A. Chen, [RNA interference used for reversal of multi-drug resistance in leukemia cells -- review]. Zhongguo Shi Yan Xue Ye Xue Za Zhi, 2010. 18(6): p. 1638- 43. 23. Siu, R.W., et al., Antiviral RNA interference responses induced by Semliki Forest virus infection of mosquito cells: characterization, origin, and frequency-dependent functions of virus-derived small interfering RNAs. J Virol, 2011. 85(6): p. 2907-17. 24. Scherrer, T., et al., A screen for RNA-binding proteins in yeast indicates dual functions for many enzymes. PLoS One, 2010. 5(11): p. e15499. 25. Ulveling, D., C. Francastel, and F. Hube, When one is better than two: RNA with dual functions. Biochimie, 2011. 93(4): p. 633-44. 26. Woese, C.R., S. Winker, and R.R. Gutell, Architecture of ribosomal RNA: constraints on the sequence of "tetra-loops". Proc Natl Acad Sci U S A, 1990. 87(21): p. 8467-71. 27. Mathews, D.H., et al., Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J Mol Biol, 1999. 288(5): p. 911-40. 28. Abrahams, J.P., et al., Prediction of RNA secondary structure, including pseudoknotting, by computer simulation. Nucleic Acids Res, 1990. 18(10): p. 3035-44. 29. Zuker, M., Computer prediction of RNA structure. Methods Enzymol, 1989. 180: p. 262- 88. 30. Ehresmann, C., et al., Probing the structure of RNAs in solution. Nucleic Acids Res, 1987. 15(22): p. 9109-28. 31. Furtig, B., et al., NMR spectroscopy of RNA. Chembiochem, 2003. 4(10): p. 936-62. 32. Holbrook, S.R. and S.H. Kim, RNA crystallography. Biopolymers, 1997. 44(1): p. 3-21. 33. Leontis, N.B. and E. Westhof, Analysis of RNA motifs. Curr Opin Struct Biol, 2003. 13(3): p. 300-8. 34. Sarver, M., et al., FR3D: finding local and composite recurrent structural motifs in RNA 3D structures. J Math Biol, 2008. 56(1-2): p. 215-52. 35. Lai, C.E., et al., FASTR3D: a fast and accurate search tool for similar RNA 3D structures. Nucleic Acids Res, 2009. 37(Web Server issue): p. W287-95. 36. Kirillova, S., S.C. Tosatto, and O. Carugo, FRASS: the web-server for RNA structural comparison. BMC Bioinformatics, 2010. 11: p. 327. 37. Dror, O., R. Nussinov, and H.J. Wolfson, The ARTS web server for aligning RNA tertiary structures. Nucleic Acids Res, 2006. 34(Web Server issue): p. W412-5. 38. Chang, Y.F., Y.L. Huang, and C.L. Lu, SARSA: a web tool for structural alignment of RNA using a structural alphabet. Nucleic Acids Res, 2008. 36(Web Server issue): p. W19-24. 39. Petrov, A.I., C.L. Zirbel, and N.B. Leontis, WebFR3D--a server for finding, aligning and analyzing recurrent RNA 3D motifs. Nucleic Acids Res, 2011. 39(Web Server issue): p. W50-5. 40. Nakashima H, F.S., Nishikawa K, Compositional changes in RNA, DNA and proteins for bacterial adaptation to higher and lower temperatures. J Biochem 2003. 133: p. 507-513. 41. Wang HC, H.D., Evidence for strong selective constraint acting on the nucleotide composition of 16S ribosomal RNA genes. Nucleic Acids Res, 2002. 30: p. 2501-2507. 141

42. Galtier, N.a.L., J.R. , Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J. Mol. Evol, 1997. 44: p. 632-636. 43. Hurst, L.D.a.M., A.R., High guanine-cytosine content is not an adaptation to high temperature: a comparative analysis amongst prokaryotes. Proc. R. Soc. Lond. B. Biol. Sci. , 2001. 268: p. 493-497. 44. Galtier, N., Tourasse, N., and Gouy, M. , A nonhyperthermophilic common ancestor to extant life forms. Science, 1999. 283(220-221). 45. Miralles, F., Compositional Properties and Thermal Adaptation of SRP-RNA in Bacteria and Archaea. J Mol Evol, 2010. 70: p. 181-189. 46. Serebrov, V., et al., Mg2+-induced tRNA folding. Biochemistry, 2001. 40(22): p. 6688- 98. 47. Tan, Z.J. and S.J. Chen, Salt contribution to RNA tertiary structure folding stability. Biophys J, 2011. 101(1): p. 176-87. 48. Brion, P. and E. Westhof, Hierarchy and dynamics of RNA folding. Annu Rev Biophys Biomol Struct, 1997. 26: p. 113-37. 49. Xaplanteri, M.A., et al., Localization of spermine binding sites in 23S rRNA by photoaffinity labeling: parsing the spermine contribution to ribosomal 50S subunit functions. Nucleic Acids Res, 2005. 33(9): p. 2792-805. 50. Terui, Y., et al., Stabilization of nucleic acids by unusual polyamines produced by an extreme thermophile, Thermus thermophilus. Biochem J, 2005. 388(Pt 2): p. 427-33. 51. Behm-Ansmant, I., M. Helm, and Y. Motorin, Use of specific chemical reagents for detection of modified nucleotides in RNA. J Nucleic Acids, 2011. 2011: p. 408053. 52. Wagner, T.M., et al., A novel method for sequence placement of modified nucleotides in mixtures of transfer RNA. Nucleic Acids Symp Ser (Oxf), 2004(48): p. 263-4. 53. Grosjean, H., G. Keith, and L. Droogmans, Detection and quantification of modified nucleotides in RNA using thin-layer chromatography. Methods Mol Biol, 2004. 265: p. 357-91. 54. Ziomek, K., et al., The influence of various modified nucleotides placed as 3'-dangling end on thermal stability of RNA duplexes. Biophys Chem, 2002. 97(2-3): p. 243-9. 55. Dalluge, J.J., T. Hashizume, and J.A. McCloskey, Quantitative measurement of dihydrouridine in RNA using isotope dilution liquid chromatography-mass spectrometry (LC/MS). Nucleic Acids Res, 1996. 24(16): p. 3242-5. 56. Dalluge, J.J., et al., Conformational flexibility in RNA: the role of dihydrouridine. Nucleic Acids Res, 1996. 24(6): p. 1073-9. 57. Edmonds, C.G., et al., Posttranscriptional modification of tRNA in thermophilic archaea (Archaebacteria). J Bacteriol, 1991. 173(10): p. 3138-48. 58. Kowalak, J.A., et al., The role of posttranscriptional modification in stabilization of transfer RNA from hyperthermophiles. Biochemistry, 1994. 33(25): p. 7869-76. 59. McCloskey, J.A., et al., Post-transcriptional modification in archaeal tRNAs: identities and phylogenetic relations of nucleotides from mesophilic and hyperthermophilic Methanococcales. Nucleic Acids Res, 2001. 29(22): p. 4699-706. 60. Stombaugh, J., et al., Frequency and isostericity of RNA base pairs. Nucleic Acids Res, 2009. 37(7): p. 2294-312. 142

61. Leontis, N.B. and E. Westhof, The annotation of RNA motifs. Comp Funct Genomics, 2002. 3(6): p. 518-24. 62. Sargsyan, K. and C. Lim, Arrangement of 3D structural motifs in ribosomal RNA. Nucleic Acids Res, 2010. 38(11): p. 3512-22. 63. Burkard, M.E., R. Kierzek, and D.H. Turner, Thermodynamics of unpaired terminal nucleotides on short RNA helixes correlates with stacking at helix termini in larger RNAs. J Mol Biol, 1999. 290(5): p. 967-82. 64. Kierzek, R., M.E. Burkard, and D.H. Turner, Thermodynamics of single mismatches in RNA duplexes. Biochemistry, 1999. 38(43): p. 14214-23. 65. Leontis, N.B. and E. Westhof, Geometric nomenclature and classification of RNA base pairs. RNA, 2001. 7(4): p. 499-512. 66. Leontis, N.B., J. Stombaugh, and E. Westhof, The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res, 2002. 30(16): p. 3497-531. 67. Crick, F.H., Codon–anticodon pairing: the wobble hypothesis. J. Mol. Biol., 1996. 19: p. 548-555. 68. Varani, G.a.M., W.H., The G · U wobble base pair. A fundamental building block of RNA structure crucial to RNA function in diverse biological systems. EMBO Rep., 2000. 1(18- 23). 69. Leontis, N.B. and E. Westhof, Conserved geometrical base-pairing patterns in RNA. Q Rev Biophys, 1998. 31(4): p. 399-455. 70. Nasalean, L.S., Jesse; Zirbel, Craig L.; Leontis, Neocles B., RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching. Non-Protein Coding RNAs, Springer Series in Biophysics, 2009. 13: p. 1. 71. Leontis N. B., S.J., Westhof E., The non-Watson-Crick base pairs and their associated isostericity matrices. Nucleic Acids Res, 2002. 30(16): p. 3497-531. 72. Leontis N. B., W.E., Conserved geometrical base-pairing patterns in RNA. Q. Rev. Biophys., 1998. 31(4): p. 399-455. 73. Moore PB, S.T., The structural basis of large ribosomal subunit function. Annu Rev Biochem, 2003. 72: p. 813-850. 74. Klein DJ, S.T., Moore PB, Steitz TA., The kink-turn: a new RNA secondary structure motif. EMBO J, 2001. 20(15): p. 4214-4221. 75. Yusupov, M.M., et al., Crystal structure of the ribosome at 5.5 A resolution. Science, 2001. 292(5518): p. 883-96. 76. Zuker, M., Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res, 2003. 31(13): p. 3406-15. 77. Schuster, P., et al., From sequences to shapes and back: a case study in RNA secondary structures. Proc Biol Sci, 1994. 255(1344): p. 279-84. 78. Mathews, D.H., et al., Incorporating chemical modification constraints into a dynamic programming algorithm for prediction of RNA secondary structure. Proc Natl Acad Sci U S A, 2004. 101(19): p. 7287-92. 79. Kainz, P., A. Schmiedlechner, and H.B. Strack, Specificity-enhanced hot-start PCR: addition of double-stranded DNA fragments adapted to the annealing temperature. Biotechniques, 2000. 28(2): p. 278-82. 80. Borodina, T.A., H. Lehrach, and A.V. Soldatov, DNA purification on homemade silica spin-columns. Anal Biochem, 2003. 321(1): p. 135-7. 143

81. Beld, M., et al., Fractionation of nucleic acids into single-stranded and double-stranded forms. Nucleic Acids Res, 1996. 24(13): p. 2618-9. 82. Cavaluzzi, M.J. and P.N. Borer, Revised UV extinction coefficients for nucleoside-5'- monophosphates and unpaired DNA and RNA. Nucleic Acids Res, 2004. 32(1): p. e13. 83. Antao, V.P., S.Y. Lai, and I. Tinoco, Jr., A thermodynamic study of unusually stable RNA and DNA hairpins. Nucleic Acids Res, 1991. 19(21): p. 5901-5. 84. Siegfried, N.A., S.L. Metzger, and P.C. Bevilacqua, Folding cooperativity in RNA and DNA is dependent on position in the helix. Biochemistry, 2007. 46(1): p. 172-81. 85. Shiman, R. and D.E. Draper, Stabilization of RNA tertiary structure by monovalent cations. J Mol Biol, 2000. 302(1): p. 79-91. 86. Schroeder, S.J. and D.H. Turner, Optical melting measurements of nucleic acid thermodynamics. Methods Enzymol, 2009. 468: p. 371-87. 87. Petersheim, M. and D.H. Turner, Base-stacking and base-pairing contributions to helix stability: thermodynamics of double-helix formation with CCGG, CCGGp, CCGGAp, ACCGGp, CCGGUp, and ACCGGUp. Biochemistry, 1983. 22(2): p. 256-63. 88. McDowell, J.A. and D.H. Turner, Investigation of the structural basis for thermodynamic stabilities of tandem GU mismatches: solution structure of (rGAGGUCUC)2 by two- dimensional NMR and simulated annealing. Biochemistry, 1996. 35(45): p. 14077-89. 89. Galtier, N. and J.R. Lobry, Relationships between genomic G+C content, RNA secondary structures, and optimal growth temperature in prokaryotes. J Mol Evol, 1997. 44(6): p. 632-6. 90. Miralles, F., Compositional Properties and Thermal Adaptation of SRP-RNA in Bacteria and Archaea. J Mol Evol, 2010. 91. Hurst, L.D. and A.R. Merchant, High guanine-cytosine content is not an adaptation to high temperature: a comparative analysis amongst prokaryotes. Proc Biol Sci, 2001. 268(1466): p. 493-7. 92. Schroeder, S.J. and D.H. Turner, Factors affecting the thermodynamic stability of small asymmetric internal loops in RNA. Biochemistry, 2000. 39(31): p. 9257-74. 93. Leontis, N.B. and E. Westhof, A common motif organizes the structure of multi-helix loops in 16 S and 23 S ribosomal RNAs. J Mol Biol, 1998. 283(3): p. 571-83. 94. Leontis, N.B., J. Stombaugh, and E. Westhof, Motif prediction in ribosomal RNAs Lessons and prospects for automated motif prediction in homologous RNA molecules. Biochimie, 2002. 84(9): p. 961-73. 95. Nasalean, L., Stombaugh, J., Zirbel C. L., Leontis N.B., RNA 3D Structural Motifs: Definition, Identification, Annotation, and Database Searching. In Non-Protein Coding RNAs, Springer Series in Biophysics. Walter, Nils G.; Woodson, Sarah A.; Batey, Robert T. (Eds.) 2009. 13: p. 1-26. 96. Klein, D.J., P.B. Moore, and T.A. Steitz, The roles of ribosomal proteins in the structure assembly, and evolution of the large ribosomal subunit. J Mol Biol, 2004. 340(1): p. 141- 77. 97. Schuwirth, B.S., et al., Structures of the bacterial ribosome at 3.5 A resolution. Science, 2005. 310(5749): p. 827-34. 98. Bulkley, D., F. Johnson, and T.A. Steitz, The antibiotic thermorubin inhibits protein synthesis by binding to inter-subunit bridge B2a of the ribosome. J Mol Biol, 2012. 416(4): p. 571-8. 144

99. Harms, J.M., et al., Translational regulation via L11: molecular switches on the ribosome turned on and off by thiostrepton and micrococcin. Mol Cell, 2008. 30(1): p. 26-38. 100. Krasilnikov, A.S., et al., Crystal structure of the specificity domain of ribonuclease P. Nature, 2003. 421(6924): p. 760-4. 101. Serganov, A., L. Huang, and D.J. Patel, Structural insights into binding and gene control by a lysine riboswitch. Nature, 2008. 455(7217): p. 1263-7. 102. Klinge, S., et al., Crystal structure of the eukaryotic 60S ribosomal subunit in complex with initiation factor 6. Science, 2011. 334(6058): p. 941-8. 103. Ben-Shem, A., et al., The structure of the eukaryotic ribosome at 3.0 A resolution. Science, 2011. 334(6062): p. 1524-9. 104. Lu, D., M.A. Searles, and A. Klug, Crystal structure of a zinc-finger-RNA complex reveals two modes of molecular recognition. Nature, 2003. 426(6962): p. 96-100. 105. Correll, C.C., et al., The common and the distinctive features of the bulged-G motif based on a 1.04 A resolution RNA structure. Nucleic Acids Res, 2003. 31(23): p. 6806-18. 106. AbouHaidar, M.G. and I.G. Ivanov, Non-enzymatic RNA hydrolysis promoted by the combined catalytic activity of buffers and magnesium ions. Z Naturforsch C, 1999. 54(7- 8): p. 542-8. 107. Serra, M.J., et al., Effects of magnesium ions on the stabilization of RNA oligomers of defined structures. RNA, 2002. 8(3): p. 307-23. 108. Hermann, T. and E. Westhof, Exploration of metal ion binding sites in RNA folds by Brownian-dynamics simulations. Structure, 1998. 6(10): p. 1303-14. 109. Zirbel, C.L., et al., Classification and energetics of the base-phosphate interactions in RNA. Nucleic Acids Res, 2009. 37(15): p. 4898-918. 110. Zgarbova, M., et al., Noncanonical hydrogen bonding in nucleic acids. Benchmark evaluation of key base-phosphate interactions in folded RNA molecules using quantum- chemical calculations and molecular dynamics simulations. J Phys Chem A, 2011. 115(41): p. 11277-92. 111. Seggerson, K. and P.B. Moore, Structure and stability of variants of the sarcin-ricin loop of 28S rRNA: NMR studies of the prokaryotic SRL and a functional mutant. RNA, 1998. 4(10): p. 1203-15. 112. Lemieux, S. and F. Major, Automated extraction and classification of RNA tertiary structure cyclic motifs. Nucleic Acids Res, 2006. 34(8): p. 2340-6. 113. Jucker, F.M., et al., A network of heterogeneous hydrogen bonds in GNRA tetraloops. J Mol Biol, 1996. 264(5): p. 968-80. 114. Andersen, J., et al., 5S RNA structure and interaction with transcription factor A. 2. Ribonuclease probe of the 7S particle from Xenopus laevis immature oocytes and RNA exchange properties of the 7S particle. Biochemistry, 1984. 23(24): p. 5759-66. 115. Chan, Y.L. and I.G. Wool, The integrity of the sarcin/ricin domain of 23 S ribosomal RNA is not required for elongation factor-independent peptide synthesis. J Mol Biol, 2008. 378(1): p. 12-9. 116. Huang, S.L., et al., PGTdb: a database providing growth temperatures of prokaryotes. Bioinformatics, 2004. 20(2): p. 276-8. 117. Rothschild, L.J. and R.L. Mancinelli, Life in extreme environments. Nature, 2001. 409(6823): p. 1092-101. 145

118. Rahrig, R.R., N.B. Leontis, and C.L. Zirbel, R3D Align: global pairwise alignment of RNA 3D structures using local superpositions. Bioinformatics, 2010. 26(21): p. 2689-97. 119. Cole, J.R., et al., The Ribosomal Database Project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res, 2009. 37(Database issue): p. D141-5. 120. DeSantis, T.Z., et al., Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl Environ Microbiol, 2006. 72(7): p. 5069-72. 121. Pruesse, E., et al., SILVA: a comprehensive online resource for quality checked and aligned ribosomal RNA sequence data compatible with ARB. Nucleic Acids Res, 2007. 35(21): p. 7188-96.

146

Appendix 1. UV-melting curves and thermodynamic parameters for RNA duplexes

U2G /C12A 5’-CACCGAGUAGGUC-3’/3’-GUGGAAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

U2G/A6C/C12A 5’-CACCGAGUCGGUC-3’/3’-GUGGAAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

147

Appendix 1. Cont.

U2G/G9A/C12A 5’-CACCGAGUAGGUC-3’/3’-GUGGAAAACCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

A11C 5’-CACCUAGUAGGUC-3’/3’-GUGGCCAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

148

Appendix 1. Cont.

Prototype S/R motif 5’-CACCUAGUAGGUC-3’/3’-GUGGCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

U2C 5’-CACCCAGUAGGUC-3’/3’-GUGGCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

149

Appendix 1. Cont.

U2A 5’-CACCAAGUAGGUC-3’/3’-GUGGCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

U2A /C12A 5’-CACCAAGUAGGUC-3’/3’-GUGGAAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

150

Appendix 1. Cont.

U2G/A6C/G9U/C12A 5’-CACCGAGUCGGUC-3’/3’-GUGGAAAUCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

U2G 5’-CACCGAGUAGGUC-3’/3’-GUGGCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

151

Appendix 1. Cont.

C1U/U2C/G13C 5’-CACUCAGUAGGUC-3’/3’-GUGCCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

C1U/U2C/G4A/G13C 5’-CACUCAAUAGGUC-3’/3’-GUGCCAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2 10mM NaCB, 0.5mM EDTA, 1M NaCl

152

Appendix 1. Cont.

Duplex concentration TM dependence

U2G/C12A 5’-CACCGAGUAGGUC-3’/3’-GUGGAAAGCCAG-5’

10mM NaCB, 0.5mM EDTA, 100mM NaCl 10mM MgCl2

153

Appendix 2. Autoradiograms of 15% denaturing gel and cleavage pattern of RNA hairpins by nucleases. Panels A are results of structural analysis using the nuclease digestion experiments. All data represents cleavage pattern of 5’-end labeled RNA hairpins treated with specific nucleases (shown on the top of each pictures). Lanes: 1 “G’s” is RNase T1 sequencing reaction in presence of 8M urea; 2 and 16 “OH” are - base hydrolysis ladders; 3 NR is non-reacted RNA molecules (control); 4-15 are correspond to the names of individual RNA molecules treated with nucleases. Pannel B represents 2D diagram of hairpin RNA U2G/C12A with cuts at given nucleases.

RNase T1

A B

154

Appendix 2. Cont.

RNase A

A B

155

Appendix 2. Cont.

RNase V1

A B

156

Appendix 3. Phylogenetic survey of 16S rRNA sarcin-ricin motifs

H17 - 16S rRNA

H17 -

16S Thermus rRNA Aquificae Firmicutes Chloroflexi Thermotogae Actinobacteria Deinococcus - 5'-seq- 3' 3'-seq- 5' P M T H P M T H P M T H P M T H P M T H P M T H

ggua a ag 14 2 4 7 3 16 2 3 # 24 5

ggua c ag 4 3 8 1 # 3 7 5 23 3 4 4 14 5

guua g ag

guua a ag

agua u ag 1 4

ugua a ag

guua a aa

ggua a gg 1

auua a ag

ggua a aa 1 2

cgua a ag 1

ggua c aa 1

guua u ag

agua g ag 2

uuaa a ag

ggga a ag

agua a ag 1 1

ggua c gg 1

uuua a ag

uuaa c ag

ggua g ca 1

Group summ 26 19 56 6 139 28

157

Appendix 3. Cont.

H17 - 16S rRNA

Chlorobi - Thermo Spirochaetes Synergistetes Bacteroidetes Deferribacteres desulfobacteria Verrucomicrobia Armatimonadetes

P M T H P M T H P M T H P M T H P M T H P M T H P M T H P M T H

5 # 6 1 1 1 3 1 1

4 1 1 1

1 3

1 0

2

3

2

1

1

1

1

113 0 2 3 1 3 2 1

158

Appendix 3. Cont.

H17 - 16S rRNA

- Epsilon - proteobacteria Gamma proteobacteria proteobacteria Tenericutes Elusimicrobia Lentisphaerae proteobacteria proteobacteria Beta - Delta - Alpha -

P M T H P M T H P M T H P M T H P M T H P M T H P M T H P M T H

4 1 2 1 5 7 4 1 4

1 1 7

3 2 1 9

1 1 7

# 1

1 1

5 1

1

1 1

1

1

1 4 1 95 7 24 12 5

159

Appendix 3. Cont.

H17 - 16S rRNA

% total SUMM

Psyc Meso Therm Hyper Psyc Meso Therm Hyper

33 151 53 16 253 13.04 59.684 20.95 6.324

12 55 46 20 133 9.023 41.353 34.59 15.04

31 29 0 0 60 51.67 48.333 0 0

1 17 0 0 18 5.556 94.444 0 0

0 10 6 0 16 0 62.5 37.5 0

0 13 0 0 13 0 100 0 0

0 11 0 0 11 0 100 0 0

0 10 1 0 11 0 90.909 9.091 0

0 6 0 0 6 0 100 0 0

0 1 1 2 4

1 2 0 0 3

1 1 1 0 3

0 3 0 0 3

0 1 0 2 3

0 2 0 0 2

0 2 0 0 2

0 1 1 0 2

0 2 0 0 2

0 1 0 0 1

0 1 0 0 1

0 0 0 1 1

79 319 109 41 548

160

Appendix 3. Cont.

H27 - 16S rRNA

H27-

16S rRNA Thermus Aquificae Firmicutes Chloroflexi Deinococcus - Thermotogae Actinobacteria

5'-seq-3' 3'-seq-5' P M T H P M T H P M T H P M T H P M T H P M T H

gagua aa ag 19 8 3 16 3 64 47 11 7 5

gagua aa aa 4 50 1 2 7 1 5

aagua aa ag 1 4 4 7 1

gacua aa aa 2

uagua aa ag

gacua aa ag

aagua aa aa 1

aacua aa aa 1

gagua au aa

gagua ua aa 1

gaguc aa ag 1

gagua au aa 1 Group summ 27 19 57 5 141 28

161

Appendix 3. Cont.

H27 - 16S rRNA

Chlorobi - Thermo Spirochaetes Synergistetes Bacteroidetes Deferribacteres desulfobacteria Verrucomicrobia

P M T H P M T H P M T H P M T H P M T H P M T H P M T H

21 85 1 1 1 3 1

6 1

4

2 1

114 1 2 4 1 3 2

162

Appendix 3. Cont.

H27 - 16S rRNA

- Beta - Alpha - Gamma Tenericutes Elusimicrobia Lentisphaerae proteobacteria proteobacteria proteobacteria Armatimonadetes

P M T H P M T H P M T H P M T H P M T H P M T H P M T H

1 1 1 3 1 1 3

1 34 65 1 20 1

1

4

1

1

1

1 1 4 1 103 7 24

163

Appendix 3. Cont.

H27 - 16S rRNA

% Delta - total Epsilon - SUMM proteobacteria proteobacteria

P M T H P M T H Psyc Meso Therm Hyper Psyc Meso Therm Hyper

3 25 159 83 43 310 8.065 51.29 26.77 13.9

7 1 1 4 52 140 13 0 205 25.37 68.29 6.341 0

4 13 7 1 25 16 52 28 4

0 4 2 0 6 0 66.67 33.33 0

0 4 0 0 4 0 100 0 0

0 3 0 0 3 0 100 0 0

1 0 2 1 0 3 0 66.67 33.33 0

0 1 1 0 2 0 33.33 33.33 0

0 1 0 0 1

0 1 0 0 1

0 0 1 0 1

1 0 0 0 1

12 5 82 328 108 44 561

164

Appendix 3. Cont.

3WJ - 16S rRNA

3WJ - 16S

rRNA Thermus Aquificae Firmicutes Chloroflexi Thermotogae Actinobacteria Deinococcus -

5'-seq-3' 3'-seq-5' P M T H P M T H P M T H P M T H P M T H P M T H

uaguaa ua agu 18 8 4 16 4 49 1 6 10 64 51 10 4 4 14 6

uaguaa ca agu 1

uaguaa ua aga 1

uaauaa uu agu 1

uagaaa ua agu 1 Group summ 27 20 54 6 138 28

desulfobacteria Chlorobi - Spirochaetes Synergistetes Bacteroidetes Deferribacteres Verrucomicrobia Thermo

P M T H P M T H P M T H P M T H P M T H P M T H P M T H

19 89 1 1 1 2 4 1 3 1 1

1 2

113 1 2 4 1 3 2

165

Appendix 3. Cont.

3WJ - 16S rRNA

proteobacteria - proteobacteria proteobacteria Tenericutes Elusimicrobia Lentisphaerae Armatimonadetes Beta - Alpha - Gamma

P M T H P M T H P M T H P M T H P M T H P M T H P M T H

1 1 4 1 34 66 1 4 1 1 23 1

1 1 4 1 100 7 24

% total proteobacteria proteobacteria SUMM Delta - Epsilon -

P M T H P M T H Psyc Meso Therm Hyper Psyc Meso Therm Hyper

7 5 1 4 79 319 106 43 547 14.44 58.3 19.4 7.9

1 2 1 0 4 25 50 25 0

0 0 1 0 1 0 0 100 0

0 0 1 0 1 0 0 100 0

0 0 0 1 1 0 0 0 100

12 5 80 321 109 44 554