The Pennsylvania State University

The Graduate School

Department of Chemistry

INVESTIGATING RNA FOLDING AND ADAPTATION

IN CELLULAR CONDITIONS

A Dissertation in

Chemistry

by

Kathleen A. Leamy

© 2018 Kathleen Leamy

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

August 2018

The dissertation of Kathleen A. Leamy was reviewed and approved* by the following:

Philip C. Bevilacqua Distinguished Professor of Chemistry and Biochemistry and Molecular Biology Dissertation Advisor Chair of Committee

Joseph A. Cotruvo Assistant Professor of Chemistry Louis Martarano Career Development Professor

Frank L. Dorman Associate Professor Biochemistry and Molecular Biology

Xin Zhang Assistant Professor of Chemistry Assistant Professor of Biochemistry and Molecular Biology Paul Berg Early Career Professorship in the Eberly College of Science

Thomas E. Mallouk Head of the Chemistry Department Evan Pugh Professor of Chemistry, Biochemistry and Molecular Biology, Physics, and Engineering Science and Mechanics

*Signatures are on file in the Graduate School

ii Abstract

RNA is an essential biomolecule that controls many processes in cells by folding into complex three-dimensional structures. Regulation can happen at the transcriptional or translational levels by RNA self-cleavage, ligand binding, or unfolding, to name a few.

The folding and structures of these so called ‘functional RNAs’ that adopt tertiary structures has been studied for several decades. These studies have revealed that functional RNAs fold in a hierarchical manner where secondary structures form before tertiary structure, RNA folds over broad energetic landscapes, and RNAs become trapped in stable misfolded, non-functional structures that can last for hours. The majority of these studies have been performed on just a few RNA sequences and in classic solution conditions of 1 M Na+. Although these studies have revealed fundamentals of RNA folding, they have been performed in non-biological solutions and on a low range of sequence diversity. This limits our knowledge of how RNA folds in cells.

In cells the ionic conditions are much lower than those typically used in vitro.

There is ~ 140 mM K+, 10 mM Na+, 0.5-1.0 mM Mg2+ in eukaryotic cells, and 1.5-2.0 mM

Mg2+ in prokaryotic cells. There are also numerous small molecules, including free nucleotides, amino acids, and metabolites, and the biomolecule and organelles contribute 20-40% molecular crowding. Recent efforts have shown that mimicking some of these cellular conditions in vitro affect the folding and function of RNAs. In all chapters of the work herein, I describe the effects of cellular mimics on RNA folding processes,

iii including the stability of secondary and tertiary structures, cotranscriptional folding, and

RNA adaptation to high temperatures.

Several functional RNAs have been shown to fold in a more two-state manner in cellular mimics, but in a multi-state manner in classic in vitro conditions. However, the mechanism behind this two-state folding has been largely uninvestigated. In Chapter Two,

I focus on how cooperativity arises in biological crowded and Mg2+ conditions by studying the influence of these conditions on the structure and stability of the secondary and tertiary structures of tRNAphe. I find that cooperativity occurs through both a destabilization of secondary structures and a stabilization of tertiary structure. This is very similar to the folding of proteins, where weak secondary structure and strong tertiary structure drive cooperativity to the native state.

Full-length RNAs are typically studied by renaturing a full-length transcript in vitro and watching it fold or unfold. However, in cells the growing transcript folds as it emerges from the polymerase, and the mechanism behind cotranscriptional structure formation is largely unknown. In Chapter Three, I study cotranscriptional structure formation in tRNAphe. I find that in cellular mimics, structures that form cotranscriptionally are weakened compared to in buffer, until a single nucleotide is added and a large increase in transcript stability is observed. This large increase in stability results in cooperative folding to the native state, and only happens when all secondary and tertiary contacts can form. This single nucleotide control of folding is observed both experimentally and computationally, and suggests that nature has selected for minimal structure formation

iv in cotranscriptional intermediates until two-state folding to the functional state can be achieved.

Organisms grow at a wide range of temperatures, which range from freezing to boiling, and their RNAs need to adopt to fold and function in those extreme environments. Yet, only a few sequences of a type of RNA are typically studied. In

Chapter Four, I investigate how tRNAphe adapts to extreme temperatures, and how model transcripts from thermophiles fold under cellular conditions. The data reveals that tRNA adapts to extreme temperature by altering the stability of secondary structures, while nucleotides in tertiary structures are highly conserved. We also show that model transcripts from thermophiles maintain cooperative folding, but only in the most cellular- like solution conditions. This study showed a novel mechanism for functional RNA adaptation.

A challenge in the RNA folding community is that there is not a good method for studying thermodynamics in complex solution conditions, such as cell extracts. This has two main causes: (1) Cosolutes and biomolecules interfere with the signal from an RNA of interest, which is normally UV detected at 260 or 280 nm, and (2) Methods to study thermodynamics typically involve heating the sample to denature the RNA, however this will also denature biomolecules in solution, no longer making it cellular-like. In Chapter

Five, I describe efforts to develop a method to study RNA thermodynamics in complex cellular mimics that will overcome the above described challenges. Overall, the work presented in this thesis provides insights into how RNAs fold and adapt in cellular conditions, with implications for understanding how nature selects for RNA structures.

v TABLE OF CONTENTS

LIST OF FIGURES………………………………………………………………………………………………………………x

LIST OF TABLES…………………………………………………………………………………………………………..…xv

ABBREVIATIONS………………………………………………………………………………………………………….xvii

ACKNOWLEDGEMENTS………………………………………………………………………………………………xviii

Chapter 1 INTRODUCTION……………………………………………………………………………………….1

1.1 RNA folding pathways and an introduction to cooperativity…………………………..2 1.2 RNA folding in vitro and in vivo…………………………………………….………………………..5 1.3 Designing in vivo-like conditions……………………………………………………………………8 1.4 Influence of in vivo-like conditions on RNA folding and function…………………….9 1.5 Influence of cotranscriptional folding and RNA adaptation…………………………..11 1.6 Transfer RNA maturation and processing…………………………………………………….13 1.7 Thesis objectives…………………………………………………………………………………………14 1.8 References………………………………………………………………………………………………….16

Chapter 2 COOPERATIVE RNA FOLDING UNDER CELLULAR CONDITIONS ARISES FROM BOTH TERTIARY STRUCTURE STABILIZATION AND SECONDARY STRUCTURE DESTABILIZATION……………………………………………………………………………………..………………….23 2.1 Abstract………………………………………………………………………………………………………23 2.2 Introduction………………………………………………………………………………………………..24 2.3 Results and discussion………………………………………………………………………………..28 2.3.1 Effects of Mg2+ and crowder are similar for WT and FL tRNA………..29 2.3.2 Cooperative tRNA folding can be induced by Mg2+-driven tertiary structure stabilization……………………………………………………………………………29 2.3.3 Cooperative tRNA folding can be induced by crowder-driven secondary structure destabilization……………………………………………….……..33 2.3.4 Cooperative and non-cooperative folding are observed on the nucleotide level………………………………………………………….…………………………36 2.3.5 tRNAphe adopts a more compact structure under in vivo-like conditions……………………………………………………………………………………………..43 2.4 Conclusions…………………………………………………………………………………………………47 2.5 Acknowledgements…………………………………………………………………………………….49 2.6 References………………………………………………………………………………………………….49

vi Chapter 3 COTRANSCRIPTIONAL FOLDING OF RNA IS COOPERATIVE UNDER PHYSIOLOGICAL CONDITIONS……………………………………………………………………………….………55 3.1 Abstract……………………………………………………………………………………………………… 55 3.2 Introduction………………………………………………………………………………………………..56 3.3 Results…………………………………………………………………………………………………………59 3.3.1 Effects of cellular conditions on cotranscriptional folding in eukaryotic Mg2+……………………………………………………………………………….…… 60 3.3.2 Effects of cellular conditions on cotranscriptional folding in prokaryotic Mg2+……………………………………………………………………………………64 3.3.3 Cooperativity arises from depopulation of non-native states…………65 3.3.4 Extensions on the 5’ and 3’ end do not affect tRNA core folding cooperativity…………………………………………………………………………………………67 3.3.5 Cooperative cotranscriptional folding is computationally predicted……………………………………………………………………………………………… 70 3.4 Discussion……………………………………………………………………………………………………73 3.5 Conclusions…………………………………………………………………………………………………76 3.6 Acknowledgments……………………………………………………………………………………….77 3.7 References………………………………………………………………………………………………….78

Chapter 4 MOLECULAR MECHANISM FOR FOLDING COOPERATIVITY OF FUNCTIONAL RNAS IN LIVING ORGANISMS…………………………………………………………………………………………83 4.1 Abstract………………………………………………………………………………………………………83 4.2 Introduction………………………………………………………………………………………………..84 4.3 Results…………………………………………………………………………………………………………86 4.3.1 Dilute solution conditions lead to non-cooperative folding with stronger base pairing…………………………………………………………………………….86 4.3.2 In vivo-like conditions favor cooperative folding and thermostability with stronger base pairing……………………………………………………………………..91 4.3.3 Nature selects for thermostable tRNAs through strengthening secondary structure not tertiary structure……………………………………………..94 4.4 Discussion……………………………………………………………………………………………………98 4.5 Acknowledgements…………………………………………………………………………………..102 4.6 References………………………………………………………………………………………………..103

Chapter 5 CONCLUSIONS AND FUTURE DIRECTIONS.....…………………………………….…109 5.1 Conclusions……………………………………………………………………………………………….109 5.2 Future directions……………………………………………………………………………………….111 5.2.1 Extension of thermodynamics Into complex cytoplasm mimics………………………………………………………………………………………………..111 5.2.2 Development of a thermodynamics assay with fluorescence detection…………………………………………………………………………………………….113 5.2.3 Investigating if the 5’ leader of tRNA drives native folding…………..118 5.2.4 Investigate the method of adaptation of RNAs to extreme conditions……………………………………………………………………………………………120

vii 5.2.5 Discover novel RNA motifs using information content…………………120 5.3 References………………………………………………………………………………………………..121

Appendix A SUPPORTING INFORMATION: CHAPTER 2…………………………………………….123 A.1 Materials and methods……………………………………………………………………………..123 A.1.1 Chemicals……………………………………………………………………………………123 A.1.2 RNA constructs and preparation…………………………………………………123 A.1.3 Thermal denaturation and data analysis……………………………………..125 A.1.4 Derivation of thermal denaturation data fitting…………………………..126 A.1.5 Temperature-dependent in-line probing and data analysis…………128 A.1.6 SAXS data collection……………………………………………………………………129 A.1.7 SAXS data analysis………………………………………………………………………131 A.2 Supplemental tables and figures………………………………………………………………..133 A.3 References………………………………………………………………………………………………..146

Appendix B SUPPORTING INFORMATION: CHAPTER 3…………………………………………….148 B.1 Materials and methods……………………………………………………………………………..148 B.1.1 Chemicals……………………………………………………………………………………148 B.1.2 RNA constructs and preparation………………………………………………...148 B.1.3 Thermal denaturation and data analysis……………………………………..150 B.1.4 In-line probing…………………………………………………………………………….151 B.1.5 Small angle X-ray scattering…………………………………………………………153 B.1.6 Native PAGE………………………………………………………………………………..153 B.1.7 CoFold structure prediction…………………………………………………………154 B.1.8 tRNA core enthalpy calculation…………………………………………………..154 B.2 Supplemental tables and figures……………………………………………………………….155 B.3 References………………………………………………………………………………………………..168

Appendix C SUPPORTING INFORMATION: CHAPTER 4……...... 169 C.1 Materials and methods……………………………………………………………………………..169 C.1.1 Chemicals……………………………………………………………………………………169 C.1.2 RNA constructs and preparation………………………………………………….169 C.1.3 Thermal denaturation…………………………………………………………………171 C.1.4 Thermal denaturation data analysis…………………………………………….171 C.1.5 Small angle X-ray scattering data collection and analysis……………..172 C.1.6 In-line probing…………………………………………………………………………….173 C.1.7 Optimal growth temperature analysis…………………………………………174 C.1.8 Information content……………………………………………………………………174 C.2 Supplemental tables and figures………………………………………………………………..176 C.3 References………………………………………………………………………………………………..190

viii Appendix D SUPPORTING INFORMATION: CHAPTER 5…………………………………………….192 D.1 Materials and methods…………………………………………………………………………..…192 D.1.1 Chemicals…………………………………………………………………………………..192 D.1.2 RNA constructs and preparation…………………………………………………192 D.1.3 Fluorescence titrations with NMPs……………………………………………..193 D.1.4 Plate reader binding assays…………………………………………………………193 D.1.5 Binding assay data fitting…………………………………………………………….194 D.1.6 CoFold structure prediction………………………………………………………..196 D.2 Supplemental Figures……………………………………………………………………………….197

ix LIST OF FIGURES

Figure 1-1. Complexity of RNA secondary and tertiary structure……………………………………..2

Figure 1-2. RNA folding pathways……………………………………………………………………………………4

Figure 1-3. Artist’s rendition of in vitro, in vivo-like, and in vivo conditions……………………….6

Figure 1-4. Effects of additives on RNA folding equilibrium…………………………………………..10

Figure 1-5. Model of RNA cotranscriptional folding……………………………………………………….11

Figure 1-6. Transfer RNA maturation and modification in yeast…………………………………….13

Figure 2-1. Structures of FL yeast tRNAphe and its model helical fragments with their predicted folds……………………………………………………………………………………………………………..30

Figure 2-2. First derivative curves of thermal denaturation experiments of FL tRNA and

SSS under physiological ionic and crowded conditions.…………………………………………….……31

Figure 2-3. Difference in TM of FL tRNA, SSS, and each of the four model HF in crowder compared to buffer...... 33

Figure 2-4. Cooperative folding can be induced through crowder-driven secondary structure destabilization…………….…………………………………………………………………………………34

Figure 2-5. Nucleotide and helical stem fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 0.5 mM Mg2+.…………………………………………………………………37

Figure 2-6. Global fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 0.5 mM Mg2+.……………………………………………………………………………………………………….38

Figure 2-7. Melting temperatures of FL tRNA and HF obtained by optical melting and ILP in buffer and crowded conditions………………………………………………………………………………….42

x Figure 2-8. Physiological crowded and ionic conditions favor a compact folded state of FL tRNA.……………...... 46

Figure 3-1. Full-length and cotranscriptional intermediate constructs of tRNAphe………….60

Figure 3-2. Thermal denaturation of FL tRNA and late cotranscriptional intermediates in buffer and various physiological conditions………………………………………………………………..…61

Figure 3-3. Native gels and ILP reactivity of FL tRNA and cotranscriptional intermediates in physiological concentrations of Mg2+…………………………………………………………………………66

Figure 3-4. Sequence and thermal denaturation of 5’ leader and precursor constructs in buffer and physiological conditions with 2.0 mM free Mg2+…………………………………………..68

Figure 3-5. Global fitting of variable temperature in-line probing signal of the 5’ leader construct in buffer and physiological conditions…………………………………………………………...69

Figure 3-6. Computationally predicted structure formation during cotranscriptional folding of yeast tRNAphe………………………………………………………………………………………………..72

Figure 3-7. A model for the late steps of cotranscriptional folding of tRNAphe…………………74

Figure 4-1. Wild-type and variable stability mutant constructs of tRNAphe……………………..87

Figure 4-2. WT and mutant thermal denaturation under in vivo-like solutions in the background of 2.0 mM free Mg2+………………………………………………………………………………….89

Figure 4-3. Aligned SAXS bead models and in-line probing of M1-M5 and WT in buffer with

2.0 mM Mg2+…………………………………………………………………………………………………………….....90

Figure 4-4. Cooperative and non-cooperative enthalpy models of WT and mutant tRNAs……………………………………………………………………………………………………………………………93

xi Figure 4-5. Analysis of tRNA sequences from organisms with a large range of optimal growing temperatures……………………………………………………………………………………………..…..94

Figure 4-6. Information content of each position in tRNAphe………………………………………….96

Figure 4-7. Conceptual free energy diagram of WT and stabilizing mutants under dilute and in vivo-like conditions…………………………………………………………………………………………….99

Figure 5-1. Experimental setup and simulated data for the high-throughput thermodynamics assay……………………………………………………………………………………………….113

Figure 5-2. Quenching of fluorescein signal upon the addition of NMPs……….………………114

Figure 5-3. RNA duplex binding assay using fluorescence detection………………………….…116

Figure 5-4. Variable temperature RNA duplex binding assay using fluorescence detection……………………………………………………………………………………………………………………117

Figure 5-5. Cotranscriptional folding pathway of tRNAphe without the 5’ leader……………119

Figure A-1. Secondary and tertiary structures of FL and WT tRNAphe……………………………137

Figure A-2. FL transcribed tRNA and WT tRNA behave in a similar manner in buffer and crowded conditions.………………………………………………………………………………………………….138

Figure A-3. Temperature-dependent ILP PAGE gel of FL tRNA in 0.5 mM Mg2+……………139

Figure A-4. Temperature-dependent ILP PAGE gel of FL tRNA in 2.0 mM Mg2+……………140

Figure A-5. Helical stem fitting of temperature-dependent ILP data in buffer and 20%

PEG200 with 2.0 mM Mg2+…………………………………………………………………………………….……141

Figure A-6. Global fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 2.0 mM Mg2+.………………………………………………………………………………………………………142

xii Figure A-7. UV-Vis detection and scattering intensity of in-line SEC SAXS of FL tRNA in buffer with 0.5 mM Mg2+…………………………………………………………………………………………….143

Figure A-8. Small angle X-ray scattering data fitting of FL tRNA in buffer and 20% PEG8000 in the background of low, physiological Mg2+………………………………………………………………144

Figure A-9. Comparison of experimental SAXS scattering curves and theoretical tRNA scattering curves generated with FoXS………………………………………………………………………145

Figure B-1. Thermal denaturation of FL tRNA and early cotranscriptional intermediates in buffer and various physiological conditions with 0.5 or 2.0 mM free Mg2+……………………158

Figure B-2. Native gels of FL tRNA and cotranscriptional intermediates under in vitro and physiological conditions……………………………………………………………………………………………..159

Figure B-3. Small angle X-ray scattering data of intermediates in 0.5 and 2.0 mM free

Mg2+……………………………………………………………………………………………………………………………160

Figure B-4. ILP of FL tRNA and cotranscriptional intermediates under in vitro or physiological conditions with 0.5 mM Mg2+…………………………………………………………………161

Figure B-5. ILP of FL tRNA and cotranscriptional intermediates under in vitro or physiological conditions with 2.0 mM Mg2+…………………………………………………………………162

Figure B-6. Variable temperature in-line probing of 5’ leader construct in buffer and physiological conditions with 2.0 mM free Mg2+………………………………………………………….163

Figure B-7. Variable temperature in-line probing of precursor construct in buffer and physiological conditions with 2.0 mM free Mg2+………………………………………………………….164

Figure B-8. In-line probing reactivity of 5’ leader and precursor constructs at 35 °C with

2.0 mM free Mg2+……………………………………………………………………………………………………….165

xiii Figure B-9. Global fitting of variable temperature in-line probing signal of the precursor construct in buffer and physiological conditions………………………………………………...... 166

Figure B-10. Computationally predicted structure formation during cotranscriptional folding of the 3’ leader of yeast tRNAphe………………………………………………………………………167

Figure C-1. WT and mutant thermal denaturation under in vivo-like solutions in the background of 0.5 mM free Mg2+…………………………………………………………………………………183

Figure C-2. Small angle X-ray scattering data in buffer with 2.0 mM free Mg2+……………..184

Figure C-3. Comparison of experimental scattering curves of WT and mutant tRNAs……185

Figure C-4. In-line probing PAGE of WT and M5 RNAs………………………………………………….186

Figure C-5. Melting temperature and enthalpy of unfolding of WT and mutant tRNAs in

2.0 mM free Mg2+……………………………………………………………………………………………………….187

Figure C-6. Average tRNA stem DGaverage from organisms with a large range of optimal growing temperatures………………………………………………………………………………………………..188

Figure C-7. Tertiary interactions in tRNAphe (PDB 1ehz)……………………………………………….189

Figure D-1. Fluorescein-labeled RNA calibration curve in 1 M NaCl on a qPCR………………197

xiv LIST OF TABLES

Table 2-1. Melting temperatures of FL tRNA and tRNA helical fragments as determined by optical melting and ILP in buffer and 20% PEG200 with 0.5 and 2.0 mM

Mg2+………………………………………………………………………………………………………………………..……40

Table 2-2. Structural parameters obtained for FL tRNAphe using Small Angle X-Ray

Scattering…………………………………………………………………………………………………………………….45

Table 5-1. Binding constants for RNA duplex formation in 1 M NaCl…………………………….117

Table A-1. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 0 mM Mg2+ derived from optical melting………………………………….…………133

Table A-2. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 0.5 mM Mg2+ derived from optical melting.…………………………………………134

Table A-3. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 2.0 mM Mg2+ derived from optical melting………………………………..……….135

Table A-4. Evaluation of SAXS data fitting using FoXS and Supcomb……………………………136

Table B-1. Cotranscriptional intermediate thermodynamic parameters determined by thermal denaturation…………………………………………………………………………………………………155

Table B-2. SAXS parameters of FL tRNA and cotranscriptional intermediates in buffer and

20% PEG8000 with 0.5 and 2.0 mM Mg2+…………………………………………………………………….156

Table B-3. Thermodynamic parameters for the unfolding of the tRNA core in the 5’ Leader and precursor constructs, determined by global fitting of variable temperature in-line probing……………...... 157

xv Table C-1. Thermodynamic parameters for WT tRNA and mutant folding in 2.0 mM free

Mg2+…………………………………………………………………………………………………………………………..176

Table C-2. Thermodynamic parameters for WT tRNA and mutant folding in 0.5 mM free

Mg2+……………………………………………………………………………………………………………………………177

Table C-3. Quality of the global fits of thermal denaturation data…………………………….…178

Table C-4. Structural parameters for WT and MT tRNAs obtained by SAXS in 2.0 mM

Mg2+……………………………………………………………………………………………………………………………179

Table C-5. Composition of samples containing Mg2+-chelated amino acids………………….179

Table C-6. Organism optimal growth temperatures and tRNA stem free energy of folding………………………………………………………………………………………………………………………..180

Table C-7. Organism optimal growth temperatures and ribosomal RNA GC percent…….181

Table C-8. Potential isosteric changes in tRNAphe tertiary interactions…………………………182

xvi ABBREVIATIONS

RNA Ribonucleic acid tRNA Transfer RNA

Phe Phenylalanine

PAGE Polyacrylamide gel electrophoresis

ILP In-line probing

SAXS Small angle X-ray scattering

SEC Size exclusion chromatography

UV-Vis Ultraviolet-visible

PCR Polymerase chain reaction

PEG Polyethylene glycol

FAM Fluorescein

BHQ1 Black hole quencher 1 qPCR Quantitative polymerase chain reaction

Kd Binding constant

xvii Acknowledgments

I want to acknowledge the NIH and NSF for funding this work, and I acknowledge that the findings and conclusions in this thesis do not necessarily reflect the view of the funding agency.

I have many people to thank. I first need to thank Phil Bevilacqua for being an incredible thesis advisor. He has been supportive through the ups and downs of research, and he has believed in me as a scientist. Phil has taught me how to be a good scientist, person, and mentor, and he helped foster my desire to continue learning and pursuing the unknown. I look forward to having you as a mentor throughout my career.

I want to thank other mentors in my life that I met before Penn State. Thank you for believing in me at every step of my development as a scientist. Dr. Jodi O’Donnell, you showed me what a strong and confident female scientist is like, and you have a wonderful example for me. You’ve also offered me advice through every step of my career and I’m so grateful. Dr. Dan Moriarty, thank you for getting me involved in research as an undergraduate and for pushing me to think outside of the box. Dr. Lucas Tucker, thank you for showing me that struggle and failure is alright, and thank you for your advice throughout the years.

I also want to thank the members of my thesis committee: Dr. Amie Boal, Dr. Frank

Dorman, Dr. Joey Cotruvo, and Dr. Xin Zhang for their helpful input on my thesis. I also want to thank Dr. Chris Keating and Dr. Tae-Hee Lee for their years of service on my committee and advice on my projects. I want to thanks Dr. Neela Yennawar for her

xviii continued support throughout my time at Penn State. You have been an instrumental part of all my papers, and I greatly appreciate the time that you spent with me collecting and analyzing data, and for passing down your knowledge on SAXS to me. And for always driving me home from CHESS while I slept in your car!

In addition to my advisor and committee, I want to thank the people that I have worked with. The Bevilacqua lab has always been composed of a wonderful group of people, past and present, who make coming to work everyday fun. You are such a supportive and fun group of people, and I will miss working alongside you. Thanks for being such great people as well as scientists! To my friends at Penn State you have kept my mind off of research and have been so much fun!

Most importantly I need to thank my family. My parents and sister have been a huge support to me throughout my time at Penn State. They have always been a phone call or text away, and have helped dry my tears countless times. They’ve believed in me when I didn’t believe in myself, and I would not be here without them. My grandmother who is not here to see me defend was always proud of my accomplishments, and she always made me feel like I could conquer anything. So thank you grandma.

And finally, to my husband Josh, you have been there for me like nobody else. On all the long and hard days, you have made me smile and laugh, and you have made so many sacrifices to make my time as a graduate student easier. Most importantly, you always believed in me more than I believed in myself, and you have pushed me to keep achieving. I know that I would not be here without you. I also need to thank Gordon,

xix Lynx, and Leona. These crazies kept me laughing, and kept me company during early mornings and late nights working at home.

xx Chapter 1

Introduction

Portions of this chapter were adapted from a review article entitled: “Bridging the Gap

Between In Vitro and In Vivo RNA Folding” by Kathleen A. Leamy, Sarah M. Assmann,

David H. Mathews, and Philip C. Bevilacqua in Quarterly Reviews of Biophysics 49:e10- e36, 2016.

The work presented in this thesis is geared towards understanding how RNA folds in cellular conditions. RNA folding has classically been studied in dilute solution conditions with non-physiological concentrations of monovalent or divalent salts to stimulate structure formation. However, in the cell RNAs are exposed to much lower concentrations of salts, and large concentrations of small cosolutes and macromolecules, and all of these factors can affect RNA folding and function. I am a proponent of studying

RNA folding in vivo, but cellular assays for RNA folding are limited to structure probing experiments that typically reflect the average RNA structure adopted, and the information that can be obtained on RNA thermodynamics is limited. Herein, the thermodynamics, folding, and evolution of functional RNAs is investigated in solutions that mimic important aspects of the cytoplasm, including the ionic, small molecule, and macromolecular crowding conditions. These studies are performed primarily on transfer

RNA in Chapters 2, 3, and 4. Chapter 5 focuses on assay development and discovering

1 new RNA functions in cells. This Introduction will shed light on what is currently known on RNA folding in vitro, in vivo, and in vivo-like solutions, and provides background for the next four chapters. The last section of the Introduction will outline the thesis objectives by chapter.

1.1 RNA folding pathways and an introduction to cooperativity

Originally, RNA was thought of as just a messenger molecule between DNA and proteins, with proteins carrying out most of the functions in cells. However, recent work has shown that RNA carries out many important cellular functions. RNAs are single-

Figure 1-1. Complexity of RNA secondary and tertiary structure. (A) RNA can adopt many secondary structures ranging from double stranded to hairpin motifs. (B) These secondary structures can fold into complex tertiary structures such as multiway junctions, pseudoknots, and kissing hairpins.

2 stranded biomolecules that have the ability to fold upon themselves and form many complex secondary and tertiary structures (Figure 1-1). These structures allow RNA to form motifs, binding pockets, and clefts, so that RNA can carry out diverse functions, including binding small molecules, catalysis, and gene regulation. There are many classes of RNAs; the most well-known are messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). These three RNAs are essential for protein synthesis; mRNA contains the information that codes for the protein sequence, tRNA transports amino acids to the ribosome for addition to the polypeptide chain, and rRNA is part of the ribosome and catalyzes peptide bond formation. However, there are many other classes of RNAs that carry out essential functions in cells, including riboswitches, which bind to a ligand and change structure, and ribozymes, which catalyze self-cleavage. Both of these types of RNAs function to regulate genes. Misfolding and mutations of RNA are characteristics of many cancers and diseases; for example, triplet repeat expansion diseases are associated with Huntington’s disease, myotonic dystrophy, and Fragile X syndrome (1). Accordingly, understanding RNA structures and their dynamic regulation is an integral aspect of understanding RNA function.

Biomolecules can fold on many different pathways that vary in complexity. RNA folds in a hierarchical manner where secondary structure forms through a series of intermediates before tertiary structure forms (Figure 1-2A) (2). Indicating that tertiary structure forms from largely pre-formed secondary structures. Stable misfolded states with a high energy barrier to native structure can be long-lived, lasting from minutes to hours (3, 4). When this type of multi-state folding pathway is observed, with highly stable

3 misfolded and intermediate states, it is considered non-cooperative (Figure 1-2B).

However, sometimes two-state folding is observed where intermediate states do not populate and are not observed. We refer to this type of folding as being cooperative

(Figure 1-2C) (5). Cooperative folding can allow for fast folding to the native state without getting stuck in misfolded traps. RNA secondary structures of perfect duplexes, in standard conditions of 1 M NaCl, are thought to be much stronger than tertiary structure

(6), so cooperativity in the folding of these motifs can arise from either (A) destabilization

Figure 1-2. RNA folding pathways. (A) RNA folds in a hierarchical manner where secondary structure forms through a number of intermediates before tertiary structure forms. (B) Folding can occur in a non-cooperative manner where intermediates are populated and misfolded states (Mi) are similar in free energy to the native state (N). (C) Alternatively, folding can occur in a cooperative manner where misfolded states don’t populate and an apparent two-state transition is observed between unfolded and folded.

4 of strong secondary structure, (B) stabilization of weak tertiary structure, or (C) a combination of secondary structure destabilization and tertiary structure stabilization.

1.2 RNA folding in vitro and in vivo

Most of what we currently know about RNA structure and folding comes from studies completed in vitro, under experimental conditions that favor a folded state. Such studies have typically been conducted in dilute solutions with high concentrations (~1 M) of monovalent ions (7) and/or (>10 mM) divalent ions (8), typically Mg2+, or under conditions that facilitate population of a desired folding intermediate, for example by renaturing the RNA at an unusual temperature or salt concentration (9). Thermodynamic and kinetic studies under in vitro conditions provide insight into the complex folding pathways of many functional RNAs. Secondary structures form on a fast time scale (μs to ms) followed by folding of the tertiary structure on a slower time scale (ms to sec). Large

RNAs have been shown to fold on a rugged pathway through populated intermediates, largely in a hierarchical manner, where secondary structures form before tertiary contacts, as demanded by the topologies of these complex RNAs (2, 6, 10-12). The folding pathways of large functional RNAs have proven to be quite complex with intermediates that can be trapped for minutes to hours (3, 4, 13). For example, 90% of the Tetrahymena ribozyme is found in a misfolded state that transitions to the native state with hour timescale kinetics (3), and the HDV ribozyme folds through numerous intermediates, some long-lived (4). The major limitation of in vitro experiments is that the solution

5 conditions are very different from the cellular environment and unavoidably lack many of the components present in cells, which can influence RNA folding and function. These limitations necessitate the development of experiments and techniques under in vivo and in vivo-like conditions to determine how RNAs fold and respond to cellular environmental conditions (Figure 1-3).

Figure 1-3. Artist’s rendition of in vitro (left), in vivo-like (center), and in vivo conditions (right). Typical in vitro solutions are dilute with high monovalent ion concentrations that are very different from cellular conditions. The cellular environment is complex with monovalent and divalent salts, complex macromolecules, cosolutes, and organelles. In vivo-like conditions, which have cellular additives, (center) bridge in vitro and in vivo conditions and are more complex than in vitro conditions.

The ultimate goal of RNA folding studies is to understand how RNA behaves in the cell. The environment of the cytoplasm is very different from the environment typically used for in vitro experiments. In cells the free concentrations of monovalent and divalent ions are much lower, around 140 mM K+ (14), 10 mM Na+ (15, 16) and 0.5-2.0 mM Mg2+

(17-21); additionally, there is an abundance of small molecules and cosolutes, and macromolecules that contribute 20%-40% crowding (22). The negative phosphate backbone of RNA, as well as the bases and sugars, can interact with these species affecting its folding and function. Much of the progress made on understanding RNA structure in

6 vivo comes from structure probing methods that sample the average RNA structure.

These experiments can be done in a transcript-specific or a genome-wide manner to reveal a snapshot of structure. Structure probing experiments have revealed that certain

RNAs tend to be fold differently or be less structured in vivo than in vitro or in silico (23).

One study for our lab comparing Arabidopsis RNA structure in vivo and in silico observed that mRNAs with the most difference in structure (bottom 5%) were enriched in annotations of biological function of stress and stimulus response, while the mRNAs with the least difference in structure (top 5%) were enriched in housekeeping functions (24).

One possibility is that housekeeping RNAs have well-defined folds while stress-related

RNAs have ill-defined folds or adopt many folds. It is likely that a range of factors in vivo contribute to RNA structure.

Since some RNAs have been shown to fold and function differently under cellular conditions, why not study RNA solely in living cells instead of in vitro conditions? The reality is that methods for directly studying RNA folding in vivo are inherently limited, and most current in vivo approaches rely on structure probing methods that do not probe RNA thermodynamics or folding pathways. Experiments performed in vivo provide information only on the average RNA structure in a cell or organism and lack information on RNA dynamics, the folding process (25), and the presence of multiple populated structures of the same transcript. These limitations motivate in vivo-like studies to understand the influence of cellular conditions on RNA folding.

7 1.3 Designing in vivo-like conditions

As mentioned in the previous section, the dilute solution conditions traditionally used to study RNA in vitro are vastly different from the cellular environment (Figure 1-3).

The cellular environment is a complex solution containing biopolymers, metabolites, dilute free salts, and organelles, with up to 40% of the cellular volume occupied by macromolecular crowders (26, 27). As such, there is no single cellular environment to which RNA is exposed. As an mRNA passes from the nucleus to the cytosol, solution conditions change; in eukaryotes, the cell is compartmentalized and as the RNA is transported to different regions its fold can change.

It is of interest to consider the differences between RNA structure in eukaryotic and prokaryotic organisms. Functional RNAs have intricate structures with tertiary contacts that assemble secondary structures to be close in space. Cations, typically Mg2+, neutralize the negative charge of the phosphate backbone and promote tertiary structures. Free Mg2+ concentrations in prokaryotic and eukaryotic cells are lower than most in vitro studies, and are somewhat different at ~1.5-2.0 mM and 0.5-1.0 mM, respectively (17, 18, 20, 21). Structured RNAs such as ribozymes, riboswitches, and thermosensors, are found predominantly in prokaryotes, where the free Mg2+ levels are

~ 2-fold higher. Although a few ribozymes and one riboswitch have been identified in eukaryotes, they appear to be rare, and proteins are typically involved in forming requisite tertiary structures (28-30). Lambowitz and co-workers demonstrated that prokaryotic group II introns fold poorly in eukaryotic cells, although they could select

8 variant RNAs that fold into active conformations at eukaryotic low Mg2+ concentrations

(18). Thermodynamic studies cannot, however, readily be performed in vivo. The cell prohibits wide variations of temperature, pH, salt, and ligand concentration, all of which are necessary to obtain thermodynamic information, although studies in poikilotherms, such as plants and , offer promise. As a result, RNA is being increasingly studied in artificial cytoplasms that mimic aspects of the cellular environment while allowing biophysical studies.

1.4 Influence of in vivo-like conditions on RNA folding and function

Several recent studies focused on mimicking aspects of the in vivo environment in vitro; conditions referred to herein as ‘in vivo-like’ conditions (Figure 1-3) (31-36). Effects of such conditions as cellular concentrations of monovalent and divalent ions and molecular crowding agents on the folding of RNAs have been a theme in a number of such studies. Experiments under these in vivo-like conditions have the potential to bridge our understanding of observations made in vitro and in vivo. Several studies have shown that synthetic crowding agents affect the thermodynamics and function of several RNAs (32,

33, 37-39). Findings of these studies are that RNAs fold cooperatively, structure becomes compact, and ribozymes cleave faster under in vivo-like conditions (33, 39-41). The kinetics of several small and large ribozymes have been probed under in vivo-like conditions and in all reported cases, rates of catalysis have increased in the presence of molecular crowders as compared to dilute solution conditions (31, 36, 40-42). For

9 example, the hammerhead ribozyme has higher catalytic activity, between 3.5-6.5 faster in the presence of 10%–30% (wt %) PEG200 or PEG8000, than in dilute solutions, suggesting a more populated active state in crowded conditions (41). In addition, in vivo- like solution conditions can favor the native state of ribozymes even in the presence of denaturants. For example, the rate of catalysis of the CPEB3 ribozyme in the presence of

2.5 M of the denaturant urea was recovered by the addition of 30% (w/v) PEG200,

PEG8000, or Dextran10, at a rate higher than in buffer alone (40). Observation of increased hammerhead catalytic activity, up to 270-fold, at high temperatures in crowded conditions indicates a more thermostable RNA under in vivo-like conditions (41).

There is a plethora of small molecules in cells that can interact with the phosphate backbone, bases, and metal ions that help RNA fold. These interactions can affect the folding and function of RNAs. While molecular crowding agents generally facilitate the folding of functional RNAs, small cosolutes have varying and complicated effects on RNA thermostability and folding cooperativity. This arises in part because the effect on stability depends strongly on the particular cosolute and RNA considered. For instance, folding studies on several RNAs with either secondary and/or tertiary structures report

Figure 1-4. The effects of additives on RNA folding equilibrium. (A) Equilibrium between folded and unfolded without additives. (B) Equilibria is shifted toward the folded state with stabilizing additives (blue) and (C) shifted towards the unfolded state with destabilizing additives (red).

10 that cosolutes such as betaine, proline, and methanol, almost always destabilize secondary structures, while having mixed effects on tertiary structure (38, 43-45).

Stabilizing osmolytes have unfavorable interactions with the unfolded state of RNA, resulting in RNA compaction that buries functional groups and stabilization of the native state, while destabilizing osmolytes have favorable interactions with the unfolded state of RNA, driving unfolding (Figure 1-4) (38, 43, 46). Recent work has shown that free amino acids are present at over 100 mM in cells. These amino acids weakly chelate Mg2+ and interact with RNA in a favorable manner helping to stabilize the folded state and increase catalytic activity (47).

1.5 Influence of cotranscriptional folding and RNA adaptation

Functional RNAs are typically studied as the minimal sequence needed to function.

The RNA is normally transcribed in vitro and renatured prior to the experiment to ensure the RNA is populating the functional state. However, in cells, RNAs are transcribed with extensions on the 5’ and 3’ ends, and the RNA folds as it emerges from the polymerase

(Figure 1-5) (48, 49). The effects of flanking sequences on the aptamer folding and

Figure 1-5. Model of RNA cotranscriptional folding. As the RNA emerges from RNA Polymerase it can fold and form complex structure.

11 function is rarely considered. One study from our group showed that the HDV ribozyme cleaves faster when transcribed with native flanking sequences on the 5’ end (50). This suggests that flanking sequences could be important for function in cells.

Recently, more studies are considering how cotranscriptional folding affects structure formation. The vast majority of these experiments are performed on riboswitches, whose structure is changed in the presence of ligand (25, 51). These studies have shown the functional state is more populated when the riboswitches are folded cotranscriptionally verses post-transcriptionally (51). In addition, the folding pathways of riboswitches are dictated by the presence of ligand during transcription (25). Thus, there is compelling evidence to study the influence of cotranscriptional folding on RNA function.

It is also important to consider how RNA sequences adapt to extreme environments. There is no correlation between mRNA GC percent and organism growth temperature, but there is a positive correlation between ribosomal RNA GC percent and organism growth temperature (52). This suggests that functional RNAs from thermophiles are selected to have a high GC content in secondary structures to ensure stability at high temperatures. This could help them maintain structure, and therefore function, in extreme environments. It is necessary to consider how increases in GC percent affects affect folding pathways. Sequences with high GC content have been shown to get stuck in long-lived misfolded intermediates before refolding to the native state. The RNA field should consider these effects on folding and function of RNAs.

12 1.6 Transfer RNA Maturation and Processing

Transfer RNA is a good model system to understand how small functional RNAs fold because it is a similar size to other functional RNAs and forms intricate secondary and tertiary interactions. Transfer RNA (tRNA) is one of the most abundant RNA in cells, after ribosomal RNA, and there are thousands of copies of each tRNA transcript in cells (53). tRNA is one of the most important functional RNAs because it delivers amino acids to the ribosome for protein synthesis. Therefore, it is highly regulated, and there is a complex network of cellular machinery responsible for recognizing, fixing, and ultimately degraded misfolded or improperly modified transcripts. The tRNA transcript undergoes a series of

Figure 1-6. Transfer RNA maturation and modification in yeast. Transfer RNAs are transcribed as precursors with flanking sequences on both the 5’ and 3’ ends of the core. After removal of the flanking sequences, the 3’ CCA motif is added and bases on the transcript are modified. After modifications the 3’ end is aminoacylated.

13 modifications to both length and sequence before it is aminoacylated and delivers an amino acid to the growing peptide chain on the ribosome (Figure 1-6) (54).

In the nucleus, the RNA is transcribed as a precursor with extensions on the 5’ and

3’ ends of the core. The first step of maturation is truncation of these ends using RNase P on the 5’ end, and either RNase Z or another exonuclease on the 3’ end (54). The 3’ end of the tRNA does not naturally contain the 3’ CCA motif. These nucleotides are added by nucleotidyl transferase after end-truncation (55). In yeast, the base modifications happen to the end-processed transcript, around half in both the nucleus the cytoplasm. Once modification in the cytoplasm has occurred the tRNA is charged with an amino acid on the 3’ end and can deliver that amino acid to the ribosome, and that cycle can repeat.

1.7 Thesis Objectives

The main objective of this thesis is to improve the understanding of how the cellular environment influences RNA folding and function. The work herein can be used to predict how other RNAs will fold and function in the cellular environment, and it helps bridge the gap between the knowledge on RNA folding in vitro and in vivo. The chapters are all connected by the theme of understanding influences of the cellular environment on RNA folding.

In Chapter 2, the influence of the crowded environment on the folding of the secondary and tertiary structures of yeast tRNAphe are described. We find that in physiological crowded and ionic conditions, tRNA folds in a two-state manner without

14 population of intermediates. Strong secondary structure is traditionally thought to drive

RNA folding; however, in physiological conditions two-state folding occurs from destabilization of secondary structure and stabilization of tertiary structure. This work showed a novel mechanism for RNA folding in a very protein-like manner to the native state.

The findings in Chapter 2 were the inspiration for the work in Chapter 3. In Chapter

3, computational and experimental methods are used to investigate the effects of cellular crowding and small molecule conditions on yeast tRNAphe cotranscriptional folding. The results suggest that in physiological solutions, cotranscriptional folding intermediates are highly destabilized and form minimal structure until all secondary structure contacts can be made. Incorporation of a single nucleotide to complete the acceptor stem base-pairing dramatically increases the stability of the growing transcript, which results in cooperative folding to the native structure. The high level of structure control observed has implications for tRNA processing and recognition of misfolded transcripts.

In Chapter 4, the folding of tRNAphe sequences with high GC content in the acceptor stem were studied under physiological conditions. These sequences mimic those found in tRNAs from thermophiles. Using computational methods, the mechanism of tRNA thermophilic adaptation was found to involve strengthening of secondary structure rather than tertiary structure. These RNA with highly stable secondary structure fold in a two-state manner only in physiological conditions.

The final chapter, Chapter 5, contains unpublished work centered on developing new methodologies for studying RNA thermodynamics in biologicaly relevant, but highly

15 complex cellular mimics and on discovering new RNA functions and testing their function in cellular conditions. A new graduate student in the Bevilacqua Lab is going to continue work on the project developing novel methods.

1.8 References

1. Osborne RJ & Thornton CA (2006) RNA-dominant diseases. Human Molecular

Genetics 15(2):R162-R169.

2. Brion P & Westhof E (1997) Hierarchy and dynamics of RNA folding. Annu Rev

Biophys Biomol Struct 26:113-137.

3. Banerjee AR & Turner DH (1995) The time dependence of chemical modification

reveals slow steps in the folding of a group I ribozyme. Biochemistry 34:6504-6512.

4. Chadalavada DM, Senchak SE, & Bevilacqua PC (2002) The folding pathway of the

genomic hepatitis delta virus ribozyme is dominated by slow folding of the

pseudoknots. J Mol Biol 317(4):559-575.

5. Dill KA, et al. (1995) Principles of protein folding--a perspective from simple exact

models. Protein Sci 4(4):561-602.

6. Tinoco IJ & Bustamante C (1999) How RNA folds. J Mol Biol 293:271-281.

7. Freier SM, et al. (1986) Improved free-energy parameters for predictions of RNA

duplex stability. Proc Natl Acad Sci USA 83:9373-9377.

16 8. Herschlag D & Cech TR (1990) Catalysis of RNA cleavage by the tetrahymena

thermophil ribozyme. 1. Kinetic description of the reaction of an RNA substrate

complementary to the active site. Biochemistry 29:10159-10171.

9. Baird NJ, Westhof E, Qin H, Pan T, & Sosnick TR (2005) Structure of a folding

intermediate reveals the interplay between core and peripheral elements in RNA

folding. J Mol Biol 352:712-722.

10. Mitchell DI & Russell R (2014) Folding pathways of the tetrahymena ribozyme. J

Mol Biol 426:2300-2312.

11. Solomatin SV, Greenfeld M, Chu S, & Herschlag D (2010) Multiple native states

reveal persistent ruggedness of an RNA folding landscape. Nature 463:681-684.

12. Wan Y, Suh H, Russell R, & Herschlag D (2010) Multiple unfolding events during

native folding of the tetrahymena group I ribozyme. J Mol Biol 400:1067-1077.

13. Zarrinkar PP, Wang J, & Williamson JR (1996) Slow folding kinetics of RNase P RNA.

RNA 2(6):564-573.

14. Feig AL & Uhlenbeck OC (1999) The role of metal ions in RNA biochemistry. The

RNA world, 2nd ed., eds Gesteland RF, Cech TR, & Atkins JF (Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, New York), pp 287-320.

15. Nagata S, Adachi K, Shirai K, & Sano H (1995) 23Na NMR spectroscopy of free Na+

in the halotolerant bacterium Brevibacterium sp. And .

Microbiology (Reading, England) 141 ( Pt 3):729-736.

16. Hirota N & Imae Y (1983) Na+-driven flagellar motors of an alkalophilic bacillus

strain yn-1. J Biol Chem 258(17):10577-10581.

17 17. Lusk JE, Williams RJ, & Kennedy EP (1968) Magnesium and the growth of

Escherichia coli. J Biol Chem 243:2618-2624.

18. Truong DM, Sidote DJ, Russell R, & Lambowitz AM (2013) Enhanced group II intron

retrohoming in magnesium-deficient Escherichia coli via selection of mutations in

the ribozyme core. Proc Natl Acad Sci USA 110:E3800-E3809.

19. Alberts B, Bray D, Lewis J, Roberts K, & Watson JD (1994) Molecular biology of the

cell 3rd ed.

20. London RE (1991) Methods for measurement of intracellular magnesium: NMR

and fluorescence. Annu Rev Physiol 53:241-258.

21. Romani AM (2007) Magnesium homeostasis in mammalian cells. Front Biosci

12:308-331.

22. Zhou HX, Rivas G, & Minton AP (2008) Macromolecular crowding and

confinement: Biochemical, biophysical, and potential physiological consequences.

Annu Rev Biophys 37:375-397.

23. Rouskin S, Zubradt M, Washietl S, Kellis M, & Weissman JS (2014) Genome-wide

probing of RNA structure reveals active unfolding of mRNA structures in vivo.

Nature 505(7485):701-705.

24. Ding Y, et al. (2014) In vivo genome-wide profiling of RNA secondary structure

reveals novel regulatory features. Nature 505:696-700.

25. Watters KE, Strobel EJ, Yu AM, Lis JT, & Lucks JB (2016) Cotranscriptional folding

of a riboswitch at nucleotide resolution. Nat Struct Mol Biol 23(12):1124-1131.

18 26. Minton AP (2001) The influence of macromolecular crowding and macromolecular

confinement on biochemical media. J Biol Chem 276:10577-10589.

27. Zimmerman SB & Trach SO (1991) Estimation of macromolecule concentrations

and excluded volume effects for the cytoplasm of escherichia coli. J Mol Biol

222(3):599-620.

28. Roth A, et al. (2014) A widespread self-cleaving ribozymes class is revealed by

bioinformatics. Nat Chem Biol 10(1):56-60.

29. Salehi-Ashtiani K, Luptak A, Litovchick A, & Szostak JW (2006) A genomewide

search for ribozymes reveals an HDV-like sequence in the human CPEB3 gene.

Science 313:1788-1792.

30. Kubodera T, et al. (2003) Thiamine-regulated gene expression of aspergillus

oryzae thia requires splicing of the intron containing a riboswitch-like domain in

the 5'-UTR. FEBS Letters 555:516-520.

31. Desai R, Kilburn D, Lee H-T, & Woodson S (2014) Increased ribozyme acitivty in

crowded solutions. J Biol Chem 289(5):2972-2977.

32. Dupuis NF, Holmstrom ED, & Nesbitt DJ (2014) Molecular-crowding effects on

single-molecule RNA folding/unfolding thermodynamics and kinetics. Proc Natl

Acad Sci USA 111(23):8464-8469.

33. Strulson CA, Boyer JA, Whitman EE, & Bevilacqua PC (2014) Molecular crowders

and cosolutes promote folding cooperativity of RNA under physiological ionic

conditions. RNA 20(3):331-347.

19 34. Nakano S-i, Kitagawa Y, Yamashita H, Miyoshi D, & Sugimoto N (2015) Effects of

cosolvents on the folding and catalytic activities of the hammerhead ribozyme.

Chem Bio Chem.

35. Tyrrell J, Weeks KM, & Pielak GJ (2015) Challenge of mimicking the influence of

the cellular environment on RNA structure by peg-induced macromolecular

crowding. Biochemistry 54:6447-6453.

36. Paudel BP & Rueda D (2014) Molecular crowding accelerates ribozymes docking

and catalysis. J Am Chem Soc 136:16700-16703.

37. Kilburn D, Roh JH, Guo L, Briber R, & Woodson S (2010) Molecular crowding

stabilizes folded RNA structure by the excluded volume efect. J Am Chem Soc

132(25):8690-8696.

38. Lambert D, Leipply D, & Draper DE (2010) The osmolyte TMAO stabilizes native

RNA tertiary structures in the absence of Mg2+: Evidence for a large barrier to

folding form phosphate dehydration. J Mol Biol 404(1):138-157.

39. Kilburn D, Roh JH, Behrouzi R, Briber RM, & Woodson SA (2013) Crowders perturb

the entropy of RNA energy landscapes to favor folding. J Am Chem Soc 135:10055-

10063.

40. Strulson CA, Yennawar NH, Rambo RP, & Bevilacqua PC (2013) Molecular crowding

favors reactivity of a human ribozyme under physiological ionic conditions.

Biochemistry 52:8187-8197.

20 41. Nakano S-i, Karimata HT, Kitagawa Y, & Sugimoto N (2009) Facilitation of RNA

enzyme activity in the molecular crowding media of cosolutes. J Am Chem Soc

131:16881-16888.

42. Strulson CA, Molden RC, Keating CD, & Bevilacqua PC (2012) RNA catalysis through

compartmentalization. Nature Chemistry 4:941-946.

43. Lambert D & Draper DE (2007) Effects of osmolytes on RNA secondary and tertiary

structure stabilities and RNA-Mg2+ ion interactions. J Mol Biol 370(5):993-1005.

44. Soto AM, Misra V, & Draper DE (2007) Tertiary structure of an RNA pseudoknot is

stabilized by "diffuse" Mg2+ ions. Biochemistry 46(11):2973-2983.

45. Lambert D & Draper DE (2012) Denaturation of RNA secondary and tertiary

structure by urea: Simple unfolded state models and free energy parameters

account for measured m-values. Biochemistry 51:9014-9026.

46. Holmstrom ED, Dupuis NF, & Nesbitt DJ (2015) Kinetic and thermodynamic origins

of osmolyte-influenced nucleic acid folding. J Phys Chem B.

47. Yamagami R, Bingaman JL, Frankel EA, & Bevilacqua PC (2018) Cellular conditions

of weakly chelated magnesium ions strongly promote RNA folding, stability, and

catalysis. Nat Comm Accepted.

48. Lai D, Proctor JR, & Meyer IM (2013) On the importance of cotranscriptional RNA

structure formation. RNA 19(11):1461-1473.

49. Pan T & Sosnick T (2006) RNA folding during transcription. Annu Rev Biophys

Biomol Struct 35(1):161-175.

21 50. Chadalavada DM, Knudsen SM, Nakano S-i, & Bevilacqua PC (2000) A role for

upstream RNA structure in facilitating the catalytic fold of the genomic hepatitis

delta virus ribozyme. J Mol Biol 301:349-367.

51. Uhm H, Kang W, Ha KS, Kang C, & Hohng S (2018) Single-molecule FRET studies on

the cotranscriptional folding of a thiamine pyrophosphate riboswitch. Proc Natl

Acad Sci USA 115(2):331-336.

52. Jegousse C, Yang Y, Zhan J, Wang J, & Zhou Y (2017) Structural signatures of

thermal adaptation of bacterial ribosomal RNA, transfer RNA, and messenger

RNA. PLOS ONE 12(9):e0184722.

53. Jakubowski H & Goldman E (1984) Quantities of individual aminoacyl-tRNA

families and their turnover in Escherichia coli. Journal of Bacteriology 158(3):769-

776.

54. Hopper AK (2013) Transfer RNA post-transcriptional processing, turnover, and

subcellular dynamics in the yeast Saccharomyces cerevisiae. Genetics 194(1):43-

67.

55. Hou Y-M (2010) Cca addition to trna: Implications for tRNA quality control. IUBMB

life 62(4):251-260.

22 Chapter 2

Cooperative RNA Folding Under Cellular Conditions Arises from both

Tertiary Structure Stabilization and Secondary Structure Destabilization

Published as a paper entitled: “Cooperative RNA Folding Under Cellular Conditions Arises from both Tertiary Structure Stabilization and Secondary Structure Destabilization” by

Kathleen A. Leamy, Neela H. Yennawar, and Philip C. Bevilacqua in Biochemistry 2017, 56

(3422). [All experiments were carried out by K.A.L. SAXS data was collected and analyzed with the help of N.H.R. Experiments were planned by K.A.L. and P.C.B.]

2.1 Abstract

RNA folding has been studied extensively in vitro, typically under dilute solution conditions and abiologically high salt concentrations of 1 M Na+ or 10 mM Mg2+. These high salt concentrations typically favor folding to the native state; however, folding happens in a multi-state manner where the RNA often gets trapped in long-lived misfolded states. The cellular environment is very different, with 20-40% crowding and only much lower concentrations of monovalent and divalent salts, around 10-40 mM Na+,

140 mM K+ and 0.5-2.0 mM Mg2+. As such, RNA structures and functions can be radically altered under cellular conditions. We previously reported that tRNAphe secondary and

23 tertiary structures unfold together in a cooperative two-state fashion under crowded in vivo-like ionic conditions, but in a non-cooperative multi-state fashion under dilute in vitro ionic conditions unless in non-physiologically high concentrations of Mg2+. The mechanistic bases behind these effects remain unclear, however. To address the mechanism that drives RNA folding cooperativity, we probe effects of cellular conditions on structures and stabilities of individual secondary structure fragments comprising the full-length RNA. We elucidate effects of a diverse set of crowders on tRNA secondary structural fragments and full-length tRNA at three levels: at the nucleotide level by temperature-dependent in-line probing (ILP), at the tertiary structure level by small angle

X-ray scattering (SAXS), and at the global level by thermal denaturation. We conclude that cooperative RNA folding is induced by two overlapping mechanisms: increased stability and compaction of tertiary structure through effects of Mg2+, and decreased stability of certain secondary structure elements through effects of molecular crowders.

These findings reveal that despite having very different chemical makeups RNA and protein can both have weak secondary structures in vivo leading to cooperative folding.

2.2 Introduction

RNA structure serves many roles in biology including catalysis, small molecule recognition, and gene regulation. (1) High concentrations of monovalent (1 M Na+) (2, 3) and divalent salts (up to 50 mM Mg2+) (4, 5) have typically been used to study RNA folding in vitro.

Rationales for these conditions are to force simple RNA duplexes to fold in a two-state-

24 like fashion, which simplifies thermodynamic interpretation of data, or to induce catalytic

RNAs to fully adopt a tertiary structure. However, the cellular environment is very different. In cells, the predominant monovalent ion is K+ and its concentration is only ~140 mM (6); Mg2+ concentrations are low at just 0.5-1.0 mM in eukaryotic cells and 1.5-2.0 mM in prokaryotic cells, (7-11) and there is an estimated 10-40 mM Na+(12, 13). There are also membranous and non-membranous compartments that concentrate RNA and an estimated 20% – 40% molecular crowding in the cytoplasm (14, 15).

There are manifold reasons to study RNA folding under cellular-like conditions.

First, genome-wide studies that probe RNA structure in cells have shown that certain classes of RNAs adopt very different structures in vivo than predicted in silico or measured in vitro(16), and there appears to be less extensive RNA structure in vivo than in vitro (17).

Likewise, studies in model cytoplasms indicate that in solutions that mimic the cellular environment, certain RNAs can form more native-like structures, have improved functions, and fold through different pathways (15, 18-20). For instance, the adenine riboswitch adopts a more stable and compact structure under cellular conditions (18); a small group I intron with lower flexibility has a larger free energy gap for folding in Mg2+

(19); the CPEB3 ribozyme cleaves with a faster rate and is stable, even under semi- denaturing conditions, in cellular conditions (21); and tRNA folds in a two-state, cooperative manner when in the background of cellular mimics (20).

Functional RNAs typically fold in a hierarchical manner, with secondary structures forming first and tertiary structures forming after (22). This order is necessitated by the global architecture of RNA, wherein secondary structures provide a structural framework,

25 and tertiary structures typically assemble out of that framework. Free energy parameters for RNA folding in the presence of 1 M NaCl have been measured for many model duplexes and are exceptionally favorable. For instance, 4-5 base pair RNA helices can have folding free energies of ≤ –10 kcal/mol. Such strong secondary structures have been suggested to drive RNA folding (23), which led to the notion that RNAs fold in a non- cooperative manner wherein weak tertiary structures unfold at lower temperatures and strong secondary structures at higher temperatures.

Secondary structures of RNA may not be as stable in vivo as they are in vitro.

Indeed, RNA and DNA secondary structures have been reported to be destabilized in the presence of high concentrations of crowders and small osmolytes similar to small molecules that can be found in cells. For example, in 20% crowding agents and cosolutes, both DNA oligomer duplexes and hammerhead ribozyme secondary structures have a decrease in melting temperature (24, 25). In the case of the hammerhead ribozyme, this decrease is accompanied by an increase in the rate of self-cleavage (25). In addition, diverse small osmolytes have been reported to destabilize nearly all RNA secondary structures while having mixed effects on the thermostability of tertiary structures (26).

The above experiments, while leading to keen insight into RNA stability in crowded solutions, were performed in non-physiological concentrations of divalent, monovalent, or small osmolyte concentrations; moreover, these studies did not focus on RNA folding cooperativity.

In the protein-folding community, folding cooperativity is often defined as the coupling of secondary and tertiary structure unfolding, and we make an effort to

26 understand RNA folding cooperativity in these terms herein. We differentiate this definition of cooperativity from the important but separate viewpoint of folding cooperativity amongst various RNA tertiary structural elements (27, 28). Woodson and co-workers showed that RNA tertiary structure can couple with secondary structure and guide proper formation of secondary structure. While pioneering, these studies were conducted in the absence of physiological concentrations of monovalent ions, crowders, or cosolutes (29). Recent studies from our lab showed that under in vivo-like conditions that mimic cellular crowding and ionic conditions, functional RNAs unfold in a cooperative all-or-none-like fashion in which tertiary and secondary structures unfold together (20).

However, the contributions of secondary structure destabilization and tertiary structure stabilization to the cooperative unfolding in RNA under in vivo-like conditions remain unclear.

Although we are strong proponents of studying RNA folding directly in vivo, it is very difficult to perform thermodynamic studies in living cells (16, 30). To complement our direct in vivo studies, we have adopted a detailed thermodynamic and structural study of RNA folding in model cytoplasms that mimic the crowding and ionic conditions of the cellular environment (31). We report here studies on full-length yeast tRNAphe and its secondary structural fragments under in vivo-like conditions, which reveal that tRNA folds in a cooperative manner owing to a combination of destabilizing secondary structures and stabilizing tertiary structure. We previously examined the folding of FL tRNA under solution conditions similar to typical in vitro environments, as well as solution conditions that emulate the cellular crowding and ionic conditions of the cell (20). In the

27 presence of physiological Mg2+ concentrations, the RNA folded in a simple two-state manner with molecular crowding, but in a complex multi-state manner in dilute solution.

Folding cooperativity, defined here as the simultaneous unfolding of tertiary and all secondary RNA structure, could arise from stabilization of tertiary structure, destabilization of secondary structures, or both. We test these models by mechanistic investigations of the helical fragments (HFs) that comprise the secondary structure of tRNA.

2.3 Results and Discussion

We previously examined the folding of FL tRNA under solution conditions similar to typical in vitro environments, as well as solution conditions that emulate the cellular crowding and ionic conditions of the cell (20). In the presence of physiological Mg2+ concentrations, the RNA folded in a simple two-state manner with molecular crowding, but in a complex multistate manner in dilute solution. Folding cooperativity, defined here as the simultaneous unfolding of tertiary and all secondary RNA structure, could arise from stabilization of tertiary structure, destabilization of secondary structure, or both. We test these models by mechanistic investigations of the helical fragments (HFs) that comprise the secondary structure of tRNA.

28 2.3.1 Effects of Mg2+ and Crowder are Similar for WT and FL tRNA. The folding properties of wild-type (WT), which has the natural modifications, and T7 transcribed full-length (FL) tRNAphe (Figure A-1) were compared in different concentrations of Mg2+ and crowder in the background of physiological 140 mM K+ and 10 mM Na+. We used optical melting to compare the unfolding of WT and FL tRNA. Notably, WT and FL tRNA have the same tertiary structure and rate of aminoacylation (32).

WT and FL tRNA have similar folding properties in buffer and crowded conditions at physiological Mg2+ (Figure A-2). In buffer as the concentration of Mg2+ is increased from 0 to 2.0 mM, both RNAs fold more cooperatively. This is seen as sharpened melting transitions that occur at higher temperature, albeit with higher amplitudes for WT tRNA.

In 20% PEG8000, similar behavior of WT and FL tRNA is observed, as indicated by the amplitudes and TMs of the transitions. Given the modest effect of nucleotide modifications on tRNA folding and function, we conducted studies on unmodified tRNA and its fragments.

2.3.2 Cooperative tRNA Folding Can be Induced by Mg2+-Driven Tertiary Structure

Stabilization. To separate out the effects of physiological conditions on RNA secondary and tertiary structures, the FL tRNA and four secondary structure HFs that comprise tRNAphe—the acceptor stem, D stem-loop, anticodon stem-loop, and TC stem-loop— were prepared as separate model RNA hairpins (Figure 2-1). Each HF is designed from the parent tRNA, with the exception of the acceptor stem, where a stretch of 8 Us was inserted into the loop to create a hairpin structure. In this and the following section, these

29 HFs were melted in the presence of 140 mM K+ and 10 mM Na+ with either 0, 0.5, or 2.0 mM Mg2+, with or without various crowding agents, and absorbance curves were summed together as the “sum of the secondary structure fragments” or “SSS” (SSS) (see Materials and Methods in Appendix A). These are compared to the melting of FL tRNA under identical solution conditions. Melting temperatures are compiled in Tables A-1 to A-3.

Figure 2-1. Structures of FL yeast tRNAphe and its model helical fragments (HF) with their predicted folds. (A) Three-dimensional folded structure of yeast FL tRNAphe (PDB 1ehz) and (B) FL tRNA secondary structure and the model HF derived from each stem. In panel (A) Mg2+ (grey) and Mn2+ (black) ions associated with tRNA are shown as spheres. The colors of each structural element are the same in both panels. FL tRNA tertiary contacts, superimposed on the secondary structure, are provided in Figure A- 1. The sequences of the model HF are the same as in FL tRNAphe and are predicted to form the shown hairpins by the mFold server for each model sequence.

In dilute buffer-only conditions and 2.0 mM Mg2+, FL tRNA folded cooperatively and at a higher TM, while secondary structure stability remained largely unchanged

(Figure 2-2A). For instance, in buffer as the concentration of Mg2+ is increased from 0 to

30 2.0 mM, FL tRNA is stabilized by ~12 oC, yet SSS is stabilized by only ~4 oC (Table A-1 to A-

3). Since the cooperative transition encompasses both secondary and tertiary structure unfolding and because secondary structure is largely unchanged in stability, tertiary structure must be stabilized. In other words, Mg2+ is primarily stabilizing tertiary structure rather than secondary structure in buffer. The stabilization of RNA structures by Mg2+ in buffer is a well-studied phenomenon (33).

Figure 2-2. First derivative curves of thermal denaturation experiments of FL tRNA and SSS under physiological ionic and crowded conditions. First derivative curves of thermal denaturation experiments in (A) 0% PEG 200, (B) 20% PEG 200 or PEG 8000, and (C) 40% PEG 200 or PEG 8000 with increasing concentrations of Mg2+ from 0 to 0.5 to 2.0 mM (light to dark colors). The TM’s are provided in Tables A-1, A-2 and A-3. Solutions contain a background of 10 mM sodium cacdoylate (pH 7.0) and 140 mM KCl.

This same phenomenon of Mg2+-driven tertiary structure stabilization holds in the background of 20 or 40% of crowder (Figure 2-2B, C). In particular, in 20% PEG200, the

31 o melting transition of SSS remained broad and the TM increased by only 4 C, as the concentration of Mg2+ is increased from 0 to 2.0 mM (Figure 2-2B). Under the same crowded conditions, FL tRNA has a sharp transition in the derivative curve and an increase in thermostability of 12 oC upon raising the Mg2+ concentration. The large stabilization of

FL tRNA compared to the secondary structures in cellular conditions indicates that Mg2+ stabilizes tertiary structure even in the presence of crowder. In 40% PEG200, similar effects are observed (Figure 2-2C). In sum, low millimolar concentrations of free Mg2+, like those found in vivo, strongly stabilize RNA tertiary structure under in vivo-like monovalent and crowding conditions.

To test if the effects are peculiar to PEG200, additional crowders were tested

(Figure 2-2, 2-3, Tables A-1 to A-3). In general, at 0 mM Mg2+, lower molecular weight crowders, such as PEG200 and PEG4000, tend to destabilize secondary and tertiary structures, and larger crowders, such as PEG8K and PEG20K tend to stabilize those structures (Figure 2-3A). These observations are unique to 0 mM Mg2+, which is non- physiological. We thus examined physiological Mg2+ and crowding effects. Similar to

PEG200, cooperativity of FL tRNA is induced by the addition of Mg2+ in 20 and 40%

PEG8000, while SSS stability is largely unaffected (Figure 2-2B, 2-2C). Furthermore, at 0.5 and 2.0 mM Mg2+ almost all crowding agents (PEG200, PEG4000, PEG8000, and

PEG20000) increase the stability of FL tRNA, resulting in a higher melting temperature than in buffer and more cooperative folding (Figure 2-3B, 2-3C) indicating that the effect is not peculiar to one crowding agent.

32

Figure 2-3. Difference in TM of FL tRNA, SSS, and each of the four model HF in crowder compared to buffer. TM in crowder minus TM in buffer in (A) 0, (B) 0.5, and (C) 2.0 mM 2+ Mg . In all panels TM differences in 20% PEG200, 40% PEG200, 20% PEG4000, 40% PEG4000, 20% PEG8000, 40% PEG8000, and 20% PEG20000 are in orange, red, light blue, dark blue, light green, dark green, and purple, respectively. The TM’s are shown on the plot and in Tables A-1, A-2 and A-3. Solutions contain a background of 10 mM sodium cacdoylate (pH 7.0) and 140 mM KCl.

2.3.3 Cooperative tRNA Folding Can be Induced by Crowder-Driven Secondary Structure

Destabilization. The previous section found that Mg2+ can stabilize RNA tertiary structure in buffer conditions, a phenomenon that is well known, as well as in crowded conditions.

Increasing Mg2+ up to physiological concentrations led to cooperative folding because the stability of the underlying secondary structural framework was largely unaffected by

Mg2+. Another way to induce cooperative folding of RNA, at least in principle, is by weakening secondary structure. Ken Dill defines protein folding cooperativity as destabilization of folding intermediates. We therefore measured effects of various crowders on the folding of FL tRNA and the SSS in different background Mg2+ concentrations.

33

Figure 2-4. Cooperative folding can be induced through crowder-driven secondary structure destabilization. First-derivative curves of thermal denaturation experiments of FL tRNA and SSS with increasing concentrations of (upper) PEG 200 and (lower) PEG 2+ 8000 in (A) 0, (B) 0.5, and (C) 2.0 mM Mg . The TM’s are provided in Tables A-1, A-2 and A-3. Solutions contain a background of 10 mM sodium cacdoylate (pH 7.0) and 140 mM KCl.

Increasing the amount of crowding agent at constant physiological Mg2+ induced more cooperative folding of FL tRNA (Figure 2-4B, 2-4C). Remarkably, the observed increase in FL tRNA cooperativity is achieved by decreasing the stability of the secondary structures (Figures 2-3, 2-4, Tables A-1 to A-3). When the concentration of PEG200 is

o 2+ increased from 0 to 40%, the SSS TM decreased by almost 14 C in 0.5 mM Mg (Figure 2-

4B) and 12 oC in 2 mM Mg2+ (Figure 2-4C), yet the overall structure stability is largely unchanged. Because the cooperative transition encompasses both secondary and tertiary structure unfolding and because secondary structure is destabilized, tertiary

34 structure must be stabilized. Similar effects are observed when the concentration of

PEG8000 is increased from 0 to 40% in 0.5 mM Mg2+ (Figure 2-4B) and 2.0 mM Mg2+

(Figure 2-4C), although destabilization of secondary structure is less dramatic. In 0.5 mM

2+ o Mg , the addition of 20% PEG8000 decreased SSS TM by 5 C, and addition of 40%

o PEG8000 decreased SSS TM by 6 C. The weakening of SSS in PEG8000 is accompanied by an increase in tertiary structure stability and cooperative folding of FL tRNA. In 2.0 mM

Mg2+, secondary structure is still destabilized by the addition of PEG8000, albeit less than that observed in 0.5 mM Mg2+, and tertiary structure is stabilized. PEG4000 and

PEG20000 show similar effects (Figure 2-3). These findings suggest that the mechanism for cooperative folding changes with the molecular weight of the crowder. For instance, crowders with low MW induce cooperativity mainly through secondary structure destabilization, while higher MW crowders induce cooperativity through both secondary structure destabilization and tertiary structure stabilization as depicted in Figure 2-4. A general destabilization of RNA secondary structure under in vivo conditions is consistent with genome-wide data comparing in vivo and in vitro data sets.

We were curious as to how crowding agents could act on specific secondary structures. We found that crowding agents have differing effects on stability of model HF depending on base-pair lengths. For instance, in 0.5 mM Mg2+ the short HF models of the anticodon (AC), D, and TC SL models are all destabilized by the addition of 20 or 40%

PEG200, PEG4000, PEG8000, and PEG20000 (Figure 2-3B). These SL models contain only

4 or 5 base pairs. On the contrary, the addition of these crowders, with the exception of

PEG200, marginally stabilize the acceptor SL model, which is a longer hairpin comprised

35 of 7 base pairs. Similarly, in 2.0 mM Mg2+, the addition of the above concentration of crowders destabilizes all HF (Figure 2-3C) and shorter SL structure are more affected, specifically the AC SL and D SL. Similar trends have been made using DNA and RNA model helices (24, 25), in which cellular-like conditions have bigger effects for helices with fewer base pairs.

2.3.4 Cooperative and Non-Cooperative Folding are Observed on the Nucleotide Level.

The previous two sections showed that cooperative folding of FL tRNA arises both from tertiary structure stabilization and secondary structure destabilization. According to these observations, in a cooperative folding environment all the nucleotides involved in secondary structure and tertiary structure interactions should be unfolding over approximately the same temperature range. To determine if FL tRNA is folding in a two- state manner at the nucleotide level, we turned to a technique with nucleotide resolution.

Temperature-dependent in-line probing (ILP) was carried out on 5’-end labeled FL tRNA in both buffer and model cytoplasms. FL tRNA was incubated at 12 temperatures between 35 and 75 oC, and the length of incubation at each temperature was chosen to achieve even RNA degradation across temperatures (see Materials and Methods in

Appendix A).

When fractionated on a sequencing gel, the normalized ILP reactivities follow the cloverleaf structure of tRNA, with higher reactivity observed at single-stranded than double-stranded regions (Figure A-3, A-4). Normalized ILP reactivity for FL tRNA was plotted versus temperature and fit to a two-state unfolding model for individual

36 nucleotides, helices, or FL tRNA as appropriate at both 0.5 (Figures 2-5, 2-6) and 2.0 mM

Mg2+ (Figures A-5, A-6).

Figure 2-5. Nucleotide and helical stem fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 0.5 mM Mg2+. Nucleotide fits and helical fits were performed on buffer and 20% PEG200 samples, respectively, to obtain a TM for unfolding of each nucleotide or each stem in (A) D SL, (B) AC SL, and (C) TC SL. The TM values and residuals from the fits are provided in each figure and in Table 2-1.

37

Figure 2-6. Global fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 0.5 mM Mg2+. The same ILP data from Figure 2-5 was fit globally across each stem for a single TM of the RNA and to look for two-state behavior. The TM values and residuals from the fits are provided in each figure and in Table 2-1.

38 For FL tRNA in the background of 0.5 mM Mg2+ and uncrowded conditions, non- cooperative folding is observed at the nucleotide level (Figure 2-5). Some of the base- paired nucleotides could not be fit to standard melting equations. For those base-paired nucleotides exhibiting two-state unfolding behavior, we fit them to obtain a TM. Varying stability is observed within and between each HF at 0.5 mM Mg2+. For example, the D SL

(FL) unfolds between 55 oC and 62 oC (Figure 2-5A, LH; Table 2-1) and the AC SL (FL) unfolds between 63 oC and 64 oC, while the TC stem-loop is unstable. As might be expected, ILP data of different HF were fit poorly globally (Figure 2-6, LH), as indicated by

2 a high error on the fit TM, a large  of 10.3, and large residuals for the fit. The varying stability of the nucleotide fits and the large error on the global fit in 0.5 mM Mg2+ without crowder are indicative of non-cooperative folding.

The parameters and conclusions made from optical melting of each HF and of each helix from ILP of FL in the background of 0.5 mM Mg2+ and uncrowded conditions largely agree with one another (Table 2-1). We consider the D SL, AC SL, and TC SL HFs in turn.

In buffer at 0.5 mM Mg2+, two distinct transitions are observed using optical melting for the D SL (model), a low temperature transition at 60.5 oC and a high temperature transition at 71.0 oC (Table 2-1); in agreement, two transitions were also observed for the

D SL (FL) in the context of FL tRNA using ILP and nucleotide fitting, a low temperature transition between around 55 oC and a high temperature transition around 62 oC (Table

o 2-1, Figure 2-5A). The AC SL (model) has an optical TM of 63.0 C and a similar ILP TM between 62.8 - 64.3 oC (Table 2-1, Figure 2-5A). From optical melting in buffer at 0.5 mM

2+ o Mg the TC SL (model) has a TM of 67.8 C, while in the context of FL tRNA using ILP the

39 entire RNA melts out below this transition (Figure 2-5C, LH). The wide range of optical melt TM’s of the model HFs and the poor global fit of the ILP data are supportive of non- cooperative folding in dilute and low Mg2+ conditions.

Table 2-1. Melting temperatures of FL tRNA and tRNA Helical Fragments as Determined by Optical Melting and ILP in Buffer and 20% PEG200 with 0.5 and 2.0 mM Mg2+.

TM (ºC) TM (ºC) Structure Element Optical Melting ILP Buffer with 0.5 mM Mg2+ FL tRNA 57.8 58.8 ± 5.4a D Stem-Loop (1)60.5, (2) 71.0 54.7-62.3b Anticodon Stem-Loop 63.0 62.8-64.2b TC Stem-Loop 67.8 --

20% PEG200 with 0.5 mM Mg2+ FL tRNA 57.0 60.4 ± 1.1a D Stem-Loop 63.5 63.6 ± 7.5c Anticodon Stem-Loop 52.3 63.3 ± 0.9c TC Stem-Loop 63.5 54.9 ± 0.7c

Buffer with 2.0 mM Mg2+ FL tRNA 65.1 60.0 ± 0.6a D Stem-Loop 66.5 61.0 ± 0.9c Anticodon Stem-Loop 59.7 63.8 ± 1.0c TC Stem-Loop 62.0 58.8 ± 1.0c

20% PEG200 with 2.0 mM Mg2+ FL tRNA 64.8 59.4 ± 0.4a D Stem-Loop 61.6 60.4 ± 1.0c Anticodon Stem-Loop 55.8 63.2 ± 0.7c TC Stem-Loop 58.2 58.6 ± 0.4c

Solutions contain 10 mM sodium cacodylate (pH 7.0) and 140 mM KCl. There is an additional 0.5 or 2.0 mM a MgCl2, and 0% or 20% PEG200 as indicated in the table. Normalized ILP reactivities of all base-paired b nucleotides were fit globally to find a single TM for the entire RNA (see Figs. 2-6 and A-6). Individual nucleotide TMs were obtained by fitting each base-paired nucleotide in each stem to obtain a single TM for each nucleotide c (see Fig. 2-5, LH). Helical TMs were obtained by globally fitting base-paired nucleotides in each helical stem to obtain a single TM for each helical secondary structure fragment (See Figs. 2-5, RH and A-5).

40 For FL tRNA in the same background of 0.5 mM Mg2+ but with 20% PEG200, cooperative folding is now observed at the nucleotide level. Global fitting of ILP data within each helical stem reveals that all HFs unfold with a single transition with TM’s clustered between 55 and 64 oC depending on HF (Figure 2-5, RH), which is similar to the single transition in optical melting transition of FL tRNA of 57.0 oC (Table 2-1).

Furthermore, global fitting of ILP data across all stems reveals a single transition of 60.4 oC

(Figure 2-6, RH). The quality of this global fit of the data is evidenced by the low error of

2 the fit, low residuals, and low  of 0.99. Agreement in the TM between the helical level fits of stems and overall, as well as their agreement with the optical melting TM of FL tRNA, strongly supports cooperative unfolding.

Next we consider 2.0 mM Mg2+ data. Optical melting studies on FL tRNA, showed that cooperative folding of FL is induced in buffer containing 2.0 mM Mg2+ (Figure 2-2A) and that the addition of a crowding agent increases the cooperative folding nature (Figure

2-2B, 2-2C). The same trend is observed with temperature-dependent ILP in these conditions. In both uncrowded and 20% PEG200 conditions in the background of 2.0 mM

Mg2+, helical fitting of ILP data within each stem shows that each HF unfolds in a single

o transition with TM’s tightly clustered between 58.8 and 63.8 C in the absence of PEG200 and between 58.6 and 63.2 ºC in 20% PEG200 (Figure A-5). The range of TM’s both with and without crowder is just ~5 oC showing that all secondary structures melt with similar

2+ TM’s and are very close to the optical melting temperature of FL tRNA in 2.0 mM Mg without (65.1 oC) and with 20% PEG200 (64.8 oC) (Table 2-1). Global fitting of ILP data across all stems (for same TM) reveals that all stems can be well described with a single

41 transition of 60.0 oC and 59.4 oC in buffer and 20% PEG200, respectively (Figure A-6, Table

2-1). The residuals for the global fits in 2.0 mM Mg2+ in both uncrowded and crowded solutions are low, indicating that the global fits represent the data well. These fits are also validated by the low error on each of the fits, the low residuals, and the low 2 values of 0.33 and 0.63 for the global fits in 2.0 mM Mg2+ with and without PEG200.

Figure 2-7. Melting temperatures of FL tRNA and HF obtained by optical melting and ILP in buffer and crowded conditions. TM values were obtained by (A) optical melting on FL tRNA or the model HF, or by (B) ILP on FL tRNA, data from which was globally fit for a TM of either FL tRNA or the HF in FL tRNA.

The data in this section and the previous section are summarized in Figure 2-7.

The stabilization of tertiary structure upon the addition of Mg2+ and the destabilization of secondary structure upon the addition of crowding agent is clear when the TMs obtained by UV or ILP melting are plotted versus solution conditions. As the concentration of

42 2+ PEG200 increases in the background of 0.5 or 2.0 mM Mg , the UV- or ILP-detected TM of FL tRNA is relatively unchanged, suggesting crowding is not affecting overall stability

(Figure 2-7). Yet, when the model HFs are probed with optical melting dramatic destabilization of TM is observed upon the addition of crowding agent in 0.5 or 2.0 mM

Mg2+, indicating that crowding destabilizes secondary structure (Figure 2-7A). In contrast, when ILP TM s are plotted, as conditions become more cooperative, the secondary structures have shifts in their TM closer to that of FL tRNA, indicative of cooperativity

(Figure 2-7B). In other words, the various helices are stabilized by the presence of FL tRNA.

2.3.5 tRNAphe Adopts a More Compact Structure Under In Vivo-Like Conditions. The above sections provided evidence that both crowding agents and physiological concentrations of Mg2+ can induce cooperative folding of FL tRNA, which is observed on both the global and nucleotide levels. To assess tertiary structure directly, we used small angle X-ray scattering (SAXS). In-line size-exclusion chromatography (SEC) SAXS was collected on FL tRNA in buffer with 0.5 mM Mg2+ to determine if the RNA forms aggregates. (SEC SAXS could not be done with PEG; see Materials and Methods in

Appendix A.) At FL tRNA concentrations of 0.4 and 0.6 mg/mL, two peaks were observed in both the absorbance- and scattering-detected SEC traces, indicating population of two distinct tRNA species (Figure A-7). Molecular weight analysis in BioXTAS RAW indicated that the species are a dimer (MW: 50 kDa) and a monomer (MW: 27 kDa; expected monomer MW of 25 kDa). At 0.2 mg/mL tRNA, the dimer peak disappeared in the

43 scattering traces and is barely detectable in the absorbance traces, indicating that concentrations of ≤0.2 mg/mL are adequate concentrations to collect data in complex solution conditions.

Compaction of FL tRNA was observed upon an increase of Mg2+ or crowder in solution. This is evidenced by decreasing radius of gyration (Rg), Dmax, and excluded volume values as the concentration of Mg2+ or PEG increases (Figure A-8, Table 2-2). As

2+ Mg concentration increases from 0 to 0.5 to 2.0 mM, the Rg decreases from 31.9 to 25.0 to 24.2 Å respectively (Table 2-2), and the Dmax decreases from 112 to 87 to 82 Å, all indicating that the addition of Mg2+ compacts tRNA. A similar compaction is observed in

2+ the presence of a crowding agent. In 0.5 mM Mg when 20% PEG8000 is added, Dmax

2+ decreases from 87 to 83 Å, and in 2.0 mM Mg when 20% PEG8000 is added, both Rg decreases from 25.1 to 23.9 Å and Dmax decreases from 82 to 74 Å (Table 2-2). A theoretical scattering curve of the tRNA crystal structure was generated and found to have Rg and Dmax values of 23.1 and 82 Å, respectively, which are very similar to the values collected experimentally (Table 2-2). Using FoXS the theoretical and experimental scattering curves were overlaid (Figure A-9A-C), and for all experimental conditions a low

2 value was obtained indicating that the data fit well to the model.

44 Table 2-2. Structural parameters obtained for FL tRNAphe using Small Angle X-Ray Scattering.

a a b b Solution Condition MW Rg (Å) Rg (Å) Dmax (Å) Excluded (kDa) Volumec (Å3) 0 mM MgCl2 32.2 31.9 33.8 112 63,700 0.5 mM MgCl2 28.9 25.0 26.1 87 55,400

2.0 mM MgCl2 28.5 24.2 25.1 82 47,700

0.5 mM MgCl2 + 20% PEG 25.7 25.9 25.7 83 67,900

2.0 mM MgCl2 +20% PEG 41.7 21.2 23.9 74 39,600 1ehz cyrstal structured 25.9 23.1 25.2 82 47,000

Solutions contain 25 mM HEPES (pH 7.5), 140 mM KCl, and 0, 0.5, or 2.0 mM MgCl2. Samples containing 20% PEG8000 are indicated. a These parameters were obtained by analysis of the experimental scattering curves using BioXTAS RAW software. These values are reported in the text. b These parameters were obtained by analysis of the experimental scattering curves using the pairwise distribution function in GNOM, using the ATSAS software package. c The excluded volume was found using the alignment program SUPCOMB when the DAMAVER bead models were aligned to the tRNA crystal structure (PDB ID: 1ehz). d The scattering curve for the tRNA crystal structure (PDB 1ehz) was calculated using FoXS and processed similar to experimentally collected scattering data.

Next we generated bead models from SAXS data and compared them to the

crystal structure of tRNA (Figure 2-8). All of the bead models align well to the crystal

structure, as assessed by the RMSD and visual inspection of the alignment, with the

exception of the bead model for tRNA in buffer with 0 and 0.5 mM Mg2+ in which tRNA

forms more an extended structure (Figure 2-8; Table A-4). The bead models, as well as

the above-discussed Rg and Dmax values, show a compaction of tRNA upon the addition of

Mg2+ or crowder (Figure 2-8). A bead model was also generated with the theoretical

scattering curve of the tRNA crystal structure (Figure A-9D), and this bead model has a

similar shape to the experimental bead models, which benchmarks our bead model

approach.

45

Figure 2-8. Physiological crowded and ionic conditions favor a compact folded state of FL tRNA. The tRNAphe crystal structure (PDB 1ehz) was aligned with FL tRNA SAXS bead models in (A) 0, (B) 0.5, and (C) 2.0 mM Mg2+, without (top) and with (bottom) 20% PEG8000. The bead models were made using DAMMIF and DAMAVER and overlaid with the crystal structure using SUPCOMB.

The compaction of tRNA under in vivo-like conditions coincides with an increase in folding cooperativity under the same in vivo-like conditions. For instance, increasing the concentration of Mg2+ in the absence of crowder induces thermostability and more cooperative folding behavior of FL tRNA according to optical melts (Figure B-2). These same changes in conditions also induce compaction of the RNA. For example, scattering experiments show that the addition of Mg2+ in the absence of crowder induces compaction of FL tRNA (Table 2-2, Figure 2-8). Similar agreement between optical melting and scattering experiments is found in the presence of 20% PEG200 (Figures 2-2 and 2-8).

Similarities between folding cooperativity and compaction suggest that the excluded volume effect of the molecular crowders on FL tRNA help induce two-state folding to the more stabilized folded structure.

46 2.4 Conclusions

Our data indicate that there are two distinct mechanisms for inducing RNA folding cooperativity. The first is a Mg2+-induced increase in tertiary structure stability. The second is a crowder-induced decrease in secondary structure stability. Under in vivo-like conditions, RNA folds cooperatively by a combination of these mechanisms.

It is of interest to consider the chemical and biological importance, as well as the ultimate origin, of such two-state-like folding of functional RNA. Strong tertiary structure can hold together intrinsically unstable secondary structures (Figure 2-7). This has chemical importance because it can assemble the secondary structural framework in a way that creates binding pockets for ligands and active sites for catalysts. Weakening of secondary structures, observed in the presence of crowders, may assist the search for a native tertiary structure by smoothing the RNA folding funnel. For proper function, RNAs need to fold into the correct structures on a biologically relevant time scale. Several studies in vitro under high ionic conditions used to fold RNA, have shown that misfolded states can last from minutes to hours (34, 35). These misfolds can have free energy values that are very similar to the native state and structural changes to reach the native state can be very large involving breaking of bonds (36). Weakened secondary structures, like those that occur under in vivo-like conditions, should help smooth out the folding landscape and lead to faster adoption of the native state.

There is possible biological relevance of the helical fragments and conditions studied herein. Fragments of fully processed tRNAs regulate protein coding genes, RNA

47 metabolism, and RNA interference in prokaryotes and eukaryotes during stress response

(37) suggesting that there may be direct biological relevance of understanding the folding of tRNA fragments. In addition, some organisms, such as plants, can increase the concentrations of monovalent ions and compatible cosolutes dramatically during abiotic stress, which could aid RNA folding (38, 39).

RNA and proteins have starkly different chemical makeups and intermolecular forces. This has led to the belief that they fold differently. Indeed, unfolding of small functional RNAs has been widely regarded as non-cooperative, with weak tertiary structure unfolding before strong secondary structure (40), while proteins fold in a two- state manner and have weak secondary structure and strong tertiary structure (41).

However, our work suggests that RNA and proteins have similar folding properties under cellular conditions. We show that, like their protein counterparts, small RNAs unfold cooperatively under cellular conditions and that this is driven by weakened secondary structural elements and strengthened tertiary structure. These findings suggest that evolutionary forces may drive biopolymers, regardless of their chemical composition, to fold in a cooperative fashion.

48 2.5 Acknowledgements

We thank Dr. Richard Gillian and Dr. Jesse Hopkins for help with small angle X-ray scattering experiments. This work was supported by U. S. National Institutes of Health

Grant R01-GM110237 (P. C. B). This work is based on research conducted at the Cornell

High Energy Synchrotron Sourse (CHESS), which is supported by the National Science

Foundation and the National Institutes of Health/National Institute of General Medical

Sciences under NSF award DMR-0936384, using the Macromolecular Diffraction at CHESS

(MacCHESS) facility, which is supported by GM-103485 from the National Institutes of

Health, through its National Institute of General Medical Sciences. We thank Erica Frankel for her help with global fitting of ILP data using IgorPro. We thank Elizabeth Whitman for collecting preliminary data on the RNA hairpins.

2.6 References

1. Cech TR & Steitz JA (2014) The noncoding RNA revolution-trashing old rules to

forge new ones. Cell 157(1):77-94.

2. Freier SM, et al. (1986) Improved free-energy parameters for predictions of RNA

duplex stability. Proc Natl Acad Sci USA 83:9373-9377.

49 3. Xia T, et al. (1998) Thermodynamic parameters for an expanded nearest-

neighbor model for formation of RNA duplexes with watson-crick base pairs.

Biochemistry 37:14719-14735.

4. Herschlag D & Cech TR (1990) Catalysis of RNA cleavage by the Tetrahymena

thermophil ribozyme. Kinetic description of the reaction of an RNA substrate

complementary to the active site. Biochemistry 29:10159-10171.

5. Tanner MA & Cech TR (1996) Activity and thermostability of the small self-

splicing group I intron in the pre-tRNA(Ile) of the purple bacterium azoarcus. RNA

2:74-83.

6. Feig AL & Uhlenbeck OC (1999) The role of metal ions in RNA biochemistry. The

RNA world, 2nd ed., eds Gesteland RF, Cech TR, & Atkins JF (Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, New York), pp 287-320.

7. Lusk JE, Williams RJ, & Kennedy EP (1968) Magnesium and the growth of

Escherichia coli. J Biol Chem 243:2618-2624.

8. Truong DM, Sidote DJ, Russell R, & Lambowitz AM (2013) Enhanced group II

intron retrohoming in magnesium-deficient Escherichia coli via selection of

mutations in the ribozyme core. Proc Natl Acad Sci USA 110:E3800-E3809.

9. Alberts B, Bray D, Lewis J, Roberts K, & Watson JD (1994) Molecular biology of

the cell 3rd ed.

10. London RE (1991) Methods for measurement of intracellular magnesium: NMR

and fluorescence. Annu Rev Physiol 53:241-258.

50 11. Romani AM (2007) Magnesium homeostasis in mammalian cells. Front Biosci

12:308-331.

12. Nagata S, Adachi K, Shirai K, & Sano H (1995) 23Na NMR spectroscopy of free Na+

in the halotolerant bacterium brevibacterium Sp. And Escherichia coli.

Microbiology (Reading, England) 141 ( Pt 3):729-736.

13. Hirota N & Imae Y (1983) Na+-driven flagellar motors of an alkalophilic bacillus

strain yn-1. J Biol Chem 258(17):10577-10581.

14. Zhou HX, Rivas G, & Minton AP (2008) Macromolecular crowding and

confinement: Biochemical, biophysical, and potential physiological

consequences. Annu Rev Biophys 37:375-397.

15. Strulson CA, Yennawar NH, Rambo RP, & Bevilacqua PC (2013) Molecular

crowding favors reactivity of a human ribozyme under physiological ionic

conditions. Biochemistry 52:8187-8197.

16. Ding Y, et al. (2014) In vivo genome-wide profiling of RNA secondary structure

reveals novel regulatory features. Nature 505:696-700.

17. Rouskin S, Zubradt M, Washietl S, Kellis M, & Weissman JS (2014) Genome-wide

probing of RNA structure reveals active unfolding of mRNA structures in vivo.

Nature 505(7485):701-705.

18. Tyrrell J, McGinnis JL, Weeks KM, & Pielak GJ (2013) The cellular envirnment

stabilized adenine riboswitch RNA structure. Biochemistry 52:8777-8785.

51 19. Kilburn D, Roh JH, Behrouzi R, Briber RM, & Woodson SA (2013) Crowders

perturb the entropy of RNA energy landscapes to favor folding. J Am Chem Soc

135:10055-10063.

20. Strulson CA, Boyer JA, Whitman EE, & Bevilacqua PC (2014) Molecular crowders

and cosolutes promote folding cooperativity of RNA under physiological ionic

conditions. RNA 20(3):331-347.

21. Desai R, Kilburn D, Lee H-T, & Woodson S (2014) Increased ribozyme acitivty in

crowded solutions. J Biol Chem 289(5):2972-2977.

22. Brion P & Westhof E (1997) Hierarchy and dynamics of RNA folding. Annu Rev

Biophys Biomol Struct 26:113-137.

23. Tinoco I, Jr. & Bustamante C (1999) How RNA folds. J Mol Biol 293(2):271-281.

24. Nakano S-i, Karimata HT, Ohmichi T, Kawakami J, & Sugimoto N (2004) The effect

of molecular crowding with nucleotide length and cosolute structure on DNA

duplex stability. J Am Chem Soc 126:14330-14331.

25. Nakano S-i, Karimata HT, Kitagawa Y, & Sugimoto N (2009) Facilitation of RNA

enzyme activity in the molecular crowding media of cosolutes. J Am Chem Soc

131:16881-16888.

26. Lambert D & Draper DE (2007) Effects of osmolytes on RNA secondary and

tertiary structure stabilities and RNA-Mg2+ ion interactions. J Mol Biol

370(5):993-1005.

52 27. Sattin BD, Zhao W, Travers K, Chu S, & Herschlag D (2008) Direct measurement

of tertiary contact cooperativity in RNA folding. J Am Chem Soc 130(19):6085-

6087.

28. Behrouzi R, Roh JH, Kilburn D, Briber RM, & Woodson SA (2012) Cooperative

tertiary interaction network guides RNA folding. Cell 149(2):348-357.

29. Chauhan S & Woodson SA (2008) Tertiary interactions determine the accuracy of

RNA folding. J Am Chem Soc 130:1296-1303.

30. Kwok CK, Ding Y, Tang Y, Assmann SM, & Bevilacqua PC (2013) Determination of

in vivo RNA structure in low-abundance transcripts. Nat Comm 4.

31. Leamy KA, Assmann SM, Mathews DH, & Bevilacqua PC (2016) Bridging the gap

between in vitro and in vivo RNA folding. Q Rev Biophys 49:e10.

32. Hall KB, Sampson JR, Uhlenbeck OC, & Redfield AG (1989) Structure of an

unmodified tRNA molecule. Biochemistry 28(14):5794-5801.

33. Stein A & Crothers DM (1976) Conformational changes of transfer RNA. The role

of magnesium(II). Biochemistry 15(1):160-168.

34. Banerjee AR & Turner DH (1995) The time dependence of chemical modification

reveals slow steps in the folding of a group i ribozyme. Biochemistry 34:6504-

6512.

35. Chadalavada DM, Senchak SE, & Bevilacqua PC (2002) The folding pathway of the

genomic hepatitis delta virus ribozyme is dominated by slow folding of the

pseudoknots1. J Mol Biol 317(4):559-575.

53 36. Mitchell DI, Jarmoskaite I, Seval N, Seifert S, & Russell R (2013) The long-range p3

helix of the tetrahymena ribozyme is disrupted during folding between the

native and misfolded conformations. J Mol Biol 425:2670-2686.

37. Gebetsberger J & Polacek N (2013) Slicing trnas to boost functional ncrna

diversity. RNA Biol. 10(12):1798-1806.

38. Greenway H & Munns R (1980) Mechanisms of salt tolerance in nonhalophytes.

Annu. Rev. Plant. Physiol. 31(1):149-190.

39. Handa S, Bressan RA, Handa AK, Carpita NC, & Hasegawa PM (1983) Solutes

contributing to osmotic adjustment in cultured plant cells adapted to water

stress. Plant Physiol. 73(3):834-843.

40. Tinoco IJ & Bustamante C (1999) How RNA folds. J Mol Biol 293:271-281.

41. Dill KA, et al. (1995) Principles of protein folding--a perspective from simple exact

models. Protein Sci. 4(4):561-602.

54 Chapter 3

Cotranscriptional Folding of RNA is Cooperative

Under Physiological Conditions

Under Revision as a paper entitled: “Cotranscriptional Folding of RNA is Cooperative

Under Physiological Conditions” by Kathleen A. Leamy, Neela H. Yennawar, and Philip C.

Bevilacqua [All experiments were carried out by K.A.L. SAXS data was collected and analyzed with the help of N.H.R. Experiments were planned by K.A.L. and P.C.B.]

3.1 Abstract

RNA folding is often studied by renaturing full-length RNA in vitro and tracking folding transitions. However, in cells the growing transcript folds as it emerges from the RNA polymerase. RNA folding pathways and population of the native state can differ profoundly if the transcript is folded co- or post-transcriptionally. Here, we investigate the folding pathways and stability of numerous cotranscriptional intermediates of yeast tRNAphe. tRNA is a highly regulated functional RNA that undergoes multiple steps of posttranscriptional modification and is found in very different lengths during its lifetime in the cell. The full-length precursor transcript is extended on both the 5’ and 3’ ends of the cloverleaf core, and these extensions get trimmed before addition of the 3’ CCA and

55 aminoacylation. We studied the thermodynamics and structures of the precursor tRNA and of cotranscriptional intermediates of the cloverleaf structure. Folding is examined at both the secondary and tertiary structural levels using multiple biophysical approaches, and cotranscriptional folding is modeled computationally. Our findings suggest that nature has selected for a single base addition to control folding to the functional three- dimensional structure of tRNAphe. In physiological conditions, the transcript folds in a single, cooperative transition only when nearly all of the nucleotides in the cloverleaf are transcribed, and extensions on the 5’ and 3’ ends do not interfere with cooperative folding. This highly controlled cooperative folding has implications for the recognition of tRNA by processing enzymes and regulation of tRNA in cells.

3.2 Introduction

Functional RNAs adopt unique three-dimensional structures that allow them to perform essential functions in the cell, including catalysis, protein synthesis, and gene regulation. RNA typically folds in a hierarchical manner, wherein secondary structure forms before tertiary structure (2). In typical in vitro solution conditions of high monovalent or divalent salt, classically 1 M Na+ (3) 10 mM Mg2+ (4), transcripts can get trapped in very stable misfolded structures that can persist for minutes to hours (5).

However, recent studies have shown that in cytoplasm mimics with physiological salt concentrations of just ~140 mM K+ (6) and 0.5-2 mM Mg2+ (7-11), functional RNAs fold in a two-state manner without detectable population of intermediates (12, 13).

56 Folding is often studied on the minimal length RNA needed to obtain function; however, in cells RNA is transcribed with extensions on the 5’ and 3’ ends (14). These extended regions can interact with the functional RNA sequence, affecting its fold and function (15-17). Additionally, most folding experiments renature a full-length RNA and then study its folding; however, in cells RNA folds cotranscriptionally as it is emerging from the polymerase (18). RNA function can be modulated when folded cotranscriptionally versus posttranscriptionally, and the fraction of folded RNA can differ when the RNA folds cotranscriptionally or posttranscriptionally (19-21). For example, recent cotranscriptional folding studies on riboswitches have shown that the aptamer folding pathway is dependent upon the presence of ligand during transcription. These studies provide compelling reasons to consider the folding of functional RNAs cotranscriptionally.

We chose to look at the native folding of tRNA, as it is one of the most prevalent

RNAs in cells. The native structure consists of four stems—the acceptor stem, D stem, anticodon stem, and TC stem—and has extensive tertiary interactions between the D stem-loop and TC stem-loop, with the acceptor stem and anticodon stem-loop not engaged in tertiary interactions (Figure 3-1). The precursor tRNA undergoes extensive posttranscriptional processing and modification, including multiple changes to its length and sequence identity during maturation. tRNAs are transcribed as precursor molecules with 5’ leader and 3’ trailer sequences that are removed in the nucleus by RNase P at the

5’ end and multiple nucleases at the 3’ end (22). In yeast, the 3’ CCA is added posttranscriptionally by nucleotidyl transferase, and then the CCA-containing transcript

57 is modified and aminoacylated. Extensive folding studies on tRNA have been performed on the cloverleaf, fully processed sequence (12, 13, 23, 24); however, folding of tRNAs with a 5’ leader and a 3’ trailer sequence have not been studied cotranscriptionally.

Proper folding of the cloverleaf as well as the 5’ and 3’ extensions during transcription is important for 5’ and 3’ end processing, posttranscriptional modifications, and biological function (25, 26).

Herein, we analyze effects of several key components of the cytoplasm on cotranscriptional tRNA folding intermediates. Influences of cellular ionic conditions, crowding, and Mg2+-chelated amino acids, and combinations thereof are investigated.

Furthermore, the effects of the 5’ leader and 3’ trailer sequences on the folding pathway were investigated. Cooperative folding of full length tRNA transcripts in cytoplasm mimics has been previously reported and attributed to secondary structure destabilization and tertiary structure stabilization (13). Thus, we hypothesized that intermediates on cotranscriptional folding pathways would also be destabilized in cytoplasm mimics, since intermediates have weak secondary structure under these conditions and might not form all tertiary contacts. Consistent with this notion, we observe folding cooperativity herein only in the cellular conditions and only in intermediates with complete base-pairing in the acceptor stem. We observe that constructs with precursor 5’ and 3’ extensions also fold in a two-state manner with no change in cloverleaf structure. Thus, nature appears to have selected for the cloverleaf structure to fold in a two-state manner only once the acceptor stem is fully formed, and it does so without interference from flanking nucleotides.

58

3.3 Results

We previously reported that secondary structure is destabilized in crowded cytoplasm mimics (13). Here, we investigated effects of such conditions on cotranscriptional folding, which involves folding as the nascent transcript emerges from

RNA polymerase. A series of model tRNA cotranscriptional intermediates that range in length from 65 “I65” to 73 “I73” nucleotides was prepared (Figure 3-1). We hypothesized that I65 lacks proper tertiary interactions because the secondary structure is weak without formation of the acceptor stem, which normally facilitates long range interactions; therefore, I65 is a good model for non-native folding. Intermediates 69, 71,

72, and 73 were chosen because tertiary structure has the potential to form as the stability of the acceptor stem increases, and two-state folding to the native state might build in. All of these constructs model those found during tRNA transcription in vivo.

Folding of these intermediates and of full-length (FL) tRNA were probed in buffer, as well as in solution conditions that mimic different aspects of the cytoplasm using the following three in vivo-like conditions: (A) a crowder (20% PEG8000), (B) amino acids chelated to millimolar amounts of Mg2+ (aaCM), and (C) combination of crowding and aaCM in the background of either 0.5 or 2.0 mM free Mg2+, which mimic eukaryotic and prokaryotic divalent conditions, respectively.

59

Figure 3-1. Full-length and cotranscriptional intermediate constructs of tRNAphe. (A) Two-dimensional and (B) three-dimensional tertiary structures of FL tRNA and cotranscriptional intermediates (I65, I69, I71, I72, I73) that are truncated at the indicated positions. Orange lines depict tertiary interactions.

3.3.1 Effects of cellular conditions on cotranscriptional folding in eukaryotic Mg2+.

We first discuss folding of intermediates under eukaryotic-like ionic conditions of 0.5 mM

free Mg2+. Intermediates without complete transcription of the acceptor stem, I65, I69

and I71, exhibit highly non-cooperative folding in buffer as well as all three in vivo-like

solution conditions. In particular, the melting curves show multiple transitions over a

broad temperature range. These observations suggest that these cotranscriptional

60 intermediates are highly unstable and/or adopting multiple conformations (Figure B-1A-

D).

Figure 3-2. Thermal denaturation of FL tRNA and late cotranscriptional intermediates in buffer and various physiological conditions (rows). Intermediates unfolding in buffer, 20% PEG8000, amino acid-chelated Mg2+, and 20% PEG8000 with amino acid- chelated Mg2+ in the background of (A-D) 0.5 mM Mg2+ and (E-H) 2.0 mM Mg2+. Colors and symbols for each construct are provided in the figure.

61 We then turned to longer intermediates that might fold into native-like structures, specifically late cotranscriptional intermediates, I72 and I73. In the presence of buffer alone, without cellular additives, broad melting transitions with multiple peaks, and a low magnitude H are observed (Figure. 2-2A, Table B-1). These observations indicate multi- state folding of these late cotranscriptional intermediates. Notably, I72 and I73 unfold at

54.5 and 57.4 °C, which are lower temperatures than FL, indicating lower stability (Table

B-1).

Addition of the commonly used crowding agent, 20% PEG8000, stabilizes full- length which unfolds in a single transition with a more negative  of -54.8 kcal/mol compared to -47.4 kcal/mol in buffer alone (Figure. 3-2B). Strikingly, late cotranscriptional intermediates are not stabilized by these same crowding conditions, as they still unfold in a multi-state manner, with H values of just -42.4 kcal/mol for I72 and

-22.8 kcal/mol for I73.

Observing that crowding dramatically destabilizes even late cotranscriptional intermediates compared to full-length, we next probed folding of the late intermediates in Mg2+-chelated amino acids. Free amino acids are found at a concentration over 100 mM in the cell and have the ability to weakly chelate Mg2+ at concentrations exceeding

10 mM. Weakly Mg2+-chelated amino acids have been shown to increase ribozyme activity and drive folding cooperativity (27, 28), and can therefore affect cotranscriptional folding. In the presence of aaCM, there is a large change in the folding transitions of the late intermediates. There is now striking similarity between the melting curves of I72,

I73, and FL (Figure 3-2C). All three curves have the same minor low temperature

62 transition at ~45 °C, and the high temperature transition TMs are similar, over a narrow temperature range of 62.8 to 65.6 °C for I72 to FL. The low temperature transition, which is the same for all constructs, is attributed to the tertiary structure unfolding, and the high temperature transition, which increases with construct length, is attributed to secondary structure. These unfolding assignments are based on the hierarchical nature of RNA folding, as well as prior assignments of tRNA folding transitions (24).

Observing destabilization of cotranscriptional intermediates in PEG and stabilization in aaCM, we next probed folding in the conditions of combined PEG and aaCM. In this combined condition, all of the constructs unfold cooperatively for the first time, exhibiting just a single transition, with TMs in the narrow range between 64.7 and

67.4 °C, increasing with length (Figure 3-2D). Given the hierarchical nature of RNA folding, this observation supports two-state, cooperative folding in which secondary structure melts out concomitant with tertiary structure, and secondary structure influences thermostability (28).

The extent of cooperativity can be measured by the value of ∆H, where a large negative ∆H indicates a cooperative folding transition (29). We compare the cooperativity of the intermediates to FL tRNA by the ratio of their ∆H values, ∆HI/∆HFL (12, 13). If the ratio of ∆HI/∆HFL is close to 1, the intermediates and full length have similar extents of cooperativity, but if the ratio is <1 the intermediates are less cooperative than FL. In buffer, 20% PEG8000, and aaCM ∆HI/∆HFL are <1 indicating differing extents of partial to noncooperative folding of cotranscriptional intermediates (Table B-1). In the condition of crowding with aaCM, the ∆H values for I72, I73, and FL are similar around -110 kcal/mol,

63 with ∆HI/∆HFL close to 1.0. Thus, only under the most physiological conditions, late cotranscriptional intermediates fold in a highly cooperative manner, similar to FL tRNA.

3.3.2 Effects of cellular conditions on cotranscriptional folding in prokaryotic Mg2+.

Next, we examined the effects of cellular conditions on folding in prokaryotic Mg2+, at the moderate Mg2+ level of 2.0 mM. Similar to 0.5 mM Mg2+, the early cotranscriptional intermediates I65, I69, and I71 display non-cooperative behavior in all four solution conditions (Figure B-1E-H). Therefore, we again focus on the late intermediates I72 and

I73.

In buffer with 2.0 mM free Mg2+, FL tRNA folds in a two-state manner, and I72 and

I73 cotranscriptional intermediates fold in a three-state manner, similar to in 0.5 mM

Mg2+ (Figure 3-2E). Specifically, there is a single unfolding transition for FL tRNA at 65.1

°C, but two distinct transitions for I72 and I73 comprised of broad transitions at 37.8 °C and 39.4 °C and sharper transitions at 62.2 °C and 63.9 °C, respectively. As described above, the low temperature transition is attributed to tertiary structure unfolding, and the high temperature transition is attributed to secondary structure unfolding consistent with the hierarchical nature of RNA folding.

Upon the addition of 20% PEG8000, FL and late intermediates all fold in a two- state manner, with TMs of 67.9, 69.5, and 69.0 °C, for I72, I73, and FL tRNA, respectively

(Figure 3-2F, Table B-1). Furthermore, with aaCM as well as the combined PEG with aaCM condition, very similar unfolding transitions of intermediates and FL are observed (Figure

3-2G, 3-2H). In both of these cellular-like conditions, a single transition is observed for

64 I72, I73, and FL, with TMs clustered between 69 and 72 °C. The magnitude of ∆Hfolding is slightly larger for all constructs in aaCM with crowder than in aaCM alone, which are ~ –

125 and –105 kcal/mol, respectively (Table B-1). Notably, the ∆HI/∆HFL for I72 and I73 are near unity in aaCM and in aaCM with crowder, indicating that these constructs are unfolding just as cooperatively as full-length but only in physiological solutions.

3.3.3 Cooperativity arises from depopulation of non-native states. Two hypotheses are that folding cooperativity could arise from either stabilization of the final folded state, or destabilization of non-native states. To test these possibilities, changes in tertiary structure of the final folded state were monitored by native gel electrophoresis (Figure B-

2) and small angle X-ray scattering (SAXS) (Figure B-3, Table B-2), while local changes in secondary and tertiary structure were monitored by in-line probing (ILP) (Figures B-4, B-

5). On native gels containing physiological concentrations of Mg2+, no change in compaction is observed in the late cotranscriptional intermediates under all four solution conditions (Figure 3-3A, 3-3B), supporting no global change in structure. The SAXS data further support the conclusion that there is not a difference in final global structure amongst the late RNA folding intermediates and the FL (Figure B-3, Table B-2). In particular, the Rg and Dmax values, measures of size and aspect ratio, are similar amongst the late cotranscriptional intermediates and FL tRNA in buffer and crowded conditions.

The ILP profiles of FL and intermediates in buffer are also very similar, indicating that native final structure is adopted (Figure 3-3C). Furthermore, the ILP signal is similar for each RNA construct in dilute and cellular solution conditions, again providing indication

65 of no change in structure (Figure 3-3D-F). In sum, it appears that late intermediates do not misfold globally or locally. Thus, the observed differences in folding cooperativity between the folding intermediates and the FL tRNA appear to be due to differences in population of non-native states, as presented in the previous sections.

Figure 3-3. Native gels and ILP reactivity of FL tRNA and cotranscriptional intermediates in physiological concentrations of Mg2+. (A, B) Native gels of FL, I73, and I72 in (A) 0.5 and (B) 2.0 mM Mg2+. (C) Normalized ILP signals of FL tRNA, I73, and I72 in buffer with 0.5 mM Mg2+. Normalized ILP signals of (D) FL, (E) I73, and (F) I72 in buffer, 20% PEG8000, amino acid chelated-Mg2+, and 20% PEG8000 with amino acid chelated-Mg2+. Colors and symbols for each construct are in the figure. ILP data was normalized to nucleotides 33-35 in the anticodon loop, which were always single- stranded.

66 3.3.4 Extensions on the 5’ and 3’ ends do not affect tRNA core folding cooperativity. In cells, tRNAs are transcribed with extensions off the 5’ and 3’ ends, and these have the potential to alter the folding and structure of the tRNA core. We therefore tested the effects of the folding of the tRNA core fused to the native 5’ leader, together referred to as the “5’ leader” construct, and with both the tRNA core fused to the native 5’ leader and 3’ trailer sequences, referred to as the “precursor” construct (Figure 3-4A). In buffer with 2.0 mM Mg2+, both of these constructs show two distinct transitions with very broad unfolding transitions that occur over ~50 °C (Figure 3-4B). The same behavior is observed in the most physiological conditions tested above of crowding with aaCM (Figure 3-4C).

We hypothesized that perhaps the tRNA core was still folding in a two-state manner and that the broad transitions observed could be attributed to base unstacking in the extended regions. To uncouple changes in the 5’ and 3’ extensions and the core, we used the more detailed method of in-line probing, which reports structure at the nucleotide level rather than globally like UV-detected thermal denaturation (Figure B-6,

B-7). We first examined an ILP profile at a single, physiologically relevant temperature. At

35 oC in buffer and physiological conditions, the ILP profiles of both the 5’ leader and precursor constructs show high signal in the 5’ and 3’ extensions and native ILP patterns of alternating strong and weak signal in the tRNA core (Figure B-8). Thus, in both of these solution conditions the tRNA core is adopting the native secondary structure, while the extensions off the 5’ and 3’ ends are unfolded and not interfering with the core.

67

Figure 3-4. Sequence and thermal denaturation of 5’ leader and precursor constructs in buffer and physiological conditions with 2.0 mM free Mg2+. (A) Sequence of 5’ leader and precursor tRNA. The “5’ leader” construct contains the 5’ leader and tRNA core, and the “precursor” construct contains the 5’ leader, tRNA core, and 3’ trailer sequences. (B, C) Thermal denaturation curves of 5’ leader (pink) and precursor (purple) sequences in (B) buffer and (C) 20% PEG8000 with amino acid chelated-Mg2+.

To gain insight into the folding pathway, the core tRNA portion of the ILP signal from the above constructs was analyzed at 12 temperatures between 35 and 75 °C. The signal of the nucleotides in each stem were globally fit as a set of melting curves to obtain a single TM and single H of folding for the entire core (Figures 3- 5, B-9). We first consider the 5’ leader construct. Global fits of all base paired nucleotides in the 5’ leader construct in buffer and physiological conditions are excellent with 2 values of 2.2 and 0.9, respectively, and errors on the TM and H less than 10% (Table B-3, Figure 3-5). The quality of the fits supports the conclusion that in these conditions the tRNA core is

68 o unfolding in a two-state manner. The TM values are 60 and 65.5 C and H values are -

129 and -146 kcal/mol in buffer and physiological conditions, respectively. The enthalpy value in physiological conditions is similar to the theoretical fully cooperative H of -

150.0cal/mol, as calculated using nearest-neighbor parameters, and close to the experimental tRNA core H of -126 kcal/mol measured in physiological conditions by thermal denaturation (Table B-1).

Figure 3-5. Global fitting of variable temperature in-line probing signal of the 5’ leader construct in buffer and physiological conditions (columns). Global fitting was performed on all double stranded regions (rows) simultaneously in tRNA in (A-D) buffer and (E-H) 2+ 20% PEG8000 with amino acid chelated-Mg to obtain a single TM and H of folding for each condition, which can be found in Table B-3. All samples contain 2.0 mM free Mg2+. Global fits (lines) and data points are shown for each stem in tRNA.

69 Next, we turned to the precursor construct. As with the 5’ leader construct, high quality global fits on the precursor construct in buffer and physiological conditions were obtained, supporting cooperative tRNA core folding when both flanking regions are present (Table B-3, Figure B-9). Global fits of the core tRNA ILP portion of data provide

o reliable TM values of 58.5 and 65.2 C, in agreement with those for the 5’ leader construct, further supporting the conclusion that flanking regions do not contribute to core folding.

In addition, the H value for folding of the precursor construct in buffer was -136 kcal/mol, in agreement with that for the 5’ leader construct. Overall, the results in this section suggest that the extended regions of tRNA do not affect the final cloverleaf structure nor the cooperativity of folding.

3.3.5 Cooperative cotranscriptional folding is computationally predicted. As described above, we observed a cooperative folding transition when the cloverleaf core is fully transcribed that is unaffected by native flanking sequences. To obtain additional insight into the mechanism of folding, we tested our experimental data against a computational model of cotranscriptional folding. CoFold, a program that predicts the thermodynamic structure of RNA taking cotranscriptional folding into account (1), was used to predict cotranscriptional tRNA structures. Sequences with a precursor 5’ end and varying lengths on the 3’ end, which replicate the growing transcript, were input to CoFold. We tested early transcripts ending at the last nucleotide of the D stem-loop, anticodon stem-loop, and TC stem-loop, as well as late transcripts with small increments along the acceptor stem, which mimic our experimental constructs of I69, I71, I72, and I73. We observe that

70 the D, anticodon, and TC stem-loops adopt native secondary structure contacts as they are transcribed (Figure 3-6); however, the 5’ strand of the acceptor stem forms non-native contacts with the 5’ leader sequence up until the 72nd residue is transcribed. Surprisingly, addition of this single residue in the core leads to an exchange of the non-native acceptor stem base pairing with native core pairing. This observation agrees with our experimental data, which show cooperative folding with the I72 construct but not I71 (Figures 3-2, B-

1). Furthermore, native structure in the tRNA core is maintained upon addition of the final core nucleotide as well as several nucleotides in the 3’ trailer (Figure 3-6).

As stated, this single nucleotide-mediated transition to the native state agrees with our thermodynamic and structure probing data. Upon addition of ten or more nucleotides to the 3’ trailer, CoFold predicts base pair formation between the 3’ trailer and 5’ leader, which maintains native tRNA core folding (Figure B-10). However, upon transcription of the full 3’ leader CoFold predicts loss of native secondary structure in the acceptor and D stems. This loss of native structure disagrees with our experimental structure probing of the precursor construct (Figures B-7, B-8, B-9), and we attribute this to lack of tertiary contact data in the CoFold structure prediction. One role of tertiary structure, which can fold cotranscriptionally, might be to block such non-native pairing.

71

Figure 3-6. Computationally predicted structure formation during cotranscriptional folding of yeast tRNAphe. Predicted cotranscriptional structure formation starting with transcription of the D stem-loop through the 3’ trailer using CoFold (1). The regions of tRNA are colored as follows: 5’ leader and 3’ trailer (black), acceptor stem (purple), D stem-loop (blue), anticodon stem-loop (green), and TC stem-loop (pink). Nucleotides with native base pairing are depicted with colored lines, and non-native base pairing is depicted with black lines.

72 3.4 Discussion

RNA performs many essentials functions in the cell, including catalysis, small molecule binding, and gene regulation. To function optimally, RNAs need to fold into the proper three-dimensional structures without significant population of misfolded states. Proper folding to the native state is especially important for tRNA. Misfolding of tRNA is implicated in stalling of the ribosome, incorrect reading of the anticodon, and human disease (26, 30). In cells, RNAs fold during transcription, and the functional region of RNAs are often transcribed with flanking nucleotides on the 5’ and 3’ ends. To prevent misfolds, nature can select for sequences that will not adopt stable non-native structures during cotranscriptional folding. In this study we examined the effects of cytoplasm mimics on the folding, stability, and structure of tRNA cotranscriptional intermediates. This was explored in traditional buffer and physiological conditions, and based on our previous work (13) we predicted that cotranscriptional folding intermediates would be destabilized in physiological conditions. Indeed, we observe folding cooperativity only in the most cellular of conditions and only in late cotranscriptional intermediates with complete base- pairing in the acceptor stem.

Biopolymers need to fold on a physiologically relevant time scale to their native state without getting trapped in stable misfolds. In this study, we examined both early and late cotranscriptional intermediates, of 65-71 nt and 72-73 nt, respectively. In buffer with physiological concentrations of Mg2+, the early cotranscriptional intermediates that can partially form in the acceptor stem show multiple peaks in melting curves, indicating

73 that there are many populated states with varying stability (Figure B-1A, B-1E). This type of folding is non-favorable because stable misfolds populate, leading to very rugged folding to the native state. Remarkably, in cytoplasm mimics, these short intermediates are highly destabilized (Figure B-1B-D, B-1F-H), indicating that they are not appreciably populating stable structures under more physiological conditions. In contrast, in physiological conditions the late cotranscriptional intermediates fold very similarly to FL tRNA, albeit at slightly lower TMs (Figure 3-2D, 3-2H). This suggests that late cotranscriptional intermediates are adopting similar structure to FL tRNA, and are following the same two-state folding pathway as FL tRNA.

We also modeled cotranscriptional folding using CoFold, a folding program that considers the growing 3’ end. Excellent agreement was observed between the length of the RNA needed for cooperative folding in experiments and length needed for formation of native base pairing in computation, with 72 nt required (Figures 3-2, 3-7). The modeling only takes into account RNA secondary structure formation suggesting that tertiary

Figure 3-7. A model for the late steps of cotranscriptional folding of tRNAphe. Without the 3’ portion of the acceptor stem transcribed (I65, I71) the structures that form are weak. As 3’ portion of the acceptor stem is transcribed, the C loop docks into the D SL, beginning with I72. The 5’ leader and 3’ trailer sequences do not interfere with folding of the cloverleaf and are single-stranded.

74 structure is determined by native secondary structure, which is consistent with the hierarchical nature of RNA folding. This suggests that in cells, premature transcripts adopt low stability cotranscriptional intermediates until sufficient transcript length that native folding pathways can be followed (Figure 3-7).

Functional RNAs are often studied in the minimal length construct needed to function, however in vivo these constructs are flanked by sequences on the 5’ and 3’ ends.

In some cases, these flanking regions have been shown to modulate function (31). tRNAs are transcribed with a 5’ leader and a 3’ trailer sequence, and the effects of these sequences on the cloverleaf structure folding pathway is largely unexplored (12, 13, 23).

Our experimental data suggest that nature has selected for 5’ leader and 3’ trailer sequences to be highly unstructured and not interact with the fully folded core tRNA structure (Figures B-6, B-7, B-8). Additionally, with and without flanking sequences, the cloverleaf folds in a two-state, highly cooperative manner (Figures 3-5, B-9), suggesting that nature selected for flanking sequences that would not change the tRNA folding pathway. Flanking sequences have been previously reported to influence ribozyme function both positively and negatively through formation of self-structures and ribozyme-inhibiting structures (15, 32, 33). What is remarkable about tRNA is that in the context of the full length transcript, the flanking structures do not appear to form any structure at all, either self-structure or structure with the tRNA core.

Maturation of tRNAs is a highly controlled process. In yeast, it involves cleavage of the 5’ leader with RNase P, while trimming of the 3’ end is thought to be carried out by tRNase Z or endonucleases (34). The CCA motif is then added to the 3’ end with tRNA

75 nucleotdiyl transferase and the tRNA is transported out of the nucleus. Our data suggest that the cloverleaf is fully folded with the 5’ and 3’ extensions, which may help the processing enzymes recognize only fully transcribed RNA. Indeed, cross-linking experiments and crystallographic analysis suggest that RNase P recognizes the folded state of the tRNA, making contacts with the acceptor stem, as well as the D and TC stem loops where tertiary contacts are concentrated (25, 35). Much less is known about processing of the 3’ end, but our data suggest that correct structure formation of the cloverleaf and no structure of the 3’ trailer could be necessary for processing.

After the 5’ and 3’ ends have been processed there is a single dangling A on the 3’ end, which we call I73 (22). This truncated transcript is recognized by nucleotidyl transferase, which adds the CCA motif onto the 3’ end. Similar to RNase P, nucleotidyl transferase recognizes the tertiary interactions between the D and TC loops, so the cloverleaf has to be folded into the native tertiary structure (36), which we find occurs with I72 and is maintained in I73 (Figure 3-2). Overall, our data suggest that there is inherent structure regulation in the sequence of tRNA, so that truncated transcripts do not fold properly and are not improperly processed.

3.5 Conclusions

In cells RNAs fold as the growing transcript emerges from RNA polymerase.

Recent studies have shed light on cotranscriptional folding of riboswitches and revealed that riboswitch folding pathways are highly dependent on the presence of ligand during

76 transcription (19, 20). However, not as much is known about the cotranscriptional folding of non-ligand dependent functional RNAs. Here we investigated the stabilities of tRNAphe cotranscriptional intermediates in cellular conditions. Our data reveal that in physiological conditions, cotranscriptional intermediates are unstable and form weak structures. The first stable intermediate forms when all of the secondary structure interactions can be made. Perhaps nature selects for this type of cotranscriptional folding to avoid population of misfolded states.

Many functional RNAs have 5’ and 3’ ends in close proximity, similar to tRNA.

These include several classes of riboswitches and ribozymes (37, 38). This long-range interaction could assure that native tertiary structure only forms once the entire RNA is transcribed. This could provide a general mechanism for avoiding misprocessing and misfolding of tertiary structure-containing RNAs.

3.6 Acknowledgments

We thank Dr. Richard Gillian and Dr. Jesse Hopkins for help with SAXS experiments and discussions. This work is based upon research conducted at the Cornell High Energy

Synchrotron Source (CHESS), which is supported by the National Science Foundation under award DMR-1332208, using the Macromolecular Diffraction at CHESS (MacCHESS) facility, which is supported by award GM-103485 from the National Institute of General

Medical Sciences, National Institutes of Health. This work was also supported by award

R01-GM110237 and R35-GM127064 to P.C.B. from the National Institutes of Health.

77 3.7 References

1. Proctor JR & Meyer IM (2013) Cofold: An RNA secondary structure prediction

method that takes co-transcriptional folding into account. Nucleic Acids Res

41(9):e102-e102.

2. Brion P & Westhof E (1997) Hierarchy and dynamics of RNA folding. Annu Rev

Biophys Biomol Struct 26:113-137.

3. Leamy KA, Assmann SM, Mathews DH, & Bevilacqua PC (2016) Bridging the gap

between in vitro and in vivo RNA folding. Q Rev Biophys 49:e10-e36.

4. Mitchell DI, Jarmoskaite I, Seval N, Seifert S, & Russell R (2013) The long-range P3

helix of the tetrahymena ribozyme is disrupted during folding between the

native and misfolded conformations. J Mol Biol 425:2670-2686.

5. Mitchell DI & Russell R (2014) Folding pathways of the tetrahymena ribozyme. J

Mol Biol 426:2300-2312.

6. Feig AL & Uhlenbeck OC (1999) The role of metal ions in RNA biochemistry. The

RNA world, 2nd ed., eds Gesteland RF, Cech TR, & Atkins JF (Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, New York), pp 287-320.

7. Lusk JE, Williams RJ, & Kennedy EP (1968) Magnesium and the growth of

Escherichia coli. J Biol Chem 243:2618-2624.

8. Truong DM, Sidote DJ, Russell R, & Lambowitz AM (2013) Enhanced group II

intron retrohoming in magnesium-deficient Escherichia coli via selection of

mutations in the ribozyme core. Proc Natl Acad Sci USA 110:E3800-E3809.

78 9. Alberts B, Bray D, Lewis J, Roberts K, & Watson JD (1994) Molecular biology of

the cell 3rd ed.

10. London RE (1991) Methods for measurement of intracellular magnesium: NMR

and fluorescence. Annu Rev Physiol 53:241-258.

11. Romani AM (2007) Magnesium homeostasis in mammalian cells. Front Biosci

12:308-331.

12. Strulson CA, Boyer JA, Whitman EE, & Bevilacqua PC (2014) Molecular crowders

and cosolutes promote folding cooperativity of RNA under physiological ionic

conditions. RNA 20(3):331-347.

13. Leamy KA, Yennawar NH, & Bevilacqua PC (2017) Cooperative RNA folding under

cellular conditions arises from both tertiary structure stabilization and secondary

structure destabilization. Biochemistry 56(27):3422-3433.

14. Lai D, Proctor JR, & Meyer IM (2013) On the importance of cotranscriptional RNA

structure formation. RNA 19(11):1461-1473.

15. Chadalavada DM, Knudsen SM, Nakano S-i, & Bevilacqua PC (2000) A role for

upstream RNA structure in facilitating the catalytic fold of the genomic hepatitis

delta virus ribozyme. J Mol Biol 301:349-367.

16. Khvorova A, Lescoute A, Westhof E, & Jayasena SD (2003) Sequence elements

outside the hammerhead ribozyme catalytic core enable intracellular activity.

Nat Struct Mol Biol 10:708.

17. Breaker RR (2017) Mechanistic debris generated by twister ribozymes. ACS Chem

Biol 12(4):886-891.

79 18. Pan T & Sosnick T (2006) RNA folding during transcription. Annu Rev Biophys

Biomol Struct 35(1):161-175.

19. Uhm H, Kang W, Ha KS, Kang C, & Hohng S (2018) Single-molecule FRET studies

on the cotranscriptional folding of a thiamine pyrophosphate riboswitch. Proc

Natl Acad Sci USA 115(2):331-336.

20. Watters KE, Strobel EJ, Yu AM, Lis JT, & Lucks JB (2016) Cotranscriptional folding

of a riboswitch at nucleotide resolution. Nat Struct Mol Biol 23(12):1124-1131.

21. Lutz B, Faber M, Verma A, Klumpp S, & Schug A (2014) Differences between

cotranscriptional and free riboswitch folding. Nucleic Acids Res 42(4):2687-2696.

22. Hopper AK (2013) Transfer RNA post-transcriptional processing, turnover, and

subcellular dynamics in the yeast Saccharomyces cerevisiae. Genetics 194(1):43-

67.

23. Crothers DM, Cole PE, Hilbers CW, & Shulman RG (1974) The molecular

mechanism of thermal unfolding of Escherichia coli formylmethionine transfer

RNA. J Mol Biol 87:63-88.

24. Stein A & Crothers DM (1976) Conformational changes of transfer RNA. The role

of magnesium(II). Biochemistry 15(1):160-168.

25. Reiter NJ, et al. (2010) Structure of a bacterial ribonuclease P holoenzyme in

complex with tRNA. Nature 468(7325):784-789.

26. Abbott JA, Francklyn CS, & Robey-Bond SM (2014) Transfer RNA and human

disease. Front Genet 5:158.

80 27. Yamagami R, Bingaman JL, Frankel EA, & Bevilacqua PC (2018) Cellular conditions

of weakly chelated magnesium ions strongly promote RNA folding, stability, and

catalysis. Nat Comm Accepted.

28. Leamy KA, Yennawar NH, & Bevilacqua PC (2018) Molecular mechanism for

folding cooperativity of functional RNAs in living organisms. Biochemistry In

Press.

29. Puglisi JD & Tinoco I, Jr. (1989) Absorbance melting curves of RNA. Methods

Enzymol 180:304-325.

30. Jones CN, Jones CI, Graham WD, Agris PF, & Spremulli LL (2008) A disease-

causing point mutation in human mitochondrial trnamet results in tRNA

misfolding leading to defects in translational initiation and elongation. J Biol

Chem 283(49):34445-34456.

31. Chadalavada DM, Cerrone-Szakal AL, & Bevilacqua PC (2007) Wild-type is the

optimal sequence of the HDV ribozyme under cotranscriptional conditions. RNA

13:2189-2201.

32. Cao Y & Woodson SA (1998) Destabilizing effect of an rRNA stem-loop on an

attenuator hairpin in the 5' exon of the tetrahymena pre-rRNA. RNA 4:901-914.

33. Pan J & Woodson SA (1998) Folding intermediates of a self-splicing RNA:

Mispairing of the catalytic core. J Mol Biol 280(4):597-609.

34. Skowronek E, Grzechnik P, Späth B, Marchfelder A, & Kufel J (2014) tRNA 3′

processing in yeast involves tRNase Z, REX1, and RRP6. RNA 20(1):115-130.

81 35. Khanova E, Esakova O, Perederina A, Berezin I, & Krasilnikov AS (2012) Structural

organizations of yeast RNase P and RNase MRP holoenzymes as revealed by UV-

crosslinking studies of RNA–protein interactions. RNA 18(4):720-728.

36. Yamashita S, Takeshita D, & Tomita K (2014) Translocation and rotation of tRNA

during template-independent RNA polymerization by tRNA

nucleotidyltransferase. Structure 22(2):315-325.

37. McCown PJ, Corbino KA, Stav S, Sherlock ME, & Breaker RR (2017) Riboswitch

diversity and distribution. RNA 23(7):995-1011.

38. Jimenez RM, Polanco JA, & Lupták A (2015) Chemistry and biology of self-

cleaving ribozymes. Trends Biochem Sci 40(11):648-661.

82 Chapter 4

Molecular Mechanism for Folding Cooperativity of Functional RNAs in

Living Organisms

Published as a paper entitled: “Molecular Mechanism for Folding Cooperativity of

Functional RNAs in Living Organisms” by Kathleen A. Leamy, Neela H. Yennawar, and

Philip C. Bevilacqua in Biochemistry 57(20):2994-3002 2018. [All experiments were carried out by K.A.L. SAXS data was collected and analyzed with the help of N.H.R.

Experiments were planned by K.A.L. and P.C.B.]

4.1 Abstract

A diverse set of organisms has adapted to live under extreme conditions. The molecular origin of the stability is unclear, however. It is not known whether the adaptation of functional RNAs, which have intricate tertiary structures, arises from strengthening of tertiary or secondary structure. Herein we evaluate effects of sequence changes on the thermostability of tRNAphe using experimental and computational approaches. To separate out effects of secondary and tertiary structure, we modify base pairing strength in the acceptor stem, which does not participate in tertiary structure. In dilute solution conditions, strengthening secondary structure leads to non-two-state thermal

83 denaturation curves and has small effects on thermostability, or the temperature at which tertiary structure and function is lost. In contrast, under cellular conditions with crowding and Mg2+-chelated amino acids, where two-state cooperative unfolding is maintained, strengthening secondary structure enhances thermostability. Investigation of stabilities of each tRNA stem across 44 organisms with a range of optimal growing temperatures revealed that organisms that grow in warmer environments have more stable stems. We also used Shannon entropies to identify positions of higher and lower information content, or sequence conservation, in tRNAphe and found that secondary structures have modest information content allowing them to drive thermal adaptation, while tertiary structures have maximal information content preventing them from participating in thermal adaptation. Base paired regions with no tertiary structure and modest information content thus offer a facile evolutionary route to enhancing the thermostability of functional RNA by simple molecular rules of base pairing.

4.2 Introduction

Life exists at temperatures ranging from freezing to boiling water, raising questions as to the molecular mechanisms for thermal adaptation. Extensive studies have been conducted on protein folding under extreme conditions (1-3), but relatively little is known about molecular mechanisms for adaption of RNA sequence to demanding environments.

Elucidation of such mechanisms can help establish how extant life has flourished on Earth,

84 as well as provide plausible pathways for the evolution of RNA sequences on early Earth in an RNA World scenario.

Of special interest are so-called ‘functional RNAs’, which require precise tertiary structures for function. These naturally occurring RNAs include ribozymes, riboswitches, rRNA, and tRNA and are key components of an RNA World (4, 5). Crystal structures reveal that tertiary contacts in functional RNAs are diverse and complex and include base triples, ribose zippers, and U-turns (6). Underlying these tertiary structures are relatively simple base paired regions that provide the structural framework of the RNA. This leads to hierarchical folding of RNA in which secondary structure formation precedes tertiary structure formation (7, 8).

Given the need for functional RNAs to adapt to diverse conditions, the question arises as to whether thermostability—the temperature at which tertiary structure is lost— comes from strengthening tertiary structures or secondary structures. When tertiary structure unfolds, RNA function is lost. Therefore, it might seem that thermostability would increase by strengthening tertiary interactions rather than secondary structure.

However, another mechanism is possible. Sosnick and Pan introduced the concept of functional stability, which is the difference in free energy between the fully functional state and the penultimately stable state (9). If that penultimately stable state is not secondary structure but rather unfolded RNA, then folding would be cooperative and secondary structure strength could increase thermostability.

To investigate this notion, we study the folding of a series of tRNAphe constructs with variable strength acceptor stems under diverse solution conditions including dilute

85 buffer, crowders, and Mg2+-chelated amino acids, which we and others have found induce cooperative RNA folding (10-13). The strength of the acceptor stem was tuned by switching between G•U and AU pairs to GC pairs. We observe that strengthened tRNA mutants fold cooperatively under the most biological conditions of crowding with Mg2+- chelated amino acids and that such strengthening of base pairing increases thermostability. Moreover, computational analysis of a set of tRNAphe sequences from organisms that live in a wide range of temperatures reveals that base0paired regions strengthen in thermophiles while tertiary structure-participating nucleotides are invariant. This suggests that in nature, tRNA adapts to harsh conditions not by strengthening tertiary interactions but by strengthening base pairing. Given the simple nature of base pairing and the large energetic effects of single nucleotide changes, this offers a simple route to adaptation of RNA and emphasizes the importance of cooperative

RNA folding conditions to RNA evolution.

4.3 Results

4.3.1 Dilute Solution Conditions Lead to Non-Cooperative Folding with Stronger Base

Pairing. Folding of WT and mutant tRNAs (Figure 4-1, Tables C-1, C-2) were first examined under dilute solution conditions using a background of 140 mM K+ (14) and either 0.5 or

2.0 mM free Mg2+, characteristic of eukaryotic and prokaryotic divalent concentrations, respectively (15-19). Data were fit according to a two-state model to calculate

86 thermodynamic parameters and test for level of cooperativity (10). In this approach, the transition with the major absorbance change dominates the fit in a multi-transition system. The quality of the global fits can be found in Table C-3.

Figure 4-1. Wild-type and variable stability mutant constructs of tRNAphe. Tertiary interactions are depicted on the WT construct with yellow lines, based on PDB 1ehz. Mutant constructs (M1-M6) vary sequence and base pairing strength in the tunable region of the acceptor stem (shaded blue) and have ‘ACCA’ at the 3’ end. Base-pairing regions that are different between mutants and WT are in black boxes. Colors of WT and mutant are maintained in all display items.

87 In 2.0 mM Mg2+, WT tRNA unfolds in a single sharp transition and was well fit to a two-state model suggesting cooperativity, wherein secondary structure unfolds concomitant with tertiary structure (Figure 4-2A). The six acceptor stem mutants (M1-

M6) were fit to the same two-state model. Their apparent melting temperatures (TMs) increase monotonically over a range of 11.4 C, reflecting increased acceptor stem base pairing strength. Folding of low stability mutants, WT and M1, in dilute buffer is cooperative as confirmed through their highly negative ∆Hfolding values of ~–80 kcal/mol

(Table C-1). However, the three highest stability mutants, M4-M6, display a broad unfolding transition and a smaller ∆Hfolding, suggesting that secondary and tertiary structure do not unfold together (Figure 4-2A). In particular, these mutants have ∆Hfolding values of only ~–65 to ~–40 kcal/mol, reflecting fewer bonds broken in the major unfolding transition and thus multi-state folding. Loss of cooperativity is also mirrored in relative enthalpy values between mutant and WT (∆HM/∆HWT), which reveals fractional values as low as 0.5 as acceptor stem base pairing strength increases (Table C-1). Similar

2+ effects are found in 0.5 mM Mg (Table C-2, Figure C-1). As such, the apparent TM does not accurately reflect tertiary structure unfolding for the more stable mutants in dilute buffer conditions.

88

Figure 4-2. WT and mutant thermal denaturation under in vivo-like solutions in the background of 2.0 mM free Mg2+. Each construct was globally fit every 2 nm between 250 and 290 nm. Normalized thermal denaturation scans at 260 nm normalized using global fitting parameters in (A) buffer, (B) 20% PEG8000, (C) additional 14.0 mM Mg2+ weakly chelated to amino acids, and (D) 20% PEG8000 with additional 14.0 mM Mg2+ weakly chelated to amino acids. All four panels are in the background of 2.0 mM Mg2+ and 140 mM K+. Low temperature data was truncated in the fitting to avoid excess baselines, as is plotted above.

We verified that loss of cooperativity was not correlated with lack of native folding using small angle X-ray scattering (SAXS) to judge globular structure (20) and in-line probing (ILP) to assess local structure (21). Overlay of WT and mutant SAXS scattering envelope is excellent, supporting retention of native structure (Figure 4-3A). The SAXS structural parameters of P(r), Rg, and Dmax, as well as excluded volume are similar for all constructs (Table C-4, Figure C-2). Experimental scattering curves align well with the

89 simulated scattering curve of tRNAphe (PDB 1ehz) again supporting WT-like overall structure for all constructs (Figure C-3). Moreover, the ILP profiles for WT and M5, representing low and high stability constructs, were virtually identical and consistent with native secondary structure including the critical 5’ end of the RNA where the mutations reside, supporting the same native base pairing patterns (Figures 4-3C, 4-3D). Thus, base pair changes in the acceptor stem modulate tRNA stability and folding pathway without affecting the final structure.

Figure 4-3. Aligned SAXS bead models of M1-M5 and WT and in-line probing in buffer with 2.0 mM Mg2+. (A) Alignment of WT (grey) and M1 (purple), M2 (blue), M3 (light blue), M4 (yellow), and M5 (orange) DAMAVER envelopes. (B) Normalized ILP signal comparing WT (black) with M5 (teal) in buffer. Normalized ILP Signal of (C) WT and (D) M5 normalized ILP signal in (black) buffer, (purple) 20% PEG8000, (blue) Mg2+- chelated amino acids and (pink) 20% PEG8000 with Mg2+-chelated amino acids in the background of 2.0 mM free Mg2+. Nucleotides 1-15 were not analyzed in samples containing Mg2+-chelated amino acids due to salt contamination.

90 4.3.2 In Vivo-Like Conditions Favor Cooperative Folding and Thermostability with

Stronger Base Pairing. We tested effects of cellular crowding in 0.5 or 2.0 mM Mg2+, using

20% w/v PEG8000, and/or cellular levels of amino acids with weakly chelated Mg2+ (aaCM)

(13). Previous studies from our lab revealed that the cooperativity of WT tRNA folding is enhanced equally in diverse crowders, including 20% PEG4000, PEG8000, Dextran10,

Dextran70, and Ficoll70 (10); 20% PEG8000 is thus a representative choice. In the presence of 20% PEG8000 and 2.0 mM Mg2+, WT tRNA again unfolds in a single sharp

o transition, with a TM of 68.0 C, supporting cooperative folding (Figure 4-2B). Indeed, the

Hfolding of WT tRNA is slightly larger than in dilute buffer, by almost –10 kcal/mol (Table

C-1). The unfolding of mutant tRNAs was sensitive to crowding agent as well (Fig. 4-2B).

Notably, the ∆Hfolding values for the WT and the mutant tRNAs are similar in crowded conditions, indicating cooperative unfolding for WT and mutants. For example, WT and

M5 have ∆Hfolding values of –80.4 and –62.7 kcal/mol, respectively, and ∆HM/∆HWT range

2+ from 1.1 to 0.8. Similar effects were found in 0.5 mM Mg (Table C-1). However, the TMs of the mutant tRNAs are relatively unchanged in crowder. For instance, the range in TM is only 4.1 C in crowder compared to 11.4 C found without crowder (Table C-1).

Next, we tested the stability of the tRNAs in amino acid-chelated Mg2+, which contains 2.0 mM free Mg2+ and 14.0 mM Mg2+ that is weakly chelated to a mixture of 96.0 mM L-glutamate, 4.3 mM L-aspartate, 3.8 mM L-glutamine, and 2.6 mM L-alanine

(described in detail in the Materials and Methods, Appendix C)—the four most abundant amino acids in E. coli. This system mimics the one that is found naturally in bacteria (13).

 In aaCM, the observed TM and ∆H are higher in magnitude than PEG8000, by ~3 C and ~–

91 70 kcal/mol, respectively (Figure 4-2C). Although the TMs are generally higher and range

 over 6 C, as secondary strength stability increases cooperativity is lost, with ∆Hmut/∆HWT as low as 0.6. In summary, neither crowded nor aaCM solutions alone have optimal properties for thermostability: crowding has modest affects on thermostability, while aaCM loses cooperativity.

Finally, we studied the combination of aaCM and crowding to test if the favorable effects of each were additive and to more closely mimic in vivo conditions. In this combination, we observed that RNAs maintain two-state folding over a wide range of thermostability (Figure2 4-2D, 4-4A). Strikingly, large gains in thermostability and relatively high folding cooperativity are observed for all but M6, the most stable mutant

(Figure 4-2D). We were interested in quantifying the extent of cooperativity change for each mutant. In a cooperative system, as secondary strength increases, Hfolding should become larger in magnitude since more hydrogen bonds are made. We constructed theoretical fully cooperative and non-cooperative Hfolding limits for the most biological condition of 20% PEG8000 and aaCM (Figure 4-4B). The non-cooperative limit was calculated using nearest neighbor parameters for the acceptor stems of WT and mutant tRNA, (22) while the fully cooperative limit was calculated as the experimentally derived

H of WT (Table C-1) plus the nearest neighbor model H of the mutant acceptor stems, as described in the Materials and Methods.

As depicted in Figure 4-4, constructs M1, WT, and M2 fit to a Hfolding very close to the fully cooperative limit. However, as acceptor stem strength increases the H becomes somewhat less cooperative (Figure 4-4B). For M3-M5, maintain partial cooperative

92 behavior. However, the strongest mutant, M6, unfolds in an almost fully non-cooperative manner, presumably because its all-GC acceptor stem unfolds subsequent to tertiary structure. Nonetheless, it is clear that strengthening base pairing in the acceptor stem leads to greater thermal stability for WT and M1-M5. When we look at less cellular solution conditions, cooperativity is lost at much lower secondary structure strength

(Figure C-5). Finally, we note that the various crowded and aaCM conditions did not affect the global tRNA structure, as the ILP profiles for WT and M5 were unaffected by buffer,

20% PEG8000, aaCM, or a combination 20%PEG8000 and aaCM (Figure 4-3C, D). s

Figure 4-4. Cooperative and non-cooperative enthalpy models of WT and mutant tRNAs. Melting temperature and enthalpy of folding of WT and mutant tRNAs in 2.0 mM free Mg2+ with 20% PEG8000 with Mg2+-chelated amino acids. (A) Melting temperature and (B) ∆Hfolding of tRNA and mutants in 20% PEG8000 and additional Mg2+-chelated amino acids. See Appendix C, Materials and Methods for the calculation of (pink) non-cooperative H and (purple) fully cooperative H limits.

93 4.3.3 Nature Selects for Thermostable tRNAs Through Strengthening Secondary

Structure not Tertiary Structure. Given that our above results show that increased secondary structure strength drives increased functional stability under biological conditions, we hypothesized that strong secondary structures may be found in functional

RNAs from thermophiles owing to natural selection. We analyzed the following characteristics in a series of organisms with a wide range of optimal growth temperatures

(OGT): genomic and rRNA GC percentage, ∆G of each stem in tRNAphe, and average ∆G

phe (∆Gavg) of the four stems in tRNA .

Figure 4-5. Analysis of tRNA sequences from organisms with a large range of optimal growing temperatures. A comparison between optimal growth temperature and (A) whole genome GC percent, (B) ribosomal RNA GC percent, and the stability of the (C) D stem, (D) acceptor stem, (E) anticodon stem, (F) TC stem. In panels B, D-F there are sequence bins based on OGT (pink dots) and threshold Gfolding lines (pink lines) for stem stability as a function of growth temperature, see Appendix C, Materials and Methods for further details.

No observable trend was found between OGT and genome GC percent, where the

R2 of a linear fit is 0.063 (plotted as 100–GC percentage in Figure 4-5A). This indicates that there is no underlying GC bias to the genome of thermophilic organisms. However, a strong positive linear correlation, with an R2 of 0.73, exists between rRNA GC percent and

94 OGT (plotted as 100–GC percentage in Figure 4-5B); clearly, organisms that grow optimally at higher temperatures have higher GC content in their rRNA, similar to previous reports (23-25). Strikingly, linear trends also exist between OGT and a threshold

2 ∆Gfolding for tRNA acceptor, anticodon, and TC stems, as well as ∆Gavg; the R for linear fits for these four plots are all at or above 0.85 (Figure 4-5D-F, Figure C-6). In these plots, the term “threshold ∆Gfolding” means that we average the five weakest ∆Gfolding values within a given temperature bin, as defined in the Materials and Methods in Appendix C, which represent a minimal stability needed to maintain function (see Section 4.4

Discussion). For the D stem, no correlation of ∆G and OGT was found; rather, we observed two sequence clusters, each of which is fairly weak (Figure 4-5C). Notably, the D stem is the only stem in tRNAphe that participates in tertiary interactions (Figure 4-1), with three of its four base pairs engaged in such interactions. We hypothesize that the highly conserved tertiary contacts preclude variation of these nucleotides.

To look for variable residues, tRNAphe sequences from 44 organisms were aligned and the information content, a measure of conservation, of each position in the RNA was calculated according to Shannon entropies (6, 26, 27). As described in Appendix C, nucleotides that are highly conserved have 2 bits of information, nucleotides that are not conserved have 0 bits of information, while two nucleotides that co-vary with Watson-

Crick base pairing share 2 bits of information. Strikingly, nucleotides that are involved in tertiary contacts have an information content close to 2 (Figure 4-6A). There are eighteen nucleotides that participate in tertiary contacts and fifteen of these have 2 bits of

95 information each. This suggests that strengthening of tertiary structure is not a route to thermal stability in tRNA.

In contrast, nucleotides involved in secondary structure have moderate (1.26-1.75 bits) or low (0.00-1.25 bits) information content. This suggests that secondary structure base pairs tend to co-vary to maintain base pairing. When mapped onto the tRNA secondary structure, positions of lower information content and conservation cluster in the acceptor, anticodon, and TC stems, with lowest information content in the acceptor

Figure 4-6. Information content of each position in tRNAphe. (A) The information bits for each position of tRNAphe. Positions are color coded as follows: tertiary contacts (green), secondary structure without tertiary contacts (pink), loops without tertiary contacts (blue), and the anticodon stem (purple). (B) Information content mapped onto the secondary structure of tRNA. Regions of high information content (1.76-2.00 bits), moderate information content (1.26-1.75 bits) and low information content (0.00-1.25 bits) are colored in red, orange, and black respectively. Joining lines between the D and TC loops are passing behind the AC stem. stem where we made the mutants that were tested experimentally (Figure 4-6B). In summary, bases with lower information content are found in the stems and tend towards higher GC content in thermophiles, while bases with higher information content are found in tertiary interactions and are not routes to thermal stability.

96 In theory tertiary structure is more difficult to modify that secondary structure.

Secondary interactions are predominantly WC interactions: where A is base paired with

U and G is base paired with C. With WC interactions, an AU and GC base pairs can be swapped without altering the structure. However, tertiary interactions are often non- canonical, so changing the tertiary interactions is conceivably more difficult because it can result in a loss of structure. There are known tertiary interaction changes that can be made, yet they are often not as simple as WC changes. These have been collected into a series of isostericity matrices (28).

The tertiary interactions in tRNAphe were examined in PyMOL (PDB 1ehz) to look for potential tertiary interaction isostericity changes. Tertiary interactions were classified based upon which face of the bases the interactions are on (WC, Hoogsteen, Sugar-Edge), the strand directionality, and the sugar pucker. We find that most of the tertiary interactions have a potential isosteric change that can be made (Table C-8). However, when the interactions are examined in the crystal structure we observe that if these isosteric changes were made another interaction in the RNA would be lost: including metal binding sites, base triple interactions, and base modifications (Figure C-7). This observation suggests that tertiary structure is conserved because changes to tertiary structure will result in a loss of global fold.

97 4.4 Discussion

RNA can have diverse functions including catalysis in ribozymes and small molecule binding in riboswitches to regulate gene expression. In addition, certain RNAs such as tRNA and rRNA have extensive tertiary structures and mediate protein expression. The pathways for adaption of these functional RNA to extreme temperatures has remained elusive. Since loss of tertiary structure leads to loss of function, one route to thermal stability might be strengthening of tertiary interactions. This could occur, for example, through enhanced metal ion interactions and long-range interactions. Such a means for stability was found in a SELEX experiment on a group I intron (29), but the question remains as to how thermal stability is obtained during natural selection.

Proteins are known to fold cooperatively in nature, especially small compact proteins (30, 31). Less is known about whether RNAs fold cooperatively. Crothers and co-workers showed that tRNA can fold cooperatively in high Mg2+ conditions (32), and we found that yeast tRNAphe folds cooperatively in crowded conditions that arises by both stabilization of tertiary structure and destabilization of secondary structure (10, 11).

When model RNAs unfold cooperatively, strengthening of base pairing, even in positions without tertiary interactions, can contribute to thermostability (33). We thus entertained the notion that variations in stems apart from tertiary interactions could tune the temperature at which tertiary interactions in natural RNAs melt. Indeed, we found this to be the case. We found that strengthening the acceptor stem by changing G•U and AU pairs to GC pairs led to greater thermal stability, but only when folding was cooperative,

98 i.e. under cellular crowded conditions containing Mg2+-chelated amino acids. Figure 4-7 provides a conceptual framework for this observation using the Pan and Sosnick concept of functional free energy (9). Under dilute conditions, where folding is non-cooperative, strengthening of secondary structure leads to no change in the functional free energy since secondary structure is the penultimately stable state and also contained in the fully folded RNA. However, under in vivo-like conditions, where folding is cooperative, strengthening of secondary structure enhances the functional free energy since the unfolded RNA is the penultimately stable state.

Figure 4-7. Conceptual free energy diagram of WT and stabilizing mutants under dilute and in vivo-like conditions. (A) In dilute solution conditions folding is non-cooperative, and stabilization of secondary structure does not increase cooperativity because the penultimate state is also stabilized. (B) In more in vivo-like conditions folding is cooperative, and stabilization of secondary structure leads to an increase in cooperativity because the penultimate state is primary structure, which does not change in stability.

These findings were corroborated by analysis of a database we constructed of tRNAphe from 44 organisms that live at a broad range of temperatures. This analysis

99 revealed that long-range tertiary interacting nucleotides have maximal information content and thus cannot change. With the exception of a single GC pair fixed toward the end of each helix, presumably present to stabilize the helix ends (34), the regions of moderate information content were largely found to be in the stems (Figure 4-6). Indeed, in nature the stems strengthen with growing temperature, suggesting that they provide a route to stability. Such base pairing changes can lead to several kcal/mol gain in free energy from just a single base change (35). Thus, strengthening secondary structure of a cooperatively folding natural RNA provides a facile route to large effects in stability, albeit the sequence may be subject to other selection pressures such as serving as an identity element for aminoacylation. Moreover, observation that secondary structure increases with (OGT) in nature supports that tRNAs fold cooperatively in vivo in a diverse set of organisms.

Changing the tertiary structure of the RNA, rather than secondary structure, could be detrimental, given the generally greater molecular complexity of tertiary structure.

For instance, new tertiary interactions in tRNA might interfere with the translation machinery. Additionally, observation that nucleotides involved in tertiary structure do not change with elevated growth temperature suggests that population of tertiary structure is relatively temperature independent or even endothermic; this contrasts sharply with secondary structure formation, which is strongly exothermic (6). Given that formation of tertiary structure is generally accompanied by water release (36, 37), tertiary structure formation may be entropically driven. Indeed, early studies show that metal binding to ATP is endothermic, presumably driven by the entropy gain from water release

100 (38). In addition, docking of the P1 helix into the catalytic core of the Tetrahymena ribozyme, which is mediated by 2’ hydroxyl tertiary interactions, is entropically favored with a S ranging from +37 to +62 eu and a H ranging from +8.5 to +19 kcal/mol (39,

40). The origin of this effect has been attributed to water release. The Woodson lab reported that the entropy of the unfolded state of Azoarcus ribozyme is decreased in the presence of crowding agents, which decreases the entropy loss upon tertiary structure docking (12, 41). In other cases, folding of other RNA tertiary interactions have been reported to be modestly enthalpically favored, indicating that the actual results will depend on the details of number and type of tertiary interactions (42, 43).

The effects observed herein were lost with the most stable stem of all GC base pairs in M6. However, our experiments were conducted on tRNAs without the natural modifications. It is well established that modifications strengthen tertiary structure (44,

45), thus it is possible that even this mutant would fold cooperatively in this background.

Hyperthermophiles tend to have more modified bases than mesophiles. These modifications are found in positions that participate in both secondary and tertiary structure, often strengthening contacts or increasing structure rigidity, suggesting that nature has multiple methods of selecting for RNA thermostability (46, 47). Thermostable proteins use several methods to improve stability, including more charged interactions, more disulfide bonds, and increasing rigidity (48).

In this study, we introduced the notion of threshold free energy, which we define as the minimal free energy needed to keep a helix folded. In other words, it is possible to have a helix with greater stability than the threshold but not less. A similar notion was

101 used by Szostak and co-workers in describing the threshold information content needed to attain an RNA with a given function, such as catalytic rate (49). Inspection of Figure 4-

5 suggests that the threshold is approached by most sequences in all but the lowest growth temperatures presumably because of the entropy gain from lower information content. It remains to be explored whether similar principles apply to other functional

RNAs. Our results suggest a simple mechanism for functional RNAs to adapt to higher temperatures by increasing the stability of secondary structure regions with low to moderate information content while maintaining tertiary contacts, structure, and stability.

4.5 Acknowledgements

We thank Dr. Richard Gillian and Dr. Jesse Hopkins for help with small-angle X-ray scattering experiments. This work was supported by U.S. National Institutes of Health

Grant R01-GM110237 (P.C.B.). Experiments conducted at the Cornell High Energy

Synchrotron Source (CHESS) were supported by the National Science Foundation and the

National Institutes of Health/National Institute of General Medical Sciences under NSF award DMR-0936384, using the Macromolecular Diffraction at CHESS (MacCHESS) facility, which is supported by GM-103485 from the National Institutes of Health, through its

National Institute of General Medical Sciences. We thank Elizabeth Jolley, Raghav

Poudyal, Laura Ritchey, Jacob Sieg, and Ryota Yamagami and for helpful comments and discussions on the manuscript.

102 4.6 References

1. Berezovsky IN & Shakhnovich EI (2005) Physics and evolution of thermophilic

adaptation. Proc Natl Acad Sci USA 102(36):12742-12747.

2. Greaves RB & Warwicker J (2007) Mechanisms for stabilisation and the

maintenance of solubility in proteins from thermophiles. BMC Struct Bio 7:18-18.

3. Bae E & Phillips GN (2004) Structures and analysis of highly homologous

psychrophilic, mesophilic, and thermophilic adenylate kinases. J Biol Chem

279(27):28202-28208.

4. Trevino SG, Zhang N, Elenko MP, Lupták A, & Szostak JW (2011) Evolution of

functional nucleic acids in the presence of nonheritable backbone heterogeneity.

Proc Natl Acad Sci USA 108(33):13492-13497.

5. Chen X, Li N, & Ellington AD (2007) Ribozyme catalysis of metabolism in the RNA

world. Chem Biodivers 4(4):633-655.

6. Bloomfield VA, Crothers DM, & Tinoco I (2000) Nucleic acids: Structures,

properties, and functions (University Science Books, Sausalito, California).

7. Tinoco I & Bustamante C (1999) How RNA folds. J Mol Biol 293(2):271-281.

8. Brion P & Westhof E (1997) Hierarchy and dynamics of RNA folding. Annu Rev

Biophys Biomol Struct 26:113-137.

9. Fang XW, et al. (2001) The thermodynamic origin of the stability of a

thermophilic ribozyme. Proc Natl Acad Sci USA 98(8):4355-4360.

103 10. Strulson CA, Boyer JA, Whitman EE, & Bevilacqua PC (2014) Molecular crowders

and cosolutes promote folding cooperativity of RNA under physiological ionic

conditions. RNA 20(3):331-347.

11. Leamy KA, Yennawar NH, & Bevilacqua PC (2017) Cooperative RNA folding under

cellular conditions arises from both tertiary structure stabilization and secondary

structure destabilization. Biochemistry 56(27):3422-3433.

12. Kilburn D, Roh JH, Behrouzi R, Briber RM, & Woodson SA (2013) Crowders

perturb the entropy of RNA energy landscapes to favor folding. J Am Chem Soc

135:10055-10063.

13. Yamagami R, Bingaman JL, Frankel EA, & Bevilacqua PC (2018) Cellular conditions

of weakly chelated magnesium ions strongly promote RNA folding, stability, and

catalysis. Nat Comm Accepted.

14. Feig AL & Uhlenbeck OC (1999) The role of metal ions in RNA biochemistry. The

RNA world, 2nd ed., eds Gesteland RF, Cech TR, & Atkins JF (Cold Spring Harbor

Laboratory Press, Cold Spring Harbor, New York), pp 287-320.

15. Lusk JE, Williams RJ, & Kennedy EP (1968) Magnesium and the growth of

escherichia coli. J Biol Chem 243:2618-2624.

16. Truong DM, Sidote DJ, Russell R, & Lambowitz AM (2013) Enhanced group II

intron retrohoming in magnesium-deficient Escherichia coli via selection of

mutations in the ribozyme core. Proc Natl Acad Sci USA 110:E3800-E3809.

17. Alberts B, Bray D, Lewis J, Roberts K, & Watson JD (1994) Molecular biology of

the cell 3rd ed.

104 18. London RE (1991) Methods for measurement of intracellular magnesium: NMR

and fluorescence. Annu Rev Physiol 53:241-258.

19. Romani AM (2007) Magnesium homeostasis in mammalian cells. Front Biosci

12:308-331.

20. Pollack L (2011) Time resolved SAXS and RNA folding. Biopolymers 95(8):543-

549.

21. Regulski EE & Breaker RR (2008) In-line probing analysis of riboswitches. Post-

transcriptional gene regulation, ed Wilusz J (Humana Press, Totowa, NJ), pp 53-

67.

22. Serra MJ & Turner DH (1995) Predicting thermodynamic properties of RNA.

Methods enzymol, (Academic Press), Vol 259, pp 242-261.

23. Wang H-c & Hickey DA (2002) Evidence for strong selective constraint acting on

the nucleotide composition of 16S ribosomal RNA genes. Nucleic Acids Res

30(11):2501-2507.

24. Jegousse C, Yang Y, Zhan J, Wang J, & Zhou Y (2017) Structural signatures of

thermal adaptation of bacterial ribosomal RNA, transfer RNA, and messenger

RNA. PLOS ONE 12(9):e0184722.

25. Galtier N & Lobry JR (1997) Relationships between genomic GC content, RNA

secondary structures, and optimal growth temperature in prokaryotes. J Mol

Evol 44(6):632-636.

26. Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J

27(3):379-423.

105 27. Schneider TD, Stormo GD, Gold L, & Ehrenfeucht A (1986) Information content of

binding sites on nucleotide sequences. J Mol Biol 188(3):415-431.

28. Leontis N, Stombaugh J, & Westhof E (2002) The non-watson-crick base pairs and

their associated isostericity matrices. Nucleic Acids Res 30(16): 3497-3531.

29. Juneau K & Cech TR (1999) In vitro selection of RNAs with increased tertiary

structure stability. RNA 5(8):1119-1129.

30. Malhotra P & Udgaonkar JB (2016) How cooperative are protein folding and

unfolding transitions? Protein Sci 25(11):1924-1941.

31. Portman JJ (2010) Cooperativity and protein folding rates. Curr. Opin. Struct. Biol.

20(1):11-15.

32. Stein A & Crothers DM (1976) Conformational changes of transfer RNA. The role

of magnesium(II). Biochemistry 15(1):160-168.

33. Blose JM, Silverman SK, & Bevilacqua PC (2007) A simple molecular model for

thermophilic adaptation of functional nucleic acids. Biochemistry 46:4232-4240.

34. Xia T, et al. (1998) Thermodynamic parameters for an expanded nearest-

neighbor model for formation of RNA duplexes with watson-crick base pairs.

Biochemistry 37:14719-14735.

35. Blose JM, et al. (2007) Non-nearest-neighbor dependence of the stability for RNA

bulge loops based on the complete set of group I single-nucleotide bulge loops.

Biochemistry 46(51):15123-15135.

106 36. Miyoshi D, Karimata H, & Sugimoto N (2006) Hydration regulates

thermodynamics of G-quadruplex formation under molecular crowding

conditions. J Am Chem Soc 128(24):7957-7963.

37. Nakano S-i, Karimata HT, Kitagawa Y, & Sugimoto N (2009) Facilitation of RNA

enzyme activity in the molecular crowding media of cosolutes. J Am Chem Soc

131:16881-16888.

38. Banyasz JL & Stuehr JE (1973) Interactions of divalent metal ions with inorganic

and nucleoside phosphates. III. Temperature dependence of the magnesium(II)--

adenosine 5'-triphosphate, --adenosine 5'-diphosphate, and --cytidine 5'-

diphosphate systems. J Am Chem Soc 95(22):7226-7231.

39. Li Y, Bevilacqua PC, Mathews D, & Turner DH (1995) Thermodynamics and

activation parameters for binding of pyrene-labeled substrate by the

tetrahymena ribozyme: Docking is not diffusion-controlled and is driven by a

favorable entropy change. Biochemistry 34:14394-14399.

40. Narlikar GJ & Herschlag D (1996) Isolation of a local tertiary folding transition in

the context of a globally folded RNA. Nat Struct Mol Biol 3:701.

41. Kilburn D, et al. (2016) Entropic stabilization of folded RNA in crowded solutions

measured by SAXS. Nucleic Acids Res 44(19):9452-9461.

42. Crothers DM, Cole PE, Hilbers CW, & Shulman RG (1974) The molecular

mechanism of thermal unfolding of Escherichia coli formylmethionine transfer

RNA. J Mol Biol 87:63-88.

107 43. Szewczak AA, Podell ER, Bevilacqua PC, & Cech TR (1998) Thermodynamic

stability of the P4−P6 domain RNA tertiary structure measured by temperature

gradient gel electrophoresis. Biochemistry 37(32):11162-11170.

44. Helm M (2006) Post-transcriptional nucleotide modification and alternative

folding of RNA. Nucleic Acids Res 34(2):721-733.

45. Nobles KN, Yarian CS, Liu G, Guenther RH, & Agris PF (2002) Highly conserved

modified nucleosides influence Mg2+-dependent tRNA folding. Nucleic Acids Res

30(21):4751-4760.

46. Kowalak JA, Dalluge JJ, McCloskey JA, & Stetter KO (1994) The role of

posttranscriptional modification in stabilization of transfer RNA from

hyperthermophiles. Biochemistry 33(25):7869-7876.

47. Lorenz C, Lünse CE, & Mörl M (2017) tRNA modifications: Impact on structure

and thermal adaptation. Biomolecules 7(2):35.

48. Pucci F & Rooman M (2017) Physical and molecular bases of protein thermal

stability and cold adaptation. Curr Opin Struct Biol 42:117-128.

49. Carothers JM, Oestreich SC, Davis JH, & Szostak JW (2004) Informational

complexity and functional activity of RNA structures. J Am Chem Soc

126(16):5130-5137.

108 Chapter 5

Conclusions and Future Directions

5.1 Conclusions

The objective of this thesis was to improve our understanding of RNA folding and processes in cellular conditions. Through the use of both experimental and computational approaches I have demonstrated that in cellular conditions functional

RNAs fold in a two-state manner through both destabilization of secondary structure and stabilization of tertiary structure, cotranscriptional intermediates are destabilized until a transcript is long enough to form secondary and tertiary interactions in a single transition, and RNAs adapt to high temperatures by stabilizing secondary structures. Transfer RNA was the primary RNA used in these studies. It is one of the most abundant functional RNAs in the cell, and misfolding of tRNA has been implicated in cell death and many diseases.

It is our hope that the findings presented in this thesis will motivate others to study and consider how cellular conditions affect their RNA of interest.

In Chapter Two, studies demonstrate that tRNA folds in a more two-state cooperative manner under conditions that mimic the crowded and ionic conditions of the cytoplasm. Cooperative folding in cellular conditions arises from both destabilization of

109 secondary structure and stabilization of tertiary structure. This mechanism of achieving cooperativity is quite surprising because strong RNA secondary structure has been thought to be the driving force behind RNA folding. We hypothesize that by destabilizing secondary structure, misfolded intermediate states will also be destabilized, resulting in a smoother folding pathway to the functional state. Strong tertiary interactions can lock in weak secondary interactions to form the native state. The work presented here is one of the first reports of RNAs folding in a protein-like manner, with weak secondary structure and strong tertiary structure.

In Chapter Three, the hypothesis that intermediate states will be destabilized in cellular conditions due to secondary structure destabilization is further investigated. The structure and stability of cotranscriptional intermediates of tRNAphe were probed using both experimental and computational approaches. Cotranscriptional intermediates of tRNAphe that are too short to form all base pairing interactions are highly destabilized in cellular conditions. The addition of a single nucleotide at the end of the acceptor stem results in a large increase in stability, and cooperative folding to the native state occurs.

The work presented in this chapter has implications for how functional RNAs fold in cells, and suggests that there is a high level of folding control at the sequence level.

In Chapter Four, the mechanism of tRNA adaptation in thermophiles is probed. In organisms that grow at high temperatures, secondary structure pairings become more stable, while tertiary contacts are largely conserved. This is a simple mode of adaptation that enhances the overall thermostability of the RNA while maintaining Watson-Crick base pairing in stem regions by changing AU or G•U pairs to GC pairs. Likewise, by

110 conserving tertiary structure, where compensatory changes are not simple or often times not possible, function can be maintained. When secondary structure is strengthened in a cooperatively folding background it stabilizes the native fold of the RNA.

5.2 Future Directions

5.2.1 Extension of Thermodynamics Into Complex Cytoplasm Mimics. Recent studies have shown that certain classes of RNAs form different structure in vitro and in vivo. For example, certain RNAs are less structured in cells (1) and certain RNAs adopt different structures in cells (2). While these studies have shown that RNA structures differ in vitro and in vivo, they are just a snapshot of overall structure, and do no report on RNA folding pathways. Thermodynamics and kinetics are extremely difficult to study in cells because most thermodynamics methods involve heating the sample to denature it, and if cells are heated all the biomolecules will denature and the cell will die. In addition, there is not a good method to detect thermodynamics in cells because most methods rely on UV detection at 260 or 280 nm, and many biomolecules in cells also absorb at these wavelengths and will interfere with detection.

Recent studies have probed the thermodynamics of functional RNAs in simple cellular mimics using molecular crowders to replicate the crowding in cells, and small molecules to replicate the cosolutes in cells (3). While these have revealed insights into

RNA folding pathways and function in cellular conditions, they are very simple model cytoplasms. This is due to experimental limitations of signal overlap and high

111 temperatures that will denature more complex cytoplasm mimics. Ideally cytoplasm mimics would contain DNA, RNA, proteins, and a variety of small molecules found in cells.

However, these experiments require new methodology needs to be developed.

There have been efforts to overcome the issue of UV detection at 260 or 280 nm by using fluorescence as a method of detection (4-6). Fluorescence detection allows for more complex solution conditions that include biomolecules, such as nucleic acids and proteins. Examples of these efforts include using the modified nucleobase 2-amino purine (5), intercalators that associate with double stranded RNA (4), and fluorophores attached to the ends of RNA (6). However, a major limitation is that these assays still use high temperatures to denature the RNA, which will denature the other biomolecules used as a cytoplasm mimic.

Recently, an assay with moderate changes in temperature that uses fluorescence detection was reported (7, 8). This approach uses fluorescence competition to report on binding constants of short RNA duplexes. When binding constants are found over a temperature range, thermodynamic parameters can be calculated using the van’t Hoff analysis. While this method is a step in the right direction, it is low-throughput and laborious, and requires high volumes of fluorescently labeled RNA, which is expensive.

Herein, I describe efforts to develop a high-throughput assay to study RNA thermodynamics in complex cellular mimics. I present efforts to develop an assay to measure RNA duplex binding, similar to those previously described (7, 8). However, this assay will be performed in a fluorescence plate reader over a narrow temperature range, making it high-throughput and non-denaturing. Development of this assay will allow for

112 high-throughput study of RNA thermodynamics in complex cellular extracts using minimal labeled sample.

5.2.2 Development of a Thermodynamics Assay with Fluorescence Detection. I present here preliminary efforts to develop a high-throughput fluorescence assay for the measurement of RNA thermodynamics in complex cytoplasm mimics. The assay is simple.

In 12 wells across a 96-well tray, a constant concentration of fluorescently labeled RNA will be mixed with its complement strand (Figure 5-1). Upon strand binding, the fluorescence will change, and the change in fluorescence can be used to find a binding constant for the duplex. When this experiment is repeated at varying temperatures, the van’t Hoff analysis can be used to find thermodynamics parameters.

Figure 5-1. Experimental setup and simulated data for the high-throughput thermodynamics assay. (A) Experimental setup of a strand titration in a 96 well tray. When the strands bind, the fluorescence is modulated. (B) The assay can be performed at several temperatures to calculate temperature dependent binding constants, and thus thermodynamic parameters. Represented here are simulated binding curves at varying temperature, with higher temperature in warm colors and lower temperatures in cool colors.

113 To test the effects of nucleotides as fluorescence quenchers, the fluorescence of an RNA strand labeled on the 5’ end with fluorescein (FAM) was titrated with increasing concentrations of AMP, CMP, GTP, or UMP. The fluorescence was monitored between

500 and 560 nm. GTP was used instead of GMP due to low solubility of GMP in water.

With increasing concentrations of all NMPs, a decrease in FAM fluorescence is observed

(Figure 5-2). The largest decreases in fluorescence arises from addition of CMP or GTP to solution. For example, upon addition of 25 mM CMP or GTP there is an ~ 30% loss in FAM signal, suggesting that a dangling C or G on the 3’ end of a complementary oligonucleotide could be used to quench FAM on the 5’ end on RNA.

Figure 5-2. Quenching of fluorescein signal upon the addition of NMPs. Fluorescence quenching of 5’-fluorescein labeled RNA, AGCAGGAU, upon the addition of AMP (purple), CMP (pink), UMP (blue), and GTP (green) as detected at 520 nm. Titrations were performed in the background of 1 M NaCl, except for the GTP titrations which were performed in 1 M LiCl to avoid formation of G-quarduplexes.

114 To determine if a qPCR instrument could be used as a fluorescent plate reader, a calibration curve of fluorescence as a function of FAM concentration was determined.

The fluorescence signal with increasing concentrations of FAM labeled RNA was collected on a StepOnePlus qPCR instrument, which contains a temperature controlled 96-well tray.

A linear increase in fluorescence signal is observed with increasing concentrations of labelled RNA (Figure D-1), indicating that we were working in an RNA concentration range with reliable signal changes, although linearity appears to be lost at 80 M at temperatures of 27 and 32 °C.

Next we performed a binding assay at 27 °C with two complementary RNA strands.

One strand was labeled with FAM on the 5’ end, and the complement strand was labeled with either one, two, or three dangling Cs on the 3’ end. We chose C because CMP gave a 30% quench in Figure 5-2. To our surprise, upon binding to the complement strand the

FAM fluorescence increases almost 2-fold (Figure 5-3A). The observed increase in fluorescence is similar with one, two, or three dangling Cs, indicating that fluorescence is not dependent on the number of dangling nucleotide. The binding assay was repeated as a function of temperature, over the moderate temperature range of 27 °C to 47 °C.

We observe an increase in fluorescence with increasing concentrations of complement

RNA (Figure 5-3B). The fluorescence increase is temperature dependent, with a 2-fold increase at 27 °C and 1.5-fold at 47 °C, possibly due to greater collisional quenching of

FAM with the dangling Cs or quenchers in solution such as O2. As expected the binding constant, KD, is larger (weaker) at high temperatures, indicated by a less steep increase in fluorescence at high temperature.

115

Figure 5-3. RNA duplex binding assay using fluorescence detection. (A) Binding assay between an RNA strand labeled with 5’ FAM, and its complement labeled with either one, two, or three dangling 3’ Cs. (B) Variable temperature binding assay between 5’ FAM labeled RNA, and its complement labeled on the 3’ end with three dangling Cs.

To obtain a larger change in fluorescence upon strand binding, the 3’ dangling end

C’s on the complement strand were replaced with a black hole 1 quencher (BHQ1) purchased from IDT. The variable temperature binding assay was repeated with this new system, and near complete loss in fluorescence was observed upon complement strand binding (Figure 5-4). Similar to described above, the change in fluorescence upon strand binding is temperature dependent, with larger changes in fluorescence observed upon binding at lower temperatures. The binding data collected at each temperature were fit to a two-state model to obtain binding constants, according the materials and methods in Appendix D. Again, at higher temperatures larger KD’s are observed, an indication of weaker binding (Table 5-1). The Kd values obtained from the binding curves at low temperature, ~15 nM at 27 °C largely agree with the estimated Kd of these oligos of ~10

116 nM using nearest neighbor parameters, indicating that this is an accurate assay for measurement of binding constants.

Figure 5-4. Variable temperature RNA duplex binding assay using fluorescence detection. Variable temperature binding assay between 5’ FAM-labeled RNA, and its complement labeled on the 3’ end with black hole quencher 1. RNA sequences are provided aobe the figure.

Table 5-1. Binding constants for RNA duplex formation in 1 M NaCl.

Temperature (°C) Kd (nM) 27 14.8 ± 2.6 32 17.1 ± 3.1 37 25.8 ± 4.9 42 54.4 ± 11 47 173 ± 32

117 Overall, these preliminary data suggest that the qPCR can be used to obtain high- throughput thermodynamic data on small RNA duplexes. The next steps to validating this method are to verify that the thermodynamic data obtained using this high-throughput method agree with the data obtained using traditional methods, such as thermal denaturation. Then the method needs to be tested in solutions of increasing complexity.

The method can be further optimized to use fluorophores and quenchers that are thermostable, bright, and resistant to photobleaching.

5.2.3 Investigating if the 5’ Leader of tRNA drives native folding. The effects of the 5’ leader sequence on directing tRNA folding pathways are largely uninvestigated. To test effects of cotranscriptional folding without the 5’ leader sequence of tRNA, cotranscriptional intermediates of tRNA with varying lengths on the 3’ end were input to the CoFold webserver, an RNA folding algorithm that takes cotranscriptional folding into account. As described in Chapter 3, when the 5’ leader is included, the tRNA core folds on a native pathway, without forming strong misfolded structures (Figure 3-6). However, if the same procedure is performed without the 5’ leader, the core forms remarkably stable misfolded structure, with a free energy of -15 kcal/mol, that has minimal native base pairing (Figure 5-5).

These preliminary data suggest that perhaps a biological role of the 5’ leader is to guide native tRNA folding to avoid stable misfolds by tying up the 5’ strand of the acceptor stem. Similar folding guides have been proposed computationally (9) and supported experimentally (10) for the HDV ribozyme. This needs to be further investigated both

118 computationally with other tRNA sequences and experimentally on a subset of tRNA sequences. Computationally, CoFold could be used to test the effects of folding tRNAs with and without their native 5’ leader sequences. This analysis could help identify if 5’ leaders are generally guiding tRNA folding. Furthermore, sequence alignments can be used to cook for covariation between putative guides and mature tRNA. Experimentally, native cotranscriptional structure formation and folding of tRNAs with and without 5’ leaders could be tested. Using native purifications of tRNA with and without 5’ leaders and structure probing, the population of RNA folded into the native state could be determined.

Figure 5-5. Cotranscriptional folding pathway of tRNAphe without the 5’ leader. Without the 5’ leader, a very stable misfolded structure is predicted to form between the 5’ portion of the acceptor stem, the D stem loop, and the anticodon stem loop.

119 5.2.4 Investigate the method of adaptation of RNAs to extreme conditions. In Chapter

Four the mechanism for thermal adaptation in tRNAphe is investigated. To adapt to extreme temperatures, thermophilic tRNAs strengthen secondary structures while tertiary contacts are unmodified. In a cooperatively folding background strengthening secondary structure will increase the stability of the native state. Transfer RNA has almost twenty tertiary contacts, but not all functional RNAs have such a complex structure. The mode of adaptation of other functional and nonfunctional RNAs should be investigated.

Information content, a measure of conservation, can be calculated for each position in an

RNA to determine which residues vary. Computational methods can be used to test the strength and GC content of RNA secondary structures to determine if regions are increasing in stability in organisms that grow at high temperature. Base pairs that strengthen are proposed to be functional either as isolated helices or as portions of larger functional RNAs with invariant tertiary structures. This analysis will provide powerful information on RNA evolution and adaptation.

5.2.5 Discover novel RNA Motifs using information content. In Chapter 3, we showed that tRNA adapts to extreme temperature by altering the strength of nucleotides involved in secondary structure, while nucleotides involved in tertiary structure are conserved.

This analysis was performed using sequence alignments and calculating information content at each position in the RNA. This method could be used to find novel RNA motifs from transcriptome sequence alignments. Sliding windows that calculate average information content over 10-20 nucleotides can identify regions of high information

120 content, and therefore high conservation. Regions of high information are associated with tertiary in functional RNAs, so this method might be a simple way to identify and discover RNA motifs.

5.2 References

1. Rouskin S, Zubradt M, Washietl S, Kellis M, & Weissman JS (2014) Genome-wide

probing of RNA structure reveals active unfolding of mrna structures in vivo.

Nature 505(7485):701-705.

2. Ding Y, et al. (2014) In vivo genome-wide profiling of RNA secondary structure

reveals novel regulatory features. Nature 505:696-700.

3. Leamy KA, Assmann SM, Mathews DH, & Bevilacqua PC (2016) Bridging the gap

between in vitro and in vivo RNA folding. Q Rev Biophys 49:e10.

4. Silvers R, Keller H, Schwalbe H, & Hengesbach M (2015) Differential scanning

fluorimetry for monitoring RNA stability. Chem Bio Chem 16(1109-1114):1109.

5. Ballin JD, et al. (2007) Site-specific variations in RNA folding thermodynamics

visualized by 2-aminopurine fluorescence. Biochemistry 46(49):13948-13960.

6. Tsourkas A, Behlke MA, Rose SD, & Bao G (2003) Hybridization kinetics and

thermodynamics of molecular beacons. Nucleic Acids Res 31(4):1319-1330.

7. Liu B, Shankar N, & Turner DH (2010) Fluorescence competition assay

measurements of free energy changes for RNA pseudoknots. Biochemistry

49(3):623-634.

121 8. Liu B, Diamond JM, Mathews DH, & Turner DH (2011) Fluorescence competition

and optical melting measurements of RNA three-way multibranch loops provide

a revised model for thermodynamic parameters. Biochemistry 50(5):640-653.

9. Isambert H & Siggia ED (2000) Modeling RNA folding paths with pseudoknots:

Application to hepatitis delta virus ribozyme. Proc Natl Acad Sci U S A

97(12):6515-6520.

10. Brown TS, Chadalavada DM, & Bevilacqua PC (2004) Design of a highly reactive

HDV ribozyme sequence uncovers a facilitation of RNA folding by alternative

pairings and physiological ionic strength. J Mol Biol 341:695-712.

122 Appendix A

Supporting Information: Chapter 2

Published as a paper entitled: “Cooperative RNA Folding Under Cellular Conditions Arises from both Tertiary Structure Stabilization and Secondary Structure Destabilization” by

Kathleen A. Leamy, Neela H. Yennawar, and Philip C. Bevilacqua in Biochemistry 2017, 56

(3422). [All experiments were carried out by K.A.L. SAXS data was collected and analyzed with the help of N.H.R. Experiments were planned by K.A.L. and P.C.B.]

A.1 Materials and Methods

A.1.1 Chemicals. PEG200, PEG4000, PEG8000, and PEG20000, MgCl2, HEPES, and sodium cacodylate were purchased from Sigma. KCl was purchased from J. T. Baker. Calf intestinal phosphatase and polynucleotide kinase were purchased from NEB.

A.1.2 RNA Constructs and Preparation. Wild type (WT) tRNAphe was purchased from

Sigma, and purified on a 10% denaturing PAGE gel, and recovered by a crush and soak/ethanol precipitation procedure.

123 T7 full length (FL) tRNA was transcribed from a hemi-duplex DNA template from IDT

(Coralville, IA) that was used without further purification. The T7 promoter binding site is underlined in the DNA template. The T7 promoter and DNA template were annealed at

95ºC for 3 min in 100 mM NaCl and cooled at room temperature for 10 min. The tRNAphe was transcribed using T7 polymerase in 40 mM Tris (pH 7.5), 25 mM MgCl2, 2 mM DTT, 1 mM spermidine, and 3 mM NTPs, with incubation at 37 oC for 4 h. The RNA was purified by 10% denaturing PAGE gel and recovered by a crush and soak/ethanol precipitation procedure.

T7 promoter: 5’d(TAATACGACTCACTATA)

FL tRNAphe DNA Template: 5’d(TGGTGCGAATTCTGTGGATCGAACACAGGACCTCCAGATCTTCAGTCTGG CGCTCTCCCAACTGAGCT AAATCCGCTATAGTGAGTCGTATTA)

FL tRNAphe: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUUCGCACCA

The four secondary structure RNA fragments derived from the arms of FL tRNA were synthesized and HPLC-purified by GE Dharmacon (Lafayette, CO), deblocked upon receiving, and dialyzed into 10 mM sodium cacodylate (pH 7.0) and 140 mM KCl using a microdialysis system (Gibco-BRL Life Technologies).

Acceptor stem RNA: 5’ GCGGAUUUUUUUUUUAAUUCGC D stem-loop RNA: 5’ GCUCAGUUGGGAGAGC

124 Anticodon stem-loop RNA: 5’ CCAGACUGAAGAUCUGG TC stem-loop RNA: 5’ CUGUGUUCGAUCCACAG

A.1.3 Thermal Denaturation and Data Analysis. RNA was renatured by incubating at

95 oC for 3 min and cooled at room temperature over 10 min in the presence of KCl and sodium cacodylate (pH 7.0). Polyethylene glycol (PEG), the crowding agent, and/or MgCl2 was added to the RNA solution, and the sample was incubated at 55 oC for 3 min and allowed to cool at room temperature for 10 min. Prior to thermal denaturation experiments, the sample was spun down at 14,000 rpm for 5 min at 4 oC to remove air bubbles and particulates.

Thermal denaturation experiments were performed on a Gilford Response II spectrophotometer, with a data point collected every 0.5 oC and a heating rate of

~0.6 ºC/min with absorbance detection at 260 nm. This method is referred to as ‘optical melting’. Single-transition melt data were fit to a two-state model using sloping baselines and analyzed using a Marquardt algorithm for nonlinear curve fitting in KaleidaGraph v.

4.5.0 (Synergy Software). Data were smoothed with an 11-point window prior to taking the derivative. Samples were 1.0 M RNA in 10 mM sodium cacodylate (pH 7.0), 140 mM

KCl, 0-2.0 mM MgCl2, and 0-40% (w/v) crowder. To analyze the unfolding of the four secondary structures as a whole and compare them to FL tRNA, raw absorbance data for the melting curves of each secondary structure were summed (eq 1), smoothed with 11- point smoothing, and then the derivative was taken.

A푆푢푚 표푓 푆푆 = A퐴푐푐푒푝푡표푟 푆푡푒푚 + A퐷 푆푡푒푚−퐿표표푝 + A퐴푛푡푖푐표푑표푛 푆푡푒푚−퐿표표푝 + A푇Ψ퐶 푆푡푒푚−퐿표표푝 (eq 1)

125

These processed data are referred to as the “sum of the secondary structure fragments” or “SSS”. The melting temperature of each RNA transition acquired by optical melting in each condition was found using a non-linear Marquardt algorithm in Kaleidagraph (1) using equation 2, the derivation for which can be found in section A.1.4

Δ퐻 1 1 [ 푅 ][푇 +273.15 − 푇+273.15] (푚푢푇 + 푏푢) + (푚푓푇 + 푏푓)푒 푚 푓(푇) = Δ퐻 1 1 (푒푞. 2) [ ][ − ] 1 + 푒 푅 푇푚+273.15 푇+273.15 where mu and mf are the slopes of the upper (unfolded) and lower (folded) baselines, bu and bf are the y-intercepts of the upper and lower baselines, H is the enthalpy of

-1 unfolding in kcal mol , and TM is the melting temperature in degrees Celsius. R is the gas constant, 0.001987 kcal K-1 mol-1.

A.1.4 Derivation of Thermal Denaturation Data Fitting. We assume that there are two states in solution, folded, F, and unfolded, U. At low temperatures the RNA is folded so we are measuring the absorbance of the folded state, AF, and at high temperatures the

RNA is unfolded so we are measuring the absorbance of the folded state, AU. The absorbance measured, A, which is a function of temperature, depends on the fraction of

RNA in the folded and unfolded states.

퐴(푇) = 퐴푈푓푈 + 퐴퐹푓퐹 (eq. 2.1)

The absorbance of the folded state can be defined by the y-intercept, bF, and slope, mF of the linear lower baseline. The absorbance of the unfolded state can be defined in a similar manner where bU and mU are the y-intercepts and slopes of the linear upper baselines.

126 퐴퐹(푇) = 푚퐹푇 + 푏퐹 (eq. 2.2) and 퐴푈(푇) = 푚푈푇 + 푏푈 (eq. 2.3)

Using the equilibrium constant, K, for the two-state system the terms for fraction folded and unfolded can be defined.

[푈] 퐹 ⇌ 푈 퐾 = [퐹]/[푈] (eq. 2.4) and 푓 = (eq. 2.5) 푈 [퐹]+[푈]

We will consider U to be the reference state, and will divide fU by [U], the substituting in the equilibrium constant we get,

1 푓 = (eq. 2.6) 푈 1+퐾

We can define the partition function, Q, as,

푄 = 1 + 퐾 (eq. 2.7)

The equilibrium constant is a function of temperature, and using the van’t Hoff equation we can written in terms of the gas constant, R, and the enthalpy of folding, H,

휕푙푛퐾 −∆퐻° = (eq. 2.8) 휕1/푇 푅

The above equation can be integrated from 1/TM to 1/T. K = 1 at the TM where the fractions of folding and unfolded RNA are equal.

∆퐻° 1 1 ∆퐻° 1 1 푙푛퐾 = ( − ) (eq. 2.9) or 퐾 = 푒푥푝 [ ( − )] (eq. 2.10) 푅 푇푀 푇 푅 푇푀 푇

To get eq.2 to fit melting data, the above derived equations can be substituted into equation 2.1.

127 A.1.5 Temperature-Dependent In-line Probing and Data Analysis. The triphosphate on the 5’ end of FL tRNA was removed by incubation with calf intestinal phosphatase (CIP)

(NEB) at 37 oC for 20 min, and the RNA was recovered by phenol/chloroform extraction and ethanol precipitation. This RNA was labeled on its 5’ end using [-32P]ATP and polynucleotide kinase (NEB), with incubation at 37 oC for 30 min. Labeled RNA was purified by 10% PAGE followed by a crush and soak and ethanol precipitation procedure.

In-line probing (ILP) experiments on FL tRNA were performed at 12 temperatures between 35 and 75 oC using a Biometra Tgradient thermocycler as an incubator. Before beginning ILP experiments, 1 L of 500,000 cpm/L 32P-labeled RNA was renatured with

Tris buffer (pH 8.3) and KCl for 1 min at 95 oC and cooled at room temperature for 10 min.

o Subsequently, MgCl2 and/or crowder were added, and the sample was heated at 55 C for 1 min and cooled at room temperature for 10 min. Final sample conditions were 20 mM Tris (pH 8.3), 140 mM KCl, 0.5 or 2.0 mM MgCl2, and 0 or 20% crowder; the slightly elevated pH aids RNA self-cleavage without alkaline denaturing the sample. The RNA was then incubated in the thermocycler at the appropriate temperature and aliquots were removed at a specified time. To achieve relatively even degradation across different temperatures, shorter time points were used for higher temperatures. At 35.0 oC, 38.6 oC, and 42.2 oC, the 36 h time point was analyzed; at 45.8 oC, 49.5 oC, and 53.2 oC, the 24 h time point was analyzed; at 56.8 oC and 60.4 oC, the 5 h time point was analyzed; at 64.0 oC and 67.7 oC, the 3 h time point was analyzed; and at 71.4 oC and 75.0 oC, the 1 h time point was analyzed. Time points were quenched with 2X formamide loading dye containing 50 mM Tris (pH 7.0) and 20 mM EDTA, which lowered the pH and sequestered

128 Mg2+ ions. RNA was fractionated on 10% PAGE gels, which were visualized using a

PhosphorImager.

Gel data were evaluated by semi-automated footprinting analysis (SAFA) software

(2) to provide reactivities of individual bands. The ILP reactivities output from SAFA were analyzed for percent reacted and corrected for loading differences by normalizing the raw data to nucleotides 34-36 in the anticodon loop, which were single stranded at all temperatures tested. Normalized ILP data were fit to the two-state melting model

Marquardt algorithm for non-linear curve fitting (eq. 2, above) (1) but were fit in IgorPro to allow the option of global fitting. When non-cooperative folding of FL tRNA was observed by ILP (e.g. in buffer with 0.5 mM Mg2+), non-global fitting of the nucleotide traces of each secondary structure was performed for separate TM’s. However, when cooperative folding of FL tRNA was observed by ILP (e.g. in solutions of 20% PEG200 with

0.5 and 2.0 mM Mg2+ and buffer with 2.0 mM Mg2+), global fitting of the base-paired nucleotides was performed to obtain a single TM for either a given helix or for FL tRNA unfolding, while slopes, y-intercepts and H were still fit separately for each nucleotide.

A.1.5 SAXS Data Collection. FL tRNA was purified and precipitated as described above.

RNA was buffer exchanged into 1X SAXS buffer (25 mM HEPES, pH 7.5 and 140 mM KCl) using an Amicon ultracentrifugal filter (3 kDa molecular weight cutoff). Stock solutions of

PEG8000 and MgCl2 were prepared in the same 1X SAXS buffer to ensure buffer matching.

Prior to data collection, the RNA was renatured in the presence of HEPES (pH 7.5) by heating at 95 oC for 3 min and cooling at room temperature. After cooling, Mg2+ and

129 PEG8000 were added to the solution, which was heated at 55 oC for 3 min and cooled at room temperature for 10 min. Samples were centrifuged at 14k rpm for 10 min to minimize aggregation and remove dust particles. SAXS data were collected on G1 station at MacCHESS (3, 4)—the solution scattering beamline at the Cornell High Energy

Synchrotron Source (CHESS). The detector for data recording was a dual 100K-S

SAXS/WAXS detector (Pilatus). The sample capillary-to-detector distance setup allowed for simultaneous collection of small- and wide-angle scattering data, covering a broad momentum-transfer range (q range) of 0.0075 – 0.8 Å−1 (q = 4sin()/, where 2 is the scattering angle). The energy of the X-ray beam was 9.8528 keV (1.2548 Å) and the synchrotron X-ray beam diameter was 250 m × 250 m.

Data were collected for SAXS either by in-line size-exclusion (SEC) or hand mixing.

SEC helps determine if aggregates are present at SAXS concentrations. The SEC SAXS data were collected on FL tRNA in buffer with 0.5 mM Mg2+ at 4 oC. Samples were injected into a Shodex KW420.5-4F size-exclusion column using a GE HPLC (AKTApurifier) that routed sample through the column and into the BioSAXS flow cell where scattering images were collected. Sample flowed at a rate of 0.15 mL/min, and each frame was collected with 2 sec exposures. RNA elution was monitored by UV-vis detection in line with scattering detection. SEC-SAXS was not performed on PEG8000 because the X-ray beam causes the polymer to aggregate on the capillary tube walls and the viscous solution would result in high column pressure (5). No aggregation was observed at RNA concentrations injected at 0.2 mg/mL, which is diluted by ~10-fold during the SEC run.

130 For non-SEC samples, plugs of ~40-45 L were delivered by hand to the quartz capillary tube, as the MacCHESS robot had trouble pipetting viscous samples with PEG. To test for signal aggregation in hand-loaded experiments, samples was prepared at 0.1,

0.15, 0.2, and 0.4 mg/mL RNA and 0, 0.5, and 2.0 mM Mg2+ with and without 20%

PEG8000, and a buffer scattering curve was measured before and after each sample set.

To ensure absence of PEG aggregation on the sample cell, buffer scans before and after sample acquisition were checked for matching. As with the SEC SAXS samples, only RNA concentrations of 0.2 mg/mL or lower are reported herein due to RNA aggregation.

To reduce radiation damage and known X-ray induced aggregation of PEG, a computer-controlled syringe pump was used to keep the hand-loaded sample oscillating in the X-ray beam. Ten scattering images were collected per sample. In the absence of

PEG, we collected 1 sec exposures, while in the presence of PEG the exposure time was reduced to 0.5 sec and the oscillating rate was increased. In addition, after each PEG- containing sample was exposed to the X-ray beam, the sample cell was washed with several mLs of Hellmanax III solution, rinsed with several mLs of water and ethanol, and dried with forced air.

A.1.6 SAXS Data Analysis. Scattering curves were analyzed with BioXTAS RAW software

(6). The collected scattering images were examined for signs of X-ray damage, which manifested itself as increases in signal at low q-range, and images containing non- damaged samples were averaged. The scattering curves of the buffer were subtracted from the scattering curves of the RNA. The linear region of the ln(I) vs q2 plot, wherein

131 qmaxRg<1.3, was identified for each sample scattering curve using the Guinier analysis, and the radius of gyration (Rg) and molecular weight of the sample were determined. The pairwise distribution function in GNOM (7), using the ATSAS software package (8), was used to determine Rg using the Porod approximation and the maximum particle dimension (Dmax).

Bead models were created by putting the GNOM output file into the online server for DAMMIF (9). Twenty individual DAMMIF bead models were created per reaction condition and DAMCLUST (8) was used to cluster individual DAMMIF models, which resulted in 2-3 groups of similar models that were used to assess the ambiguity of the reconstructions. For all experimental conditions, the DAMCLUST bead models were very similar to each other, so a consensus model was calculated by averaging all the DAMMIF models using DAMAVER (10). The output bead model from DAMAVER was aligned with the crystal structure of tRNAphe (PDB 1ehz) using SUPCOMB (8) and compared in PyMOL

(11). As an additional check on our experimental bead models, we generated a theoretical bead model directly from a tRNA crystal structure (PDB 1ehz), which was done by first generating a theoretical scattering curve from the crystal structure using FoXS software and then converting that scattering curve into a bead model. We then compared this theoretical bead model and its Rg and Dmax values with our experimental data.

132 A.2 SUPPLEMENTAL TABLES AND FIGURES

Table A-1. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 0 mM Mg2+ derived from optical melting.

Melting Temperature (°C) in 0 mM Mg2+ FL T7 Acceptor Anticodon Additive tRNAphe Stem D SL SL TC SL Sum of SS None 53.2 56.8 60.5 63.0 58.3 59.6

20% PEG200 52.8 52.9 58.8 61.7 53.4 55.6 40% PEG200 50.5 45.5 44.0 60.0 44.0 48.4

20% PEG4000 45.5 57.8 64.9 66.7 56.8 61.3 40% PEG4000 64.0 59.3 64.6 67.6 60.0 62.2

20% PEG8000 66.0 68.1 69.0 72.5 72.4 70.4 40% PEG8000 54.0 57.0 51.0 68.8 60.5 58.9

20% PEG20000 -- 60.5 61.5 57.0 57.0 57.0

Samples contain a background of 10 mM sodium cacodylate (pH 7.0) and 140 mM KCl.

133 Table A-2. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 0.5 mM Mg2+ derived from optical melting.

Melting Temperature (°C) in 0.5 mM Mg2+ FL T7 Acceptor Anticodon Additive tRNAphe Stem D SL SL TC SL Sum of SS None 57.8 57.1 (1) 60.5 67.8 59.6 63.4 (2) 67.8

20% PEG200 57.0 52.3 63.5 63.5 55.3 56.4 40% PEG200 59.1 44.6 51.5 55.6 50.3 49.3

20% PEG4000 63.0 57.7 78.0 67.1 57.2 58.2 40% PEG4000 65.5 59.3 64.6 67.7 61.0 62.3

20% PEG8000 64.0 59.0 55.1 67.6 57.4 58.4 40% PEG8000 66.0 57.2 54.2 66.7 53.6 57.7

20% PEG20000 64.0 58.5 -- 62.0 58.5 58.5

All samples contain a background of 10 mM sodium cacodylate (pH 7.0) and 140 mM KCl.

134 Table A-3. Melting temperatures of T7 tRNAphe, its individual HF, and the SSS in the background of 2.0 mM Mg2+ derived from optical melting.

Melting Temperature (°C) in 2.0 mM Mg2+ FL T7 Acceptor Anticodon Additive tRNAphe Stem D SL SL TC SL Sum of SS None 65.1 59.7 66.5 69.7 62.0 63.4

20% PEG200 64.8 55.8 61.6 66.1 58.2 59.3 40% PEG200 64.1 50.4 58.4 54.6 61.4 51.9

20% PEG4000 69.5 59.3 64.6 67.6 61.9 62.3 40% PEG4000 87.5 58.0 52.5 67.6 61.0 62.3

20% PEG8000 69.0 59.8 62.7 70.1 61.1 64.3 40% PEG8000 64.0 57.6 59.5 66.5 61.0 60.6

20% PEG20000 70.0 58.0 -- 69.0 58.5 62.5

All samples contain a background of 10 mM sodium cacodylate (pH 7.0) and 140 mM KCl.

135 Table A-4. Evaluation of SAXS data fitting using FoXS and Supcomb. SAXS Sample RMSD (Å) Buffer with 0 mM Mg2+ 6.56 Buffer with 0.5 mM Mg2+ 4.13 Buffer with 2.0 mM Mg2+ 3.30 20% PEG with 0.5 mM Mg2+ 3.55 20% PEG with 2.0 mM Mg2+ 2.75

The RMSD was found using the SUPCOMB alignments of the DAMAVER envelopes to the tRNA crystal structure.

136

Figure A-1. Secondary and tertiary structures of FL and WT tRNAphe. Tertiary contacts, yellow lines, are superimposed on (left) FL tRNAphe and (right) WT tRNAphe, which contains modifications (black bases). Coloring as per Figure 2-1.

137

Figure A-2. FL transcribed tRNA and WT tRNA behave in a similar manner in buffer and crowded conditions. In (left) buffer and (right) 20% PEG8000 FL tRNA (closed circles) and WT modified tRNA (open circles) have similar folding transitions at physiological concentrations of Mg2+.

138

Figure A-3. Temperature-dependent in-line probing (dT-ILP) PAGE gel of FL tRNAphe in buffer and 20% PEG200 with a background of 10 mM sodium cacodylate, 140 mM KCl, and 0.5 mM Mg2+. Guanosines on the T1 ladder are marked with a pink dot, and the regions of the gel that contain nucleotides in the 3’ of the acceptor stem, the D loop, anticodon loop, variable loop, and TC loop are noted. The colors of those regions match the colors in Fig. 2-1. At 35.0 oC, 38.6 oC, and 42.2 oC the 36 h time point was analyzed; at 45.8 oC, 49.5 oC, and 53.2 oC the 24 h time point was analyzed; at 56.8 oC and 60.4 oC the 5 h time point was analyzed; at 64.0 oC and 67.7 oC the 3 h time point was analyzed; and at 71.4 oC and 75.0 oC the 1 h time point was analyzed.

139

Figure A-4. Temperature-dependent in-line probing (dT-ILP) PAGE gels of FL tRNAphe in buffer and 20% PEG200 with a background of 10 mM sodium cacodylate, 140 mM KCl, and 2.0 mM Mg2+. Guanosines on the T1 ladder are marked with a pink dot, and the regions of the gel that contain nucleotides in the 3’ of the acceptor stem, the D loop, anticodon loop, variable loop, and TC loop are noted. The colors of those regions match the colors in Fig. 2-1. At 35.0 oC, 38.6 oC, and 42.2 oC the 36 h time point was analyzed; at 45.8 oC, 49.5 oC, and 53.2 oC the 24 h time point was analyzed; at 56.8 oC and 60.4 oC the 5 h time point was analyzed; at 64.0 oC and 67.7 oC the 3 h time point was analyzed; and at 71.4 oC and 75.0 oC the 1 h time point was analyzed.

140

Figure A-5. Helical stem fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 2.0 mM Mg2+. Helical fits were performed on buffer and 20% PEG200 samples to obtain a TM for unfolding of each stem in (A) D SL, (B) AC SL, and (C) TC SL. The TM values and the residuals of the fits are provided in each figure and in Table 2-1.

141

Figure A-6. Global fitting of temperature-dependent ILP data in buffer and 20% PEG200 with 2.0 mM Mg2+. The same ILP data from Figure A-5 was fit globally across each stem for a single TM of the RNA and to look for two-state behavior. The TM values and the residuals from the fits are provided in each figure and in Table 2-1.

142

Figure A-7. Uv-Vis detection and scattering intensity of in-line SEC SAXS of FL tRNA in buffer with 0.5 mM Mg2+. (Left) Absorbance-detection at 260 nm of size-exclusion traces and (Right) Integrated scattering intensity of FL tRNA in buffer witih 0.5 mM Mg2+ at 0.2 (pink), 0.4 (blue), and 0.6 mg/mL (black). The molecular weight of both peaks as determined from the integrated intensity in BioXTAS RAW is labeled. The left peak has a higher molecular weight (~50 kDa) and is attributed to the formation of a dimer, and the right peak has a lower molecular weight (~27 kDa) very close to that expected of the monomer (estimated 25 kDa). In the 0.2 mg/mL curves, trace amounts of the dimer peak are absorbance-detected, but are not scattering detected. The curves of 0.6 mg/mL RNA (black) were offset to align with the other curves. During the 0.6 mg/mL experiment a bubble was found in the SEC line and the beam had to be turned off, therefore the time of exposure and volume eluted was offset.

143

Figure A-8. Small angle X-ray scattering data fitting of FL tRNA in buffer and 20% PEG8000 in the background of low, physiological Mg2+. (A) Scattering curves, (B) Kratky plots, and (C) Porod plots of FL tRNA in buffer with 0 (blue), 0.5 (light yellow), and 2.0 mM Mg2+ (red), and in 20% PEG8000 with 0.5 (dark yellow) and 2.0 mM Mg2+ (maroon).

144

Figure A-9. Comparison of experimental SAXS scattering curves and the theoretical scattering curve of a tRNAphe crystal structure (PDB ID: 1ehz) generated with FoXS. (A- C) Overlay of experimental scattering curves with the theoretical tRNAphe scattering curve generated with FoXS in (A) 0, (B) 0.5, and (C) 2.0 mM Mg2+ with and without PEG800. (D) DAMAVER bead model generated from the tRNAphe theoretical scatting curve overlayed with the crystal structure. The X2 of the FoXS fits are provided on each plot.

145 A.3 References

1. Siegfried NA & Bevilacqua PC (2009) Thinking inside the box: Designing,

implementing, and interpreting thermodynamic cycles to dissect cooperativity in

RNA and DNA folding. Methods Enzymol 455:365-393.

2. Das R, Laederach A, Pearlman SM, Herschlag D, & Altman RB (2005) SAFA: Semi-

automated footprinting analysis software for high-throughput quantification of

nucleic acid footprinting experiments. RNA 11:344-354.

3. Acerbo AS, Cook MJ, & Gillilan RE (2015) Upgrade of macchess facility for X-ray

scattering of biological macromolecules in solution. J Synchrotron Radiat

22(1):180-186.

4. Skou S, Gillilan RE, & Ando N (2014) Synchrotron-based small-angle X-ray

scattering (SAXS) of proteins in solution. Nat Protoc 9(7):1727-1739.

5. Kilburn D, Roh JH, Behrouzi R, Briber RM, & Woodson SA (2013) Crowders

perturb the entropy of RNA energy landscapes to favor folding. J Am Chem Soc

135:10055-10063.

6. Nielsen SS, et al. (2009) Bioxtas raw, a software program for high-throughput

automated small-angle X-ray scattering data reduction and preliminary analysis.

J Appl Crystallogr 42(5):959-964.

7. Svergun D (1992) Determination of the regularization parameter in indirect-

transform methods using perceptual criteria. J Appl Crystallogr 25(4):495-503.

146 8. Petoukhov MV, et al. (2012) New developments in the ATSAS program package

for small-angle scattering data analysis. J Appl Crystallogr 45(2):342-350.

9. Franke D & Svergun DI (2009) DAMMIF, a program for rapid ab-initio shape

determination in small-angle scattering. J Appl Crystallogr 42(2):342-346.

10. Volkov VV & Svergun DI (2003) Uniqueness of ab initio shape determination in

small-angle scattering. J Appl Cryst 36:860-864.

11. DeLano WL (2002) The PyMOL molecular graphics system.

147 Appendix B

Supporting Information: Chapter 3

Under Revision as a paper entitled: “Cotranscriptional Folding of RNA is Cooperative

Under Physiological Conditions” by Kathleen A. Leamy, Neela H. Yennawar, and Philip C.

Bevilacqua [All experiments were carried out by K.A.L. SAXS data was collected and analyzed with the help of N.H.R. Experiments were planned by K.A.L. and P.C.B.]

B.1 Materials and Methods

B.1.1 Chemicals. PEG8000, HEPES, and sodium cacodylate were purchased from Sigma.

KCl was purchased from J. T. Baker. Calf intestinal phosphatase and polynucleotide kinase were purchased from NEB.

B.1.2 RNA Constructs and Preparation. Full-length (FL) tRNAphe, I73, I72, I69, and I65 cotranscriptional intermediates were transcribed from a hemi-duplex DNA template purchased from IDT that was used without further purification. The T7 promoter and DNA template were annealed by heating to 95 C for 3 min in 100 mM NaCl and cooling at room temperature for 10 min.

148 The 5’ leader and the precursor constructs were transcribed from duplex templates that were made by PCR amplification of double stranded gBlocks purchased from IDT. The 5’ nucleotides of these RNA sequences were suboptimal for transcription, so a hammerhead ribozyme was engineered on the 5’ end of the RNA (1). The hammerhead ribozyme cleaved cotranscriptionally, and the cleaved 5’ leader or precursor product was purified. To amplify the gBlocks, the template (20 ng) was mixed with the top strand and bottom strand primers (0.5 uM each), dNTPs (0.5 mM each), and Q5

Polymerase, incubated at 98 C for 30 sec, annealed at 64 C for 30 min, and extended at

72 C for 1 min. This process was repeated for 25 cycles, and the PCR product was verified on a 1.5% agarose gel.

All RNAs were transcribed using T7 polymerase in 40 mM Tris (pH 7.5), 25 mM

MgCl2, 2 mM DTT, 1 mM spermidine, and 3 mM NTPs, with incubation at 37 C for 4 hours.

The RNA was purified on a 10% PAGE gel followed by a crush and soak/ethanol precipitation procedure. The RNA was buffer exchanged into 10 mM sodium cacdoylate using an Amicon ultracentrifugal filter (3 kDa molecular weight cutoff).

FL tRNAphe: 5′GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUUCGCACCA

Intermediate 73: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUUCGCA

Intermediate 72: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUUCGC

149 Intermediate 71: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUUCG

Intermediate 69: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAGAAUU

Intermediate 65: 5’GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA UCCACAG

5’ Leader: 5’GGGAGAUGGACUUUUACCUGAUGAGGCCGAAAGGCCGAAACUCCACGAAAGUGGAGUAG UAAAAGUCCAUUAGUUGUAAGCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAU CUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCA

Precursor: 5’GGGAGAUGGACUUUUACCUGAUGAGGCCGAAAGGCCGAAACUCCACGAAAGUGGAGUAG UAAAAGUCCAUUAGUUGUAAGCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAU CUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCAUUCCUUUUUGCUAGCAUUCU

The underlined region is base pairing with the hammerhead region and the italic region is the hammerhead ribozyme.

B.1.3 Thermal Denaturation and Data Analysis. RNA was denatured at 95C for 3 min and annealed at room temperature for 10 min in the presence of KCl and sodium cacodylate (pH 7.0). MgCl2 and/or polyethylene glycol (PEG) 8000 and/or amino acids were added to the RNA solution, and the sample was heated at 55 C for 3 min and cooled at room temperature for 10 min. Samples were spun down at 14,000 rpm for 5 min at 4

C to remove air bubbles and particulates. Samples were a final concentration of 0.5 μM

RNA in 10 mM sodium cacodylate (pH 7.0), 140 mM KCl, 0.5-2.0 mM MgCl2, 0-20% (w/v)

150 PEG8000, and 0 or 106 mM amino acids. Samples with amino acids were prepared as previously described (2).

Thermal denaturation experiments were performed on an HP 8452 diode-array refurbished by OLIS, Inc with a data point collected every 0.5 C with absorbance detection from 230-330 nm. This method is referred to as “optical melting”. When observed, single-transition melt data were fit using a two-state model using sloping baselines and a nonlinear Marquardt algorithm in KalediaGraph using Eq. 2 (the derivation can be found in section A.1.4), where mu and mf are the slopes of the unfolded and folded baselines, bu and bf are the y-intercepts of the unfolded and folded baselines,

H is the enthalpy of folding, and TM is the melting temperature in degrees Celsius. R is the gas constant of 0.001987 kcal K-1 mol-1. Data were smoothed with an 11-point window prior to the derivative being taken.

Δ퐻 1 1 [ ][ − ] 푅 푇 +273.15 푇+273.15 (푚푢푇+푏푢)+(푚푓푇+푏푓)푒 푀 푓(푇) = Δ퐻 1 1 (Eq. 2) [ ][ − ] 1+푒 푅 푇푀+273.15 푇+273.15

B.1.4 In-line Probing. The triphosphate on the 5’ end of the T7 transcribed RNAs was removed by incubation with calf intestinal phosphatase at 37 C for 20 min, and the RNA was recovered by a phenol/chloroform and ethanol precipitation. This RNA was labeled on the 5’ end with [-32P]ATP using polynucleotide kinase at 37 C for 20 min. Labeled

RNA was purified on a 10% PAGE gel followed by a crush and soak/ethanol precipitation procedure.

151 One µL of 500,000 cpm/µL 32P labeled RNA was renatured in the presence of 140 mM KCl by heating at 95 C for 1 min and cooling at room temperature for 10 min. Tris

(pH 8.3), MgCl2, PEG8000, and amino acids were then added to the solution, and the sample was heated at 55 C for 1 min and cooled at room temperature for 10 min. Final sample conditions were 20 mM Tris (pH 8.3), 140 mM KCl, 0.5-2.0 mM MgCl2, 0- 20%

PEG8000 (w/v), and 0 or 106 mM amino acids. For single temperature ILP, the RNA was incubated at 37 C and aliquots were removed and quenched with 2X formamide loading dye containing 50 mM Tris (pH 7.0) and 20 mM EDTA at 12, 24, 36, and 48 h. The 36 hour time points were fractionated on 10% PAGE gels, which were visualized using a phosphorimager. For variable temperature ILP, time points were analyzed as follows: At

35.0 C, 38.6 C, and 42.2 C, the 36 h time point was analyzed; at 45.8 C, 49.5 C, and

53.2 C, the 24 h time point was analyzed; at 56.8 C and 60.4 C, the 5 h time point was analyzed; at 64.0 C and 67.7 C, the 3 h time point was analyzed; and at 71.4 C and 75.0

C, the 1 h time point was analyzed. These were quenched as above with 2X formamide loading dye containing 50 mM Tris (pH 7.0) and 20 mM EDTA, and fractionated on a 10% gel, which were visualized with a phosphorimager.

Gel data were analyzed by semiautomated footprinting analysis (SAFA) software to obtain intensities of individual bands. The ILP intensities output from SAFA were normalized to nucleotides 33-35 in the anticodon loop, which were single stranded in each sample. Notably, the 15 nucleotides closest to the 5’ end were not analyzed in the presence of amino acids because of interference from high salt concentrations.

152 Normalized ILP intensity in stem regions was globally fit to a two-state unfolding model

(equation 1) for a single TM and H of folding in Igor Pro.

B.1.5 Small Angle X-ray Scattering. RNAs were transcribed by T7 polymerase and purified as described above. RNA was renatured by heating at 95 C in the presence of KCl and

HEPES buffer (pH 7.0) for 3 min then cooled at room temperature for 10 min. After cooling, MgCl2 and PEG8000 were added and the solution was heated at 55 C for 3 min and then cooled at room temperature for 10 min. Samples containing only buffer were collected via in-line size exclusion SAXS and samples containing PEG were collected via hand loading SAXS. Data was collected and analyzed as previously described (3).

B.1.6 Native PAGE. RNA transcribed by T7 transcription was radiolabeled with [-32P]ATP and purified as described above. RNA was renatured by heating at 95 C in the presence of KCl and sodium cacodylate buffer (pH 7.0) for 3 min then cooled at room temperature for 10 min. After cooling, MgCl2, PEG8000, amino acids were added and the solution was heated at 55 C for 3 min and then cooled at room temperature for 10 min. A final concentration of 10% glycerol was added to each sample and 5 L of each sample was loaded onto a 10% polyacrylamide native gel. Samples containing RNA renatured in 0.5 or 2.0 mM Mg2+ were run on gels containing 0.5 or 2.0 mM Mg2+, respectively. The RNA was fractionated on the native gels for 5 hours at 5 C and visualized with a phosphorimager.

153 B.1.7 CoFold Structure Prediction. Structure formation of cotranscriptional intermediates was predicted using the CoFold webserver, part of the Vienna package (4).

Intermediates with varying lengths on the 3’ end were input to the webserver to predict secondary structure contacts.

B.1.8 tRNA Core Enthalpy Calculation. The theoretical fully cooperative enthalpy of unfolding for the tRNAphe core was calculated using nearest-neighbor parameters for base pairs in each of the four stem, according to equation 2.

∆퐻푡푅푁퐴 = ∆퐻푎푐푐푒푝푡표푟 푠푡푒푚 + ∆퐻퐷 푠푡푒푚 + ∆퐻푎푛푡푖푐표푑표푛 푠푡푒푚 + ∆퐻푇Ψ퐶 푠푡푒푚 (eq. 2)

154 B.2 SUPPLEMENTAL TABLES AND FIGURES

Table B-1. Cotranscriptional intermediate thermodynamic parameters determined by thermal denaturation.

0.5 mM Mg2+ 2.0 mM Mg2+ Construct ∆Hfolding ∆HI/∆HFL TM ∆Hfolding ∆HI/∆HFL TM (kcal/mol) (°C) (kcal/mol) (°C)

Buffer FL -47.4 1.0 58.2 -89.6 1.0 65.1 I73 -21.7 0.5 57.4 -30.2 0.3 63.9 I72 -23.9 0.5 54.5 -30.1 0.3 62.2

20% PEG8000 FL -54.8 1.0 59.7 -91.9 1.0 69.0 I73 -22.8 0.5 55.5 -40.5 0.4 69.5 I72 -42.4 0.9 54.3 -56.7 0.6 67.9

Mg2+-Amino Acids FL -84.2 1.0 65.6 -107.1 1.0 71.0 I73 -78.8 0.9 65.1 -106.5 1.0 70.4 I72 -70.6 0.8 62.8 -97.4 0.9 68.9

Mg2+-Amino Acids & 20% PEG8000 FL -103.0 1.0 67.4 -125.6 1.0 71.7 I73 -110.6 1.1 66.4 -140.0 1.1 71.3 I72 -111.3 1.1 64.7 -121.5 1.0 70.2

FL, I73, and I72 are abbreviations for full-length tRNA, cotranscriptional intermediate 73, and cotranscriptional intermediate 72. ∆HI/∆HFL is the ratio of the Hfolding of the intermediates divided by the Hfolding of full length tRNA. A ∆HI/∆HFL close to 1 indicates that both constructs have similar levels of cooperativity, a ∆HI/∆HFL >1 indicates the intermediate is more cooperative than full length, and a ∆HI/∆HFL <1 indicates that the intermediate is less cooperative than full length. The errors on the H and TM values are less than 5%.

155 Table B-2. SAXS parameters of FL tRNA and cotranscriptional intermediates in buffer and 20% PEG8000 with 0.5 and 2.0 mM Mg2+.

a a b b Co- MW Rg Rg Dmax transcription (kDa) (Å) (Å) (Å) Intermediate

20% PEG8000 with 0.5 mM Mg2+ FL tRNAd 25.7 25.9 25.7 83 I73 27.6 20.9 21.5 74 I72 23.6 20.2 20.8 73

Buffer with 2.0 mM Mg2+ FL tRNAd 28.5 24.2 25.1 82 I73 31.2 23.3 24.1 82 I72 25.7 22.7 24.5 83

20% PEG8000 with 2.0 mM Mg2+ FL tRNAd 41.7 21.2 23.9 74 I73 36.8 25.6 26.5 90 I72 37.7 27.1 25.8 82

Solutions contain 25 mM HEPES (pH 7.5), 140 mM KCl, and 0.5, or 2.0 mM MgCl2. Samples containing 20% PEG8000 are indicated. aObtained by analysis of the experimental scattering curves using BioXTAS RAW software (5). bObtained by analysis of the experimental scattering curves using the pairwise distribution function in GNOM, using the ATSAS software package (6). cSAXS parameters for FL tRNA were previously published (2, 3).

156 Table B-3. Thermodynamic parameters for the unfolding of the tRNA core in the 5’ Leader and precursor constructs, determined by global fitting of variable temperature in-line probing.

2b 2 b RNA TM Hfolding  TM Hfolding  Construct (°C) (kcal/mol) (°C) (kcal/mol) Buffer Crowding & aaCMc

5’ leadera 59.9 ± 0.2 -128.9 ± 13.5 2.2 65.5 ± 0.5 -145.6 ± 23.5 0.9

precursora 58.5 ± 0.2 -135.8 ± 11.8 0.4 65.2 ± 1.8 N.A.d 0.8

aGlobal fitting of all double-stranded nucleotides in tRNA. The fits can be found in Figure 3-5. b2 reflects the quality of the global and helical fits obtained. A low 2 indicates a good fit to the data. cSamples contain 20% PEG8000, 14.0 mM Mg2+ chelated to 106 mM amino acids, and 2.0 mM free Mg2+. dN.A. is “Not Available” due to absence of an upper baseline.

157

Figure B-1. Thermal denaturation of FL tRNA and shorter cotranscriptional intermediates in buffer and various physiological conditions (rows) with 0.5 or 2.0 mM free Mg2+ (columns). Intermediates unfolding in buffer, 20% PEG8000, amino acid-chelated Mg2+, and 20% PEG8000 with amino acid-chelated Mg2+ in the background of (A-D) 0.5 mM Mg2+ and (E-H) 2.0 mM Mg2+. Colors and symbols for each construct are provided in the figure.

158

Figure B-2. Native gels of FL tRNA and cotranscriptional intermediates under in vitro and physiological conditions. Native gels of FL tRNA and cotranscriptional intermediates in (A) 0.5 mM and (B) 2.0 mM Mg2+. RNA Constructs were incubated in either buffer, 20% PEG8000, amino acid chelated Mg2+, or 20% PEG8000 with amino acid chelated Mg2+ in the background of 2.0 mM Mg2+ and 140 mM KCl.

159

Figure B-3. Small angle X-ray scattering data of intermediates in 0.5 and 2.0 mM free Mg2+. (A) Size-exclusion chromatography (SEC) scattering intensities in buffer; (B) scattering curves; (C) Kratky plots; and (D) Porod plots. Full length tRNA is black, I73 is represented by closed circles, and I72 is represented by open circles. Samples in buffer with 2.0 mM Mg2+ are green, 20% PEG8000 with 0.5 mM Mg2+ are blue, and 20% PEG8000 with 2.0 mM Mg2+ are orange.

160

Figure B-4. ILP of FL tRNA and cotranscriptional intermediates under in vitro or physiological conditions with 0.5 mM Mg2+. RNA constructs were incubated in either buffer, 20% PEG8000, amino acid chelated Mg2+, or 20% PEG8000 with amino acid chelated Mg2+ in the background of 0.5 mM Mg2+, 140 mM KCl, and 25 mM HEPES (pH 8.3).

161

Figure B-5. ILP of FL tRNA and cotranscriptional intermediates under in vitro or physiological conditions with 2.0 mM Mg2+. RNA constructs were incubated in either buffer, 20% PEG8000, amino acid chelated Mg2+, or 20% PEG8000 with amino acid chelated Mg2+ in the background of 2.0 mM Mg2+, 140 mM KCl, and 25 mM HEPES (pH 8.3).

162

Figure B-6. Variable temperature in-line probing of 5’ leader construct in buffer and physiological conditions with 2.0 mM free Mg2+. Guanosines on the T1 ladder are marked with a blue dot, and the regions of the gel that contain nucleotides in the 5’ leader, 5’ portion of the acceptor stem, the D loop, anticodon loop, and TC loop are labeled. See Materials and Methods for time points analyzed on the gel at each temperature.

163

Figure B-7. Variable temperature in-line probing of precursor construct in buffer and physiological conditions with 2.0 mM free Mg2+. Guanosines on the T1 ladder are marked with a blue dot, and the regions of the gel that contain nucleotides in the 5’ leader, 5’ portion of the acceptor stem, the D loop, anticodon loop, TC loop, and 3’ trailer are labeled. See Materials and Methods for time points analyzed on the gel at each temperature.

164

Figure B-8. In-line probing reactivity of 5’ leader and precursor constructs at 35 °C with 2.0 mM free Mg2+. Normalized ILP reactivity of (A) 5’ leader and (B) precursor constructs in buffer (open circles) and 20% PEG8000 with amino acid-chelated Mg2+ (closed circles). ILP gels can be found in Figures B-6 and B-7, with just the 35 oC data plotted here.

165

Figure B-9. Global fitting of variable temperature in-line probing signal of the precursor construct in buffer and physiological conditions (columns). Global fitting was performed on all double stranded regions (rows) simultaneously in tRNA in (A-D) buffer and (E-H) 20% PEG800 with amino 2+ acid chelated-Mg to obtain a single TM and H of folding for each condition, which can be found in Table B-3. All samples contain 2.0 mM free Mg2+. Global fits (lines) and data points are shown for each stem in tRNA.

166

Figure B-10. Computationally predicted structure formation during cotranscriptional folding of the 3’ leader of yeast tRNAphe. Predicted cotranscriptional structure formation in tRNAphe as the 3’ trailer is elongated. The regions of tRNA are colored as follows: 5’ leader and 3’ trailer (black), acceptor stem (purple), D stem-loop (blue), anticodon stem-loop (green), and TC stem-loop (pink). Nucleotides with native base pairing are depicted with colored lines, and non-native base pairing is depicted with black lines. Earlier predicted cotranscriptional structures can be found in Figure 3-6.

167 B.3 References

1. Ferré-D'Amaré AR & Doudna JA (1996) Use of cis- and trans-ribozymes to remove 5′ and

3′ heterogeneities from milligrams of in vitro transcribed RNA. Nucleic Acids Res

24(5):977-978.

2. Leamy KA, Yennawar NH, & Bevilacqua PC (2018) Molecular mechanism for folding

cooperativity of functional rnas in living organisms. Biochemistry In Press.

3. Leamy KA, Yennawar NH, & Bevilacqua PC (2017) Cooperative RNA folding under cellular

conditions arises from both tertiary structure stabilization and secondary structure

destabilization. Biochemistry 56(27):3422-3433.

4. Proctor JR & Meyer IM (2013) Cofold: An RNA secondary structure prediction method

that takes co-transcriptional folding into account. Nucleic Acids Res 41(9):e102-e102.

5. Hopkins JB, Gillilan RE, & Skou S (2017) BioXTAS RAW: Improvements to a free open-

source program for small-angle X-ray scattering data reduction and analysis. J Appl

Crystallogr 50(5):1545-1553.

6. Svergun DI (1992) Determination of the regularization parameter in indirect-transform

methods using perceptual criteria. J Appl Cryst 25:495-495-493.

168 Appendix C

Supporting Information: Chapter 4

Published as a paper entitled: “Molecular Mechanism for Folding Cooperativity of

Functional RNAs in Living Organisms” by Kathleen A. Leamy, Neela H. Yennawar, and

Philip C. Bevilacqua in Biochemistry 57(20):2994-3002 2018.

C.1 Materials and Methods

C.1.1 Chemicals. PEG8000, HEPES, MgCl2, and sodium cacodylate were purchased from

Sigma. KCl was purchased from J. T. Baker. Calf intestinal phosphatase and polynucleotide kinase were purchased from NEB.

C.1.2 RNA Constructs and Preparation. Wild-type (WT) tRNAphe and mutants (M) were transcribed with a hemi-duplex DNA template purchased from IDT that was used without further purification, as previously described (1). The RNA was buffer exchanged into 10 mM sodium cacdoylate using an Amicon ultracentrifugal filter (3 kDa molecular weight cutoff). The RNA sequences are below, and nucleotides mutated from WT are in bold.

FL tRNAphe:

169 5ʹGCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAAUUCGCACCA

Mutant 1 (M1):

5’GCUAAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAAUUAGCACCA 3’

Mutant 2 (M2):

5’GCGAAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAAUUCGCACCA 3’

Mutant 3 (M3):

5’GCGCAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAAUGCGCACCA 3’

Mutant 4 (M4):

5’GCGCGUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAACGCGCACCA 3’

Mutant 5 (M5):

5’GCGCGCUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGAGCGCGCACCA 3’

Mutant 6 (M6):

5’GCGCGCGUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGA

UCCACAGCGCGCGCACCA 3’

170 C.1.3 Thermal Denaturation. RNA was renatured by denaturing at 95°C for 3 min and annealing at room temperature for 10 min in the presence of KCl and sodium cacodylate

(pH 7.0). After cooling, MgCl2, polyethylene glycol (PEG) 8000, and amino acids were added to the RNA solution. The sample was then heated at 55 °C for 3 min and cooled at room temperature for 10 min. Samples were spun down at 14,000 rpm for 5 min at 4 °C to remove air bubbles and particulates. Samples were final concentrations of 0.5 μM RNA in 10 mM sodium cacodylate (pH 7.0), 140 mM KCl, 0.5-2.0 mM MgCl2, and 0-20% (w/v)

PEG8000, and 0-106. mM amino acids, as indicated (2). Further explanation of Mg2+- chelated samples is in Table C-5, as previously described (2). Thermal denaturation experiments were performed on an HP 8452 diode-array refurbished by OLIS, Inc with a data point collected every 0.5 °C and a data point collected ~0.5 °C/min with absorbance detection from 230-330 nm. This method is referred to as “optical melting”.

C.1.4 Thermal Denaturation Data Analysis. Thermal denaturation data on WT and mutant RNAs was truncated, as in Figure 4-2 and Figure C-1, to remove excess baselines and fit globally from 250-290 nm using a two-state model with sloping baselines and a nonlinear Marquartdt algorithm in IgorPro (3) using equation 2 (The derivation for which can be found in Section A.1.4), where mu and mf are the slopes of the unfolded and folded baselines, bu and bf are the y-intercepts of the unfolded and folded baselines, DH is the enthalpy of folding, and TM is the melting temperature in Kelvin. R is the gas constant of

0.001987 kcal-1 K-1 mol-1. In global fitting the slopes and y-intercepts were allowed to vary at each wavelength, but the DH and TM were held constant. The traces at 260 nm were

171 normalized for fraction unfolded using equation 3 which uses the sloping baselines from the global fits to normalize for fraction unfolded, where f is the fraction unfolded and AT is the raw absorbance at a temperature, T. There is noise in some of the polyethylene glycol data in mutant with high stem stability because the absorbance change in these transitions was low.

∆ . . (eq. 2) � � = ∆ . .

() (eq. 3) � = ()()

The theoretical non- and fully cooperative enthalpies of folding were calculated to compare with the experimental ∆H. The non-cooperative ∆H is the ∆Hfolding derived using nearest neighbor parameters of the WT and mutant acceptor stems (4). The fully cooperative ∆H is the experimentally derived ∆Hfolding of WT plus or minus the nearest neighbor model ∆∆H of the mutant acceptor stems.

C.1.5 Small Angle X-ray Scattering Data Collection and Analysis. WT and MT tRNA constructs were transcribed, purified, and buffer exchanged as described previously (1).

RNA was renatured in 1X SAXS buffer by denaturing at 95 °C for 3 min and annealing at room temperature for 10 min. After cooling, MgCl2 was added to the sample which was heated at 55 °C for 3 min then cooled at room temperature for 10 min. Samples were centrifuged at 14k rpm for 10 min to minimize aggregation (5).

SAXS data were collected on G1 station at MacCHESS-the solution scattering beamline at the Cornell High Energy Synchrotron Source (CHESS) (6) using in-line size-

172 exclusion chromatography (SEC) to separate monomers from aggregates, as previously described (1). Samples were at a final concentration of 0.2 mg/mL in buffer with 2.0 mM

Mg2+. SAXS data were analyzed as previously described using the linear Gunier region to obtain structural data (1). The scattering curves were compared with simulated tRNA scattering curves using the FoXS webserver (7, 8).

C.1.6 In-line Probing. The triphosphate on the 5’ end of T7 transcripts was removed by incubation with calf intestinal phosphatase (CIP) at 37 °C for 20 min, followed by a phenol/chloroform extraction and an ethanol precipitation. This RNA was labeled on the

5’ end with [g-32P]ATP by incubation with polynucleotide kinase at 37 °C for 30 min. The labeled RNA was purified by 10% PAGE followed by a crush and soak and ethanol precipitation procedure. 500k cpm of labeled RNA with 140 mM KCl and 20 mM Tris (pH

8.3) was renatured by denaturing at 95 °C for 1 min and annealing at room temperature for 5 min. After annealing, PEG8000, MgCl2, and amino acids were added as indicated, and the solution was heated at 55 °C for 1 min then cooled at room temperature for 5 min. The samples were incubated at 37 °C and aliquots removed at 12, 24, 36, and 48 hours, which were quenched with 50 mM Tris (pH 7.0) and 20 mM EDTA. RNA was fractionated on a 10% PAGE gel and visualized with a PhosphorImager. Lanes containing different amounts of salts and amino acids were separated on the gel to avoid running contamination. Gel data was analyzed using semi-automated footprinting analysis (SAFA) software (9) to obtain reactivities of each individual band. The reactivity of each band was

173 normalized to nucleotides 34-36 in the anticodon loop, which were single-stranded in all solution conditions. This corrects for percent reacted and loading differences.

C.1.7 Optimal Growth Temperature Analysis. We analyzed the strength of tRNA stems, ribosomal RNA GC percent, and genome GC percent against optimal growth temperature of organisms. The growing temperatures of the organisms was found in the DSMZ German

Collection of Microorganisms (11), and can be found in Table C-6. The base pairs in the acceptor, D, anticodon, and TYC stem for RNA sequences were found using the Transfer

RNA database (10). The portions of the sequence that compose the 5’ and 3’ ends of each stem were input to the RNACofold webserver, part of the Vienna Package, to determine the ∆G of each stem (Table C-6). The sequences from the organisms were binned into six temperature ranges (20-29 °C with 14 sequences, 30-40 °C with 11 sequences, 50-69 °C with 5 sequences, 70-79 °C with 4 sequences, 80-89 °C with 5 sequences, and 90-99 °C with 4 sequences), the temperature in each bin was averaged, and the five weakest

∆Gfolding in each bin were averaged together. A linear fit for threshold ∆Gfolding was calculated using the averaged values in Igor Pro. Genome and rRNA GC content vs. OGT was fit for a linear correlation in Igor Pro (Table C-7).

C.1.8 Information Content. Using a sequence alignment, Information content was calculated for each position in tRNA to determine conservation. Nucleotides that are highly conserved have 2 bits of information, nucleotides that are not conserved have 0 bits of information, while two nucleotides that co-vary with Watson-Crick base pairing

174 share 2 bits of information. Information content was calculated separately for the loop and stem positions, as previously described (12). For the nucleotides in loops, information content was calculated by finding the Shannon uncertainty (H) from the sequence alignment using equation 4, where Pi is the probability of finding a particular base, (A, C,

G, U) at position i in the sequence. The Shannon uncertainty for genomes is approximately 2, and the information content (IC) is Hgenome minus Hposition, according to equation 5.

(eq. 4) � = − �����

(eq. 5) �� = � − �

For stem regions, covariation of base pairs was accounted for when finding information content. Equations 4 and 5 were used to find the Shannon uncertainty and the information content.

175 C.2 Supplemental Tables and Figures

Table C-1. Thermodynamic parameters for WT tRNA and mutant folding in 2.0 mM free Mg2+. o o RNA TM ( C) DH (kcal/mol) TM(M) - TM(WT) ( C) DHM/DHWT Construct Buffer M1 64.0 -80.4 -2.2 1.1 WT 66.2 -72.2 0.0 1.0 M2 66.1 -81.5 -0.1 1.1 M3 69.2 -66.1 3.0 0.9 M4 72.6 -51.7 6.4 0.7 M5 75.4 -40.4 9.2 0.6 M6 72.8 -38.0 6.6 0.5 20% PEG8000 M1 66.1 -77.7 -1.9 1.0 WT 68.0 -80.4 0.0 1.0 M2 68.8 -91.4 0.8 1.1 M3 71.1 -66.8 3.1 0.8 M4 67.2 -74.3 -0.8 0.9 M5 70.2 -62.7 2.2 0.8 M6 61.1 -21.4 -6.9 0.3 Mg2+ Chelated Amino Acids M1 70.2 -149.3 -0.7 1.0 WT 70.9 -145.9 0.0 1.0 M2 72.0 -141.6 1.1 1.0 M3 74.7 -94.7 3.8 0.6 M4 75.5 -86 4.6 0.6 M5 74.1 -84.7 3.2 0.6 M6 73.5 -88.1 2.6 0.6 20% PEG8000 & Mg2+ Chelated Amino Acids M1 71.9 -152.3 0.2 1.0 WT 71.8 -148.0 0.0 1.0 M2 73.3 -155.5 1.5 1.1 M3 75.9 -130.2 3.9 0.9 M4 77.3 -130.3 3.9 0.9 M5 76.7 -119.8 4.9 0.8 M6 78.7 -81.3 6.9 0.5

Constructs are ordered in the table according to the stability of the accepter stem. Thermodynamic parameters are derived from global fits of thermal denaturation data from 250 nm to 290 nm according to a two-state model.

176 Table C-2. Thermodynamic parameters for WT tRNA and mutant folding in 0.5 mM free Mg2+.

o o RNA TM ( C) DH (kcal/mol) TM(M) - TM(WT) ( C) DHM/DHWT Construct Buffer M1 35.3 -24.5 -25.9 0.5 WT 61.2 -44.9 0.0 1.0 M2 59.9 -46.6 -1.3 1.0 M3 65.8 -38.8 4.6 0.5 M4 42.4 -14.7 -18.8 1.0 M5 64.9 -26.6 3.7 0.5 M6 9.43 -18.0 -51.8 1.0 20% PEG8000 M1 60.0 -74.9 -1.8 1.1 WT 61.8 -70.7 0.0 1.0 M2 65.2 -43.7 3.4 0.6 M3 65.3 -52.8 3.5 0.7 M4 63.4 -48.9 1.6 0.7 M5 59.5 -35.1 -2.3 0.5 M6 53.7 -47.4 -8.1 0.7 Mg2+ Chelated Amino Acids M1 64.4 -78.5 -1.7 0.6 WT 66.1 -125.9 0.0 1.0 M2 66.7 -107.1 0.6 0.6 M3 71.9 -59.6 5.8 0.9 M4 73.6 -45.9 7.5 0.5 M5 69.6 -25.9 3.5 0.4 M6 69.6 -25.9 3.5 0.2 20% PEG8000 & Mg2+ Chelated Amino Acids M1 68.6 -79.4 -0.1 1.0 WT 68.7 -83 0.0 1.0 M2 69.1 -103.9 0.4 1.3 M3 71.9 -77 3.2 0.9 M4 67.8 -139.7 -0.8 1.7 M5 71.4 -63.9 2.7 0.8 M6 67.9 -139.7 -0.8 1.7

Constructs are ordered in the table according to the stability of the accepter stem. Thermodynamic parameters are derived from global fits of thermal denaturation data from 250 nm to 290 nm according to a two-state model.

177 Table C-3. Quality of the global fits of thermal denaturation data.

RNA 0.5 mM Mg2+ 2.0 mM Mg2+ Construct o 2 o 2 TM ( C) DH c TM ( C) DH c (kcal/mol) (kcal/mol)

Buffer M1 4.3E-04 1.3E-04 8.94E-02 3.3E-02 3.3E-02 1.77E-02 WT 1.0E-03 5.2E-04 2.24E-01 2.4E-04 2.4E-04 8.46E-03 M2 2.2E-02 1.7E-01 2.13E-03 2.3E-02 2.3E-02 1.16E-02 M3 6.5E-02 2.8E-01 7.88E-03 1.8E-02 1.8E-02 2.27E-03 M4 5.1E-04 4.7E-04 2.25E-02 3.8E-02 3.8E-02 1.92E-03 M5 6.0E-01 9.3E-01 7.42E-03 1.3E-01 1.3E-01 3.05E-03 M6 2.4E-04 2.4E-04 4.60E-02 1.4E-01 1.4E-01 2.61E-03

20% PEG8000 M1 2.2E-02 4.4E-01 3.18E-03 3.3E-02 3.3E-02 1.46E-02 WT 3.3E-02 5.8E-01 5.64E-03 4.1E-02 4.1E-02 1.61E-02 M2 6.3E-02 3.7E-01 1.11E-02 2.0E-02 2.0E-02 6.88E-03 M3 6.5E-02 5.7E-01 1.17E-02 3.1E-02 3.1E-02 6.36E-03 M4 1.6E-01 1.2E+00 2.29E-02 1.5E-01 1.5E-01 1.09E-02 M5 2.8E-01 1.0E+00 4.03E-03 1.3E-01 1.3E-01 4.68E-03 M6 2.9E-01 2.0E+00 6.95E-03 1.0E+00 1.0E+00 1.09E-02

Mg2+ Chelated Amino Acids M1 1.3E-02 7.0E-01 3.17E-04 8.8E-03 6.6E-01 2.57E-04 WT 1.3E-02 7.0E-01 3.17E-04 9.6E-03 6.7E-01 1.56E-04 M2 1.8E-02 6.6E-01 3.55E-04 8.3E-03 5.2E-01 1.87E-04 M3 9.7E-02 4.4E-01 1.28E-03 M4 4.2E-02 3.3E-01 1.28E-03 5.4E-02 6.1E-01 3.05E-04 M5 1.0E-04 3.8E-05 1.02E-03 2.9E-02 6.8E-01 6.95E-04 M6 1.0E-04 3.8E-05 1.02E-03 3.2E-02 7.9E-01 5.29E-04

20% PEG8000 & Mg2+ Chelated Amino Acids M1 4.1E-02 8.1E-01 7.86E-04 1.4E-02 1.0E+00 6.57E-04 WT 4.5E-02 1.1E+00 8.15E-03 1.4E-02 1.0E+00 9.01E-04 M2 2.2E-02 8.0E-01 2.37E-03 1.7E-02 1.3E+00 8.45E-04 M3 2.8E-02 5.2E-01 1.85E-03 2.1E-02 8.3E-01 4.58E-04 M4 3.8E-02 2.5E+00 7.93E-03 2.1E-02 8.3E-01 4.58E-04 M5 8.5E-02 8.7E-01 1.24E-03 9.2E-02 2.2E+00 2.43E-03 M6 3.8E-02 2.5E+00 7.93E-03 8.2E-02 7.5E-01 8.63E-04

178 Table C-4. Structural parameters for WT and MT tRNAs obtained by SAXS in 2.0 mM Mg2+.

a a b b c d RNA MW Rg (Å) Rg (Å) Dmax (Å) Excluded RMSD FoXS Construct (kDa) Volumec (Å) c2 (Å3) WT 28.5 24.2 25.1 82 47,700 3.30 1.03 M1 35.2 23.6 25.7 83 50,400 3.35 1.02 M2 28.7 25.6 26.4 85 53,000 3.90 1.21 M3 32.2 25.5 26.9 85 56,100 3.84 1.14 M4 31.9 26.1 26.6 87 50,800 3.80 1.25 M5 25.2 26.3 27.4 87 51,400 3.94 1.10

a Solutions contain 25 mM HEPES (pH 7.5), 140 mM KCl, and 2.0 mM MgCl2. Parameters were obtained by analysis of the experimental scattering curves using BioXTAS RAW software. bValues were obtained using the pairwise distribution function in GNOM using the ATSAS software package. cParameter was found using the alignments of the DAMAVER bead models with the tRNA crystal structure in SUPCOMB.

Table C-5. Composition of samples containing Mg2+-chelated amino acids.

Concentration 0.5 mM free Mg2+ 2.0 mM free Mg2+ Amino acidsa 106.6 mM 106.6 mM Total Mg2+ 4.6 mM 16.0 mM Free Mg2+ 0.5 mM 2.0 mM Amino acid chelated Mg2+ 4.1 mM 14.0 mM KCl 140 mM 140 mM Sodium cacodylate 10 mM 10 mM aComposed of 96.0 mM glutamate, 4.2 mM aspartate, 3.8 mM glutamine, and 2.6 mM alanine at pH 7.0.

179 Table C-6. Organism optimal growth temperatures and tRNA stem free energy of folding. *Values are in °C •Values are in kcal/mol

Organism Growth Acceptor D Stem AC Stem TYC Stem Temperature* Stem DG• DG• DG• DG• Rhodopseudomonas palustris 25 -13.4 -3.8 -5.2 -6.4 Rhodopirellula baltica 25 -7.6 -3.8 -8.3 -5.6 Candidatus Protochlamydia amoebophila 25 -12 -3.8 -4 -5.4 Xanthomonas campestris 26 -12.8 -3.8 -7.7 -5.4 Bradyrhizobium japonicum 26 -13.4 -3.8 -6.8 -4 Mesorhizobium loti 26 -12 -3.8 -6.9 -6.4 Pseudomonas syringae 26 -12 -3.8 -7.7 -5.6 Streptomyces avermitilis 28 -13.4 -3.8 -4.7 -8.3 Vibrio cholerae 28 -11.4 -3.8 -4 -6.4 Ralstonia solanacearum 28 -8.7 -3.8 -6.9 -6.4 Nitrosomonas europaea 28 -11 -3.8 -4 -6.4 Oceanobacillus iheyensis 28 -10.7 -3.8 -4 -5.6 Photorhabdus luminescens subsp. laumondii 28 -13.5 -3.8 -4 -6.4 Leifsonia xyli subsp. xyli str. 28 -10.6 -3.8 -3.1 -6.6 Bacillus cereus 30 -12.8 -3.8 -4 -5.6 Bdellovibrio bacteriovorus 30 -9.2 -3.8 -4 -5.6 30 -11.1 -3.8 -2.8 -8.3 Lactobacillus plantarum 30 -6.5 -3.8 -3.8 -5.6 Pseudomonas aeruginosa 30 -13.4 -3.8 -7.7 -5.6 Spiroplasma melliferum 30 -8.9 -3.8 -1.3 -4.1 Bartonella henselae 37 -12 -3.8 -6.9 -5.5 Corynebacterium diphtheriae 37 -12 -3.8 -4.1 -6.8 Lactobacillus johnsonii 37 -2.7 -3.8 -1.7 -4.2 Methanospirillum hungatei 37 -11.3 -3.8 -5.3 -8.2 Staphylococcus epidermidis 37 -8.9 -3.8 -3.8 -4.1 Methanococcus aeolicus 40 -12.6 -1.4 -5.3 -6.8 Geobacillus stearothermophilus 55 -12.8 -3.8 -3.6 -5.6 Thermoplasma acidophilum 55 -12.6 -3.8 -5.3 -6.1 Symbiobacterium thermophilum 60 -12.7 -3.8 -6 -6.8 Thermoplasma volcanium 60 -12.6 -3.8 -5.3 -6.1 Methanothermobacter thermautotrophicus 65 -12.6 -1.4 -5.3 -6.1 Sulfolobus acidocaldarius 70 -14.1 -3.8 -8.2 -8.2 Sulfolobus solfataricus 70 -14.1 -3.8 -8.2 -8.2 75 -13.4 -3.8 -5.2 -8.3 Sulfolobus tokodaii 75 -14.1 -3.8 -8.2 -8.2 80 -13.4 -3.8 -5.3 -8.3 Methanocaldococcus jannaschii 80 -14.1 -1.4 -5.3 -6.8 Archaeoglobus fulgidus 85 -13.1 -3.8 -6.6 -8.2 Staphylothermus marinus 88 -14.1 -3.8 -8.2 -8.2 Pyrococcus abyssi 90 -14.9 -3.8 -6.8 -9.1 K1, K1 90 -14.1 -3.8 -8.2 -8.2 95 -14.9 -3.8 -6.8 -9.1 97 -14.9 -3.8 -6.8 -9.1 Pyrobaculum aerophilum 98 -14.1 -3.8 -8.2 -8.2

180 Table C-7. Organism optimal growth temperatures and ribosomal RNA GC percent.

Organism Growth 5S GC 16S GC 23S GC Average rRNA GC Temperature* Percent Percent Percent Percent Rhodopseudomonas palustris 25 63% 50% 54% 56% Rhodopirellula baltica 25 63% 55% 54% 57% Candidatus Protochlamydia amoebophila 25 50% 49% 50% Xanthomonas campestris 26 60% 55% 53% 56% Bradyrhizobium japonicum 26 58% 56% 57% Mesorhizobium loti 26 62% 56% 59% Pseudomonas syringae 26 Streptomyces avermitilis 28 57% 58% 57% 57% Vibrio cholerae 28 52% 53% 52% 52% Ralstonia solanacearum 28 60% 54% 53% 56% Nitrosomonas europaea 28 53% 53% 49% 52% Oceanobacillus iheyensis 28 59% 53% 52% 55% Photorhabdus luminescens subsp. laumondii 28 Leifsonia xyli subsp. xyli str. 28 60% 60% Bacillus cereus 30 56% 53% 52% 54% Bdellovibrio bacteriovorus 30 49% 49% 49% Deinococcus radiodurans 30 61% 55% 58% Lactobacillus plantarum 30 56% 50% 50% 52% Pseudomonas aeruginosa 30 54% 54% 53% 54% Spiroplasma melliferum 30 53% 50% 52% Bartonella henselae 37 59% 55% 51% 55% Corynebacterium diphtheriae 37 60% 56% 53% 56% Lactobacillus johnsonii 37 56% 52% 50% 53% Methanospirillum hungatei 37 55% 56% 51% 54% Staphylococcus epidermidis 37 54% 51% 50% 52% Methanococcus aeolicus 40 49% 56% 53% 53% Geobacillus stearothermophilus 55 59% 59% 59% Thermoplasma acidophilum 55 54% 54% Symbiobacterium thermophilum 60 60% 60% 60% 60% Thermoplasma volcanium 60 50% 55% 53% 53% Methanothermobacter thermautotrophicus 65 53% 58% 57% 56% Sulfolobus acidocaldarius 70 65% 63% 60% 63% Sulfolobus solfataricus 70 63% 63% 62% 63% Thermus thermophilus 75 67% 64% 64% 65% Sulfolobus tokodaii 75 66% 64% 64% 65% Thermotoga maritima 80 66% 64% 63% 64% Methanocaldococcus jannaschii 80 68% 64% 63% 65% Archaeoglobus fulgidus 85 62% 64% 63% 63% Staphylothermus marinus 88 68% 67% 67% 67% Pyrococcus abyssi 90 67% 67% Aeropyrum pernix K1, K1 90 68% 68% 69% 68% Pyrococcus horikoshii 95 70% 66% 66% 67% Pyrococcus furiosus 97 71% 66% 66% 68% Pyrobaculum aerophilum 98 72% 68% 70% 70%

181 Table C-8. Potential isosteric changes in tRNAphe tertiary interactions.

Tertiary Interaction Potential Compensatory Change Interaction Lost WC U8*:A14* H WC C8:C14 H C8 interaction with Mg2+ H A9*:A23* H H A9:C23 H H C9:A23 H H G9:C23 H H G9:G23 H WC G15*:C48* WC WC C15:G48 WC Base triple with U12* WC G18*:U55* WC WC C18:A55 WC U55 pseudouridine modification WC G19*:C56* WC WC U19:A56 WC WC G19:A56 WC WC A19:U56 WC H G22*:G46* WC Base triple with C13* WC G26:A44 WC WC A26:C44 WC G26 2-methyl modification WC U54*:A58* H WC C54:C58 H U54 5-methyl modification A58 1-methyl modification

*Information content of this base is 2, and it is universally conserved. H stands for Hoogsteen face interaction and WC stands for Watson-Crick face interaction. The interactions are depicted in molecular detail in Figure C-7.

182

Figure C-1. WT and mutant thermal denaturation under in vivo-like solutions in the background of 0.5 mM free Mg2+. Each construct was globally fit every 2 nm between 250 and 290 nm. Thermal denaturation scans at 260 nm normalized using global fitting parameters in (A) buffer, (B) 20% PEG8000, (C) Mg2+-chelated amino acids, and (D) 20% PEG8000 and Mg2+-chelated amino acids. All four panels are in the background of 0.5 mM Mg2+ and 140 mM K+. Low temperature data was truncated in the fitting to avoid excess baselines, as is plotted above.

183

Figure C-2. Small angle X-ray scattering data in buffer with 2.0 mM free Mg2+. The experimental (A) scattering curves, (B) Kratky plots, and (C) Porod plots of tRNA mutants. Colors are as in Figure 4-1.

184

Figure C-3. Comparison of experimental scattering curves of WT and mutant tRNAs (data points) and the theoretical scattering curve (smooth curve) of tRNA crystal structure, PDB: 1ehz, generated with FoXS. Experimental scattering curves of (A) WT, (B) M1, (C) M2, (D) M3, (E) M4, and (F) M5.

185

Figure C-4. In-line probing PAGE of WT and M5 RNAs. ILP in buffer, 20% PEG800, Mg2+-chelated amino acids, and 20% PEG8000 with Mg2+-chelated amino acids. Control lanes are unreacted RNA, a hydrolysis ladder, and a denaturing T1 ladder.

186

Figure C-5. Melting temperature and enthalpy of unfolding of WT and mutant tRNAs in 2.0 mM 2+ free Mg . (Top) Melting temperature and (Bottom) ∆Hfolding of tRNA and mutants in buffer (open squares), 20% PEG8000 (closed squares), aaCM (circles), and 20% PEG8000 and additional aaCM (triangles). The non-cooperative ∆H was calculated using nearest neighbor parameters for the WT and mutant acceptor stems. See Materials and Methods for the calculation of non- cooperative DH and fully cooperative DH limits.

187

Figure C-6. Average tRNA stem DGaverage from organisms with a large range of optimal growing temperatures. The threshold DGfolding for stem stability is in pink. See Materials and Methods for further details.

188

Figure C-7. Tertiary interactions in tRNAphe (PDB 1ehz).

189 C.3 References

1. Leamy KA, Yennawar NH, & Bevilacqua PC (2017) Cooperative RNA folding under cellular

conditions arises from both tertiary structure stabilization and secondary structure

destabilization. Biochemistry 56(27):3422-3433.

2. Yamagami R, Bingaman JL, Frankel EA, & Bevilacqua PC (2018) Cellular conditions of

weakly chelated magnesium ions strongly promote RNA folding, stability, and catalysis.

Nat Comm Accepted.

3. Siegfried NA & Bevilacqua PC (2009) Thinking inside the box: Designing, implementing,

and interpreting thermodynamic cycles to dissect cooperativity in RNA and DNA folding.

Methods Enzymol 455:365-393.

4. Serra MJ & Turner DH (1995) Predicting thermodynamic properties of RNA. Methods

Enzymol 259:242-261.

5. Skou S, Gillilan RE, & Ando N (2014) Synchrotron-based small-angle X-ray scattering

(SAXS) of proteins in solution. Nat Protoc 9(7):1727-1739.

6. Acerbo AS, Cook MJ, & Gillilan RE (2015) Upgrade of macchess facility for X-ray

scattering of biological macromolecules in solution. J Synchrotron Radiat 22(1):180-186.

7. Schneidman-Duhovny D, Hammel M, Tainer John A, & Sali A (2013) Accurate SAXS

profile computation and its assessment by contrast variation experiments. Biophys J

105(4):962-974.

190 8. Schneidman-Duhovny D, Hammel M, Tainer JA, & Sali A (2016) Foxs, foxsdock and

multifoxs: Single-state and multi-state structural modeling of proteins and their

complexes based on SAXS profiles. Nucleic Acids Res 44(Web Server issue):W424-W429.

9. Das R, Laederach A, Pearlman SM, Herschlag D, & Altman RB (2005) Safa: Semi-

automated footprinting analysis software for high-throughput quantification of nucleic

acid footprinting experiments. RNA 11:344-354.

10. Jühling F, et al. (2009) tRNAdb 2009: Compilation of trna sequences and tRNA genes.

Nucleic Acids Res 37(suppl_1):D159-D162.

11. Söhngen C, et al. (2016) Bacdive – the bacterial diversity metadatabase in 2016. Nucleic

Acids Res 44(D1):D581-D585.

12. Carothers JM, Oestreich SC, Davis JH, & Szostak JW (2004) Informational complexity and

functional activity of RNA structures. J Am Chem Soc 126(16):5130-5137.

191 Appendix D

Supporting Information: Chapter 5

D.1 Materials and Methods

D.1.1 Chemicals. NaCl JT Baker, Lithium chloride, sodium cacodylate, CMP, and GTP were purchased from Sigma, NaCl was purchased from JT Baker, and AMP and UMP were purchased from USB.

D.1.2 RNA constructs and preparation. RNA constructs labeled with fluorescein (FAM) and blank hole quencher 1 (BHQ1) were purchased HPLC purified from the Keck Oligo

Synthesis Resource at Yale University. The unlabeled RNA constructs were purchased

HPLC purified from Integrated DNA Technologies (IDT) and used without further purification. The sequences are provided below. RNAs were dialyzed in an eight-well microdialysis apparatus (Gibco-BRL Life Technologies) at a flow rate of 25 mL/min in three steps: 100 mM NaCl for 6 hours, water for 6 hours, and 10 mM sodium cacodylate, pH

7.0, for at least 12 hours.

FAM labeled RNA: 5’ FAM-AGCAGGUA

BHQ1 labeled RNA: 5’ UACCUGCU-BHQ1

Unlabeled RNA: 5’ UACCUGCU(C1-3)

192 D.1.3 Fluorescence titrations with NMPs. Fluorescence titrations were performed on a

Horiba Fluorolog FL3-11 spectrometer. The excitation was at 495 nm and emission spectra were collected between 500 and 560 nm, with a FAM peak at 520 nm, as a function of NMP concentration at room temperature.. The fluorescence 520 nm in each titration was normalized to the fluorescence without added NMPs. Fluorescein-labeled

RNA, 80 nM, was renatured by heating at 95 °C for 3 min with 10 mM sodium cacodylate and 1 M NaCl, then cooled at room temperature for 10 min. A solution containing AMP,

CMP, UMP, or GTP, 80 nM fluorescein, 10 mM sodium cacodylate (pH 7.0), and 1 M NaCl was titrated into the labeled RNA to a final concentration of 25 mM NMPs. Of note, GTP was used instead of GMP because there was no commercially available lithium salt of

GMP, and in the presence of Na+ or K+ guanosine will form G-quadraplexes. For the same reason, the GTP titrations were performed with 1 M LiCl instead of 1 M NaCl.

D.1.4 Plate Reader Binding Assays. Binding assays were performed with fluorescein- labeled RNA renatured with increasing concentrations of the complement strand, which was either labeled on the 3’ end with black hole quencher 1 or dangling Cs. Titrations were performed over 12 wells of a 96-well tray in a StepOnePlus qPCR. The complementary strands were annealed in 1 M NaCl and 10 mM sodium cacodylate by heating at 95 °C for 3 min and cooling at room temperature for 10 min. The fluorescently labeled strand was at a concentration of 50 nM, and the complement strand was titrated from 0 to 1600 nM, and three data points were collected at each concentration and repeated at temperature at 27, 32, 35, 42 and 47 °C.

193 D.1.5 Binding Assay Data Fitting. Binding curves collected on the qPCR were fit to obtain a Kd, according to equation 3 in KaleidaGraph. The derivation for this equation is below.

We assume that there are two states: unbound and bound, where F is unbound fluorescently labeled RNA, Q is the unbound complement strand, and FQ is the bound

[F][Q] complex. The equilibrium is: F+Q ⇔ FQ and the Kd for this system is K = (1). D [FQ]

The total concentrations of both strands can be defined as:

[FT ]=[F]+[FQ] or [F]=[FT ]−[FQ] (2)

[QT ]=[Q]+[FQ] or [Q]=[QT ]−[FQ] (3)

The fraction of free fluorophore, fF , and bound fluorophore, fFQ , can be described as:

[F] [FQ] fF = (4a) and fFQ = (4b) [FT ] [FT ]

We know that the fraction of free fluorophore and bound fluorophore must equal 1:

1= fF + fFQ (5a) or fFQ=1− fF (5b) or fF =1− fFQ (5c)

And the observed fluorescence, FlObs , comes from the fluorescence of the free fluorophore, FlF, and the bound fluorophore, FlFQ.

FlObs = fFFlF + (1− fF )FlFQ (6a) which can be rearranged to:

FlObs = FlFQ + fF (FlF − FlFQ ) (6b)

We can substitute the above equations into the equation for Kd:

[F][Q] ([F ]−[FQ])([Q ]-[FQ]) K = (1) to get K = T T (7) D [FQ] D [FQ]

2 This can be rearranged to: [FQ] −[FQ]([FT ]+[QT ]+KD )+[FT ][QT ]=0 (8)

194 The quadratic can be solved for [FQ]:

([F ]+[Q ]+K )- ([F ]+[Q ]+K )2 − 4[F ][Q ] [FQ]= T T D T T D T T (9) 2

And this can be used to find the fraction of bound fluorophore, fFQ .

2 [FQ] ([FT ]+[QT ]+KD )- ([FT ]+[QT ]+KD ) − 4[FT ][QT ] fFQ = = (10) [FT ] 2[FT ]

The fraction of unbound fluorophore, fF , can also be found.

2 2[FT ] ([FT ]+[QT ]+KD )- ([FT ]+[QT ]+KD ) − 4[FT ][QT ] fF =1− fFQ = − (11) 2[FT ] 2[FT ]

The terms for unbound and bound fluorophore can be submitted into the equation for

FlObs , to obtain the equation for fitting data for a Kd value (equation 3).

2 2[FT ]-([FT ]+[QT ]+KD )+ ([FT ]+[QT ]+KD ) − 4[FT ][QT ] (eq. 3) FlObs = FlFQ + (FlF − FlFQ ) 2[FT ]

This data can be fit individually to obtain a KD at each temperature, which can be used to find thermodynamic parameters according to the following derivation (eq. 15).

From the free energy (DG), the entropy (DS) and enthalpy (DH) of folding can be determined. Free energy is also related to KD.

∆� = ∆� − �∆� (12) ∆� = −����� (13)

Equations 12 and 13 can be set equal to each other and rearranged to get:

∆ ∆ ��� = + (14) which can be rearranged to the van’t Hoff equation:

∆ ∆ � = exp + (15)

195 The data from binding assays can be globally fit at all temperatures to obtain the thermodynamic data.

D.1.6 CoFold Structure Prediction. Structure formation of cotranscriptional intermediates was predicted using the CoFold webserver, part of the Vienna package (4).

Intermediates with varying lengths on the 3’ end were input to the webserver to predict secondary structure contacts.

196 D.2 Supplemental Figures

5 27C 1.4 10 32C 37C 42C 1.2 105 47C

1 105

8 104

6 104 FAM Fluorescence FAM

4 104

2 104

0 0 20 40 60 80 100 120

[FAM Labeled RNA] (nM)

Figure D-1. Fluorescein-labeled RNA calibration curve in 1 M NaCl on a qPCR. Calibration curves for fluorescein labeled RNA at five temperatures on a StepOnePlus qPCR. Data is linear until ~80 nM fluorescein at low temperatures.

197 VITA Kathleen A. Leamy

Education Ph.D., Chemistry, The Pennsylvania State University August 2018 B.S., Chemistry and Biochemistry, Siena College May 2013

Publications 1. Leamy, K. A., Yennawar, N. H., Bevilacqua, P. C., “Cellular conditions stabilize near full-length RNA cotranscription intermediates”, Under Review, PNAS 2. Leamy, K. A., Yennawar, N. H., Bevilacqua, P. C. (2018) “Molecular Mechanism for Folding Cooperativity of Functional RNAs in Living Organisms”, Biochemistry, 57 (20), pp 2994–3002 3. Leamy, K. A., Yennawar, N. H., Bevilacqua, P. C. (2017) “Cooperative RNA folding under cellular conditions arises from both tertiary structure stabilization and secondary structure destabilization”, Biochemistry, 56(27):3422-3433 4. Bingaman, J. L, Frankel, E. A, Hull, C. M., Leamy, K. A., Messina, K. J., Mitchell, D., 3rd, Park, H., Ritchey, L. E., Babitzke, P., Bevilacqua, P. C. (2016) “Eliminating blurry gel bands in gels with a simple cost-effective repair to the gel cassette.” RNA, 22. 5. Leamy, K. A., Assmann, S. M., Mathews, D. H., Bevilacqua, P. C. (2016) “Bridging the gap between in vitro and in vivo RNA folding” Q. Rev. Biophys. 49. 6. Gunsch, M. J., Paske, A. C., Leamy, K. A., O'Donnell, J. L., (2013) "Chlorocarbon and Alcohol Vapor Discrimination by Electropolymerized Ultrathin Chromophore Films", J. Electrochem. Soc. 160:2 (B13-B16).

Invited Oral Presentations Penn State Chemical Biology Seminar Series April 2018 255th ACS Meeting March 2018 Penn State RNA Club February 2018 Siena College October 2017

Selected Poster Presentations 255th ACS Meeting April 2018 22nd Annual RNA Society Meeting June 2017 4th Annual Albany RNA Institute Meeting March 2017 31st Annual Gibbs Conference September 2016

Selected Awards and Honors Teaching Innovation Award 2017 Penn State Graduate Student Award 2017 Best Poster Award, 4th Annual Albany RNA Institute Meeting 2017 Penn State Travel Awards 2015-2018