RNA Secondary Structures: from Biophysics to Bioinformatics
Total Page:16
File Type:pdf, Size:1020Kb
RNA Secondary Structures: from Biophysics to Bioinformatics DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By William D. Baez, M.S. Graduate Program in Physics The Ohio State University 2018 Dissertation Committee: Dr. Ralf Bundschuh, Advisor Dr. Kurt Fredrick Dr. Amy Connelly Dr. Comert Kural c Copyright by William D. Baez 2018 Abstract We investigate aspects of RNA secondary structure from the view point of theoretical bio- physics and from view point of bioinformatics. From the existence of a novel thermodynamic phase transition to the fundamental mechanisms of life, RNA continues to act as wellspring of new discoveries. RNA forms elaborate secondary structures through intramolecular base pairing. These structures perform critical biological functions within each cell. Due to the availability of a polynomic algorithm to calculate the partition function, they are also a suitable model system for the statistical physics of disordered systems. In this model, below the denaturation temperature random RNA secondary structures can exist in one of two phases: a strongly disordered, low-temperature glass phase and a weakly disordered, high-temperature molten phase. The probability of two bases pairing in these phases has been shown to decay with the distance between the two bases with an exponent 3/2 and 4/3 in the molten and glass phases, respectively. Drawing on previous results from a renormalized field theory of the glass transition, we numerically study this transition and introduce two order parameters that determine the location of the critical point, and explore the driving mechanism behind this transition. Within a cell's genome regulatory elements can often be found within the vicinity of the genes they regulate. In prokaryotes, a common translational regulatory element, the Shine Dalgarno sequence, has been found to be largely absent from entire phyla of bacteria. This sequence element is part of the textbook model of translation initiation. To understand how Shine Dalgarno independent bacteria, such as F. johnsoniae, a member of the phylum Bacteroidetes, initiates translation, we used high-thoughput RNA sequencing and ribosome ii profiling data to investigate the impact of mRNA secondary structure near a gene's initiation site. We found evidence that strongly implicates the role that unstructured or unstable mRNA structures play in these understudied organisms. Finally, we again use high-throughput RNA sequencing and ribosome profiling data to study the impact of Fhit loss on human cells. Our findings show that Fhit expression impacts the translation of a number of cancer associated genes, and they support the hypothesis that Fhits genome protective/tumor suppressor function is associated with post-transcriptional changes in expression of genes whose dysregulation contributes to malignancy. iii To Mrs. Jacobs, my 10th grade biology teacher. This is all your fault. iv Acknowledgments There are too many people I should thank that have helped me along my journey to this PhD. Fortunately, I have forgotten most of them. So, I will, instead, thank those that I do remember. First and foremost, I would like to thank my advisor, Dr. Ralf A. Bundschuh, for his guidance, support, and Buddha-like patience. Little did we know back in 2012 that my introductory research problem, the phase transition of RNA secondary structures, would become six year academic adventure. Along with Ralf, I must thank my fellow graduate students with whom I've have the pleasure of working alongside with in the Bundshuh group: Blythe Moreland, Robert Patton, Kenji Oman, and Dengke Zhao. I would also like to thank my collaborators here at Ohio State and abroad. In particular, I must extend my gratitude to Dr. Kurt Fredrick, Dept. of Microbiology, and Dr. Daniel Schoenberg, Dept. of Biological Chemistry and Pharmacology, both of whom opened my mind to the complexities of biology beyond what one might learn in a textbook. Their patience and guidance during my initial months collaborating with their respective groups were instrumental to our success. Many thanks to Dr. Kay Wiese, of the Ecole´ normale sup´erieurein Paris, France, whose contributions were instrumental in the RNA phase transition project. It would be remiss of me to not acknowledge the support, guidance, and mentorship of those from my academic past: Dr. J. Andrew Hauger and Dr. Trinanjan Datta of Augusta Univeristy, Augusta, GA, and Mrs. Jean Jacobs of Lindenhurst Senior High School, Lindenhurst, NY. As chair of the Dept. of Chemistry and Physics, Dr. Andy Hauger taught me more than just the art of electronics and the wonders of quantum mechanics. From high atop a mountain, he taught me the fundamental lesson of being a leader: take care v of your people. It is a lesson that I continue to benefit from to this day. It was from Dr. Trinanjan Datta from whom I learned the skill neccessary to survive graduate school and physics research: grit. For without grit, it is all too easy to give up after countless deadends, numerous soul-crushing failures, and endless nights of little to no sleep. I am forever indebted to these two academic heroes. But perhaps the most valuable lesson I was every taught, but did not appreciate until the final years of graduate school, came from my high school biology teacher, Mrs. Jean Jacobs. As an ambitious high school student, I attempted to take as many courses as one could fit into nine forty-minute periods. But it was she how would often remind me, "There are not enough hours in the day, Mr. Baez." 20+ years later, I finally learned that lesson. vi Vita September 3, 1980 . Born|Lindenhurst, NY May, 2010 . B.S., Augusta State University, Augusta, GA December, 2015 . M.S., Ohio State University, Columbus, OH Publications Daniel L Kiss, William Baez (co-first author), Kay Huebner, Ralf Bundschuh, Daniel R. Schoenberg. Impact of FHIT loss on the translation of cancer-associated mRNAs. Mol. Cancer, 16:179 (2017). Daniel L Kiss, William Baez (co-first author), Kay Huebner, Ralf Bundschuh, Daniel R. Schoenberg. Loss of fragile histidine triad (Fhit) protein expression alters the translation of cancer-associated mRNAs. BMC Res. Notes, 11:178 (2018) Fields of Study Major Field: Physics Studies in Biophysics Theory, Bioinformatics: Ralf Bundschuh vii Table of Contents Page Abstract........................................... ii Dedication......................................... iv Acknowledgments.....................................v Vita............................................. vii List of Figures ......................................x List of Tables ....................................... xix Chapters 1 Introduction1 1.1 Nucleic Acids and The Central Dogma...................... 1 1.1.1 DNA and RNA............................... 1 1.1.2 The Universality of the Central Dogma ................ 2 1.2 Biophysics and Statistical Mechanics...................... 5 1.3 Bioinformatics .................................. 8 1.3.1 High-Throughput RNA Sequencing and Ribosome Profiling..... 9 1.4 Outline ....................................... 11 2 Phase Transition of RNA Secondary Structures 12 2.1 Introduction.................................... 12 2.2 RNA Secondary-Structure Model........................ 16 2.2.1 RNA Secondary Structures ....................... 16 2.2.2 Energy Model............................... 16 2.2.3 Partition Function............................ 17 2.2.4 Observables................................ 18 2.2.5 Numerical Approach........................... 20 2.3 Scaling of the contact and overlap observables................. 20 2.3.1 Initial Estimate of Phase Transition Temperature........... 20 2.3.2 Order Parameters for the Transition, and a More Precise Estimation of the Phase Transition Temperature.................. 22 2.4 Transition Mechanism.............................. 24 2.5 Discussion & Conclusions ............................ 36 3 The role of RNA secondary structure in translation initiation in Flavobac- terium johnsoniae 37 viii 3.1 Introduction.................................... 37 3.2 F. johnsoniae as the model organism for Bacteroidetes............ 37 3.3 Materials and Methods.............................. 40 3.3.1 Cell culture and library preparations.................. 40 3.3.2 RNA-seq, Ribo-seq, and the selection of representative set of genes . 40 3.4 Results....................................... 44 3.4.1 Ribosome footprints from F. johnsoniae are shorter and more uniform in length than those of other bacteria ................. 44 3.4.2 Start codon usage and AUG trinucleotide representation in the F. johnsoniae TIR.............................. 49 3.4.3 RNA-seq data indentifies promoters with well-conserved -7 elements 54 3.4.4 Rate of translation is tuned by mRNA secondary structure near the start codon................................ 56 3.5 Discussion/Conclusion.............................. 69 4 The role of FHIT on the translation of cancer-associated mRNAs 75 4.1 Introduction: Background on Fhit ....................... 75 4.1.1 Fhit as 5' cap scavenger......................... 75 4.2 Materials and Methods.............................. 76 4.2.1 Cell culture and library preparations.................. 76 4.2.2 RNA-seq, Ribo-seq, and informatics .................. 76 4.3 Results....................................... 78 4.3.1 Identifying mRNAs whose translation is controlled by Fhit . 78 4.3.2 Fhit loss associated with changes in ribosome distribution