Structure-Function Relationships of Rna and Protein in Synaptic Plasticity
Total Page:16
File Type:pdf, Size:1020Kb
University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations 2017 Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Sarah Middleton University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Bioinformatics Commons, Biology Commons, and the Neuroscience and Neurobiology Commons Recommended Citation Middleton, Sarah, "Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity" (2017). Publicly Accessible Penn Dissertations. 2474. https://repository.upenn.edu/edissertations/2474 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/2474 For more information, please contact [email protected]. Structure-Function Relationships Of Rna And Protein In Synaptic Plasticity Abstract Structure is widely acknowledged to be important for the function of ribonucleic acids (RNAs) and proteins. However, due to the relative accessibility of sequence information compared to structure information, most large genomics studies currently use only sequence-based annotation tools to analyze the function of expressed molecules. In this thesis, I introduce two novel computational methods for genome-scale structure-function analysis and demonstrate their application to identifying RNA and protein structures involved in synaptic plasticity and potentiation—important neuronal processes that are thought to form the basis of learning and memory. First, I describe a new method for de novo identification of RNA secondary structure motifs enriched in co-regulated transcripts. I show that this method can accurately identify secondary structure motifs that recur across three or more transcripts in the input set with an average recall of 0.80 and precision of 0.98. Second, I describe a tool for predicting protein structural fold from amino acid sequence, which achieves greater than 96% accuracy on benchmarks and can be used to predict protein function and identify new structural folds. Importantly, both of these tools scale linearly with increasing numbers of input sequences, making them feasible to run on thousands of sequences at a time. Finally, I use these tools to investigate RNA localization and local translation in dendrites—two processes that are prerequisites for long-lasting synaptic potentiation. Using soma- and dendrite-specific RNA-sequencing data as a starting point, I define the full set of RNAs localized to the dendrites, identify novel secondary structure motifs enriched in these RNAs that may act as dendritic localization signals, and predict the structure of all proteins that would be produced by these localized RNAs during local translation. The results shed new light on potential regulatory mechanisms of dendritic localization and roles of locally translated proteins at the synapse, and demonstrate the utility of structure-based tools in genomics analysis. Degree Type Dissertation Degree Name Doctor of Philosophy (PhD) Graduate Group Genomics & Computational Biology First Advisor Junhyong Kim Keywords long-term potentiation, protein structure, RNA localization, RNA structure, single cell, structure motifs Subject Categories Bioinformatics | Biology | Neuroscience and Neurobiology This dissertation is available at ScholarlyCommons: https://repository.upenn.edu/edissertations/2474 STRUCTURE-FUNCTION RELATIONSHIPS OF RNA AND PROTEIN IN SYNAPTIC PLASTICITY Sarah A. Middleton A DISSERTATION in Genomics and Computational Biology Presented to the Faculties of the University of Pennsylvania in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy 2017 Supervisor of Dissertation _________________________ Junhyong Kim, Ph.D., Professor of Biology Graduate Group Chairperson _________________________ Li-San Wang, Ph.D., Associate Professor of Pathology and Laboratory Medicine Dissertation Committee Li-San Wang, Ph.D., Associate Professor of Pathology and Laboratory Medicine (Chair) Danielle Bassett, Ph.D., Associate Professor of Bioengineering Russ Carstens, M.D., Associate Professor of Medicine James Eberwine, Ph.D., Professor of Pharmacology Isidore Rigoutsos, Ph.D., Professor of Pathology, Anatomy, and Cell Biology, Thomas Jefferson University STRUCTURE-FUNCTION RELATIONSHIPS OF RNA AND PROTEIN IN SYNAPTIC PLASTICITY COPYRIGHT 2017 Sarah A. Middleton This work is licensed under the Creative Commons Attribution- NonCommercial-ShareAlike 3.0 License To view a copy of this license, visit https://creativecommons.org/licenses/by-nc-sa/3.0/us/ ACKNOWLEDGMENTS I am extremely grateful to all the people who have helped and supported me throughout my graduate work: my advisor, Junhyong, who has been an inspiring and compassionate mentor; all of the members of the Kim lab, especially those who helped me get started in the wet lab (Chantal, Derek, Hoa, Jean), managed the computing resources (Jamie, Stephen), and shared computational advice and camaraderie in the dry lab (Hannah, Mugdha, Qin, Youngji); my thesis committee, for their extremely helpful advice and guidance; my collaborators, especially Joe and Kanishka, who helped with developing and testing the machine learning methods in Chapter 3 of this thesis; my fellow GCB students, for being an awesome and fun group of people—I will miss betraying each other during game nights!; and the GCB coordinators, Hannah and Maureen, for organizing events and keeping GCB running smoothly. I also want to give my sincerest thanks to my parents, who have always gone out of their way to support me; and to my husband, Julius, who can make me smile even when everything else seems to be going wrong (a regular occurrence in graduate school!). Finally, I gratefully acknowledge the Department of Energy Computational Science Graduate Fellowship (DE-FG02-97ER25308), which not only supported much of this work financially, but also provided me with amazing opportunities to expand my computational skills and knowledgebase and to interact with inspiring people in computational disciplines outside of my own. iii ABSTRACT STRUCTURE-FUNCTION RELATIONSHIPS OF RNA AND PROTEIN IN SYNAPTIC PLASTICITY Sarah A. Middleton Junhyong Kim Structure is widely acknowledged to be important for the function of ribonucleic acids (RNAs) and proteins. However, due to the relative accessibility of sequence information compared to structure information, most large genomics studies currently use only sequence-based annotation tools to analyze the function of expressed molecules. In this thesis, I introduce two novel computational methods for genome-scale structure- function analysis and demonstrate their application to identifying RNA and protein structures involved in synaptic plasticity and potentiation—important neuronal processes that are thought to form the basis of learning and memory. First, I describe a new method for de novo identification of RNA secondary structure motifs enriched in co-regulated transcripts. I show that this method can accurately identify secondary structure motifs that recur across three or more transcripts in the input set with an average recall of 0.80 and precision of 0.98. Second, I describe a tool for predicting protein structural fold from amino acid sequence, which achieves greater than 96% accuracy on benchmarks and can be used to predict protein function and identify new structural folds. Importantly, both of these tools scale linearly with increasing numbers of input sequences, making them iv feasible to run on thousands of sequences at a time. Finally, I use these tools to investigate RNA localization and local translation in dendrites—two processes that are prerequisites for long-lasting synaptic potentiation. Using soma- and dendrite-specific RNA-sequencing data as a starting point, I define the full set of RNAs localized to the dendrites, identify novel secondary structure motifs enriched in these RNAs that may act as dendritic localization signals, and predict the structure of all proteins that would be produced by these localized RNAs during local translation. The results shed new light on potential regulatory mechanisms of dendritic localization and roles of locally translated proteins at the synapse, and demonstrate the utility of structure-based tools in genomics analysis. v TABLE OF CONTENTS ACKNOWLEDGMENTS .............................................................................................. III LIST OF TABLES ....................................................................................................... VIII LIST OF FIGURES ........................................................................................................ IX CHAPTER 1: INTRODUCTION .................................................................................... 1 1.1. RNA structure .................................................................................................................................. 1 1.1.1. Overview ................................................................................................................................... 2 1.1.2. RNA structure prediction........................................................................................................... 3 1.1.3. Structure-function relationships ................................................................................................ 6 1.2. Protein structure .............................................................................................................................. 8 1.2.1. Overview ..................................................................................................................................