Computational Modeling of Rna-Small Molecule and Rna-Protein Interactions
Total Page:16
File Type:pdf, Size:1020Kb
The Texas Medical Center Library DigitalCommons@TMC The University of Texas MD Anderson Cancer Center UTHealth Graduate School of The University of Texas MD Anderson Cancer Biomedical Sciences Dissertations and Theses Center UTHealth Graduate School of (Open Access) Biomedical Sciences 8-2015 COMPUTATIONAL MODELING OF RNA-SMALL MOLECULE AND RNA-PROTEIN INTERACTIONS Lu Chen Follow this and additional works at: https://digitalcommons.library.tmc.edu/utgsbs_dissertations Part of the Bioinformatics Commons, Biophysics Commons, Medicinal-Pharmaceutical Chemistry Commons, Pharmaceutics and Drug Design Commons, Statistical Models Commons, and the Structural Biology Commons Recommended Citation Chen, Lu, "COMPUTATIONAL MODELING OF RNA-SMALL MOLECULE AND RNA-PROTEIN INTERACTIONS" (2015). The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access). 626. https://digitalcommons.library.tmc.edu/utgsbs_dissertations/626 This Dissertation (PhD) is brought to you for free and open access by the The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences at DigitalCommons@TMC. It has been accepted for inclusion in The University of Texas MD Anderson Cancer Center UTHealth Graduate School of Biomedical Sciences Dissertations and Theses (Open Access) by an authorized administrator of DigitalCommons@TMC. For more information, please contact [email protected]. Title page COMPUTATIONAL MODELING OF RNA- SMALL MOLECULE AND RNA-PROTEIN INTERACTIONS A DISSERTATION Presented to the Faculty of The University of Texas Health Science Center at Houston and The University of Texas M.D. Anderson Cancer Center Graduate School of Biomedical Sciences In Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY by Lu Chen, B.S. Houston, Texas August 2015 Dedication To my darling wife, Xiaofei Xiong and son, Ryan X. Chen who have loved, inspired, encouraged, motivated and supported me on whatever decision I have made. My parents, Jianjun Chen and Guoyun Zhang, who are always supportive in my scientific career. Dr. Shuxing Zhang, greatest mentor in my life. iii Acknowledgements I acknowledge all the hardworking scientists in Dr. Shuxing Zhang’s lab, including John Morrow, Micheal Cato, Zhi Tan, Srinivas Alla, Hoang Tran, Lei Du-cuny, Longzhang Tian, Sharangdhar Phatak, Nathan Ihle, Ryan Watkins. I also thank Matri and Paloma in George Calin’s lab, and great researchers in Dr. Edward Nikonowicz’s lab for their dedicate contributions to experimental validations. I owe great acknowledgement to my advisory committee, Shuxing Zhang, George Calin, Xiongbin Lu, Edward Nikonowicz, Wenyi Wang and John Ladbury for your valuable inputs and insights into this thesis. I could not have done it without your support. Special thanks to University of Texas M.D. Anderson and University of Austin for providing state-of-the-art HPC resources. Thank TACC for free academic license for Gaussian09. Thank Dr. Jinbo Xu for providing the source code and assisting me on deploying RaptorX for RNA-protein interface threading. iv Abstract COMPUTATIONAL MODELING OF RNA- SMALL MOLECULE AND RNA-PROTEIN INTERACTIONS By Lu Chen, B.S. Advisor: Shuxing Zhang, Ph.D. The past decade has witnessed an era of RNA biology; despite the considerable discoveries nowadays, challenges still remain when one aims to screen RNA-interacting small molecule or RNA-interacting protein. These challenges imply an immediate need for cost-efficient while predictive computational tools capable of generating insightful hypotheses to discover novel RNA-interacting small molecule or RNA-interacting protein. Thus, we implemented novel computational models in this dissertation to predict RNA-ligand interactions (Chapter 1) and RNA-protein interactions (Chapter 2). Targeting RNA has not garnered comparable interest as protein, and is restricted by lack of computational tools for structure-based drug design. To test the potential of translating molecular docking tools designed for protein to RNA-ligand docking and virtual screening, we benchmarked 5 docking software and 11 scoring functions to assess their performances in pose reproduction, pose ranking, score-RMSD correlation and virtual screening. From this benchmark, we proposed a three-step docking pipelines optimized for virtual screening against RNAs with different flexibility properties. Using this pipeline, we have successfully v identified a selective compound binding to GA:UU motif. Both NMR and the subsequent MD simulation proved its selective binding to GA:UU motif flanked by two tandem flexible base pairs next to GA. Consistent to the 3D model, SAR analysis revealed that any R-group substitution would abolish the binding. Current computational methods for RNA-protein interaction prediction (sequence-based or structure-based) are either short of interpretability or robustness. Aware of these pitfalls, we implemented RNA-Protein interaction prediction through Interface Threading (RPIT), which identifies and references a known RNA-protein interface as the template to infer the region where the interaction occurs and predict the interacting propensity based on the interface profiles. To estimate the propensity more accurately, we implemented five statistical scoring functions based our unique collection of non-redundant protein-RNA interaction database. Our benchmark using leave-protein-out cross validation and two external validation sets resulted in overall 70%-80% accuracy of RPIT. Compared with other methods, RPIT offers an inexpensive but robust method for in silico prediction of RNA-protein interaction networks, and for prioritizing putative RNA-protein pairs using virtual screening. vi Table of Contents Approval page ........................................................................................................................ i Title page ............................................................................................................................... ii Dedication ............................................................................................................................ iii Acknowledgements ............................................................................................................ iv Abstract ................................................................................................................................. v Table of Contents ................................................................................................................ vii List of Illustrations ............................................................................................................... xi List of Tables ...................................................................................................................... xiii Abbreviations ..................................................................................................................... xiv Chapter 1: Introduction ......................................................................................................... 1 1.1 Targeting RNA with small molecules ......................................................................... 1 1.1.1 RNA as therapeutic target .................................................................................... 1 1.1.2 Hit identification via molecular docking .............................................................. 1 1.1.3 Current in silico methods of targeting RNA ......................................................... 3 1.2 Discovering novel RNA-protein interaction ................................................................ 4 1.2.1 Emerging RNA-protein interactions (RPI) ........................................................... 4 1.2.2 RNA-protein interface .......................................................................................... 5 1.2.3 Current in silico methods of predicting RPI ......................................................... 6 Chapter 2: Computational modeling of RNA-small molecule interaction ............................ 9 2.1 Introduction ................................................................................................................. 9 vii 2.2 Materials and Methods: Benchmarking, Development and Application ................... 11 2.2.1 Benchmark datasets ............................................................................................ 11 2.2.2 Molecular docking and decoy generation ........................................................... 18 2.2.3 Evaluation of pose reproduction ......................................................................... 18 2.2.4 Evaluation of pose ranking ................................................................................. 19 2.2.5 Evaluation of virtual screening ........................................................................... 20 2.2.6 Evaluation of docking score-binding affinity correlation ................................... 21 2.2.7 RNA-specific scoring function optimization ...................................................... 21 2.2.8 MD simulations of GA:UU RNA-inhibitor complex ......................................... 22 2.2.9 Preparation of RNA samples .............................................................................. 22 2.2.10 Nuclear magnetic resonance (NMR) ................................................................ 23 2.3 Results: Benchmarking and optimizing docking method for RNA target