A Novel Approach to De Novo Protein Structure Prediction

A NOVEL APPROACH TO DE NOVO PROTEIN STRUCTURE PREDICTION USING KNOWLEDGE BASED ENERGY FUNCTIONS AND EXPERIMENTAL RESTRAINTS By Nils Wötzel Dissertation Submitted to the Faculty of the Graduate School of Vanderbilt University In partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY in Chemistry December, 2011 Nashville, Tennessee Approved: Professor Jens Meiler Professor B. Andes Hess Jr. Professor Clare M. McCabe Professor Phoebe L. Stewart To Juliane, my parents and my sister ii ACKNOWLEDGEMENTS I thank my dissertation advisor for his scientific and personal support throughout my high school and graduate career. I see him not only as my scientific mentor, but also as a great friend who always had a word of advice. Although I have a great friendship with him, his wife Claudia and his son Jonas, we always worked productively in a professional student- mentor relationship. He influenced my decision to study chemistry, encouraged me to visit the USA for an internship and finally to join his lab for PhD career. He prepared me to take the next step in my life and become an independent scholar. The six years in graduate school taught me many things. Besides the scientific knowledge and the skills that were required to accomplish the tasks that were set, I could acquire great experiences from the work with colleagues. Rene Staritzbichler started the project with us and went through many iterations of improving aspects of the work. Mert Karakaş has been a valuable peer who took on the challange with me to develop the BCL::Fold method, going through many ups and downs. Nathan Alexander, Ralf Mueller, Brian Weiner and Edward Lowe have been involved in many scientific discussions that increased the quality of the work where we can publish results and can be proud of the tools that can and are used by the scientific community. My thesis committe was of great help and engaged in constructive scientific discussion. Dr. Clare McCabe and Dr. Andes Hess Jr. helped me see my work from a differnt perspective. Dr. Pheobe Stewart was not only a helpful advisor but also an exceptional collaborator contributing to the efforts to bring many projects to a success. iii Besides many funding form Vanderbilt, National Institue of Healt and the National Science foundation for the projects that I worked on, I would like to thank the Department of Chemistry for their support through the Warren research fellowship in the year 2010. Since I came to Nashville, I was fortunate to have had roomates that accepted my flaws and were always there when I needed them. Ralf Mueller, Andrew Morin and Nathan Alexander were not only roomates, but we became friends beyond peership. I also want to pay my gratitude to my parents, that supported me in my decision to leave Germany to take a new step in my career in a different country. It is not easy to let the last child leave home. We had to sacrifice many holidays and occasions that we used to celebrate together. My sister always supported me, and besides all the problems there are with older sisters, I love her and look forward to pentding more time with her again. Lastly, I want to thank my better half, the most valuable person in my life. We met just 2.5 years before I left Germany and ever since we were living apart. Juliane, my fiance, sacrificed 6 years waiting for me, and I hope we can spent many more days together than we were separated. There is no way to ever compensate for that time, but I will try to be the person, that she was waiting for all this time. iv TABLE OF CONTENTS ACKNOWLEDGEMENTS ............................................................................................... iii LIST OF TABLES .............................................................................................................. x LIST OF FIGURES ........................................................................................................... xi LIST OF ABBREVIATIONS ........................................................................................... xii SUMMARY ..................................................................................................................... xiii I. INTRODUCTION .........................................................................................................1 Central Dogma of Molecular Biology .....................................................................1 Protein Structure and Function ................................................................................1 Protein Structure Elucidation ...................................................................................2 In Silico Protein Structure Prediction ......................................................................3 Protein Structure Comparison Methods ...................................................................5 Template Based Protein Structure Prediction ..........................................................6 De novo Protein Structure Prediction ......................................................................7 Protein Structure Prediction using Low Resolution/Sparse Experimental Restraints..................................................................................................................8 BCL::Score ..............................................................................................................9 BCL::Fold ..............................................................................................................11 Use of Cryo Electron Microscopy as sparse experimental restraint ......................12 Rigid body fitting ...................................................................................................13 BCL::EM-Fit ..........................................................................................................13 BioChemistry Library ............................................................................................14 II. BCL::EM-Fit: Rigid body fitting of atomic structures into density maps using geometric hashing and real space refinement. .............................................................17 Introduction ............................................................................................................17 Results ....................................................................................................................20 An efficient two-stage low and high resolution fitting protocol ......................20 Protein fitting procedure is highly reliable for resolutions of 10 Å or better ..26 BCL::EM-Fit identifies the correct density for a given atomic resolution structure, homolog, or comparative model ......................................................31 Adenovirus capsid proteins are docked with high confidence into cryoEM density ..............................................................................................................34 4 copies of 1OELG are docked into the chaperonin GroEL density map at 5.4 Å resolution ................................................................................................40 v Correct handedness of a density maps can be verified by the CCC of the initial fit ............................................................................................................42 Discussion ..............................................................................................................43 Docking works best when secondary structural elements are resolved within the density map .....................................................................................43 BCL::EM-Fit correctly identifies and places homologous structures and comparative models .........................................................................................44 BCL::EM-Fit is applicable to fitting of large adenovirus capsid proteins .......45 BCL::EM-Fit can fit subunits within a larger assembly ..................................45 BCL::EM-Fit and flexible docking ..................................................................46 Advantages and disadvantages of Geometric Hashing compared to Fourier/Real Space fitting ................................................................................46 Methods..................................................................................................................48 Geometric hashing re-casted for searching density maps with protein structures ..........................................................................................................48 Extraction of feature cloud from density map intensities (Figure 2a) .............49 Selection of triangular bases for coordinate transformations (Figure 2b) .......52 The maximal distance of features from the coordinate base is limited by a feature radius (Figure 2b) .................................................................................53 Quantization of features accounts for finite number of transformations, low resolution of the density map, and experimental noise (Figure 2b-c).......54 Hash map architecture (Figure 2c) ...................................................................57 Atoms within secondary structure elements are used as features to represent the protein (Figure 2d)......................................................................57 Initial fits are determined that superimpose the maximum number of features (Figure 2e-g) .......................................................................................58 Filtering fits by translational and rotational distance .......................................60

A Novel Approach to De Novo Protein Structure Prediction

Estimation of Uncertainties in the Global Distance Test (GDT TS) for CASP Models

Methods for the Refinement of Protein Structure 3D Models

Designing and Evaluating the MULTICOM Protein Local and Global

Computational Methods for Evaluation of Protein Structural Models

A Glance Into the Evolution of Template-Free Protein Structure Prediction Methodologies Arxiv:2002.06616V2 [Q-Bio.QM] 24 Apr 2

Single-Sequence Protein Structure Prediction Using Language Models from Deep Learning

Are Attention and Symmetries All You Need?

Highly Accurate Protein Structure Prediction with Alphafold

Template-Based Protein Modeling Using the Raptorx Web Server

Protein Shape Sampled by Ion Mobility Mass Spectrometry Consistently Improves Protein Structure Prediction

And Contact-Based Protein Structure Prediction

View a Copy of This Licence, Visit Creativeco Mmons .Org/Licen Ses/By/4.0/