A Web Server for Rapid NMR-Based Protein Structure Determination Berjanskii, M.; Tang, P.; Liang, J.; Cruz, J
Total Page:16
File Type:pdf, Size:1020Kb
NRC Publications Archive Archives des publications du CNRC GeNMR: a web server for rapid NMR-based protein structure determination Berjanskii, M.; Tang, P.; Liang, J.; Cruz, J. A.; Zhou, J.; Zhou, Y.; Bassett, E.; Macdonell, C.; Lu, P.; Lin, G.; Wishart, D. S. This publication could be one of several versions: author’s original, accepted manuscript or the publisher’s version. / La version de cette publication peut être l’une des suivantes : la version prépublication de l’auteur, la version acceptée du manuscrit ou la version de l’éditeur. For the publisher’s version, please access the DOI link below./ Pour consulter la version de l’éditeur, utilisez le lien DOI ci-dessous. Publisher’s version / Version de l'éditeur: https://doi.org/10.1093/nar/gkp280 Nucleic Acids Research, 37, Suppl. 2, pp. W670-W677, 2009-07-01 NRC Publications Record / Notice d'Archives des publications de CNRC: https://nrc-publications.canada.ca/eng/view/object/?id=8c24f92f-5a63-4bc9-9510-868e720b942e https://publications-cnrc.canada.ca/fra/voir/objet/?id=8c24f92f-5a63-4bc9-9510-868e720b942e Access and use of this website and the material on it are subject to the Terms and Conditions set forth at https://nrc-publications.canada.ca/eng/copyright READ THESE TERMS AND CONDITIONS CAREFULLY BEFORE USING THIS WEBSITE. L’accès à ce site Web et l’utilisation de son contenu sont assujettis aux conditions présentées dans le site https://publications-cnrc.canada.ca/fra/droits LISEZ CES CONDITIONS ATTENTIVEMENT AVANT D’UTILISER CE SITE WEB. Questions? Contact the NRC Publications Archive team at [email protected]. If you wish to email the authors directly, please see the first page of the publication for their contact information. Vous avez des questions? Nous pouvons vous aider. Pour communiquer directement avec un auteur, consultez la première page de la revue dans laquelle son article a été publié afin de trouver ses coordonnées. Si vous n’arrivez pas à les repérer, communiquez avec nous à [email protected]. W670–W677 Nucleic Acids Research, 2009, Vol. 37, Web Server issue Published online 30 April 2009 doi:10.1093/nar/gkp280 GeNMR: a web server for rapid NMR-based protein structure determination Downloaded from https://academic.oup.com/nar/article-abstract/37/suppl_2/W670/1126378 by National Research Council Canada user on 21 January 2019 January 21 user on Canada Council Research National by https://academic.oup.com/nar/article-abstract/37/suppl_2/W670/1126378 from Downloaded Mark Berjanskii1, Peter Tang1, Jack Liang1, Joseph A. Cruz1, Jianjun Zhou1, You Zhou1, Edward Bassett1, Cam MacDonell1, Paul Lu1, Guohui Lin1 and David S. Wishart1,2,3,* 1Department of Computing Science, 2Department of Biological Sciences, University of Alberta and 3National Research Council, National Institute for Nanotechnology (NINT), Edmonton, AB, Canada T6G 2E8 Received February 9, 2009; Revised April 2, 2009; Accepted April 14, 2009 ABSTRACT INTRODUCTION GeNMR (GEnerate NMR structures) is a web server Nuclear magnetic resonance (NMR) spectroscopy, along for rapidly generating accurate 3D protein struc- with X-ray crystallography, is one of only two methods tures using sequence data, NOE-based dis- that can be used to determine the 3D structures of proteins tance restraints and/or NMR chemical shifts as to atomic resolution. To date, nearly 8000 peptide and input. GeNMR accepts distance restraints in protein structures have been determined by NMR and deposited into the PDB (1). The standard route to deter- XPLOR or CYANA format as well as chemical shift mine protein structures by NMR involves three basic files in either SHIFTY or BMRB formats. The web steps: (i) determining the chemical shift assignments of server produces an ensemble of PDB coordinates the target protein; (ii) measuring the inter- and intra- for the protein within 15–25 min, depending on residue 1H NOEs (nuclear Overhauser enhancements) to model complexity and completeness of experimen- generate distance constraints; and (iii) using the NOE- tal restraints. GeNMR uses a pipeline of several derived constraints to perform simulated annealing or dis- pre-existing programs and servers to calculate the tance geometry to calculate the 3D structure of the protein actual protein structure. In particular, GeNMR com- (2). The last step in this process is very computationally bines genetic algorithms for structure optimization demanding and requires very specialized software and sig- along with homology modeling, chemical shift nificant computational resources. Over the past 20 years, a threading, torsion angle and distance predictions number of stand-alone programs have been specifically from chemical shifts/NOEs as well as ROSETTA- developed to facilitate these calculations including, based structure generation and simulated annealing DYANA (3), CYANA (4), XPLOR-NIH (5), CNS with XPLOR-NIH to generate and/or refine protein (6) and ARIA (7). All of these programs are excellent at coordinates. GeNMR greatly simplifies the task what they do, and all have been extensively tested. However, they are also somewhat unwieldy, platform- of protein structure determination as users do not specific software packages that are difficult to learn or have to install or become familiar with complex use. Furthermore, these programs work almost exclusively stand-alone programs or obscure format conver- with NOE data and most of them do not take advantage sion utilities. Tests conducted on a sample of of chemical shift information in their structure refinement 90 proteins from the BioMagResBank indicate that process. Likewise, they do not incorporate a number of GeNMR produces high-quality models for all protein widely used structural informatics techniques such as queries, regardless of the type of NMR input data. threading, fragment-based assembly or homology model- GeNMR was developed to facilitate rapid, user- ing to accelerate the structure determination process. friendly structure determination of protein struc- Consequently, these programs often require many CPU tures via NMR spectroscopy. GeNMR is accessible hours of dedicated calculation and refinement to complete at http://www.genmr.ca. their tasks. *To whom correspondence should be addressed. Tel: +780 492 0383; Fax: +780 492 5305; Email: [email protected] ß 2009 The Author(s) This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2009, Vol. 37, Web Server issue W671 While most NMR-based structure determination packages use NOE data almost exclusively, it has recently been shown that it is possible to determine protein struc- tures using only chemical shift data (i.e. without NOEs). Indeed, several stand-alone programs and web servers 2019 January 21 user on Canada Council Research National by https://academic.oup.com/nar/article-abstract/37/suppl_2/W670/1126378 from Downloaded have been developed for this purpose, including Cheshire (8), CS-Rosetta (9) and CS23D (10). However, chemical shift-based structure determination is not as robust or as rapid as NOE-based structure determination—especially for novel protein folds (10) or large structures (8,9). Indeed, some of these programs take hundreds or even thousands of CPU hours to complete their calculations. Furthermore, none of these shift-based structure determination techniques takes advantage of the additional information contained in NOE constraint data. Ideally, what is needed for NMR-based structure deter- mination is a tool that is: (i) easy to use (a ‘single click’ solution); (ii) platform independent (a web server); (iii) fast (generating structures in minutes, not hours); (iv) capable of handling both NOE and chemical shift data; and (v) smart enough to perform the NMR equiva- lent of ‘molecular replacement’ by taking advantage of well-known structural bioinformatics techniques such as threading, homology modeling and fragment-based assembly. Here, we wish to describe a simple web server, called GeNMR, that addresses these needs. In particular, Figure 1. A flow chart outlining the general structure of the GeNMR GeNMR is a next generation, highly generalizable system web server and the programs that it calls to generate protein structures for rapidly generating protein structures by NMR using a from chemical shift and/or NOE data. The specific function of each of the named programs is explained in the text. combination of genetic algorithms and simulated anneal- ing. It allows the rapid (15 min) and accurate determi- the (common) situation where there are missing NOE or nation of protein structures using any combination of chemical shift assignments. Without the complete protein chemical shifts and/or NOEs as input. GeNMR builds sequence, GeNMR would typically generate concatenated on nearly 15 years of research in our lab related to using (i.e. incorrect) structures in the regions where NOE or chemical shifts and other NMR data to identify protein chemical shift data was missing. secondary structures (11), to predict torsion angles (12), to The sequence, chemical shift and/or NOE files may be identify protein folds (13), to predict protein flexibility either pasted or typed into the text box or uploaded (14) and to calculate protein structures (10). GeNMR through a file browse