1 Towards Structure Determination of Disulfide-Rich Peptides Using Chemical Shift-Based Methods Conan K. Wang1,*, David J. Crai

Towards Structure Determination of Disulfide-Rich Peptides Using Chemical Shift-Based Methods Conan K. Wang1,*, David J. Craik1 1Institute for Molecular Bioscience, The University of Queensland, Brisbane, Queensland, 4072, Australia. *Address correspondence to: Dr Conan Wang ([email protected]) 1 ABSTRACT: Disulfide-rich peptides are a class of molecules for which NMR spectroscopy has been the primary tool for structural characterization. Here, we explore whether the process can be achieved by using structural information encoded in chemical shifts. We examine: i) a representative set of five cyclic disulfide-rich peptides that have high-resolution NMR and X-ray structures, and ii) a larger set of 100 disulfide-rich peptides from the PDB. Accuracy of the calculated structures was dependent on the methods used for searching through conformational space and for identifying native conformations. Although Hα chemical shifts could be predicted reasonably well using SHIFTX, agreement between predicted and experimental chemical shifts was sufficient for identifying native conformations for only some peptides in the representative set. Combining chemical shift data with secondary structure information and potential energy calculations improved the ability to identify native conformations. Additional use of sparse distance restraints or homology information to restrict the search space also improved resolution of calculated structures. This study demonstrates that abbreviated methods have potential for elucidation of peptide structures to high-resolution and further optimization of these methods, e.g. improvement in chemical shift prediction accuracy, will likely help transition these methods into the mainstream of disulfide-rich peptide structural biology. 2 INTRODUCTION Peptides are found throughout nature and have a wide range of functions essential for life, including metabolic regulation, signaling and host-defense. There is currently widespread interest in peptides with covalent constraints such as disulfide bonds because constrained peptides typically exhibit increased stability and conformational rigidity over linear peptides, and also adopt well-defined structural elements more typical of proteins.1 These characteristics underpin their potential as potent and selective inhibitors of protein-protein interactions,2 as illustrated by several constrained peptides either with potent pre- clinical activity,3 or reaching late-stage clinical trials and/or regulatory approval.4 It is expected that advances in technologies for high-throughput design and discovery of bioactive peptides will continue to push peptides into the mainstream of biotechnological applications. Knowledge of structures of natural and designed peptides is essential to harvest their potential as bioactive modalities, which can be obtained through the use of X-ray crystallography and/or NMR spectroscopy. Of the two techniques, NMR is preferred because it does not require crystallization, which is a major bottleneck in the X-ray approach, notwithstanding that recent advances in racemic crystallography 5-8 addresses this issue for disulfide-rich peptides.9,10 However, a concern with the use of NMR spectroscopy is that full structure determination is resource-intensive, requiring analysis of multiple experiments to obtain structural restraints and iterative rounds of data evaluation during structure refinement. Approaches to streamline NMR determination have been proposed, including automated data analysis11- 14 and NMR chemical shift-based methods.15-21 It is promising that in some cases, structures of proteins or nucleic acids have been determined solely from chemical shift data, circumventing the need to gather other restraints, such as distance restraints provided by the nuclear Overhauser effect and dihedral angle restraints by coupling constants.15-18 However, recent studies suggest that high-resolution structures of proteins are more reliably obtained when chemical shift data are supplemented with some additional restraints.21-26 Nevertheless, more research is needed to understand how broadly chemical shift based methods can be applied, particularly for peptides, to develop more efficient approaches for NMR structure determination. Here we examine de novo structure determination of disulfide-rich peptides solely from chemical shift data, as well as the effect of including additional sparse restraints and of using homology modeling. We developed computational methods to extract structural information from chemical shifts, using them to examine two sets of peptides: i) a representative set of peptides constrained by a macrocyclic backbone and disulfide bonds and ii) a larger non-overlapping set of 100 disulfide-rich peptides from the PDB. 3 EXPERIMENTAL SECTION Peptide synthesis and purification The assembly, synthesis, purification, cyclization and oxidation of ribifolin, SFTI-1, cVc1.1, BTD-2, and kB1 have been described previously.27,28 More details are provided in the Supporting Information. NMR spectroscopy Peptides were dissolved in H2O/D2O (9:1, v/v) at a concentration of ~1 mM. NMR spectra were recorded on a Bruker Avance-600 MHz NMR spectrometer at 298 K. One- and two-dimensional NMR spectra (1H, 1H TOCSY, NOESY, and 1H, 13C HSQC, and 1H, 15N HSQC) were acquired. Spectra were processed using Topspin 1.2 (Bruker) and analyzed using CCPNMR 2.2.2. Spectra were internally referenced to 2,2-dimethyl-2-silapentane-5-sulfonic acid (DSS) at 0.00 ppm. A protocol for structure determination using chemical shifts The Supporting Information provides details of the algorithm used for structure determination, describing the representation chosen for any candidate solution (each solution known as an individual has a set of genes), the method for initialization of the population of individuals, the method for calculation of scores (to assign fitness), and routines for generating new individuals (called genetic operators). We note that a genetic algorithm has previously been shown to be effective in calculation of protein structures.29 Molecular dynamics simulations Molecular dynamics simulations were performed as previously described.30 More details are provided in the Supporting Information. Homology modelling Peptides with sequence similarity to the query peptide (i.e. peptide to be modeled) were identified manually or by a BLAST search of the PDB. Matching peptides that were identical to the query peptide were rejected to prevent bias in the results. A sequence alignment of the query peptide to the remaining matching peptides was generated by optimizing amino acid and chemical shift similarity. The alignments along with coordinates of the subject peptides were used to construct homology models using MODELLER v9.16.31 4 RESULTS Accuracy of chemical shift predictions for disulfide-rich peptides Prediction accuracies of three popular programs that calculate chemical shifts, i.e. SHIFTX,32 SHIFTX+,33 and SPARTA+34 were examined. These programs have been used successfully for de novo determination of protein structures15-18 and are reported to be fast and accurate. They were tested on a set of 100 disulfide-rich structures (i.e. containing >1 disulfide bond) that exhibited good MolProbity scores35 (used as an indication of structural quality, Table S1 and Figure S1). Prediction accuracies of the programs were comparable, although SPARTA+ performed slightly better than SHIFTX and SHIFTX+, having standard deviations of 0.30, 0.56, 3.04, 1.37 ppm compared to 0.35, 0.67, 3.34, 1.54 ppm for SHIFTX and 0.35, 0.65, 3.30, and 1.52 for SHIFTX+ for the set of nuclei tested (Figure 1a). Nevertheless, the overall trend in accuracy for all three programs with respect to nuclei type was the same, with predictions of the backbone Hα chemical shift having the lowest deviations from experimental values, followed by HN then Cα and finally N. To examine prediction accuracy in a more detailed and standardized manner, a smaller representative set of disulfide-rich peptides was generated. Specifically, we re-measured chemical shifts for five well- characterized backbone-cyclic peptides that have demonstrated therapeutic potential: ribifolin, SFTI-1, cVc1.1, BTD-2, and kB1 (Figure 1b). Aside from structural diversity (i.e. have varying size, topology, disulfide content and connectivity), these peptides have high-resolution X-ray structures (1.25–1.85 Å)9,10,36 that agree well with their NMR structures (Figure 1b). As shown in Figure 1c and Figure S2, predictions generally tracked well with the residue-to-residue variation in experimental chemical shifts. In some cases, there were discrepancies between predicted and experimental Hα chemical shifts, as observed for ribifolin, cVc1.1 and kB1, but predictions for SFTI-1 and BTD-2 were very accurate by comparison (Figure 1c; using SHIFTX). Overall, the analysis of chemical shift predictions for disulfide- rich peptides suggests that current chemical shift calculation programs might have the potential to guide the de novo determination of structures of disulfide-rich peptides to high resolution. Scoring of structures by chemical shift-based parameters We examined whether chemical shifts, alone or in combination with other structural parameters, could be used to distinguish between different conformations of disulfide-rich peptides (Figure 2 and Figure S3). In a preliminary analysis, a range of parameters was tested to build a weighted chemical shift-based score (Supporting Information Methods and Figure S3). These parameters included examples

1 Towards Structure Determination of Disulfide-Rich Peptides Using Chemical Shift-Based Methods Conan K. Wang1,*, David J. Crai

Quantitative Analysis of Biomolecular NMR Spectra: a Prerequisite for the Determination of the Structure and Dynamics of Biomolecules

HASH: a Program to Accurately Predict Protein H Shifts from Neighboring Backbone Shifts

Automatic 13C Chemical Shift Reference Correction for Unassigned Protein NMR Spectra

A Modest Improvement in Empirical NMR Chemical Shift Prediction by Means of an Artiﬁcial Neural Network

Bioinformatics Methods for NMR Chemical Shift Data

David Wishart University of Alberta [email protected] NOE-Based NMR

Predicting Chemical Shifts with Graph Neural Networks APREPRINT

Using Chemical Shift Perturbation to Characterise Ligand Binding ⇑ Mike P

An Overview of Tools for the Validation of Protein NMR Structures

Arxiv:1305.2164V2 [Physics.Chem-Ph] 24 Nov 2013

HASH: a Program to Accurately Predict Protein H Shifts from Neighboring Backbone Shifts

Quantum-Mechanics-Derived C Chemical Shift Server (Cheshift) for Protein Structure Validation