
Synthetic beta-solenoid proteins with the fragment- free computational design of a beta-hairpin extension James T. MacDonalda,b, Burak V. Kabasakalc, David Goddingc,d, Sebastian Kraatzc,e, Louie Hendersonc, James Barberc, Paul S. Freemonta,b, and James W. Murrayc,1 aCentre for Synthetic Biology and Innovation, Imperial College London, London SW7 2AZ, United Kingdom; bDepartment of Medicine, Imperial College London, London SW7 2AZ, United Kingdom; cDepartment of Life Sciences, Imperial College London, London SW7 2AZ, United Kingdom; dDepartment of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom; and eLaboratory of Biomolecular Research, Paul Scherrer Institut, CH-5232 Villigen PSI, Switzerland Edited by David Baker, University of Washington, Seattle, WA, and approved July 26, 2016 (received for review December 22, 2015) The ability to design and construct structures with atomic level repeat five residues (RFR)-fold, which has a square cross-sec- precision is one of the key goals of nanotechnology. Proteins offer tional profile, as the basis for the design of a class of synthetic an attractive target for atomic design because they can be synthe- repeat proteins (21) (Fig. 1 A and B). sized chemically or biologically and can self-assemble. However, the The RFR-fold has a number of properties that make it attrac- generalized protein folding and design problem is unsolved. One tive as a substrate for design. The structure is unusually regular, approach to simplifying the problem is to use a repetitive protein as but is able to tolerate a wide range of residues on the outside of a scaffold. Repeat proteins are intrinsically modular, and their folding the solenoid barrel. The solenoids in natural RFR-fold proteins and structures are better understood than large globular domains. are nearly straight in contrast to several other forms of repeat Here, we have developed a class of synthetic repeat proteins based proteins, such as the leucine-rich repeat proteins, which are highly on the pentapeptide repeat family of beta-solenoid proteins. We curved. There are examples of natural RFR-fold proteins with have constructed length variants of the basic scaffold and computa- loop extensions projecting from the barrel, making this class of tionally designed de novo loops projecting from the scaffold core. proteins particularly suitable for functionalization. The protein is The experimentally solved 3.56-Å resolution crystal structure of one similarindiametertoDNA,andsomeRFR-foldproteinsare designed loop matches closely the designed hairpin structure, show- thought to play a role as DNA mimics (22). Here, we have designed BIOPHYSICS AND ing the computational design of a backbone extension onto a syn- and solved the structures of a number of artificial RFR-fold pro- COMPUTATIONAL BIOLOGY thetic protein core without the use of backbone fragments from teins of different lengths. known structures. Two other loop designs were not clearly resolved Previously, computationally designed enzymes have reused in the crystal structures, and one loop appeared to be in an incorrect backbone scaffolds from known natural proteins (23–25), al- conformation. We have also shown that the repeat unit can accom- though artificial helical bundle proteins have been functionalized modate whole-domain insertions by inserting a domain into one of using an intuitive manual design process (26–28). As the field of the designed loops. enzyme design becomes more ambitious, it is likely that con- sideration of backbone plasticity will become increasingly im- computational protein design | synthetic repeat proteins | de novo portant (29). Backbone conformations from solved protein backbone design | coarse-grained model structures are guaranteed to be designable because there is at uring the course of evolution, natural proteins may be Significance Drecruited to new unrelated functions conferring a selective advantage to the organism (1, 2). This accretion of new features and functions is likely to have left behind complex interlocking The development of algorithms to design new proteins with backbone plasticity is a key challenge in computational protein amino acid dependencies that can make reengineering natural design. In this paper, we describe a class of extensible synthetic proteins difficult and unpredictable (3). For this reason, we and repeat protein scaffolds with computationally designed vari- others hypothesize that it is more desirable to design de novo able loops projecting from the central core. We have developed proteins because these proteins provide a biologically neutral methods to sample backbone conformations computationally platform onto which functional elements can be grafted (4). using a coarse-grained potential energy function without using Artificial proteins have been designed by decoding simple residue backbone fragments from known protein structures. This pro- patterning rules that govern the packing of secondary structural cedure was combined with existing methods for sequence elements, and this technique has been particularly successful for design to successfully design a loop at atomic level precision. α – -helical bundle proteins (5 7). An alternative approach is to as- Given the inherent modular and composable nature of repeat semble de novo folds from backbone fragments of known structures proteins, this approach allows the iterative atomic-resolution or idealized secondary structural elements and use computational design of complex structures with potential applications in protein design methods to design the sequence (4, 8–10). Both the novel nanomaterials and molecular recognition. computational and simpler rules-based design approaches have concentrated on designing proteins consisting of canonical sec- Author contributions: J.T.M., J.B., P.S.F., and J.W.M. designed research; J.T.M., B.V.K., ondary structure linked with loops of minimal length. D.G., S.K., L.H., and J.W.M. performed research; J.T.M. contributed new reagents/analytic tools; J.T.M., B.V.K., D.G., S.K., and J.W.M. analyzed data; and J.T.M., P.S.F., and J.W.M. wrote A class of proteins that has attracted considerable interest is the paper. artificial proteins based on repeating structural motifs due to The authors declare no conflict of interest. their intrinsic modularity and designability (11). Repeat proteins This article is a PNAS Direct Submission. have applications that include their use as novel nanomaterials – Data deposition: The atomic coordinates and structure factors have been deposited in the (12 14) and as scaffolds for molecular recognition (15, 16). These Protein Data Bank, www.pdb.org (PDB ID codes 4YC5, 4YDT, 4YCQ, 4YEI, 4YFO, 5DZB, proteins may be designed using sequence consensus-based rules 5DRA, 5DN0, 5DNS, 5DQA, and 5DI5). (17) or computational protein design methods (18, 19). There are 1To whom correspondence should be addressed. Email: [email protected]. a number of families of beta-helical repeat proteins (20), from This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. which we chose the pentapeptide repeat family, forming the 1073/pnas.1525308113/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1525308113 PNAS Early Edition | 1of6 Downloaded by guest on October 1, 2021 AB C using standard techniques, and were found to crystallize in a variety of different crystal forms (Table 1 and SI Appendix,TableS1). A fourth variant protein, SynRFR24.2, was constructed with two amino acid changes (D196S and R198H). This protein crystallized in a new crystal form not observed for the SynRFR24.1 protein, probably because the large arginine 198 side chain blocked a crystal contact. All SynRFR proteins formed dimers in the crystal lattice (Fig. 1D) and in solution (SI Appendix,Fig.S2). Thermal stability D SynRFR20.1 measurements of these variable-length proteins using a thermofluor assay showed melting temperatures of between 65 °C and 73 °C that did not appear to be correlated with repeat length (SI Appendix, Fig. S5). Computational Design of de Novo Loops. Given the inherent SynRFR24.1 modularity and ease of expression, we decided to test whether these proteins could serve as an extended scaffold base for the design of de novo backbone embellishments as a step toward functionalization. Taking the 1.8-Å resolution SynRFR24.1 crystal structure [Protein Data Bank (PDB) ID code 4YC5] as the base scaffold, an eight-residue insertion was created SynRFR28.1 approximately midway along the stochastic repeat region of the protein (between residues 108 and 109). This loop length waschosenonthebasisoftheaccuracy of previous loop structure prediction results (35). Four thousand backbone sequence-independent loop conformations were sampled using PD2_loop_model software with no externally imposed restraints Fig. 1. RFR beta-solenoid proteins. (A) Single superhelical turn composed of on secondary structure or any other feature (Fig. 2A). Briefly, four repeats with a square cross-sectional profile. Each five-residue repeat the method samples plausible backbone loop conformations forms one face of the square, and 20 residues form a helical turn with an from a sequence-independent, coarse-grained Cα potential en- ∼ 5-Å rise. (B) View down the beta-solenoid, showing leucine residues from ergy function and then reconstructs other backbone atoms using position 3 in the repeat motif forming the hydrophobic core. A logo plot of residue frequencies used to produce the stochastic sequence region of the a structural
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages6 Page
-
File Size-