Computational Design of a Leucine-Rich Repeat Protein with a Predefined Geometry

Computational design of a leucine-rich repeat protein with a predefined geometry Sebastian Rämischa, Ulrich Weiningerb, Jonas Martinssona, Mikael Akkeb, and Ingemar Andréa,1 Departments of aBiochemistry and Structural Biology and bBiophysical Chemistry, Center for Molecular Protein Science, Lund University, SE-221 00 Lund, Sweden Edited by David Baker, University of Washington, Seattle, WA, and approved October 30, 2014 (received for review July 17, 2014) Structure-based protein design offers a possibility of optimizing multiple functional sites in a single protein, binding site targeting, or the overall shape of engineered binding scaffolds to match their specific recognition of protein oligomers. targets better. We developed a computational approach for the Leucine-rich repeat (LRR) proteins display a significant varia- structure-based design of repeat proteins that allows for adjustment tion in shape (13, 14). The repeats in this protein class are typically of geometrical features like length, curvature, and helical twist. By composed of 20–30 residues, and they form helically twisted, combining sequence optimization of existing repeats and de novo solenoid-like structures with a continuous parallel β-sheet on design of capping structures, we designed leucine-rich repeats (LRRs) the concave side; they can be elongated or highly curved (2). It has from the ribonuclease inhibitor (RI) family that assemble into been shown that N- and C-terminal capping structures are crucial structures with a predefined geometry. The repeat proteins were for folding and stability of LRR proteins (15, 16). The combination built from self-compatible LRRs that are designed to interact to form of a stable core of framework positions and variability in overall highly curved and planar assemblies. We validated the geometrical assembly structure makes LRRs ideal building blocks for the de- design approach by engineering a ring structure constructed from 10 velopment of repeat proteins with rationally designed shapes. The self-compatible repeats. Protein designcanalsobeusedtoincrease feasibility of engineering LRR proteins has been demonstrated with our structural understanding of repeat proteins. We use our design consensus design applied to repeats from the ribonuclease inhibitor constructs to demonstrate that buried Cys play a central role for (RI) family (17), the nucleotide-binding oligomerization domain stability and folding cooperativity in RI-type LRR proteins. The family (18), and the variable lymphocyte receptors (VLRs). The computational procedure presented here may be used to develop latter yielded a protein-binding scaffold named “repebodies” (19). BIOPHYSICS AND repeat proteins with various geometrical shapes for applications In this work, we developed a computational approach for the COMPUTATIONAL BIOLOGY where greater control of the interface geometry is desired. structure-based design of repeat proteins that allows for adjustment of geometrical features like length, curvature, and helical binding scaffold | Rosetta | buried cysteines | twist. We used the method to design a self-compatible LRR that computational protein design | geometrical design assembles into highly curved and planar repeat proteins. The repeats were designed to allow assembly into a closed-ring ngineered protein-binding scaffolds are increasingly used as structure. Variants with five double repeats and added capping Etherapeutics, diagnostic probes, intracellular reporter mole- structures produced stable proteins. In the absence of caps, the cules, or fusion domains in protein crystallization (1). Nature same repeat variants can dimerize to form complete circles (full- provides a large variety of protein recognition scaffolds from ring structures), thereby verifying the correctly designed geom- which engineered systems could be built. Repeat proteins are etry. The results demonstrate that stable proteins with pre- used in a wide range of biological processes, including the im- defined shapes can be built with limited sequence redesign if the mune response and regulatory cascades (2–5). They consist of conformation of the self-compatible repeat is selected carefully. simple, structurally similar building blocks, called repeats, that Additionally, the results highlight the stabilizing effect of buried assemble into elongated tandem arrays (6). Their extended shapes result in proteins with extraordinarily large binding sur- Significance faces, which makes them ideal scaffolds for protein binding. Analogous to antibodies, repeat proteins can be divided into Repeat proteins are used in nature to bind to proteins and pep- framework residues, which encode stability and structure, and tides. The shape of their binding surfaces can vary substantially, variable positions, which are responsible for protein recognition even for proteins within the same family. This variability likely (7). A striking difference from antibodies is that the global arose because they evolved to match the proteins they interact structure can vary considerably between repeat proteins, even with geometrically. Repeat proteins are often engineered to de- within a family. This structural variability suggests that not only velop binders specific to new target proteins. It would be highly the directly interacting residues but also the overall shapes of beneficial to design repeat proteins with predefined geometrical these proteins are optimized for binding target molecules. shapes because such a method would enable development of Engineered repeat proteins have typically been developed by engineered repeat proteins that are shape-optimized to their consensus sequence design, a method where highly conserved targets. Here, we demonstrate that repeat proteins with a pre- sequence positions are identified and scaffolds are built from defined shape can be designed using a computational design identical repeats containing the most common residues at those method. The approach is exemplified by the design of a protein positions (8). Consensus design has been successfully applied to that forms a ring structure not seen in nature. create stable scaffolds from several repeat protein classes (9–12). However, this approach does not enable the design of binders with Author contributions: S.R. and I.A. designed research; S.R., U.W., J.M., M.A., and I.A. performed predefined shapes. Because the geometrical shape of an assembly is research; S.R., U.W., M.A., and I.A. analyzed data; and S.R., U.W., M.A., and I.A. wrote the paper. encoded by subtle structural differences between repeats and The authors declare no conflict of interest. interrepeat interfaces, a structure-based design approach is required This article is a PNAS Direct Submission. to design the assembly shape rationally. Optimizing the shape Freely available online through the PNAS open access option. complementarity to a target molecule would enable development of 1To whom correspondence should be addressed. Email: [email protected]. scaffolds that are custom-made for their target proteins and could This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. yield enhanced binding properties, such as simultaneous binding to 1073/pnas.1413638111/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1413638111 PNAS Early Edition | 1of6 Downloaded by guest on September 24, 2021 Cys in the core of RI-type LRR proteins and address the role of AB capping motifs in the folding of repeat proteins. Results To develop an approach for the design of repeat proteins with a defined geometry, we started from the following consid- erations: Repeat proteins can be described as arrays of repeating structural units. However, there are subtle but important differences between the conformations of individual repeats that encode the compatibility with neighboring repeats as well as the overall shape of the protein. Features like buried hydrogen bonds (e.g., in Asn and Cys ladders, in structural water molecules, in contacts formed between interacting loop segments) serve as specificity elements that define the relative orientation of neighboring repeats. Thus, the central question when designing shape-optimized repeat proteins is how to engineer these specificity elements accurately. Because nature has already developed a variety of specific interaction geometries, we attempt to adopt these detailed features from repeats with a known structure. In structures of known LRR proteins, the curvature- defining angles between neighboring repeats range from −0.5° to 37.2° and helical twist angles are between −11.2° and 10.5°. The space of available conformations for repeats within a certain family is quite limited and can be sampled from the structures of intact repeat proteins. Because each protein contains many repeats, a large number of possible backbone conformations can be assembled from a small set of protein structures. We developed a computational design method where repeat proteins with predefined shapes are assembled from structurally compatible building blocks taken from crystal structures. Subsequently, sequence redesign is used to optimize self- compatibility. A summary of the procedure is shown in Fig. 1A; it consists of the following steps: i) The desired geometry of the protein is defined. ii) A library of structures of individual repeats is compiled from crystal structures of selected repeat proteins. Fig. 1. Overview of the design procedure. (A) Illustration of the general iii workflow for designing repeat

Computational Design of a Leucine-Rich Repeat Protein with a Predefined Geometry

Ankrd9 Is a Metabolically-Controlled Regulator of Impdh2 Abundance and Macro-Assembly

Repeat Proteins Challenge the Concept of Structural Domains

A Structural Guide to Proteins of the NF-Kb Signaling Module

Supp. Table S2: Domains and Protein Families with a Putative Role in Host-Symbiont Interactions

Appendix 2. Significantly Differentially Regulated Genes in Term Compared with Second Trimester Amniotic Fluid Supernatant

WSB1: from Homeostasis to Hypoxia Moinul Haque1,2,3, Joseph Keith Kendal1,2,3, Ryan Matthew Macisaac1,2,3 and Douglas James Demetrick1,2,3,4*

Structural and Biochemical Consequences of Disease-Causing Mutations in the Ankyrin Repeat Domain of the Human TRPV4 Channel

Dissecting and Reprogramming the Folding and Assembly of Tandem-Repeat Proteins

Take Home Lessons from Studies of Related Proteins

(TRP) Cation Channels Could Explain Smell, Taste, And/Or Chemesthesis Disorders

The Contribution of the Ankyrin Repeat Domain of TRPV1 As a Thermal

Use of Protein Engineering Techniques to Elucidate Protein Folding Pathways