Toxin Structures As Evolutionary Tools: Using Conserved 3D Folds to Study the Evolution of Rapidly-Evolving Peptides
Total Page:16
File Type:pdf, Size:1020Kb
Toxin structures as evolutionary tools: Using conserved 3D folds to study the evolution of rapidly-evolving peptides Eivind A. B. Undheim1*, Mehdi Mobli2 and Glenn F. King1 1Institute for Molecular Bioscience and 2Centre for Advanced Imaging, The University of Queensland, St Lucia, Queensland 4072, Australia *Corresponding author: [email protected] Keywords: Protein evolution; toxin; structural adaptation; inhibitor cystine knot; knottin Abbreviations: AMP – Antimicrobial peptide CaV – Voltage-gated calcium channel DDH – Disulfide-directed b-hairpin ICK – Inhibitor cystine knot KV – Voltage-gated potassium channel NaV – Voltage-gated sodium channel PDB – Protein Data Bank (http://www.rcsb.org/pdb) RyR1 – Ryanodine receptor 1 SVWC – Single von-Willebrand factor type C 1 Summary Three-dimensional (3D) structures have been used to explore the evolution of proteins for decades, yet they have rarely been utilised to study the molecular evolution of peptides. Here we highlight areas in which 3D structures can be particularly useful for studying the molecular evolution of peptide toxins. Although we focus our discussion on animal toxins, including one of the most widespread disulfide-rich peptide folds known, the inhibitor cystine knot, our conclusions should be widely applicable to studies of the evolution of disulfide-constrained peptides. We show that conserved 3D folds can be used to identify evolutionary links and test hypotheses regarding the evolutionary origin of peptides with extremely low sequence identity; construct accurate multiple sequence alignments; and better understand the evolutionary forces that drive the molecular evolution of peptides. Introduction “Where do you come from?” is one of the most frequently asked questions when people first meet. Regardless of the “social unit” in question—continent, country, state, city, family—origin stories tend to interest to most of us, as they help place our perceptions of people into context. Similarly, identifying distantly related peptides provides clues as to their common origin and informs us about the properties that underlie their current structure and function. Peptide toxins from animal venoms (henceforth just “toxins”) are particularly interesting in this regard. Toxins are “recruited” into venom from housekeeping roles, where they subsequently undergo diversification and often neofunctionalisation to produce bio-weapons with a diverse range of functions [1]. Toxins generally have high potency and selectivity for particular receptor and ion channel subtypes, which has made them useful pharmacological tools [2] and attractive leads for development of peptide-based drugs and bioinsecticides [3,4]. In addition to exceptional potency, toxins have to be soluble at high concentrations in venom and sufficiently stable to exert their function in the foreign environment into which they are injected. Elucidating the evolutionary connections between toxins and their ancestral non-toxin precursors therefore provides insight into the molecular traits that are important for attaining the required levels of potency, selectivity, solubility and stability [5]. This knowledge not only aids categorisation and annotation of the rapidly growing mountains of sequence data, but it can 2 provide important structure-function relationship information to guide bioengineering efforts. Unfortunately, however, reconstructing the evolutionary history of peptide toxins can be extremely challenging [5,6]. One of the major problems encountered when attempting to reconstruct the evolutionary histories of peptides is loss of phylogenetic signal. Regardless of one’s subjective size-distinction between a peptide and a protein, peptides contain -- by definition -- a relatively small number of amino acid residues that can be used to infer common ancestry. This problem becomes particularly pronounced when looking at toxins. Peptide toxins are typically rich in conserved disulfide bonds that provide them with exceptional chemical, thermal and biological stability [3]. These disulfide frameworks generally constrain the peptide fold to such an extent that most of the non-cysteine residues can be mutated without damaging the toxin’s structural integrity [7]. This is a luxury not afforded to most globular proteins, where the integrity of the hydrophobic core is largely dependent on non-covalent interactions between buried residues, and most mutations in such residues lead to impaired stability and/or folding [8-10]. In contrast, the interior of cysteine-rich peptides is largely composed of covalent disulfide bonds [5,11-14]. Due to their smaller size, cysteine-rich peptides also have a greater surface-to-volume ratio, resulting in a greater proportion of exposed residues that can be mutated without perturbing the folding or structure [10]. Evolution of disulfide-rich peptide toxins is therefore generally characterised by the accumulation of a large number of mutations that result in increased potency, altered selectivity, or even acquisition of entirely new functions [15,16] — all of which make animal toxins of interest to a wide cross-section of the scientific community. However, this high degree of molecular plasticity often leaves very few conserved residues available for deep evolutionary analyses. Despite the striking accumulation of mutations during peptide-toxin evolution, structurally important residues such as cysteines tend to be highly conserved [7]. This is because a mutation that affects peptide folding results in biochemical energy being wasted making non-functional peptides and it can have additional detrimental effects due to formation of aggregates and fibrils [17]. The cysteine framework is therefore commonly used to group toxins into classes and structural families [18-20]. However, deletions and insertions between cysteines can result in 3 “new” cysteine patterns [5], as can the acquisition or loss of cysteine pairs (see below). Such alterations in cysteine framework can obscure evolutionary links and lead to misalignment of cysteines in sequence alignments, with resulting errors in downstream evolutionary analyses. A solution to many of the problems of studying rapidly evolving peptides such as toxins may lie in the fact that while even highly conserved residues may be mutated, the core 3D fold nearly always remains intact [21,22]. Thus, 3D structures provide a means of recognizing distant evolutionary links and serve as a guide for alignment of sequences with variations in otherwise conserved residues. Moreover, when integrated with more traditional evolutionary analyses, 3D structures can be used to identify remarkable cases of structural convergence in evolution and provide insight into the phenotypic change of an evolving gene family [5]. Here we highlight what we believe is still a largely unrecognized source of information in molecular evolutionary biology. We focus on the inhibitor cystine knot (ICK), one of the most widespread disulfide-rich peptide folds known, and one of the best characterised disulfide-rich folds across higher taxa. This fold includes examples where structures have played a pivotal role in recognizing evolutionary links and provided a deeper understanding of toxin evolution. Beware the inhibitor cystine knot—old, weaponised, and enigmatic The ICK motif is defined as an antiparallel β sheet stabilised by a cystine knot. The cystine knot itself is composed of a closed ring formed by two disulfide bonds and the intervening sections of the peptide backbone, with a third disulfide piercing the ring to create a pseudo-knot [23] (Fig. 1). With the exception of cyclic ICK peptides, cystine knots are not true knots in a mathematical sense, as they can be untied by a non-bond-breaking geometrical transformation, but they nevertheless provide ICK peptides with remarkable stability and resistance to proteases [24]. Thus, it is perhaps not surprising that the ICK motif may be the most ubiquitous cysteine-rich peptide fold in nature. A survey of all unique 1–10 kDa peptides in UniProtKB [25] with at least one disulfide annotation and a PDB entry [26] reveals that 18.5% of these peptides assume an ICK fold, second only to the cystine-stabilised α/β fold, which accounts for 19%. In comparison, the third-most abundant disulfide-constrained fold, the three-finger toxin motif, represents 10% of all entries. ICK peptides also appear to constitute the most abundant fold in the “dark proteome” (i.e., proteins without a determined 3D structure that are inaccessible to homology 4 modelling [27]), indicating that it may well be the most abundant cysteine-rich fold known. In addition to their ubiquity, the ICK fold has an extremely widespread taxonomic distribution, with experimentally confirmed examples found in viruses, fungi, plants, and animals, where they appear to function primarily as pathogen-defence molecules (“defensins”) (Fig. 2) [28-51]. ICK toxins are also probably the most widely recruited peptide-fold in animal venoms. This frequent “weaponisation” is probably attributable to both the stability and evolutionary plasticity of the ICK fold. This plasticity is most evident in spider venoms, where ICK toxins appear to constitute the majority of the molecular and structural diversity [52]. However, in addition to spiders, ICK toxins are also found in the venoms of scorpions (e.g. PDB 1IE6; [44]), assassin bugs (e.g. PDB 1LMR; [42]), and cone snails (e.g. PDB 1RMK; [37]), as well as tick saliva [53]. Moreover, putative ICK toxins have been identified in the venoms of ants