Structural Venomics Reveals Evolution of a Complex Venom by Duplication and Diversification of an Ancient Peptide-Encoding Gene
Total Page:16
File Type:pdf, Size:1020Kb
Structural venomics reveals evolution of a complex venom by duplication and diversification of an ancient peptide-encoding gene Sandy S. Pinedaa,b,1,2,3,4, Yanni K.-Y. China,c,1, Eivind A. B. Undheimc,d,e, Sebastian Senffa, Mehdi Moblic, Claire Daulyf, Pierre Escoubasg, Graham M. Nicholsonh, Quentin Kaasa, Shaodong Guoa, Volker Herziga,5, John S. Mattickb,6, and Glenn F. Kinga,2 aInstitute for Molecular Bioscience, The University of Queensland, St Lucia, QLD 4072, Australia; bGarvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Darlinghurst, Sydney, NSW 2010, Australia; cCentre for Advanced Imaging, The University of Queensland, St Lucia, QLD 4072, Australia; dCentre for Biodiversity Dynamics, Department of Biology, Norwegian University of Science and Technology, 7491 Trondheim, Norway; eCentre for Ecological & Evolutionary Synthesis, Department of Biosciences, University of Oslo, 0316 Oslo, Norway; fThermo Fisher Scientific, 91941 Courtaboeuf Cedex, France; gUniversity of Nice Sophia Antipolis, 06000 Nice, France; and hSchool of Life Sciences, University of Technology Sydney, Broadway, NSW 2007, Australia Edited by Adriaan Bax, National Institutes of Health, Bethesda, MD, and approved March 18, 2020 (received for review August 21, 2019) Spiders are one of the most successful venomous animals, with N-terminal strand is sometimes present (12). The cystine knot more than 48,000 described species. Most spider venoms are comprises a “ring” formed by two disulfide bonds and the in- dominated by cysteine-rich peptides with a diverse range of phar- tervening sections of the peptide backbone, with a third disulfide macological activities. Some spider venoms contain thousands of piercing the ring to create a pseudoknot (11). This knot provides unique peptides, but little is known about the mechanisms used to generate such complex chemical arsenals. We used an integrated transcriptomic, proteomic, and structural biology approach to Significance demonstrate that the lethal Australian funnel-web spider pro- duces 33 superfamilies of venom peptides and proteins. Twenty- The venom of the Australian funnel-web spider is one of the six of the 33 superfamilies are disulfide-rich peptides, and we show most complex chemical arsenals in the natural world, com- BIOCHEMISTRY that 15 of these are knottins that contribute >90% of the venom prising thousands of peptide toxins. These toxins have a di- proteome. NMR analyses revealed that most of these disulfide-rich verse range of pharmacological activities and vary in size from peptides are structurally related and range in complexity from sim- short (3 to 4 kDa) to long (8 to 9 kDa). It is unclear how spiders ple to highly elaborated knottin domains, as well as double-knot evolved such complex venoms and whether there is an evolu- toxins, that likely evolved from a single ancestral toxin gene. tionary relationship between short and long toxins. Here, we introduce a “structural venomics” approach to show that the spider venom | venom evolution | structural venomics | transcriptomics | venom of Australian funnel-web spiders evolved primarily by proteomics duplication and elaboration of a single ancestral knottin gene; short toxins are simple knottins whereas most long toxins are piders evolved from an arachnid ancestor in the Late Or- either highly elaborated single-domain knottins or double-knot Sdovician around 450 million years ago (1), and they have toxins created by intragene duplications. since become one of the most successful animal lineages on the planet, with >100,000 extant species (2). A key contributor to Author contributions: S.S.P., E.A.B.U., J.S.M., and G.F.K. designed research; S.S.P., Y.K.-Y.C., E.A.B.U., S.S., M.M., C.D., P.E., G.M.N., Q.K., S.G., and V.H. performed research; their evolutionary success is the use of venom to capture prey S.S.P., Y.K.-Y.C., E.A.B.U., S.S., M.M., C.D., P.E., G.M.N., Q.K., S.G., V.H., and G.F.K. analyzed and defend against predators. The constant selection pressure on data; and S.S.P., Y.K.-Y.C., E.A.B.U., and G.F.K. wrote the paper. venoms over hundreds of millions of years has enabled them to Competing interest statement: C.D. is affiliated with Thermo Fisher Scientific. evolve into complex mixtures of bioactive compounds with a This article is a PNAS Direct Submission. diverse range of pharmacological activities. Published under the PNAS license. Spider venoms are a heterogeneous mixture of salts, low Data deposition: Atomic coordinates for protein structures determined in this study were molecular weight organic compounds (<1 kDa), linear and deposited in the Protein Data Bank under accession codes 2N6N, 2N6R, 6BA3, and 2N8K disulfide-rich peptides (DRPs) (typically, 3 to 9 kDa with three while corresponding NMR chemical shifts were deposited in BioMagResBank under to six disulfide bonds), and proteins (10 to 120 kDa) (3–5). accessions BMRB25774, BMRB25778, BMRB25853,andBMRB30352. Metadata and annotated nucleotide sequences were deposited in the European Nucleotide Archive However, peptides are the major components of most spider under project accessions PRJEB6062 (ERA298588)andPRJEB35693. Mass spectrometry venoms, with some containing >1,000 peptides (6). The majority data has been deposited to ProteomeXchange Consortium via the PRIDE partner of these peptides are “short” DRPs (2.5 to 5 kDa), but there is repository with the dataset identifier PXD016886. also a significant proportion of “long” DRPs (6.5 to 8.5 kDa) (5). 1S.S.P. and Y.K.-Y.C. contributed equally to the work. As the primary function of spider venom is to rapidly immobilize 2To whom correspondence may be addressed. Email: [email protected] or glenn. prey, it is perhaps not surprising that most spider-venom DRPs [email protected]. that have been functionally characterized target neuronal ion 3Present address: Brain and Mind Centre, University of Sydney, Camperdown, NSW 2050, channels and receptors (5, 7). Australia. 4 ’ Although spider-venom DRPs have been shown to adopt a Present address: St Vincent s Clinical School, University of New South Wales, Sydney, NSW 2010, Australia. variety of three-dimensional (3D) structures, including the 5 β Present address: School of Science & Engineering, University of the Sunshine Coast, Sippy Kunitz (8), prokineticin/colipase (9), disulfide-directed -hairpin Downs, QLD 4556, Australia. (DDH) (10), and inhibitor cystine knot (ICK) fold (11), the 6Present address: School of Biotechnology & Biomolecular Sciences, University of New majority of spider-venom DRP structures solved to date conform South Wales, Sydney, NSW 2010, Australia. to the ICK motif. The ICK motif is defined as an antiparallel This article contains supporting information online at https://www.pnas.org/lookup/suppl/ β-sheet stabilized by a cystine knot (11). In spider toxins, the doi:10.1073/pnas.1914536117/-/DCSupplemental. β-sheet typically comprises only two β-strands although a third First published May 12, 2020. www.pnas.org/cgi/doi/10.1073/pnas.1914536117 PNAS | May 26, 2020 | vol. 117 | no. 21 | 11399–11408 Downloaded by guest on September 25, 2021 ICK peptides (also known as knottins) (13) with exceptional The distribution of peptide masses in H. infensa venom is bi- resistance to chemicals, heat, and proteases (14, 15), which has modal. Most peptides fall in the mass range 2.5 to 5.5 kDa, but made them of interest as drug and insecticide leads (5, 14, 16). there is a significant cohort of larger peptides with mass 6.5 to Some spider toxins show minor (17) or more significant (18) 8.5 kDa (Fig. 1 C and D). This bimodal distribution matches that elaborations of the basic ICK fold involving an additional sta- previously described for venom from related funnel-web spiders bilizing disulfide bond. More recently, “double-knot” spider (6, 25), various tarantulas (26), and the spitting spider Scytodes toxins have been reported in which two structurally independent thoracica (27), and it is also reflected in the mass profile generated ICK domains are joined by a short linker (19, 20). for all spider toxins reported to date (5). As reported previously Like other folds with stabilizing disulfide bridges, knottins for Australian funnel-web spiders (6, 25), the 3D venom landscape display a remarkable diversity of biological functions, including (Fig. 1F) revealed no correlation between peptide mass and modulation of many different types of ligand- and voltage-gated peptide hydrophobicity, as judged by reversed-phase (RP) high- ion channels (5). Despite strong conservation of the knottin pressure liquid chromatography (HPLC) retention time. scaffold across a taxonomically diverse range of spiders, several factors have hampered analysis of their evolutionary history (21). Transcriptomics Uncovers the Biochemical Diversity of H. infensa First, it is not until recently that a large number of knottin pre- Venom. Consistent with MS analysis of secreted venom, se- cursor sequences have become available from venom-gland quencing of a venom-gland transcriptome from H. infensa transcriptomes. Second, the disulfide framework in small DRPs revealed a biochemically diverse venom, with at least 33 toxin generally constrains the peptide fold to such an extent that most superfamilies (Fig. 2). In light of their likely toxic function, each noncysteine residues can be mutated without damaging the pep- superfamily of toxins was named, as suggested previously (28), tide’s structural integrity, a luxury not afforded to most globular after gods/deities of death, destruction, or the underworld. proteins (21). Thus, evolution of DRPs is typically characterized Expressed sequence tags were sequenced using the 454 platform by the accumulation of many mutations, leaving very few con- and assembled using MIRA v3.2 (29). This produced a total of served residues available for deep evolutionary analyses (21). 26,980 contigs and 7,194 singlets, with an average contig length Third, very few structures have been solved for DRPs larger than 5 of 496 base pairs (bp) (maximum length 3,159 bp, N50 674 bp).