Toward De Novo Design of Immune Silent Protein and Peptide Therapeutics Lance Stewart, Ph.D

Total Page:16

File Type:pdf, Size:1020Kb

Toward De Novo Design of Immune Silent Protein and Peptide Therapeutics Lance Stewart, Ph.D Toward de novo design of immune silent protein and peptide therapeutics Lance Stewart, Ph.D. MBA Chief Strategy and Operations Officer, UW IPD FDA – CERSI Collaborative Workshop “Predictive Immunogenicity for Better Clinical Outcomes” October 3 and 4, 2018 10-4-18 1 Design a new world of synthetic proteins • Founded in 2012 by Dr. David Baker • Organized within the School of Medicine, Biochemistry • ~140 Person Umbrella Organization • Faculty PIs (Baker, DiMaio, King, Gu, Bradley) • WRF Innovation Fellows (21) • Translational Investigators (1) • Research Staff (17) • Postdocs (27) • Graduate Students (36) • Admin (6) • Undergrads (27) • High School Student (1) • Using 10% of all UW Internet Traffic 10-4-18 2 PROTEIN STRUCTURE PREDICTION Amino Acid Sequence Protein Tertiary Structure PROTEIN DESIGN 3 The folded states of proteins are likely global energy minima for their sequences Unfolded Protein Structure Prediction: find lowest energy structure for fixed sequence Energy Protein Design: find a sequence for which desired structure has lowest energy Sample structures and sequences, and evaluate Native state energies using Rosetta molecular modeling suite 4 15 Years Ago (2003): First De Novo Design • First computational de novo design of a novel protein fold (Top 7) with atomic level accuracy. Model X-ray Structure Kuhlman et al, Science 2003 10-4-18 5 Today: The Coming of Age of De Novo Protein Design • Designed completely from scratch • Sequence unique from existing proteins in nature • Experimentally verified by high-resolution structure Top7 Kuhlman, Baker 1988 1997 2003 2011 2012 2013 2014 2015 2016 FSD-1 Mayo Coiled Coils: Degrado, Reagan, Kim, Harbury, Eisenberg, Alber, Woolfson Huang PS*, Boyken SE*, and Baker D. Nature 2016 10-4-18 6 De novo protein design Native sequences Neanderthal De novo protein design protein design Number of 100 residue amino sequences: 20100 = 1.3*10130 Number of naturally occurring proteins: ~1015 10-4-18 7 HISTORIC MOMENT IN PROTEIN DESIGN • We’ve learned how to design proteins from scratch. • There is finally enough computing power to do it. • Genomics enables building and testing designs in the lab. 10-4-18 8 De novo protein design method Design a Low Energy Define Design Strain-Free Select Sequences that Blueprint Backbones Sequences for Fold into Designed Backbones Structure Rosetta energy Rosetta RMSD Size and arrangement Backbones sampled VERY LARGE number of Selection of designed of secondary using fragments of possible amino acid sequences with lowest structures natural proteins sequences energies close to design model 10-4-18 9 De novo protein design method Gene Library Generate Yeast Surface FACS and Next-Gen DNA Select Individual Synthesis Display Libraries Sequencing Designs for Verification FACS: Fluorescein-Anti-Myc Ab to probe expression & R-Phycoerythrin Target of Limited Interest to probe binding activity Functional Myc-tag Protease FITC DESIGNED Binder PROTEIN Target Myc-tag Aga2 Aga1 DESIGNED Limited PROTEIN YEAST Aga2 Heat Aga1 CELL ofbinding Level YEAST Expression CELL Level of expression ~100,000 genes Transform yeast with Individual clones encoding mini- plasmids encoding Identify gene sequences expressing designed proteins ~60 aa minibinder design library, encoding functional minibinders are used to designed minibinders and treat with limited verify function protease and / or heat 10-4-18 10 Protein Design Takes Us Beyond Traditional Small Molecules and Antibodies • Computational design enables bottom up creation oF totally new Functional designer peptides, proteins, and nanomaterials. Traditional Computationally Traditional Computationally Small Molecule Designed Antibodies or Designed Drugs Mini-Proteins (<100 aa) Smaller Ab Smart Nanomaterials <1 KDa. Macrocycles (7-16 aa) Fragments 50 KDa. – 3 MDa. 1-12 KDa. 13-160 KDa. 10-4-18 11 Large Scale Design of Hyperstable Mini-Proteins Rocklin et al Science 2017 10-4-18 12 Design of Disulfide Stapled and Cyclic Mini-Proteins with Precise Control of Shape and Size αββ (0.99Å) βαβ (0.7Å) ββα (0.86Å) Mixed-chirality peptides stapled c(α α ) (0.79Å) - R L peptides Disulfide c(αα) (1.03Å) c(ααα) (1.06Å) c(ββ) (1.26Å) Design Models / NMR structure Cyclic peptides Bhardwaj, G*., Mulligan V.*, Bahl. C* et al., Nature (2016) 13 Rosetta can Design Peptide Macrocycles with Near Atomic Level Accuracy 7mer (rmsd: 0.8 Å) 7mer (rmsd: 1.1 Å) 8mer (rmsd: 0.3 Å) 9mer (rmsd: 1.2 Å) 10mer (rmsd: 0.8 Å) 10mer (rmsd: 0.5 Å) Hosseinzadeh, P. *, Bhardwaj, G*., Mulligan V.* et al., Science (2017) L-AA/D-AA computational model / NMR ensemble 14 Designed proteins show high thermal stability and resistance to chemical denaturation NC_HEE_D1 Bhardwaj, G*., Mulligan V.*, Bahl. C* et al., Nature (2016) 15 Immunogenicity ? 10-4-18 16 Causes of Immune Responses to Proteins 1. Multivalency = Aggregate Large Size Instability 2. MHC-II T-cell Epitopes 3. TCR - MHC-II T-cell Epitopes / Ag Primed B-Cell Synapse Sauerborn M, Brinks V, Jiskoot W, Schellekens H. Immunological mechanism underlying the immune response to recombinant human protein therapeutics. Trends Pharmacol Sci. 2010 Feb;31(2):53-9. 10-4-18 17 Features of Immunogenic Substances vs. De Novo Designed Mini-Binders ImmunogenicImmunogenic Protein Designed MiniDesigned-Binder / Macrocycle Mini-Binder / Protein Macrocycle Large size (> 10 KDa.) Small (< 10 KDa.) • Large size (> 10 KDa.) • Small (< 10 KDa.) Multivalent = B-cell receptor crosslinking Monomeric • Multivalent = B-cell receptor • Monomeric crosslinking • Hyper-Stable Poor• Poor stability stability = Denaturation = Denaturation = = Hyper-Stable• Not (>80 Self ˚C, Protease resistant) Aggregation = Multivalent Aggregation = Multivalent • Hard to digest or D-handed un- Not• SelfNot Self Not Self natural amino acids make it hard T•-cellT -epitopescell epitopes (MHC -(MHCII) require-II) require processing Hard to digestto or process. D-handed un-natural amino and presentation.processing and presentation. acids make• itRe hard-Designable to process. • Depends on route of entry Re• -designDepends (deimmunize) on route hardof entry Re-design (immune silence) easier delivery (e.g. mucosal, delivery (e.g. mucosal, Longsubcutaneous, T1/2 resident time intravenous) (Weeks) Short T1/2 (minutessubcutaneous, to hours) intravenous) • Short serum half-life Delivery• Adjuvants is often presentI.V. or S.C. to (systemic) trigger Delivery options, I.V., S.C., Aerosol (localized) • Simple formulations (PBS) Excipientinnate formulations immune function Simple formulations (PBS) 10-4-18 18 Designed Influenza Therapeutic Mini-Binder • Low cost potent, inhalable, long-lived, broadly neutralizing anti-viral therapeutic. • 40 amino acids (synthetic or recombinant) Heat Stable Aerosol • 2 disulfide bonds, Tm > 95˚C, Kd > ~5 nM. • In vitro Neutralization EC50 < 0.003 ug/ml. • Not immunogenic in mice. Flu Binder, 0.03 mg/kg All Mice Survive 100 80 Control 0.03mg/kg HB1.6928.2.3 60 40 Survival(%) Survival Control 20 0 02468101214 Time (d) Days After Flu Image by IPD and Cognition Studio Aaron Chevalier, Daniel Silva, Gabe Rocklin, David Baker et al., Nature 2017 10-4-18 19 Potent Anti-Flu Mini-binder is Hyperstable Pharmacokinetics Trypsin Resistant Heat Stable T1/2 = ~20 min. 10000 CS15134IV IV CS15134SQ SQ 1000 100 Plasma Conc (ng/mL) 10 1 0.0 4.0 8.0 12.0 16.0 20.0 24.0 Time (hrs) Aaron Chevalier, Daniel Silva, Gabe Rocklin, David Baker et al. , Nature 2017 10-4-18 20 Designed Mini-Binders Elicit Little or No Antibodies in Mice Dosing Protocols IgG Responses in ELISA (1:500 serum) Intravenous Intranasal Dosing 3mg/kg i.v. 1 2 3 hIgG Weeks 0 3 6 9 Blood Draws 1 2 3 BSA Dosing 3mg/kg i.n. 1 2 3 Weeks 0 2 4 6 Blood Draws 1 2 3 Designs are much less immunogenic than hIgG or BSA in mice ! Aaron Chevalier, Daniel Silva, Gabe Rocklin, David Baker et al., Nature 2017 10-4-18 21 Repeat Dosing of Mini-Binders Does Not Alter Prophylactic Efficacy in Mice Minibinder Dosing Flu virus 3mg/kg i.v. or i.n 2x MLD50 of H1N1 CA09 1 2 3 4 14 Days Trial 0 3 6 12 13.5 15.5 Weeks Minibinder 0.3 mg/kg i.n. Complete prophylactic protection after 4 repeated doses from 8 weeks to 1 week prior to lethal flu virus challenge ! Aaron Chevalier, Daniel Silva, Gabe Rocklin, David Baker with Deb Fuller’s Lab et al. , Nature 2017 10-4-18 22 ~Congruence Between Computational and Experimental Saturation Site Mutagenesis (SSM) AA Substitution AA Substitution Position in Designed Sequence Position in Designed Sequence Can Re-Design Sequence to Reduce T-Cell Epitope Liability if Needed ! Change Effect Enhance Function Neutral Destroy Function Aaron Chevalier, Daniel Silva, Gabe Rocklin, David Baker et al. , Nature 2017 10-4-18 23 Structure and Function By Design • Through design, we can maintain structural features through design of an astronomical number of different amino acid sequences. • As such, numerous desired target product profile features are achievable through de novo design. – Size – pI – Stability – H-bonding networks = hydration sphere – Others. • By definition de novo designed protein sequences do not exist in nature and could be recognized as foreign! • How can we design immune silence ? 10-4-18 24 MHC-II displays peptides on the surface of cells for T-cell receptors • Peptide binding cleft between 2 domains • binds 15-24mers, 9mer core • P1,P4,P6,P9 “pocket” positions 10-4-18 25 Reducing the Liability of T-cell Epitopes by Design • Through Rosetta design, we can maintain structural features and alter the amino acid sequence to “silence” predicted or known offending T-cell epitopes. 1. Host Genome Sequences 2. Known Epitope Sequences Machine Learning Predictions 3. King C, Garza EN, Mazor R, Linehan JL, Pastan I, Pepper M, Baker D. Removing T-cell epitopes with computational protein design. Proc Natl Acad Sci U S A. 2014 Jun 10;111(23):8577-82. 10-4-18 26 Current Immunogenicity Testing Paradigm for De Novo Designed Proteins • Step 1.
Recommended publications
  • Recent Advances in Automated Protein Design and Its Future
    EXPERT OPINION ON DRUG DISCOVERY 2018, VOL. 13, NO. 7, 587–604 https://doi.org/10.1080/17460441.2018.1465922 REVIEW Recent advances in automated protein design and its future challenges Dani Setiawana, Jeffrey Brenderb and Yang Zhanga,c aDepartment of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA; bRadiation Biology Branch, Center for Cancer Research, National Cancer Institute – NIH, Bethesda, MD, USA; cDepartment of Biological Chemistry, University of Michigan, Ann Arbor, MI, USA ABSTRACT ARTICLE HISTORY Introduction: Protein function is determined by protein structure which is in turn determined by the Received 25 October 2017 corresponding protein sequence. If the rules that cause a protein to adopt a particular structure are Accepted 13 April 2018 understood, it should be possible to refine or even redefine the function of a protein by working KEYWORDS backwards from the desired structure to the sequence. Automated protein design attempts to calculate Protein design; automated the effects of mutations computationally with the goal of more radical or complex transformations than protein design; ab initio are accessible by experimental techniques. design; scoring function; Areas covered: The authors give a brief overview of the recent methodological advances in computer- protein folding; aided protein design, showing how methodological choices affect final design and how automated conformational search protein design can be used to address problems considered beyond traditional protein engineering, including the creation of novel protein scaffolds for drug development. Also, the authors address specifically the future challenges in the development of automated protein design. Expert opinion: Automated protein design holds potential as a protein engineering technique, parti- cularly in cases where screening by combinatorial mutagenesis is problematic.
    [Show full text]
  • Structure-Function Relationship of Archaeal Rhodopsin Proteins Analyzed by Continuum Electrostatics
    Structure-Function Relationship of Archaeal Rhodopsin Proteins analyzed by Continuum Electrostatics DISSERTATION submitted to the Faculty of Biology, Chemistry and Geoscience of the University of Bayreuth, Germany for obtaining the degree of Doctor of Natural Sciences presented by Edda Kloppmann born in Luneburg,¨ Germany Bayreuth 2010 Vollstandiger¨ Abdruck der von der Fakultat¨ fur¨ Biologie, Chemie und Geowissenschaften der Universitat¨ Bayreuth genehmigten Disserta- tion zur Erlangung des akademischen Grades Doktor der Naturwis- ” senschaften (Dr. rer. nat.)“. Die vorliegende Arbeit wurde im Zeitraum von November 2002 bis November 2003 am IWR Heidelberg und von Dezember 2003 bis April 2007 am Lehrstuhl fur¨ Biopolymere an der Universitat¨ Bayreuth unter der Leitung von Professor G. Matthias Ullmann erstellt. Hiermit erklare¨ ich, dass ich die vorliegende Arbeit selbststandig¨ ver- fasst und keine anderen als die von mir angegebenen Quellen und Hilfsmittel verwendet habe. Ferner erklare¨ ich, dass ich nicht bereits anderweitig mit oder ohne Er- folg versucht habe, eine Dissertation einzureichen oder mich der Dok- torprufung¨ zu unterziehen. Bayreuth, den 13.01.2010 Prufungsausschuss:¨ 1. Gutachter: Prof. Dr. G. Matthias Ullmann 2. Gutachter: Prof. Dr. Franz X. Schmid Prufungsvorsitz:¨ Prof. Dr. Thomas Hellweg Prof. Dr. Benedikt Westermann Einreichung des Promotionsgesuchs: 13.01.2010 Kolloqium: 04.05.2010 Structure-Function Relationship of Archaeal Rhodopsin Proteins analyzed by Continuum Electrostatics ABSTRACT Rhodopsin proteins perform two cellular key functions: signaling of external stimuli and ion transport. Examples of both functional types are found in the family of archaeal rho- dopsins, namely the proton pump bacteriorhodopsin, the chloride pump halorhodopsin and the photoreceptor sensory rhodopsin II. For these three membrane proteins, high- resolution X-ray structures are available, allowing a theoretical investigation in atomic detail.
    [Show full text]
  • Advances in Rosetta Protein Structure Prediction on Massively Parallel Systems
    UC San Diego UC San Diego Previously Published Works Title Advances in Rosetta protein structure prediction on massively parallel systems Permalink https://escholarship.org/uc/item/87g6q6bw Journal IBM Journal of Research and Development, 52(1) ISSN 0018-8646 Authors Raman, S. Baker, D. Qian, B. et al. Publication Date 2008 Peer reviewed eScholarship.org Powered by the California Digital Library University of California Advances in Rosetta protein S. Raman B. Qian structure prediction on D. Baker massively parallel systems R. C. Walker One of the key challenges in computational biology is prediction of three-dimensional protein structures from amino-acid sequences. For most proteins, the ‘‘native state’’ lies at the bottom of a free- energy landscape. Protein structure prediction involves varying the degrees of freedom of the protein in a constrained manner until it approaches its native state. In the Rosetta protein structure prediction protocols, a large number of independent folding trajectories are simulated, and several lowest-energy results are likely to be close to the native state. The availability of hundred-teraflop, and shortly, petaflop, computing resources is revolutionizing the approaches available for protein structure prediction. Here, we discuss issues involved in utilizing such machines efficiently with the Rosetta code, including an overview of recent results of the Critical Assessment of Techniques for Protein Structure Prediction 7 (CASP7) in which the computationally demanding structure-refinement process was run on 16 racks of the IBM Blue Gene/Le system at the IBM T. J. Watson Research Center. We highlight recent advances in high-performance computing and discuss future development paths that make use of the next-generation petascale (.1012 floating-point operations per second) machines.
    [Show full text]
  • Precise Assembly of Complex Beta Sheet Topologies from De Novo 2 Designed Building Blocks
    1 Precise Assembly of Complex Beta Sheet Topologies from de novo 2 Designed Building Blocks. 3 Indigo Chris King*†, James Gleixner†, Lindsey Doyle§, Alexandre Kuzin‡, John F. Hunt‡, Rong Xiaox, 4 Gaetano T. Montelionex, Barry L. Stoddard§, Frank Dimaio†, David Baker† 5 † Institute for Protein Design, University of Washington, Seattle, Washington, United States 6 ‡ Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, New York, United 7 States 8 x Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Northeast Struc- 9 tural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, New Jersey, United States 10 § Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States 11 12 ABSTRACT 13 Design of complex alpha-beta protein topologies poses a challenge because of the large number of alternative packing 14 arrangements. A similar challenge presumably limited the emergence of large and complex protein topologies in evo- 15 lution. Here we demonstrate that protein topologies with six and seven-stranded beta sheets can be designed by inser- 16 tion of one de novo designed beta sheet containing protein into another such that the two beta sheets are merged to 17 form a single extended sheet, followed by amino acid sequence optimization at the newly formed strand-strand, strand- 18 helix, and helix-helix interfaces. Crystal structures of two such designs closely match the computational design models. 19 Searches for similar structures in the SCOP protein domain database yield only weak matches with different beta sheet 20 connectivities. A similar beta sheet fusion mechanism may have contributed to the emergence of complex beta sheets 21 during natural protein evolution.
    [Show full text]
  • I S C B N E W S L E T T
    ISCB NEWSLETTER FOCUS ISSUE {contents} President’s Letter 2 Member Involvement Encouraged Register for ISMB 2002 3 Registration and Tutorial Update Host ISMB 2004 or 2005 3 David Baker 4 2002 Overton Prize Recipient Overton Endowment 4 ISMB 2002 Committees 4 ISMB 2002 Opportunities 5 Sponsor and Exhibitor Benefits Best Paper Award by SGI 5 ISMB 2002 SIGs 6 New Program for 2002 ISMB Goes Down Under 7 Planning Underway for 2003 Hot Jobs! Top Companies! 8 ISMB 2002 Job Fair ISCB Board Nominations 8 Bioinformatics Pioneers 9 ISMB 2002 Keynote Speakers Invited Editorial 10 Anna Tramontano: Bioinformatics in Europe Software Recommendations11 ISCB Software Statement volume 5. issue 2. summer 2002 Community Development 12 ISCB’s Regional Affiliates Program ISCB Staff Introduction 12 Fellowship Recipients 13 Awardees at RECOMB 2002 Events and Opportunities 14 Bioinformatics events world wide INTERNATIONAL SOCIETY FOR COMPUTATIONAL BIOLOGY A NOTE FROM ISCB PRESIDENT This newsletter is packed with information on development and dissemination of bioinfor- the ISMB2002 conference. With over 200 matics. Issues arise from recommendations paper submissions and over 500 poster submis- made by the Society’s committees, Board of sions, the conference promises to be a scientific Directors, and membership at large. Important feast. On behalf of the ISCB’s Directors, staff, issues are defined as motions and are discussed EXECUTIVE COMMITTEE and membership, I would like to thank the by the Board of Directors on a bi-monthly Philip E. Bourne, Ph.D., President organizing committee, local organizing com- teleconference. Motions that pass are enacted Michael Gribskov, Ph.D., mittee, and program committee for their hard by the Executive Committee which also serves Vice President work preparing for the conference.
    [Show full text]
  • Bioengineering Professor Trey Ideker Wins 2009 Overton Prize
    Bioengineering Professor Trey Ideker Wins 2009 Overton Prize March 13, 2009 Daniel Kane University of California, San Diego bioengineering professor Trey Ideker-a network and systems biology pioneer-has won the International Society for Computational Biology's Overton Prize. The Overton prize is awarded each year to an early-to-mid-career scientist who has already made a significant contribution to the field of computational biology. Trey Ideker is an Associate Professor of Bioengineering at UC San Diego's Jacobs School of Engineering, Adjunct Professor of Computer Science, and member of the Moores UCSD Cancer Center. He is a pioneer in using genome-scale measurements to construct network models of cellular processes and disease. His recent research activities include development of software and algorithms for protein network analysis, network-level comparison of pathogens, and genome-scale models of the response to DNA-damaging agents. "Receiving this award is a wonderful honor and helps to confirm that the work we have been doing for the past several years has been useful to people," said Ideker. "This award also provides great recognition to UC San Diego which has fantastic bioinformatics programs both at the undergraduate and graduate level. I could never have done it without the help of some really first-rate bioinformatics and bioengineering graduate students," said Ideker. Ideker is on the faculty of the Jacobs School of Engineering's Department of Bioengineering, which ranks 2nd in the nationfor biomedical engineering, according to the latest US News rankings. The bioengineering department has ranked among the top five programs in the nation every year for the past decade.
    [Show full text]
  • Precise Assembly of Complex Beta Sheet Topologies from De Novo Designed Building Blocks
    SHORT REPORT Precise assembly of complex beta sheet topologies from de novo designed building blocks Indigo Chris King1*†, James Gleixner1, Lindsey Doyle2, Alexandre Kuzin3, John F Hunt3, Rong Xiao4, Gaetano T Montelione4, Barry L Stoddard2, Frank DiMaio1, David Baker1 1Institute for Protein Design, University of Washington, Seattle, United States; 2Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, United States; 3Biological Sciences, Northeast Structural Genomics Consortium, Columbia University, New York, United States; 4Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, United States Abstract Design of complex alpha-beta protein topologies poses a challenge because of the large number of alternative packing arrangements. A similar challenge presumably limited the emergence of large and complex protein topologies in evolution. Here, we demonstrate that protein topologies with six and seven-stranded beta sheets can be designed by insertion of one de novo designed beta sheet containing protein into another such that the two beta sheets are merged to form a single extended sheet, followed by amino acid sequence optimization at the newly formed strand-strand, strand-helix, and helix-helix interfaces. Crystal structures of two such *For correspondence: chrisk1@ uw.edu designs closely match the computational design models. Searches for similar structures in the SCOP protein domain database yield only weak matches with different beta sheet connectivities. A † Present address: Molecular similar beta sheet fusion mechanism may have contributed to the emergence of complex beta Engineering and Sciences, sheets during natural protein evolution. University of Washington, DOI: 10.7554/eLife.11012.001 Seattle, United States Competing interests: The authors declare that no competing interests exist.
    [Show full text]
  • A Glance Into the Evolution of Template-Free Protein Structure Prediction Methodologies Arxiv:2002.06616V2 [Q-Bio.QM] 24 Apr 2
    A glance into the evolution of template-free protein structure prediction methodologies Surbhi Dhingra1, Ramanathan Sowdhamini2, Frédéric Cadet3,4,5, and Bernard Offmann∗1 1Université de Nantes, CNRS, UFIP, UMR6286, F-44000 Nantes, France 2Computational Approaches to Protein Science (CAPS), National Centre for Biological Sciences (NCBS), Tata Institute for Fundamental Research (TIFR), Bangalore 560-065, India 3University of Paris, BIGR—Biologie Intégrée du Globule Rouge, Inserm, UMR_S1134, Paris F-75015, France 4Laboratory of Excellence GR-Ex, Boulevard du Montparnasse, Paris F-75015, France 5DSIMB, UMR_S1134, BIGR, Inserm, Faculty of Sciences and Technology, University of La Reunion, Saint-Denis F-97715, France Abstract Prediction of protein structures using computational approaches has been explored for over two decades, paving a way for more focused research and development of algorithms in com- parative modelling, ab intio modelling and structure refinement protocols. A tremendous suc- cess has been witnessed in template-based modelling protocols, whereas strategies that involve template-free modelling still lag behind, specifically for larger proteins (> 150 a.a.). Various improvements have been observed in ab initio protein structure prediction methodologies over- time, with recent ones attributed to the usage of deep learning approaches to construct protein backbone structure from its amino acid sequence. This review highlights the major strategies undertaken for template-free modelling of protein structures while discussing few tools devel- arXiv:2002.06616v2 [q-bio.QM] 24 Apr 2020 oped under each strategy. It will also briefly comment on the progress observed in the field of ab initio modelling of proteins over the course of time as seen through the evolution of CASP platform.
    [Show full text]
  • Deep Learning-Based Advances in Protein Structure Prediction
    International Journal of Molecular Sciences Review Deep Learning-Based Advances in Protein Structure Prediction Subash C. Pakhrin 1,†, Bikash Shrestha 2,†, Badri Adhikari 2,* and Dukka B. KC 1,* 1 Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS 67260, USA; [email protected] 2 Department of Computer Science, University of Missouri-St. Louis, St. Louis, MO 63121, USA; [email protected] * Correspondence: [email protected] (B.A.); [email protected] (D.B.K.) † Both the authors should be considered equal first authors. Abstract: Obtaining an accurate description of protein structure is a fundamental step toward under- standing the underpinning of biology. Although recent advances in experimental approaches have greatly enhanced our capabilities to experimentally determine protein structures, the gap between the number of protein sequences and known protein structures is ever increasing. Computational protein structure prediction is one of the ways to fill this gap. Recently, the protein structure prediction field has witnessed a lot of advances due to Deep Learning (DL)-based approaches as evidenced by the suc- cess of AlphaFold2 in the most recent Critical Assessment of protein Structure Prediction (CASP14). In this article, we highlight important milestones and progresses in the field of protein structure prediction due to DL-based methods as observed in CASP experiments. We describe advances in various steps of protein structure prediction pipeline viz. protein contact map prediction, protein distogram prediction, protein real-valued distance prediction, and Quality Assessment/refinement. We also highlight some end-to-end DL-based approaches for protein structure prediction approaches. Additionally, as there have been some recent DL-based advances in protein structure determination Citation: Pakhrin, S.C.; Shrestha, B.; using Cryo-Electron (Cryo-EM) microscopy based, we also highlight some of the important progress Adhikari, B.; KC, D.B.
    [Show full text]
  • IG-VAE: Generative Modeling of Immunoglobulin Proteins by Direct
    bioRxiv preprint doi: https://doi.org/10.1101/2020.08.07.242347; this version posted August 10, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. IG-VAE: GENERATIVE MODELING OF IMMUNOGLOBULIN PROTEINS BY DIRECT 3D COORDINATE GENERATION Raphael R. Eguchi Namrata Anand Christian A. Choe Department of Biochemistry Department of Bioengineering Department of Bioengineering Stanford University Stanford University Stanford University [email protected] [email protected] [email protected] Po-Ssu Huang Department of Bioengineering Stanford University [email protected] ABSTRACT While deep learning models have seen increasing applications in protein science, few have been implemented for protein backbone generation—an important task in structure-based problems such as active site and interface design. We present a new approach to building class-specific backbones, using a variational auto-encoder to directly generate the 3D coordinates of immunoglobulins. Our model is torsion- and distance-aware, learns a high-resolution embedding of the dataset, and generates novel, high-quality structures compatible with existing design tools. We show that the Ig-VAE can be used to create a computational model of a SARS-CoV2-RBD binder via latent space sampling. We further demonstrate that the model’s generative prior is a powerful tool for guiding computational protein design, motivating a new paradigm under which backbone design is solved as constrained optimization problem in the latent space of a generative model. 1 Introduction Over the past two decades, the field of protein design has made steady progress, with new computational methods providing novel solutions to challenging problems such as enzyme catalysis, viral inhibition, de novo structure generation and more[1, 2, 3, 4, 5, 6, 7, 8, 9].
    [Show full text]
  • THE FUTURE POSTPONED 2.0 Why Declining Investment in Basic Research Threatens a U.S
    THE FUTURE POSTPONED 2.0 Why Declining Investment in Basic Research Threatens a U.S. Innovation Deficit MIT Washington Office 820 First St. NE, Suite 610 Washington, DC 20002-8031 Tel: 202-789-1828 futurepostponed.org THE FUTURE POSTPONED 2.0 Why Declining Investment in Basic Research Threatens a U.S. Innovation Deficit The Future Postponed 2.0. October 2016. Cambridge, Massachusetts. This material may be freely reproduced. Cover image: Leigh Prather/Shutterstock Contents UNLEASHING THE POWER OF SYNTHETIC PROTEINS 4 Potential Breakthroughs in Medicine, Energy, and Technology DAVID BAKER, Professor of Biochemistry, University of Washington Investigator, Howard Hughes Medical Institute PREDICTING THE FUTURE OF EARTH’S FORESTS 9 Forests absorb carbon dioxide and are thus an important buffer against climate change, for now. Understanding forest dynamics would enable both better management of forests and the ability to assess how they are changing. STUART J. DAVIES, Frank H. Levinson Chair in Global Forest Science; Director, Forest Global Earth Observatory-Center for Tropical Forest Science, Smithsonian Tropical Research Institute RESETTING THE CLOCK OF LIFE 13 We know that the circadian clock keeps time in every living cell, controlling biological processes such as metabolism, cell division, and DNA repair, but we don’t understand how. Gaining such knowledge would not only offer fundamental insights into cellular biochemistry, but could also yield practical results in areas from agriculture to medicine to human aging. NING ZHENG, Howard Hughes Medical Institute, Department of Pharmacology University of Washington; MICHELE PAGANO, M.D., Howard Hughes Medical Institute, Department of Pathology and NYU Cancer Institute, New York University School of Medicine CREATING A CENSUS OF HUMAN CELLS 16 For the first time, new techniques make possible a systematic description of the myriad types of cells in the human body that underlie both health and disease.
    [Show full text]
  • Alzheimer's Disease Ask Document
    UW MEDICINE INSTITUTE FOR PROTEIN DESIGN TACKLING THE FOUNDATION OF ALZHEIMER’S DISEASE The University of Washington Institute for Protein Design LZHEIMER’S DISEASE IS A DEVASTATING CONDITION. This relentlessly Aprogressive neurodegenerative disorder has no cure, it can’t be prevented, and the mean life expectancy of a person who has been diagnosed is a short seven years. The disease imposes heavy social, psychological, physical and economic burdens on caregivers, and it is costly for society: the U.S. now spends an estimated $100 billion on it every year. And costs are steadily rising with the aging of the population. Unfortunately, despite investing billions of dollars and many years into more than 400 potential drug treatments, there are, as yet, no solutions for Alzheimer’s disease. Given that the number of people affected worldwide could quadruple to more than 100 million by 2050, we think it is time for a totally new approach: protein design. Scientists at the University of Washington’s Institute for Protein Design have already taken the first steps to change the paradigm for Alzheimer’s disease research and treatment. We invite you to join us. Normal nerve cells are Mis-folding Proteins and Alzheimer’s Disease shown on the top. Accumulation of amyloid Proteins are the body’s workhorses: the major functional components of living cells. plaques and formation However, over a person’s lifetime, otherwise normal proteins may malfunction — and may of neurofibrillary tangles even cause toxic activity that harms the cell. Many diseases, including Alzheimer’s disease, (shown on bottom) are hallmarks of result from a protein malfunction known as mis-folding.
    [Show full text]