Supersecondary Structure Motifs and De Novo Protein Structure Prediction
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Protein Structure: Data Bases and Classification
Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References 1 Structural Proteins Membrane Proteins Globular Proteins 2 Terminology • Primary Structure • Secondary Structure • Tertiary Structure • Quatenary Structure • Supersecondary Structure • Domain • Fold Hierarchy of Protein Structure Helices α 3.10 π Amino acids/turn: 3.6 3.0 4.4 Frequency ~97% ~3% rare H-bonding i, i+4 i, i+3 i, i+5 3 α-helices α-helices α-helices have handedness: α-helices have a dipole: β-sheets 4 β-sheets Have a right-handed twist! β-sheets Can form higher level structures! Super Secondary Structure Motifs 5 What is a Domain? Richardson (1981): W ithin a single subunit [polypeptide chain], contiguous portions of the polypeptide chain frequently fold into compact, local semi-independent units called domains. More About Domains • Independent folding units. • Lots of within contacts, few outside. • Domains create their own hydrophobic core. • Regions usually conserved during recombination. • Different domains of the same protein can have different functions. • Domains of the same protein may or may not interact. Why Look for Domains? Domains are the currency of protein function! 6 Domain Size • Domains can be between 25 and 500 residues long. • Most are less than 200 residues. • Domains can be smaller than 50 residues, but these need to be stabilized. Examples are the zinc finger and a scorpion toxin. Two Very Small Domains A Humdinger of a Domain 7 What’s the Domain? (Part 1) What’s the Domain? (Part 2) Homology and Analogy • Homology: Similarity in characteristics resulting from shared ancestry. -
Water in Protein Structure Prediction
Water in protein structure prediction Garegin A. Papoian†‡, Johan Ulander†‡§, Michael P. Eastwood†¶, Zaida Luthey-Schultenʈ, and Peter G. Wolynes†,†† †Department of Chemistry and Biochemistry and Center for Theoretical Biological Physics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0371; and ʈDepartment of Chemistry, University of Illinois at Urbana–Champaign, Urbana, IL 61801 Contributed by Peter G. Wolynes, December 4, 2003 Proteins have evolved to use water to help guide folding. A interactions play an important role not only in binding interfaces physically motivated, nonpairwise-additive model of water-medi- but in folding of monomeric proteins. ated interactions added to a protein structure prediction Hamilto- We use the associative memory (AM) Hamiltonian molecular nian yields marked improvement in the quality of structure pre- dynamics model as a starting point (14–16). This Hamiltonian diction for larger proteins. Free energy profile analysis suggests has two principal components: general polymer physics-based that long-range water-mediated potentials guide folding and terms that are sequence independent, collectively called ‘‘back- smooth the underlying folding funnel. Analyzing simulation tra- bone,’’ and sequence-dependent knowledge-based distance- jectories gives direct evidence that water-mediated interactions dependent additive potentials, collectively denoted as AM͞C facilitate native-like packing of supersecondary structural ele- (AM͞contact). The AM part describes interactions between all ments. Long-range pairing of hydrophilic groups is an integral part pairs of residues that are separated in sequence between 3 and of protein architecture. Specific water-mediated interactions are a 12 residues. It uses a set of nonhomologous memory proteins to universal feature of biomolecular recognition landscapes in both build a funneled energy landscape by matching fragments. -
A Deep Learning Approach for Protein-Ligand Interaction Prediction
bioRxiv preprint doi: https://doi.org/10.1101/2019.12.20.884841; this version posted September 20, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license. SSnet: A Deep Learning Approach for Protein-Ligand Interaction Prediction Niraj Verma,y,x Xingming Qu,z,x Francesco Trozzi,y,x Mohamed Elsaied,{,x Nischal Karki,y,x Yunwen Tao,y,x Brian Zoltowski,y,x Eric C. Larson,z,x and Elfi Kraka∗,y,x yDepartment of Chemistry, Southern Methodist University, Dallas TX USA zDepartment of Computer Science, Southern Methodist University, Dallas TX USA {Department of Engineering Management and Information System, Southern Methodist University, Dallas TX USA xSouthern Methodist University E-mail: [email protected] Abstract Computational prediction of Protein-Ligand Interaction (PLI) is an important step in the modern drug discovery pipeline as it mitigates the cost, time, and resources re- quired to screen novel therapeutics. Deep Neural Networks (DNN) have recently shown excellent performance in PLI prediction. However, the performance is highly dependent on protein and ligand features utilized for the DNN model. Moreover, in current mod- els, the deciphering of how protein features determine the underlying principles that govern PLI is not trivial. In this work, we developed a DNN framework named SSnet that utilizes secondary structure information of proteins extracted as the curvature and torsion of the protein backbone to predict PLI. We demonstrate the performance of SSnet by comparing against a variety of currently popular machine and non-machine learning models using various metrics. -
Predicting Protein Secondary and Supersecondary Structure
29 Predicting Protein Secondary and Supersecondary Structure 29.1 Introduction............................................ 29-1 Background • Difficulty of general protein structure prediction • A bottom-up approach 29.2 Secondary structure ................................... 29-5 Early approaches • Incorporating local dependencies • Exploiting evolutionary information • Recent developments and conclusions 29.3 Tight turns ............................................. 29-13 29.4 Beta hairpins........................................... 29-15 29.5 Coiled coils ............................................. 29-16 Early approaches • Incorporating local dependencies • Predicting oligomerization • Structure-based predictions • Predicting coiled-coil protein interactions Mona Singh • Promising future directions Princeton University 29.6 Conclusions ............................................ 29-23 29.1 Introduction Proteins play a key role in almost all biological processes. They take part in, for example, maintaining the structural integrity of the cell, transport and storage of small molecules, catalysis, regulation, signaling and the immune system. Linear protein molecules fold up into specific three-dimensional structures, and their functional properties depend intricately upon their structures. As a result, there has been much effort, both experimental and computational, in determining protein structures. Protein structures are determined experimentally using either x-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. While -
Supersecondary Structures (Structural Motifs)
SupersecondarySupersecondary StructuresStructures (structural(structural motifs)motifs) VariousVarious SourcesSources Lecture 2 Biochemistry 4000 Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): Combinations of secondary structures in specific geometric arrangements ● Simple supersecondary structures consisting of 3 or fewer secondary structures considered here ● Large supersecondary structures (ie. Greek Key and Jelly-Roll) consisting of more than 3 secondary structures will be considered along with tertiary structures and folds ● Why? Large supersecondary structures can be domains. Simple supersecondary structures are typically composed of two secondary structures (ie. strands or helices) and a turn (or loop) Helix-turn-helix DNA binding motif Lecture 2 Biochemistry 4000 Slide 2 Supersecondary Structures (Motifs) Supersecondary Structures Hierarchy of Protein Structure (Motifs): Combinations of α β secondary structures in specific geometric arrangements Simple supersecondary structures are the 'building blocks' of …. αα βαβ ββ a) Complex Supersecondary Structures b) Tertiary Structure / Domain folds ββββ Simple supersecondary structures αααα are relatively stable components of Greek 4 helix βαβαβ βββ the 'Molten Globule' protein folding Key bundle β intermediate Rossman β-meander fold Lecture 2 Biochemistry 4000 Slide 3 β-hairpin Simple and Common supersecondary structure β-hairpin: two antiparallel β-strands joined by a turn or loop ● Very small supersecondary structure (typically less than 10 residues) -
Super Secondary Structures of Proteins with Post-Translational Modifications in Colon Cancer
molecules Article Super Secondary Structures of Proteins with Post-Translational Modifications in Colon Cancer Dmitry Tikhonov 1,2 , Liudmila Kulikova 1, Arthur Kopylov 3 , Kristina Malsagova 3, Alexander Stepanov 3 , Vladimir Rudnev 2 and Anna Kaysheva 3,* 1 Institute of Mathematical Problems of Biology RAS-the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia; [email protected] (D.T.); [email protected] (L.K.) 2 Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia; [email protected] 3 V.N. Orekhovich Institute of Biomedical Chemistry, 119121 Moscow, Russia; [email protected] (A.K.); [email protected] (K.M.); [email protected] (A.S.) * Correspondence: [email protected]; Tel.: +79-199-175-017 Academic Editors: Susy Piovesana and Rudy J. Richardson Received: 1 June 2020; Accepted: 6 July 2020; Published: 9 July 2020 Abstract: New advances in protein post-translational modifications (PTMs) have revealed a complex layer of regulatory mechanisms through which PTMs control cell signaling and metabolic pathways, contributing to the diverse metabolic phenotypes found in cancer. Using conformational templates and the three-dimensional (3D) environment investigation of proteins in patients with colorectal cancer, it was demonstrated that most PTMs (phosphorylation, acetylation, and ubiquitination) are localized in the supersecondary structures (helical pairs). We showed that such helical pairs are represented on the outer surface of protein molecules and characterized by a largely accessible area for the surrounding solvent. Most promising and meaningful modifications were observed on the surface of vitamin D-binding protein (VDBP), complement C4-A (CO4A), X-ray repair cross-complementing protein 6 (XRCC6), Plasma protease C1 inhibitor (IC1), and albumin (ALBU), which are related to colorectal cancer developing. -
Protein Structure
Proteins: an introduction Life is the result of the chemical activity of proteins (with nucleic acids essentially confined to encoding and producing them) A cross-section through Escherichia coli shows the diversity and density of macromolecules forming a cell flagella pili Howard Berg David Goodsell Life is the result of the chemical activity of proteins (with nucleic acids essentially confined to encoding and producing them) A cross-section through a human cell, from the plasma membrane to the nucleus The enlargement shows cell-surface gycoproteins, the sub-membraneous cytoskeleton (primarily actin filaments), a microtubulus, ribosomes (in pink; they are everywhere except the nucleus) relative size of E. coli David Goodsell Life is the result of the chemical activity of proteins (with nucleic acids essentially confined to encoding and producing them) A cross-section through a human cell, from the plasma membrane to the nucleus The enlargement shows the endoplasmatic reticulum (‚rough ER‘), Golgi stacks, and a clathrin cage assembling for vesicle budding David Goodsell Life is the result of the chemical activity of proteins (with nucleic acids essentially confined to encoding and producing them) A cross-section through a human cell, from the plasma membrane to the nucleus The enlargement shows a mitochondrion David Goodsell Life is the result of the chemical activity of proteins (with nucleic acids essentially confined to encoding and producing them) A cross-section through a human cell, from the plasma membrane to the nucleus The enlargement shows a nuclear pore, the pernuclear and intranuclear cytoskeleton, chromatin, spliceosomes and other RNP particles David Goodsell Proteins are extremely diverse David Goodsell Proteins are extremely diverse David Goodsell What is a protein? • protein - 1844, from Fr. -
Structural Bioinformatics
Course plan 2015-2016 Academic Year Qualification Master's Degree Structural Bioinformatics Descriptive details concerning the subject: · Name of the subject: Structural Bioinformatics · Code : 30169 · Type of subject: Optativa · Module: SBI · Credits: 5 ECTS · Total hours: 125.0 · Year: 1st · Term: 2nd · Coordination: Baldo Oliva · Department: Ciències Experimentals i de la Salut · Building: Campus Mar Teaching staff: · Group: 1 · Language: English · Lecturer: Baldo Oliva · Department: Ciències Experimentals i de la Salut Subject presentation: · Presentation: The main goal of the course is to learn the basic concepts on the structure of macromolecules, more specifically of proteins. Also the principles of crystallography are given in order to gain a deeper understanding on the 3- dimensional data. The main features relating sequence and protein fold will be used on the prediction of secondary and terciary structure of proteins and on its evaluation. Finally, the relationship between sequence/structure and function of proteins will be analysed. Prerequisites in order to follow the itinerary: 1) Knowledge in Bioinformatics: alignment algorithms (Needleman-Wunsch and Smith-Watermann) and substitution matrices (PAM and BLOSUM); Hidden Markov Models and Neural Networks; Linux and windows operative systems; Perl programming 2) Knowledge in Mathematics: Derivatives and Integrals in n-dimensions; Taylor polinomia; Linear algebra and trigonometry; main concepts of statistics and probabilities: combinatorial analysis, permutations, Bayes theorem, Gaussian and Poisson distributions; numerical analysis, optimization and fitting of functions. 3) Knowledge in Physics: Newton classical mechanics; Thermodynamics, Gibbs energy, entropy, enthalpy and basic statistical mechanics. 4) Knowledge in Chemistry: Chemical structure, conformation, configuration and estereoisomers; Reactivity, transition state, electrophilic and nucleophilic attacks; Electronic densitty, molecular orbitals, quantum numbers, Shroedingen function and wave function; Steady state and equilibrium. -
Research Article Prediction of Four Kinds of Simple Supersecondary Structures in Protein by Using Chemical Shifts
Hindawi Publishing Corporation e Scientific World Journal Volume 2014, Article ID 978503, 5 pages http://dx.doi.org/10.1155/2014/978503 Research Article Prediction of Four Kinds of Simple Supersecondary Structures in Protein by Using Chemical Shifts Feng Yonge College of Science, Inner Mongolia Agriculture University, Hohhot 010018, China Correspondence should be addressed to Feng Yonge; [email protected] Received 7 May 2014; Revised 3 June 2014; Accepted 4 June 2014; Published 18 June 2014 Academic Editor: Hao Lin Copyright © 2014 Feng Yonge. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Knowledge of supersecondary structures can provide important information about its spatial structure of protein. Some approaches have been developed for the prediction of protein supersecondary structure. However, the feature used by these approaches is primarily based on amino acid sequences. In this study, a novel model is presented to predict protein supersecondary structure by use of chemical shifts (CSs) information derived from nuclear magnetic resonance (NMR) spectroscopy. Using these CSs as inputs of the method of quadratic discriminant analysis (QD), we achieve the overall prediction accuracy of 77.3%, which is competitive with the same method for predicting supersecondary structures from amino acid compositions in threefold cross-validation. Moreover, our finding suggests that the combined use of different chemical shifts will influence the accuracy of prediction. 1. Introduction support vector machine [10]. And the overall accuracy of 78% was achieved. The features of these studies were mainly The prediction of protein structure is always one of the derived from the amino acid compositions or dipeptide most important research topics in the field of bioinformatics. -
Secondary Structure Prediction for RNA Binding Domain in RNP Proteins Identifies Pap As the Main Structural Motif
View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Volume 257, number 2, 373-376 FEB 07796 November 1989 Secondary structure prediction for RNA binding domain in RNP proteins identifies Pap as the main structural motif A. Ghetti, C. Padovani, G. Di Cesare and C. Morandi rst~tuto di Science Bio~ogi~~e, university di Verona, Strada le Grazie, 37334 Verma, Italy Received 23 August 1989 In eukaryotic cells transcript processing is strictly dependent upon binding of specific proteins. Nuclear RNA binding proteins share a common domain, which is involved in RNA binding. In order to characterize RNP-RNA interactions we have performed a secondary structure prediction based both on statistical algorithms and comparative analysis of different proteins. A high conservation for secondary structure propensity between different RNPs was observed. Ribonucle~c protein; Structure prediction I. INTRODUCTION polypeptide) confers with the protein the affinity for single-stranded nucleic acids. Further analysis of UP1 In recent years, it has been demonstrated that sequence domain revealed that it contains a tandem eukaryotic mRNAs are always associated in the duplicate of 90 residues. cytoplasm as well as in the nucleoplasm with specific Later works identified the UPl’s subdomain in other proteins which play an important role in transcript nuclear proteins from different sources: polyadenylate metabolism [ 1,2]. binding proteins [lo-121, Cl hnRNP [13], nucleolin After transcription by Pal II, the synthesized [ 141, proteins bound to small nuclear RNAs (snRNP) heterogeneous nuclear RNA (hnRNA) tightly [15-171 and others. associates with a set of proteins called ‘heterogeneous A common characteristic of these proteins is their nuclear ribonucleic proteins’ (hnRNP) to form a ability to bind single-stranded nucleic acids in vitro and NAfprotein complex in a ‘nucleosome-like’ structure, RNA in vivo. -
Local Protein Structures Bernard Offmann, Manoj Tyagi, Alexandre De Brevern
Local Protein Structures Bernard Offmann, Manoj Tyagi, Alexandre de Brevern To cite this version: Bernard Offmann, Manoj Tyagi, Alexandre de Brevern. Local Protein Structures. Current Bioinfor- matics, Benthams Science, 2007, 2, pp.165-202. inserm-00175058 HAL Id: inserm-00175058 https://www.hal.inserm.fr/inserm-00175058 Submitted on 11 May 2010 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. HAL authorLocal manuscript Protein Structures (Offmann, Tyagi & de Brevern) Current Bioinformatics 2007;2:165-202 Preprint for Current Bioinformatics 2007 HAL author manuscript inserm-00175058, version 1 Local Protein Structures Offmann B. 1, Tyagi M. 1+ & de Brevern A.G. 2* 1 Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, 15, avenue René Cassin, BP7151, 97715 Saint Denis Messag Cedex 09, La Réunion, France 2 Equipe de Bioinformatique Génomique et Moléculaire (EBGM), INSERM UMR-S 726, Université Paris Diderot, case 7113, 2, place Jussieu, 75251 Paris, France * Corresponding author: mailing address: Dr. de Brevern A.G., Equipe de Bioinformatique Génomique et Moléculaire (EBGM), INSERM UMR-S 726, Université Paris Diderot, case 7113, 2, place Jussieu, 75251 Paris, France E-mail : [email protected] Tel: (33) 1 44 27 77 31 Fax: (33) 1 43 26 38 30 key words: secondary structure, protein folds, structure-sequence relationship, structural alphabet, protein blocks, molecular modeling, ab initio . -
Amino Acid Distribution Rules Predict Protein Fold: Protein Grammar for Beta-Strand Sandwich-Like Structures
Biomolecules 2015, 5, 41-59; doi:10.3390/biom5010041 OPEN ACCESS biomolecules ISSN 2218-273X www.mdpi.com/journal/biomolecules/ Article Amino Acid Distribution Rules Predict Protein Fold: Protein Grammar for Beta-Strand Sandwich-Like Structures Alexander Kister Department of Mathematics, Rutgers University, Piscataway, NJ 08854, USA; E-Mail: [email protected] Received: 13 October 2014 / Accepted: 31 December 2014 / Published: 23 January 2015 Abstract: We present an alternative approach to protein 3D folding prediction based on determination of rules that specify distribution of “favorable” residues, that are mainly responsible for a given fold formation, and “unfavorable” residues, that are incompatible with that fold, in polypeptide sequences. The process of determining favorable and unfavorable residues is iterative. The starting assumptions are based on the general principles of protein structure formation as well as structural features peculiar to a protein fold under investigation. The initial assumptions are tested one-by-one for a set of all known proteins with a given structure. The assumption is accepted as a “rule of amino acid distribution” for the protein fold if it holds true for all, or near all, structures. If the assumption is not accepted as a rule, it can be modified to better fit the data and then tested again in the next step of the iterative search algorithm, or rejected. We determined the set of amino acid distribution rules for a large group of beta sandwich-like proteins characterized by a specific arrangement of strands in two beta sheets. It was shown that this set of rules is highly sensitive (~90%) and very specific (~99%) for identifying sequences of proteins with specified beta sandwich fold structure.