Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions

Total Page:16

File Type:pdf, Size:1020Kb

Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions by Mitchell Li Cheong Man A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Cell and Systems Biology University of Toronto © Copyright by Mitchell Li Cheong Man 2017 Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions Mitchell Li Cheong Man Master of Science Department of Cell and Systems Biology University of Toronto 2017 Abstract Short linear motifs (SLiMs) are regulatory binding sites that are involved in signalling and protein regulation. SLiMs are often found in intrinsically disordered regions (IDRs) which are rapidly evolving and lack stable tertiary conformations. Despite their prevalence throughout the proteome, many SLiMs still remain unknown and understanding how they cooperate and evolved are ongoing endeavours. The goal of this thesis is to address the properties of SLiMs and how they evolved, using bioinformatics methods and evolutionary models. To further examine the role of SLiMs in signal transduction, I show that deletion of predicted SLiMs (pSLiMs) has a broad range of quantitative effects on signalling pathway output. Next, to explore what properties are important in substrate recognition, I show that the combination and order of motifs can predict target specificity. Lastly, using a comparative phylogenetic approach to investigate the evolution of motifs, I provide evidence that phosphorylation and docking sites coevolved. ii Acknowledgements I would first and foremost like to thank Alan for giving me this opportunity to join his lab. Thank you for pushing me to pursue bioinformatics and always supporting me whether it is advice on my project or otherwise - you have helped me to gain a much deeper understanding and appreciation for science. I would also like to thank my committee supervisors for their support and guidance throughout my project. A great thank you to Belinda Chang for her advisement to pursue covariation of motifs, and Julie Forman-Kay for her expertise in understanding the field of intrinsically disordered regions. Additionally I would like to thank Nick Provart for agreeing to evaluate this thesis. Thank you to the Moses lab, both past and present (Alex L, Alex N, Bob, Caressa, Gavin, Ian, Liz, Muluye, Nirvana, Purnima, Selma and Taraneh). Your shared knowledge, acumen, thoughtfulness, sincerity and understanding for science and for your peers were great to be apart of, and something I hope to emulate in my future endeavours. A special thanks to Bob for allowing me to join you on the pSLiM journey. Finally, a big thanks to my family and friends for their continued support outside the lab. Adrian, thanks for your constant and unwavering council. Haeri, thank you for showing me the way forward, motivating me, and being my scientific soundboard, when I needed it most. You inspired me to pursue this thesis, and then helped me every step of the way to achieve it. iii Table of Contents Abstract .................................................................................................................. ii Acknowledgements ............................................................................................... iii Table of Contents ................................................................................................. iv List of Tables ....................................................................................................... vii List of Figures ..................................................................................................... viii List of Abbreviations ............................................................................................. ix Organization of thesis ............................................................................................ x 1. Introduction ........................................................................................................ 1 1.1 Intrinsically disordered regions ..................................................................... 2 1.2 Short linear motifs ........................................................................................ 3 1.2.1 Post-translational modifications ......................................................... 4 1.2.2 Docking motifs ................................................................................... 5 1.2.3 Degradation motifs ............................................................................. 5 1.2.4 Localization signals ............................................................................ 6 1.3 Computational analysis of SLiMs ................................................................. 6 1.4 Cooperativity of multiple SLiMs .................................................................... 7 1.5 SLiM evolution .............................................................................................. 8 1.6 Research Objectives .................................................................................... 9 2. Results ............................................................................................................. 10 2.1 Quantifying the effect of predicted SLiM deletions ........................................ 10 2.1.1 Effect of mutations in known SLiMs on HOG signaling .................... 15 2.1.2 Effect of mutations in pSLiMs on HOG signaling ............................. 17 iv 2.2 Testing how recognition of SLiM combinations generalize across the proteome for target specificity .......................................................................... 17 2.2.1 Enrichment analysis of SLiM combinations to discriminate Clb5 specific targets ......................................................................................... 19 2.2.2 Evolutionary analysis of SLiM combinations to discriminate Clb5 specific targets .......................................................................................... 26 2.2.3 Enrichment analysis of SLiM combinations to discriminate Clb5 specific targets using a Hidden Markov Model framework ........................ 28 2.3 Detecting coevolution of SLiMs .................................................................. 31 2.3.1 Coevolution of phosphorylation and docking sites in Cbk1 targets .. 33 3. Discussion........................................................................................................ 38 3.1 Quantifying the effect of pSLiM deletions ................................................... 38 3.2 Recognition of SLiM combinations for target specificity ............................. 40 3.3 Detecting coevolution of SLiMs .................................................................. 45 4. Future Directions ............................................................................................. 47 4.1 Exploring pSLiM deletions in other signalling pathways ............................ 47 4.2 Confirming Clb5 specific target predictions ................................................ 48 4.3 Extending the coevolution analysis to another model ................................ 49 5. Conclusions ..................................................................................................... 50 6. Materials and Methods .................................................................................... 51 6.1 pSLiM deletion analysis ............................................................................. 51 6.1.1 Yeast strains and mutagenesis ........................................................ 51 6.1.2 Pathway induction and flow cytometry assay .................................. 51 6.1.3 Histogram data analysis ................................................................... 52 6.1.4 Calculating fraction of “on” cells and change in mean GFP intensity of “on” cells ................................................................................................ 53 6.1.5 Statistical analysis of mutant strains ................................................ 54 v 6.2 Clb5-Cdk1 motif combination analysis ....................................................... 55 6.2.1 Defining Cdk1 (Clb5 specific and non-Clb5 specific) and non-Cdk1 protein datasets ........................................................................................ 55 6.2.2 Identifying occurrences of matches to combinations of Cks1-Cdk1- Clb5 motifs ................................................................................................ 56 6.2.3 Ortholog assignment and multiple sequence alignments of related yeast species ............................................................................................ 56 6.2.4 Constructing an HMM model of the Cks1-Cdk1-Clb5 motif ordering .................................................................................................................. 57 6.2.5 Parameter estimation of emission probabilities ............................... 57 6.2.6 Comparison of substrates that match rules in each subset for enrichment analysis .................................................................................. 58 6.3 Cbk1 phosphorylation and docking site correlated evolution analysis ....... 59 6.3.1 Defining Cbk1 substrate dataset ...................................................... 59 6.3.2 Ortholog assignment and multiple sequence alignments of related yeast species for the 10 known/likely substrates ...................................... 59 6.3.3 Constructing Phylogenetic trees .....................................................
Recommended publications
  • Prediction of Virus-Host Protein-Protein Interactions Mediated by Short Linear Motifs Andrés Becerra, Victor A
    Becerra et al. BMC Bioinformatics (2017) 18:163 DOI 10.1186/s12859-017-1570-7 RESEARCH ARTICLE Open Access Prediction of virus-host protein-protein interactions mediated by short linear motifs Andrés Becerra, Victor A. Bucheli and Pedro A. Moreno* Abstract Background: Short linear motifs in host organisms proteins can be mimicked by viruses to create protein-protein interactions that disable or control metabolic pathways. Given that viral linear motif instances of host motif regular expressions can be found by chance, it is necessary to develop filtering methods of functional linear motifs. We conduct a systematic comparison of linear motifs filtering methods to develop a computational approach for predictin g motif-mediated protein-protein interactions between human and the human immunodeficiency virus 1 (HIV-1). Results: We implemented three filtering methods to obtain linear motif sets: 1) conserved in viral proteins (C),2) located in disordered regions (D) and 3) rare or scarce in a set of randomized viral sequences (R).ThesetsC, D, R are united and intersected. The resulting sets are compared by the number of protein-protein interactions correctly inferred with them – with experimental validation. The comparison is done with HIV-1 sequences and interactions from the National Institute of Allergy and Infectious Diseases (NIAID). The number of correctly inferred interactions allows to rank the interactions by the sets used to deduce them: D ∪ R and C. The ordering of the sets is descending on the probability of capturing functional interactions. With respect to HIV-1, the sets C∪R, D∪R, C∪D∪R infer all known interactions between HIV1 and human proteins med iated by linear motifs.
    [Show full text]
  • Proteome‐Wide Analysis of Phospho‐Regulated PDZ Domain Interactions
    Published online: August 20, 2018 Method Proteome-wide analysis of phospho-regulated PDZ domain interactions Gustav N Sundell1, Roland Arnold2,*, Muhammad Ali1 , Piangfan Naksukpaiboon2, Julien Orts3, Peter Güntert3,4, Celestine N Chi5,** & Ylva Ivarsson1,*** Abstract Introduction A key function of reversible protein phosphorylation is to regulate Reversible protein phosphorylation is crucial for regulation of cellu- protein–protein interactions, many of which involve short linear lar processes and primarily occurs on Ser, Thr, and Tyr residues in motifs (3–12 amino acids). Motif-based interactions are difficult to eukaryotes (Seet et al, 2006). Phosphorylation may have different capture because of their often low-to-moderate affinities. Here, functional effects on the target protein, such as inducing conforma- we describe phosphomimetic proteomic peptide-phage display, tional changes, altering cellular localization, or enabling or disabling a powerful method for simultaneously finding motif-based interaction sites. Hundreds of thousands of such phosphosites have interaction and pinpointing phosphorylation switches. We compu- been identified in different cell lines and under different conditions tationally designed an oligonucleotide library encoding human (Olsen et al, 2006; Hornbeck et al, 2015). An unresolved question is C-terminal peptides containing known or predicted Ser/Thr phos- which of these phosphosites are of functional relevance and not phosites and phosphomimetic variants thereof. We incorporated background noise caused by the off-target activity of kinases these oligonucleotides into a phage library and screened the PDZ revealed by the high sensitivity in the mass spectrometry analysis. (PSD-95/Dlg/ZO-1) domains of Scribble and DLG1 for interactions So far, only a minor fraction of identified phosphosites has been potentially enabled or disabled by ligand phosphorylation.
    [Show full text]
  • Using Peptide-Phage Display to Capture Conditional Motif-Based Interactions
    Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1716 Using peptide-phage display to capture conditional motif-based interactions GUSTAV SUNDELL ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6214 ISBN 978-91-513-0433-5 UPPSALA urn:nbn:se:uu:diva-359434 2018 Dissertation presented at Uppsala University to be publicly examined in B42, BMC, Husargatan 3, Uppsala, Friday, 19 October 2018 at 09:15 for the degree of Doctor of Philosophy. The examination will be conducted in English. Faculty examiner: Doctor Attila Reményi (nstitute of Enzymology, Research Center for Natural Sciences, Hungarian Academy of Sciences, Budapest, Hungary). Abstract Sundell, G. 2018. Using peptide-phage display to capture conditional motif-based interactions. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology 1716. 87 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0433-5. This thesis explores the world of conditional protein-protein interactions using combinatorial peptide-phage display and proteomic peptide-phage display (ProP-PD). Large parts of proteins in the human proteome do not fold in to well-defined structures instead they are intrinsically disordered. The disordered parts are enriched in linear binding-motifs that participate in protein-protein interaction. These motifs are 3-12 residue long stretches of proteins where post-translational modifications, like protein phosphorylation, can occur changing the binding preference of the motif. Allosteric changes in a protein or domain due to phosphorylation or binding to second messenger molecules like Ca2+ can also lead conditional interactions. Finding phosphorylation regulated motif-based interactions on a proteome-wide scale has been a challenge for the scientific community.
    [Show full text]
  • Cytoplasmic Short Linear Motifs in ACE2 and Integrin Β3 Link SARS
    bioRxiv preprint doi: https://doi.org/10.1101/2020.10.06.327742; this version posted October 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Cytoplasmic short linear motifs in ACE2 and integrin b3 link SARS-CoV-2 host cell receptors to endocytosis and autophagy Johanna Kliche, Muhammad Ali, Ylva Ivarsson * Department of Chemistry, BMC, Uppsala University, Husargatan 3, 751 23 Uppsala, Sweden Communicating author: [email protected] Muhammad Ali: 0000-0002-8858-6776 Johanna Kliche: 0000-0003-3179-4635 Ylva Ivarsson: 0000-0002-7081-3846 Key words SARS-CoV-2 receptors, SLiMs, endocytosis, autophagy, ACE2, integrins, LIR, phospho- regulation, protein-protein interactions One sentence summary Affinity measurements confirmed binding of short linear motifs in the cytoplasmic tails of ACE2 and integrin b3, thereby linking the receptors to endocytosis and autophagy. bioRxiv preprint doi: https://doi.org/10.1101/2020.10.06.327742; this version posted October 6, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC 4.0 International license. Abstract The spike protein of the SARS-CoV-2 interacts with angiotensin converting enzyme 2 (ACE2) and enters the host cell by receptor-mediated endocytosis. Concomitantly, evidence is pointing to the involvement of additional host cell receptors, such as integrins.
    [Show full text]
  • Keys to Unlocking Regulators and Substrates
    BI87CH36_Brautigan ARI 21 May 2018 9:29 Annual Review of Biochemistry Protein Serine/Threonine Phosphatases: Keys to Unlocking Regulators and Substrates David L. Brautigan1 and Shirish Shenolikar2 1Center for Cell Signaling and Department of Microbiology, Immunology and Cancer Biology, University of Virginia School of Medicine, Charlottesville, Virginia 22908, USA; email: [email protected] 2Signature Research Programs in Cardiovascular and Metabolic Disorders and Neuroscience and Behavioral Disorders, Duke-NUS Medical School, Singapore 169857 Annu. Rev. Biochem. 2018. 87:921–64 Keywords The Annual Review of Biochemistry is online at phosphoproteins, SLiMs, acetylation, ubiquitination, signaling networks biochem.annualreviews.org https://doi.org/10.1146/annurev-biochem- Abstract 062917-012332 Protein serine/threonine phosphatases (PPPs) are ancient enzymes, with dis- Copyright c 2018 by Annual Reviews. tinct types conserved across eukaryotic evolution. PPPs are segregated into All rights reserved types primarily on the basis of the unique interactions of PPP catalytic sub- Access provided by Duke University on 03/01/19. For personal use only. units with regulatory proteins. The resulting holoenzymes dock substrates Annu. Rev. Biochem. 2018.87:921-964. Downloaded from www.annualreviews.org distal to the active site to enhance specificity. This review focuses on the sub- ANNUAL REVIEWS Further unit and substrate interactions for PPP that depend on short linear motifs. Click here to view this article's Insights about these motifs from structures of holoenzymes open new oppor- online features: • Download figures as PPT slides tunities for computational biology approaches to elucidate PPP networks. • Navigate linked references There is an expanding knowledge base of posttranslational modifications of • Download citations • Explore related articles PPP catalytic and regulatory subunits, as well as of their substrates, including • Search keywords phosphorylation, acetylation, and ubiquitination.
    [Show full text]
  • Screening and Computational Analysis of Colorectal Associated Non
    Razak et al. BMC Medical Genetics (2019) 20:171 https://doi.org/10.1186/s12881-019-0911-y RESEARCH ARTICLE Open Access Screening and computational analysis of colorectal associated non-synonymous polymorphism in CTNNB1 gene in Pakistani population Suhail Razak1,2* , Nousheen Bibi3, Javid Ahmad Dar4, Tayyaba Afsar2, Ali Almajwal2, Zahida Parveen5 and Sarwat Jahan1 Abstract Background: Colorectal cancer (CRC) is categorized by alteration of vital pathways such as β-catenin (CTNNB1) mutations, WNT signaling activation, tumor protein 53 (TP53) inactivation, BRAF, Adenomatous polyposis coli (APC) inactivation, KRAS, dysregulation of epithelial to mesenchymal transition (EMT) genes, MYC amplification, etc. In the present study an attempt was made to screen CTNNB1 gene in colorectal cancer samples from Pakistani population and investigated the association of CTNNB1 gene mutations in the development of colorectal cancer. Methods: 200 colorectal tumors approximately of male and female patients with sporadic or familial colorectal tumors and normal tissues were included. DNA was extracted and amplified through polymerase chain reaction (PCR) and subjected to exome sequence analysis. Immunohistochemistry was done to study protein expression. Molecular dynamic (MD) simulations of CTNNB1WT and mutant S33F and T41A were performed to evaluate the stability, folding, conformational changes and dynamic behaviors of CTNNB1 protein. Results: Sequence analysis revealed two activating mutations (S33F and T41A) in exon 3 of CTNNB1 gene involving the transition of C.T and A.G at amino acid position 33 and 41 respectively (p.C33T and p.A41G). Immuno-histochemical staining showed the accumulation of β-catenin protein both in cytoplasm as well as in the nuclei of cancer cells when compared with normal tissue.
    [Show full text]
  • UC Irvine UC Irvine Previously Published Works
    UC Irvine UC Irvine Previously Published Works Title The WW domain of the scaffolding protein IQGAP1 is neither necessary nor sufficient for binding to the MAPKs ERK1 and ERK2. Permalink https://escholarship.org/uc/item/7bt750s1 Journal The Journal of biological chemistry, 292(21) ISSN 0021-9258 Authors Bardwell, A Jane Lagunes, Leonila Zebarjedi, Ronak et al. Publication Date 2017-05-01 DOI 10.1074/jbc.m116.767087 Peer reviewed eScholarship.org Powered by the California Digital Library University of California MAPK-IQGAP1 binding AUTHORS’ FINAL VERSION The WW domain of the scaffolding protein IQGAP1 is neither necessary nor sufficient for binding to the MAPKs ERK1 and ERK2* A. Jane Bardwell1, Leonila Lagunes1, Ronak Zebarjedi1, and Lee Bardwell1,2 1From the Department of Developmental and Cell Biology, Center for Complex Biological Systems, University of California, Irvine, CA 92697 USA *Running Title: MAPK-IQGAP1 binding 2Address correspondence to: Professor Lee Bardwell, Department of Developmental and Cell Biology, University of California, Irvine, CA 92697-2300, USA. Tel. No. 949 824-6902, FAX No. 949 824-4709, E-mail: [email protected] Mitogen-activated protein kinase (MAPK) ERK2-IQGAP1 interaction does not scaffold proteins, such as IQ motif require ERK2 phosphorylation or catalytic containing GTPase activating protein 1 activity and does not involve known (IQGAP1), are promising targets for novel docking recruitment sites on ERK2, and we therapies against cancer and other obtain an estimate of the dissociation diseases. Such approaches require accurate constant (Kd) for this interaction of 8 μM. information about which domains on the These results prompt a re-evaluation of scaffold protein bind to the kinases in the published findings and a refined model of MAPK cascade.
    [Show full text]
  • Combinatorial Avidity Selection of Mosaic Landscape Phages Targeted at Breast Cancer Cells—An Alternative Mechanism of Directed Molecular Evolution
    viruses Article Combinatorial Avidity Selection of Mosaic Landscape Phages Targeted at Breast Cancer Cells—An Alternative Mechanism of Directed Molecular Evolution Valery A. Petrenko 1,* , James W. Gillespie 1 , Hai Xu 1,2, Tiffany O’Dell 1 and Laura M. De Plano 1,3 1 Department of Pathobiology, College of Veterinary Medicine, Auburn University, Auburn, AL 36849, USA 2 National Veterinary Biological Medicine Engineering Research Center, Jiangsu Academy of Agricultural Sciences, Nanjing 210014, Jiangsu, China 3 Department of Chemical Sciences, Biological, Pharmaceutical and Environmental, University of Messina, Viale F. Stagno d’Alcontres 31, 98166 Messina, Italy * Correspondence: [email protected]; Tel.: +1-334-844-2897 Received: 2 August 2019; Accepted: 22 August 2019; Published: 26 August 2019 Abstract: Low performance of actively targeted nanomedicines required revision of the traditional drug targeting paradigm and stimulated the development of novel phage-programmed, self-navigating drug delivery vehicles. In the proposed smart vehicles, targeting peptides, selected from phage libraries using traditional principles of affinity selection, are substituted for phage proteins discovered through combinatorial avidity selection. Here, we substantiate the potential of combinatorial avidity selection using landscape phage in the discovery of Short Linear Motifs (SLiMs) and their partner domains. We proved an algorithm for analysis of phage populations evolved through multistage screening of landscape phage libraries against the MDA-MB-231 breast cancer cell line. The suggested combinatorial avidity selection model proposes a multistage accumulation of Elementary Binding Units (EBU), or Core Motifs (CorMs), in landscape phage fusion peptides, serving as evolutionary initiators for formation of SLiMs. Combinatorial selection has the potential to harness directed molecular evolution to create novel smart materials with diverse novel, emergent properties.
    [Show full text]
  • Profile-Based Short Linear Protein Motif Discovery Niall J Haslam1,2,3 and Denis C Shields1,2,3*
    Haslam and Shields BMC Bioinformatics 2012, 13:104 http://www.biomedcentral.com/1471-2105/13/104 METHODOLOGY ARTICLE Open Access Profile-based short linear protein motif discovery Niall J Haslam1,2,3 and Denis C Shields1,2,3* Abstract Background: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3–10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation. Results: The profile motif discovery method MEME performed relatively poorly for motifs in disordered regions of proteins. However, when we applied evolutionary weighting to account for redundancy amongst homologous proteins, and masked out poorly conserved regions of disordered proteins, the performance of MEME is equivalent to that of regular expression methods. However, the two approaches returned different subsets within both a benchmark dataset, and a more realistic discovery dataset. Conclusions: Profile-based motif discovery methods complement regular expression based methods. Whilst profile-based methods are computationally more intensive, they are likely to discover motifs currently overlooked by regular expression methods. Keywords: Protein-protein interactions, Motif discovery, Peptide binding, Short linear motifs, Mini-motifs, SLiMs Background sites like SLiMs, in addition to well-characterised do- In protein-protein interaction networks, hub proteins main modules. This will advance understanding of are defined as those that interact with a number of other the fundamental mechanisms that drive protein-protein proteins, either simultaneously or at different times.
    [Show full text]
  • Prediction of Short Linear Protein Binding Regions
    Prediction of short linear protein binding regions. Catherine Mooney 1;2;4, Gianluca Pollastri 1;3, Denis C. Shields 1;2;4∗ and Niall J. Haslam 1;2;4 1Complex and Adaptive Systems Laboratory, 2Conway Institute of Biomolecular and Biomedical Science, 3School of Computer Science and Informatics, and 4School of Medicine and Medical Science, University College Dublin ∗To whom correspondence should be addressed - [email protected] +35317165344 Preprint submitted to J. Mol. Biol. October 24, 2011 Prediction of short linear protein binding regions. Catherine Mooney 1;2;4, Gianluca Pollastri 1;3, Denis C. Shields 1;2;41 and Niall J. Haslam 1;2;4 1Complex and Adaptive Systems Laboratory, 2Conway Institute of Biomolecular and Biomedical Science, 3School of Computer Science and Informatics, and 4School of Medicine and Medical Science, University College Dublin Abstract Short linear motifs in proteins, typically of 3-12 residues in length, play key roles in protein-protein interactions, frequently binding specifically to peptide-binding domains within interacting proteins. Their tendency to be found in disordered segments of proteins has meant that they have often been overlooked. Here we present SLiMPred (Short Linear Motif Predictor), the first general de novo method to computationally predict such regions in protein primary sequences independent of experimentally defined homologs and interactors. The method applies machine learning techniques to predict new motifs based on annotated instances from the Eukaryotic Linear Motif database as well as structural, biophysical and biochemical features derived from the protein primary sequence. We have integrated these data sources and benchmarked the predictive accuracy of the method finding that it per- forms equivalently to a predictor of protein binding regions in disordered regions in addition to having predictive power for other classes of motifs sites such as polyproline II helix motifs and short linear motifs lying in or- dered regions.
    [Show full text]
  • Biological Sequence Motif Discovery Using Motif-X
    Biological sequence motif discovery using motif-x. Michael F. Chou1 and Daniel Schwartz2 1 Department of Genetics Harvard Medical School Boston, MA Email: mchou(at)genetics.med.harvard.edu 2 Department of Physiology and Neurobiology University of Connecticut Storrs, CT Email: daniel.schwartz(at)uconn.edu Abstract: The web-based motif-x program provides a simple interface to extract statistically significant motifs from large data sets such as MS/MS post-translational modification data and groups of proteins that share a common biological function. Users upload data files and download results using common web browsers on essentially any web-compatible computer. Once submitted, data analyses are performed rapidly on an associated high-speed computer cluster and they produce both syntactic and image- based motif results and statistics. The protocols presented demonstrate the use of motif-x in three common user scenarios. Key terms: protein motif, phosphorylation, post-translational modification (PTM), motif discovery, motif-x, mass spectrometry, proteomics. 2 INTRODUCTION Using the tools of mass spectrometry and spectral identification, large-scale proteomic experiments are now able to identify thousands of protein post-translational modifications (PTMs) in a single experimental run, and the technology to enrich for different modifications such as phosphorylation, acetylation, glycosylation, and others has been steadily improving. Because of the importance of post-translational modifications in normal and pathologic cellular physiology, the ultimate goal of measurement is to understand the underlying biological processes that lead to these modifications and the consequences thereof. Knowing, e.g., the preference of an enzyme for its natural substrates can help elucidate biological pathways in which they are involved.
    [Show full text]
  • Computational Identification and Analysis of Protein Short Linear Motifs
    Front Biosci. 2010 Jun 1;15:801-25. PMID: 20515727 Computational identification and analysis of protein short linear motifs Norman E. Davey 1,2,3,4 , Richard J. Edwards 5, Denis C. Shields 1,2,3 1UCD Complex and Adaptive Systems Laboratory, University College Dublin, Dublin, Ireland, 2UCD Conway Institute of Biomolecular and Biomedical Research, University College Dublin, Dublin, Ireland, 3UCD School of Medicine and Medical Sciences, University College Dublin, Dublin, Ireland, 4 EMBL Structural and Computational Biology Unit, Meyerhofstrasse 1, 69117 Heidelberg, Germany, 5 School of Biological Sciences, University of Southampton, Southampton, United Kingdom TABLE OF CONTENTS 1. Abstract 2. Introduction 2.1. Biological attributes of SLiMs 2.1.1. Structural disorder 2.1.2. Sequence conservation 2.1.3. Specificity 2.1.4. Affinity 2.1.5. Structure 2.1.6. Amino acid preference 2.2. Potential for novel SLiM discovery 2.3. Sources of SLiM information 2.3.1. Classical motifs 2.3.2. Modification motifs 3. SLiM discovery 3.1. A priori motif discovery 3.1.1. Primary sequence 3.1.2. Structural information 3.1.3. Keyword searches 3.2. Post-translational modification prediction 3.3. De novo motif discovery 3.3.1. Algorithmic motif discovery 3.3.2. Biological models 3.3.3. Structural models 4. Dataset design for SLiM discovery 4.1. Data sources 4.1.1. Gene ontology 4.1.2. Localization 4.1.3. Protein-protein interaction data 4.2. Working with PPI data 4.2.1. Binary interaction 4.2.2. Protein complex interaction 4.2.3. Atomic interaction 4.2.4.
    [Show full text]