Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions
Total Page:16
File Type:pdf, Size:1020Kb
Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions by Mitchell Li Cheong Man A thesis submitted in conformity with the requirements for the degree of Master of Science Department of Cell and Systems Biology University of Toronto © Copyright by Mitchell Li Cheong Man 2017 Functional Analysis of Short Linear Motifs in Intrinsically Disordered Regions Mitchell Li Cheong Man Master of Science Department of Cell and Systems Biology University of Toronto 2017 Abstract Short linear motifs (SLiMs) are regulatory binding sites that are involved in signalling and protein regulation. SLiMs are often found in intrinsically disordered regions (IDRs) which are rapidly evolving and lack stable tertiary conformations. Despite their prevalence throughout the proteome, many SLiMs still remain unknown and understanding how they cooperate and evolved are ongoing endeavours. The goal of this thesis is to address the properties of SLiMs and how they evolved, using bioinformatics methods and evolutionary models. To further examine the role of SLiMs in signal transduction, I show that deletion of predicted SLiMs (pSLiMs) has a broad range of quantitative effects on signalling pathway output. Next, to explore what properties are important in substrate recognition, I show that the combination and order of motifs can predict target specificity. Lastly, using a comparative phylogenetic approach to investigate the evolution of motifs, I provide evidence that phosphorylation and docking sites coevolved. ii Acknowledgements I would first and foremost like to thank Alan for giving me this opportunity to join his lab. Thank you for pushing me to pursue bioinformatics and always supporting me whether it is advice on my project or otherwise - you have helped me to gain a much deeper understanding and appreciation for science. I would also like to thank my committee supervisors for their support and guidance throughout my project. A great thank you to Belinda Chang for her advisement to pursue covariation of motifs, and Julie Forman-Kay for her expertise in understanding the field of intrinsically disordered regions. Additionally I would like to thank Nick Provart for agreeing to evaluate this thesis. Thank you to the Moses lab, both past and present (Alex L, Alex N, Bob, Caressa, Gavin, Ian, Liz, Muluye, Nirvana, Purnima, Selma and Taraneh). Your shared knowledge, acumen, thoughtfulness, sincerity and understanding for science and for your peers were great to be apart of, and something I hope to emulate in my future endeavours. A special thanks to Bob for allowing me to join you on the pSLiM journey. Finally, a big thanks to my family and friends for their continued support outside the lab. Adrian, thanks for your constant and unwavering council. Haeri, thank you for showing me the way forward, motivating me, and being my scientific soundboard, when I needed it most. You inspired me to pursue this thesis, and then helped me every step of the way to achieve it. iii Table of Contents Abstract .................................................................................................................. ii Acknowledgements ............................................................................................... iii Table of Contents ................................................................................................. iv List of Tables ....................................................................................................... vii List of Figures ..................................................................................................... viii List of Abbreviations ............................................................................................. ix Organization of thesis ............................................................................................ x 1. Introduction ........................................................................................................ 1 1.1 Intrinsically disordered regions ..................................................................... 2 1.2 Short linear motifs ........................................................................................ 3 1.2.1 Post-translational modifications ......................................................... 4 1.2.2 Docking motifs ................................................................................... 5 1.2.3 Degradation motifs ............................................................................. 5 1.2.4 Localization signals ............................................................................ 6 1.3 Computational analysis of SLiMs ................................................................. 6 1.4 Cooperativity of multiple SLiMs .................................................................... 7 1.5 SLiM evolution .............................................................................................. 8 1.6 Research Objectives .................................................................................... 9 2. Results ............................................................................................................. 10 2.1 Quantifying the effect of predicted SLiM deletions ........................................ 10 2.1.1 Effect of mutations in known SLiMs on HOG signaling .................... 15 2.1.2 Effect of mutations in pSLiMs on HOG signaling ............................. 17 iv 2.2 Testing how recognition of SLiM combinations generalize across the proteome for target specificity .......................................................................... 17 2.2.1 Enrichment analysis of SLiM combinations to discriminate Clb5 specific targets ......................................................................................... 19 2.2.2 Evolutionary analysis of SLiM combinations to discriminate Clb5 specific targets .......................................................................................... 26 2.2.3 Enrichment analysis of SLiM combinations to discriminate Clb5 specific targets using a Hidden Markov Model framework ........................ 28 2.3 Detecting coevolution of SLiMs .................................................................. 31 2.3.1 Coevolution of phosphorylation and docking sites in Cbk1 targets .. 33 3. Discussion........................................................................................................ 38 3.1 Quantifying the effect of pSLiM deletions ................................................... 38 3.2 Recognition of SLiM combinations for target specificity ............................. 40 3.3 Detecting coevolution of SLiMs .................................................................. 45 4. Future Directions ............................................................................................. 47 4.1 Exploring pSLiM deletions in other signalling pathways ............................ 47 4.2 Confirming Clb5 specific target predictions ................................................ 48 4.3 Extending the coevolution analysis to another model ................................ 49 5. Conclusions ..................................................................................................... 50 6. Materials and Methods .................................................................................... 51 6.1 pSLiM deletion analysis ............................................................................. 51 6.1.1 Yeast strains and mutagenesis ........................................................ 51 6.1.2 Pathway induction and flow cytometry assay .................................. 51 6.1.3 Histogram data analysis ................................................................... 52 6.1.4 Calculating fraction of “on” cells and change in mean GFP intensity of “on” cells ................................................................................................ 53 6.1.5 Statistical analysis of mutant strains ................................................ 54 v 6.2 Clb5-Cdk1 motif combination analysis ....................................................... 55 6.2.1 Defining Cdk1 (Clb5 specific and non-Clb5 specific) and non-Cdk1 protein datasets ........................................................................................ 55 6.2.2 Identifying occurrences of matches to combinations of Cks1-Cdk1- Clb5 motifs ................................................................................................ 56 6.2.3 Ortholog assignment and multiple sequence alignments of related yeast species ............................................................................................ 56 6.2.4 Constructing an HMM model of the Cks1-Cdk1-Clb5 motif ordering .................................................................................................................. 57 6.2.5 Parameter estimation of emission probabilities ............................... 57 6.2.6 Comparison of substrates that match rules in each subset for enrichment analysis .................................................................................. 58 6.3 Cbk1 phosphorylation and docking site correlated evolution analysis ....... 59 6.3.1 Defining Cbk1 substrate dataset ...................................................... 59 6.3.2 Ortholog assignment and multiple sequence alignments of related yeast species for the 10 known/likely substrates ...................................... 59 6.3.3 Constructing Phylogenetic trees .....................................................