Integrated Analysis of Proteomics Data to Assess and Improve the Scope of Mass Spectrometry Based Genome Annotation

Integrated Analysis of Proteomics Data to Assess and Improve the Scope of Mass Spectrometry Based Genome Annotation

Integrated analysis of proteomics data to assess and improve the scope of mass spectrometry based genome annotation Michael Mueller Trinity Hall A dissertation submitted to the University of Cambridge for the degree of Doctor of Philosophy European Molecular Biology Laboratory European Bioinformatics Institute Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD United Kingdom. Email: [email protected] 30 March 2009 To Daniela, Benjamin and Hannah This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. This dissertation is not substantially the same as any I have submit- ted for a degree, diploma or other qualification at any other university, and no part has already been, or is currently being submitted for any degree, diploma or other qualification. This dissertation does not exceed the specified length limit of 300 pages as defined by the Biology Degree Committee. This dissertation has been typeset in 12 pt Palatino using LATEX2ε ac- cording to the specifications defined by the Board of Graduate Studies and the Biology Degree Committee. 30 March 2009 Michael Mueller Integrated analysis of proteomics data to assess and improve the scope of mass spectrometry based genome annotation Abstract Michael Mueller 30 March 2009 Trinity Hall The completion of the human genome has shifted attention from deciphering the sequence to the identification and characterisation of the functional components. Availability of the genome sequence has fostered an array of high-throughput technologies to systematically probe gene function on a genome-wide scale at all levels of biological information flow, from the DNA sequence over transcripts to proteins. A powerful approach to study gene function on protein level is the iden- tification and quantification of proteins in complex mixtures by mass spectrom- etry. Despite significant technological and methodological advances, the com- plexity and the dynamic range of proteomes still pose major challenges for the analysis of biological samples. Based on an integrative bioinformatics analysis, I examine the composition of mass spectrometry proteomics datasets with respect to the coverage of the par- ticular proteome under study as well as the protein-coding genome as a whole. Using the example of a large-scale collaborative study of protein expression in human brain tissue, I point out characteristics of mass spectrometry proteomics datasets in terms of resolution and functional composition. On the basis of a comprehensive survey of publicly available proteomics data in the context of the genome sequence, I assess to what extent the findings from the analysis of the brain proteome study reflect global trends in mass spectrometry based pro- teomics. Following on from the results obtained by the analysis of experimental data, I outline and evaluate a strategy to improve the selectivity and sensitivity of target driven proteomics through diversification of peptide populations by com- binatorial proteolysis. The results presented in this dissertation show that mass spectrometry is an in- dispensable tool for large-scale protein research. However, they also demonstrate profound shortcomings of the technology regarding composition, redundancy and resolution of the generated data, emphasising the need for more targeted and systematic approaches to proteomics. The proposed combinatorial strategy contributes towards this aim by significantly increasing the coverage of the pro- teome by protein-specific signature peptides that are suitable reporter candidates for targeted proteomics experiments based on single reaction monitoring. Acknowledgements This dissertation describes work carried out at the European Bioinformat- ics Institute (EBI) in Hinxton, UK, between April 2005 and March 2009. The EBI is is an outstation of the European Molecular Biology Labora- tory (EMBL), Heidelberg, Germany. My research work at the EBI was funded through the EMBL international PhD programme. I would like to thank my supervisor Rolf Apweiler for taking me on as a PhD student and for his support and guidance over the last four years. Many thanks also to the members of my thesis advisory committee Wolf- gang Huber, Jyoti Choudhary, Sarah Teichman and Lars Steinmetz for their critical and constructive assessment of my work. I am equally thank- ful to Lennart Martens for his continuous interest in and support of my research and the very valuable feedback on my dissertation. I also would like to thank the HUPO Proteome Proteome Project (BPP) bioinformatics committee for involving me in the bioinformatics analysis of the HUPO BPP pilot study. The following people I would like to thank for accompanying me on this exciting journey through time and space at the EBI and in Cambridge. Members, ex-members and visitors of the Proteomics Services Team, Lennart Martens, Juan Antonio Vizca´ıno, Florian Reising, Matthieu Visser and Joe Foster, the lunch time deipnosophists and companions on the reg- ular and at times extensive walks on and around the campus. Richard Cotˆ e,´ Phil Jones, Antony Quinn, Cathrine Leroy and Samuel Kerrien for their help with everything IT related and last but not least the team leader Henning Hermjakob. ii iii Lennart, without you I might well never have made it to this stage. Thanks for sharing not only my research interests (or initiating them in the first place) but also a healthy dose of irony and cynicism. Conversations would only be half the fun without them. I am certain your enthusiasms is going to inspire many (PhD) students to come. Slow down a bit though, other- wise you will not only be a professor in five years but also one with grey hair (and probably no beard given your disapproval of beards). Juanan, you were a great colleague and it was always fun to work with you! Who will finish my chips at lunch time now? I hope you won’t spot that some parts of this dissertation are “justified” and some are not. Florian, the only team member who really appreciates the saying “a busy life is a wasted life”. I hope one day you will find yourself on the veranda of that beach bungalow in south america enjoying the sun. All it takes, is to convince Anna of the affore mentioned saying. Matthieu, thank you for always pointing out both sides of the coin not only in research but also life related matters. Soon, I will be joining you in the quest for the Holy Grail in the haystack of biopolymers. Yet, not bold enough to face the twenty-letter hydra I will humbly stick with the As, Cs, Gs and Ts for the time being. Richard, your brilliant linguistic idioms made me chuckle and the team meetings more enjoyable than meetings are generally meant to be. Phil, until recently the last British survivor in the all too European enclave of the PRIDE team. I don’t know how you managed to stand all the moaning about single glasing and the lack of insulation over the years. Joe, you will have to hold the British end up from now on. Good luck with your PhD! And remember, never, never hand your private number to German profes- sors. Antony, I would like to thank you for the good times we had together at the begining of my PhD at the group retreat, gigs and on various pub nights. Samuel, thank you for having been the only developer ever bold enough to let a layman touch code. Cathrine, thank you for extending my knowledge of French music. I still haven’t managed to comb through all the 150 or so Serge Gainsbourg songs you once gave me. Henning, thank you for giving me a home in the vast spaces of the PANDA group. I also would like to thank my fellow PhD students at the EBI and my iv friends in Cambridge for their support and major contribution to make life in Cambridge much more fun then I ever anticipated. In particular, I would like to thank Jorn,¨ first and foremost for being a great friend and for the good times we had together. But also for persuading me that it is worthwhile spending an entire day trying to install an R package and for his helpful advice regarding this weird and wonderful super high-level “programming language”. Jacky, thank you for resurrecting the social cal- endar after it had died a horrible death after the PhD course. I am now aware of at least a dozen music genres ending in “core” I had never even heard of before. John, I am grateful for your friendship and the countless evenings we spent together, philosophising about life, the universe and everything. And of course for proofreading this dissertation. I owe you one (or two or three). I also would like to thank Daniel, Melanie and Georg for the very enjoyable times we had together in particular as team mates during the many pubquizzes at the Clarendon. Unfortunately, I couldn’t share the moment of glory with you. Furthermore, I would like to thank my long-lasting friends Tim and Thomas for the, well, long-lasting friend- ship even across long distances. My greatest gratitude goes to my wonderful partner Daniela as well as my parents who always supported me unconditionally all these years. Daniela, thank you for taking me on this thrilling ride that we had over last ten years with all its ups and downs. Thank you for always sticking to me and for the encouragement and comfort in moments of dispear. Thank you for being the loving mother of our children Hannah and Benjamin who brought so much joy into our lifes. Finally, I am very grateful to my parents Else and Hans who always supported my decissions no matter how implausible they seemed to them. I thank you both for allowing me to go my way.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    272 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us