The Evolution of Function in the Rab Family of Small Gtpases
Total Page:16
File Type:pdf, Size:1020Kb
The Evolution of Function in the Rab Family of small GTPases Yoan Diekmann Dissertation presented to obtain the Ph.D. degree in Computational Biology Instituto de Tecnologia Química e Biológica | Universidade Nova de Lisboa Research work coordinated by: Oeiras, April, 2014 Cover: “The Rab Universe”. A circular version of Figure 2.5. The Evolution of Function in the Rab Family of small GTPases Yoan Diekmann, Computational Genomics Laboratory, Instituto Gulben- kian de Ciˆencia Declara¸c˜ao: Esta disserta¸c˜ao´eo resultado do meu pr´opriodesenvolvido entre Outubro de 2009 e Outubro de 2013 no laborat´oriodo Jos´eB. Pereira-Leal, PhD, Instituto Gulbenkian de Ciˆeciaem Oeiras, Portugal, no ˆambito do Programa de Doutoramento em Biologia Computacional (edi¸c˜ao2008-2009). Apoio financeiro: Apoio financeiro da FCT e do FSE no ˆambito do Quadro Comunit´ariode Apoio, bolsa de doutoramento SFRH/BD/33860/2009 e projectos HMSP-CT/SAU-ICT/0075/2009 e PCDC/EBB-BIO/119006/2010. Acknowledgements “Alors que vous achevez la lecture de mon expos´e,il est n´ecessaire de souligner le fait que ce que j’en ai retir´eest bien plus cons´equent.” Mit diesen Worten, die ich meinem Freund Axel schulde, begann was in der vorliegenden Arbeit seinen vorl¨aufigenH¨ohepunktfindet. So wahr wie damals ist der Satz auch heute, und das verdanke ich einer Reihe von Men- schen welche unerw¨ahnt zu lassen unm¨oglich ist. Obwohl das Schreiben die folgenden Namen notwendigerweise in eine Reihenfolge zw¨angt,so geb¨uhrt doch allen ihr g¨anzlich eigener Dank. Ich lasse der M¨udigkeit freien Lauf und schreibe einfach drauf los. Als erstes m¨ochte ich meinen supervisor Z´enennen. Danke f¨urall die M¨oglichkeiten und Lektionen. Ohne eigene, kann man keine Meinung vertreten. An all meine Arbeitskollegen von CGL und BIU, danke f¨urdie Hilfe und die Arbeitsumgebung die wir zusammen geschaffen haben. An das IGC f¨urdas unvergleichliche ‘Science Sommer Camp’-Gef¨uhl,f¨ur die brillanten Vortr¨ageund die Chancen ganz nah ranzukommen. An Daniel, der den vielleicht gr¨oßtenaber sicherlich wichtigsten Teil meiner Ausbildung ¨ubernommen hat. Ohne die endlosen Gespr¨ache, den ansteckenden Enthusiasmus und dein Vorbild w¨ussteich vielleicht bis heute nicht, dass ich mit science richtig liege. An Jorge und das PDBC, mir die Bewerbungsphotowahl nicht zu sehr angekreidet und die Gelegen- heit gegeben zu haben all das zu erleben. Ich kann mir keine bessere Art vorstellen zum Biologen zu mutieren. An die Menschen denen ich zu danken bisher vers¨aumt habe, ohne welche ich aber niemals hier gelandet w¨are:Jens Stoye, Eric Tannier und Julien Allali. Danke f¨urdas Vertrauen und die Nachsicht. Schlussendlich, danke an meine Freunde, in Portugal und Deutschland, ohne die ich nie die Kraft gehabt h¨atteauch nur das kleinste der Opfer zu stemmen die mein “Beruf” fordert. Und an meine Eltern, f¨urdie bedin- gungslose, andauernde Unterst¨utzung,und mir das Einzige beigebracht zu haben, was wirklich z¨ahlt,nachzudenken. Contents 1 Evolution of Protein Function . 2 1.1 Introduction . 4 1.1.1 What is the function of a protein? . 6 1.1.2 When does protein function evolve? . 7 1.1.3 Studying evolution of protein function . 9 1.2 The biochemistry of evolving protein function . 14 1.2.1 Binding . 15 1.2.2 Catalysis . 26 1.2.3 Biophysical properties . 31 1.3 The genetic basis of evolving protein function . 35 1.3.1 Modification . 35 1.3.2 Duplication . 36 1.3.3 Rearrangements . 37 1.4 Evolution and protein function . 39 1.4.1 Gene sequence evolution . 40 1.4.2 Gene family evolution . 41 1.5 Rab GTPases: the evolution of function in protein switches . 45 1.5.1 The Rab family of small GTPases . 47 1.6 Conclusion . 55 1.6.1 Rabs to study the evolution of function . 57 References . 59 i ii 2 Evolutionary patterns of the Rab family of small GTPases . 83 2.1 Introduction . 85 2.2 Results / Discussion . 88 2.2.1 The Rabifier . 88 2.2.2 Validation of the Rabifier classifications and design . 91 2.2.3 Benchmarking the Rabifier . 95 2.2.4 Availability of the Rabifier and its predictions . 97 2.2.5 New hypothetical subfamilies . 97 2.2.6 Global Dynamics of the Rab sequence space . 99 2.2.7 Dating the origin of Rabs and expanding the LECA . 103 2.2.8 The Rab family in Monosiga brevicollis and the ori- gin of animals . 106 2.2.9 A model for Rab subfamily innovation . 108 2.3 Conclusions . 111 2.4 Materials and Methods . 114 2.4.1 Ethics Statement . 114 2.4.2 The set of human Rabs . 114 2.4.3 The Rabifier . 115 2.4.4 Hypothetical subfamilies . 117 2.4.5 Phylogenetic trees . 117 2.4.6 Rab PCR of mouse organs and cells . 118 2.A Supplementary text / figures . 120 2.A.1 Step 1 . 121 2.A.2 Step 2 . 121 References . 130 3 Phylogeny and the inference of episodic positive selection . 145 3.1 Introduction . 147 iii 3.2 Results / Discussion . 148 3.3 Conclusion . 161 3.4 Materials and Methods . 162 3.A Supplementary tables . 164 References . 166 4 Functional Innovation in the Rab family of small GTPases . 171 4.1 Introduction . 173 4.2 Results / Discussion . 177 4.2.1 Rab25 evolved by neofunctionalisation . 177 4.2.2 Rab25 function evolved by effector switching . 181 4.2.3 Rab25 function evolved even long after duplication . 191 4.3 Conclusion . 192 4.4 Materials and Methods . 195 4.4.1 Alignment and gene tree . 195 4.4.2 Ancestral sequence reconstruction . 196 4.4.3 Past episodic positive selection . 196 4.4.4 Bidirectional Best Hits and orthologs . 197 4.A Supplementary tables . 197 References . 202 5 Conclusion . 213 5.1 Outlook . 218 References . 221 Summary . v Summary . vii Resumo . xiii Author contribution: I reviewed the literature and wrote the chapter. Chapter 1 Evolution of Protein Function “[. ] a general physiology which can describe the essential char- acteristics of matter in the living state is an ideal [. ] we can strive toward [. ] by a study of the vital functions in all their aspects throughout the myriads of organisms.” [1, p. 4] —August Krogh, 1929 Abstract Chapter 1 The question how protein function evolves is a fundamental prob- lem with profound implications for both functional end evolutionary studies on proteins. Here, we review some of the work that has ad- dressed or contributed to this question. We identify and comment on three different levels relevant for the evolution of protein func- tion. First, biochemistry. This is the focus of our discussion, as protein function itself commonly receives least attention in studies on protein evolution. We distinguish three basic ways in which pro- tein function evolves: by altering interactions, by changing product outcome or reaction biochemistry in enzymes, or by modification of protein biophysical properties. Second, genetics. This is the level responsible to generate variation, which acts as the substrate of evo- lution. Third, evolution. Evolutionary forces constantly sieve the pool of random mutations. Of particular interest here are evolution- ary models at gene family level, as these provide a framework to integrate functional aspects. We conclude with two major observa- tions. First, functional evolution is extremely dynamic. Not only do we find frequent transitions between distinct functional classes, but most mutational paths realising them are also short. Second and related, functional evolution is more likely than expected. We find the map between sequence and function to be degenerate, i.e. vari- ous sequences are able to perform the same function, and the same sequence able to carry out different functions. Finally, this review points us to protein switches as a promising model system, and we introduce the Rab family of small GTPases as the object of study in the remainder of this thesis. 4 1.1 Introduction roteins are essential macromolecules involved in virtually all cel- P lular and organismic processes of life. Various biological disciplines including biochemistry, cell biology and physiology study the functions and activities of proteins and how these relate to cellular and organismal traits. Shortly after the first protein sequences became available in the early 1950s, comparing sequences (and later structures) became one of the means to interrogate protein function [2]. Given sequences from the same proteins that evolved in different species (i.e. orthologs), identical regions may be important for a function shared amongst these proteins, for in- stance an enzyme active site. Conversely, if the function differs between species and one seeks to explain these differences, the responsible regions may be expected in varying parts of the protein. Hence, evolution informs about protein function. With the advent of molecular evolution, the comparative study of sim- ilarities and differences amongst proteins was given a formal evolution- ary basis. Like species, related proteins descend from common ancestors. They retain similarities and accumulate differences as a consequence of shared history and in response to various evolutionary forces. The observ- able result of this process is sequence divergence, ultimately caused by mutations whose evolutionary fate is intimately connected to their func- tional consequences. Hence, function informs about protein evolution. In conclusion, protein function and evolution are inextricably linked1. Arising from this, a fundamental question with profound implications for 1The stringent definition of biological function, the modern history view [3] or se- lected effect definition, is actually an evolutionary concept: “the functions of a trait or feature are all and only those effects of its presence for which it was under positive natural selection in the (recent) past and for which it is under [.