Untangling the Evolution of Rab G Proteins: Implications of a Comprehensive Genomic Analysis Tobias H Klöpper1, Nickias Kienle2, Dirk Fasshauer2 and Sean Munro1*
Total Page:16
File Type:pdf, Size:1020Kb
Klöpper et al. BMC Biology 2012, 10:71 http://www.biomedcentral.com/1741-7007/10/71 RESEARCHARTICLE Open Access Untangling the evolution of Rab G proteins: implications of a comprehensive genomic analysis Tobias H Klöpper1, Nickias Kienle2, Dirk Fasshauer2 and Sean Munro1* Abstract Background: Membrane-bound organelles are a defining feature of eukaryotic cells, and play a central role in most of their fundamental processes. The Rab G proteins are the single largest family of proteins that participate in the traffic between organelles, with 66 Rabs encoded in the human genome. Rabs direct the organelle-specific recruitment of vesicle tethering factors, motor proteins, and regulators of membrane traffic. Each organelle or vesicle class is typically associated with one or more Rab, with the Rabs present in a particular cell reflecting that cell’s complement of organelles and trafficking routes. Results: Through iterative use of hidden Markov models and tree building, we classified Rabs across the eukaryotic kingdom to provide the most comprehensive view of Rab evolution obtained to date. A strikingly large repertoire of at least 20 Rabs appears to have been present in the last eukaryotic common ancestor (LECA), consistent with the ‘complexity early’ view of eukaryotic evolution. We were able to place these Rabs into six supergroups, giving a deep view into eukaryotic prehistory. Conclusions: Tracing the fate of the LECA Rabs revealed extensive losses with many extant eukaryotes having fewer Rabs, and none having the full complement. We found that other Rabs have expanded and diversified, including a large expansion at the dawn of metazoans, which could be followed to provide an account of the evolutionary history of all human Rabs. Some Rab changes could be correlated with differences in cellular organization, and the relative lack of variation in other families of membrane-traffic proteins suggests that it is the changes in Rabs that primarily underlies the variation in organelles between species and cell types. Keywords: Organelles, G proteins, humans, last eukaryotic common ancestor Background organelles. The specificity of cargo selection is determined The secretory and endocytic pathways of eukaryotic cells by coat proteins and their adaptors, with the recruitment allow the biosynthesis of lipids and of secreted and mem- of coats being directed by phosphoinositides or by mem- brane proteins to be separated from the barrier function bers of the Arf family of small G proteins [2,3]. However, of the plasma membrane. They also allow the remodeling the movement and arrival of vesicles is directed for the of the cell surface and the uptake of large molecules and most part by small G proteins of the Rab family, with the even of other organisms. Although some prokaryotes have soluble N-ethylmaleimide-sensitive-factor attachment pro- internal organelles, these lack the complexity seen in tein receptors (SNAREs) then driving membrane fusion eukaryotes and also the trafficking routes that connect the [4,5]. organelles into pathways [1]. However, this complexity Originally identified in yeast, the Rab proteins have now and connectivity presents eukaryotic cells with a major emerged as the largest and most diverse family within the organizational challenge, as vesicles and other carriers ‘landmark’ molecules that specify the identity of vesicles must select particular cargo from their site of generation, and organelles [6-10]. They are typically anchored to the and then move toward, and fuse with, specific target bilayer by a long flexible hypervariable domain that ends in a prenylated C terminus. In the GTP-bound form, Rabs bind effectors, but when the GTP is hydrolyzed through * Correspondence: [email protected] the action of a GTPase activating protein (GAP), they are 1MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK Full list of author information is available at the end of the article extracted from the bilayer by a carrier protein called GDP © 2012 Klöpper et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Klöpper et al. BMC Biology 2012, 10:71 Page 2 of 17 http://www.biomedcentral.com/1741-7007/10/71 dissociation inhibitor (GDI) [11,12]. Thus the activating tree and clustered similar sequences (see Methods for exchange factors (GEFs) and GAPs of Rabs act in concert details). We extracted a conserved core Rab motif from to establish a restricted subcellular distribution for each the constructed alignment, partitioned it into sub-align- particular Rab-GTP form, and this GTP form then serves ments according to the different initial sequence groups as a landmark for the recruitment of those proteins that and constructed an HMM for each one. Finally, we used need to act in that location. Rab-GTP effectors include lin- these models to search genome and expressed sequence kers to motor proteins and tethering factors that attach tag (EST) databases for further Rab sequences. With the vesicles to the correct organelle before fusion. expanded dataset we repeated the analysis and refined Although Rabs are members of the Ras superfamily of the set of specific HMMs. This process was repeated small G proteins, they share features that allow them to be until no further improvement in the classification was clustered within a monophyletic branch of this family, achieved. Finally, we supplemented our dataset with with their closest relatives being the Ran family which data from further species, obtaining more than 7600 dif- directs nuclear transport [13-15]. The genomes of all ferent Rab sequences from more than 600 species repre- eukaryotes encode multiple members of the Rab family, senting all major eukaryotic phyla. For 384 of these with humans having 66 and even the simplest eukaryotes species we found genome projects listed in the Genomes having more than 10 Rabs. [16-18]. Studies of Rab conser- On Line Database (GOLD) [21]. The analysis contained vation have shown that their numbers and presence varies 248 metazoans (of which 131 were included in GOLD: considerably between different phyla, and even between 37 genomes were complete, 3 were draft, and 91 were different species within phyla [13,16]. Thus, studies on incomplete), 166 fungi (25, 4, and 90 respectively), 81 Rabs have great potential to shed light on the evolution of plants (16, 1, and 30), 19 apicomplexans (8, 0, and 7), eukaryotic endomembrane systems, and have already pro- 20 heterokonts (7, 1, and 5), and 11 kinetoplastids (4, 0, vided useful insights into this issue [17,19,20]. However, and 4). the complex history of Rabs, including independent losses, duplications and diversifications, means that providing a Identifying the Rabs of the last eukaryotic common full catalogue of Rab diversity and evolutionary history is a ancestor major challenge. Our classification analysis identified a set of 20 basic In this study, we used a classification and phylogeny- Rab types (Figure 1). Deducing which of these Rab types based approach to define subfamilies of the Rab proteins, were present in the LECA was complicated by the fact and then iteratively built and refined hidden Markov mod- that the placement of the root in the eukaryotic tree of els (HMMs) for these subfamilies to use for searching life is still under debate [22-24]. What is perhaps the sequence databases. Finally, we built evolutionary trees most widely accepted hypothesis places the eukaryotic using maximum likelihood and distance-based methods. root between the bikonts and unikonts [23,25,26]. Other This provided the most comprehensive view of Rab evolu- possible scenarios reported to date include rooting the tion obtained to date, and we have established a Rab Data- tree within or close to Excavata [27,28], and studies of base web server to make the classifiers and full analysis rare genomic changes have led to the proposal of a root available. Examining the patterns of Rab evolution demon- between Archaeplastidia and all other eukaryotes [29]. strates the striking degree of Rab complexity in the last We inspected the effect of these different hypotheses on eukaryotic common ancestor (LECA), and provides the set of likely LECA Rabs, and found that the presence insights into the evolution of membranes prior to the of the same 20 Rabs in the LECA was supported by two LECA. In addition, it is clear that many Rabs have proven of the three trees (Figure 2). The exception was the tree dispensable during evolution whereas a small set have been based on the Archaeplastida outgroup, in which the well conserved, or have even expanded in particular number of LECA Rabs reduced to 14 of these 20. How- lineages, and in some cases these changes can be correlated ever, it should be noted that in this model the position with changes in cellular properties, or can even be used to of the excavates was uncertain, with the authors stres- predict function. sing the need for ‘extreme caution’ [29], and if the exca- vates were placed with the Archaeplastida, then the Results number of LECA Rabs would again be 20. A hidden Markov model-based classification of Rab It is also formally possible that Rabs have moved protein sequences between kingdoms by horizontal gene transfer, but the We started our classification by collecting known Rab trees for the 20 Rabs generally fit well with species evolu- protein sequences from 21 widely diverse species. After tion. In addition, horizontal gene transfer after endosym- aligning the sequences and removing those of low qual- biosis would probably have involved a non-unikont ity, we obtained a first set of approximately 500 species as the donor and so would not have increased the sequences from which we reconstructed a phylogenetic number of Rab families in the bikonts.