Phylogenetic Analysis of Rhogap Domain-Containing Proteins
Total Page:16
File Type:pdf, Size:1020Kb
Letter Phylogenetic Analysis of RhoGAP Domain-Containing Proteins Marcel0 M. Brand&, Karina L. Silva-Brandh2, Fernando F. Costal, and Sara T.O. Saadl* Hematology and Hemotherapy Center, State University of Campinas, 13083-970 Campinas, SP, Brazil; Department of Zoology, Institute of Biology, State University of Campinas, 13083-970 Campinas, SP, Brazil. Proteins containing an Rho GTPase-activating protein (RhoGAP) domain work as molecular switches involved in the regulation of diverse cellular functions. The abil- ity of these GTPases to regulate a wide number of cellular processes comes from their interactions with multiple effectors and inhibitors, including the RhoGAP family, which stimulates their intrinsic GTPase activity. Here, a phylogenetic approach was applied to study the evolutionary relationship among 59 RhoGAP domain-containing proteins. The sequences were aligned by their RhoGAP do- mains and the phylogenetic hypotheses were generated using Maximum Parsimony and Bayesian analyses. The character tracing of two traits, GTPase activity and presence of other domains, indicated a significant phylogenetic signal for both of them. words: Bayesian analysis, character tracing, parsimony, phylogenomics, protein domain, RhoGAP Introduction The Rho GTPase-activating proteins (RhoGAPs) are of GTP hydrolysis. defined by the presence of a 150-amino-acid homolog In the early analyses of the human genome se- region that is designated as the RhoGAP domain. quence, 77 different genes containing the RhoGAP This domain is necessary and sufficient for GAP domain were found. Further studies have demon- activity and shares at least 20% sequence identity strated that many of these genes are simple gene se- among its family members (1,2). Proteins contain- quence variations or single nucleotide polymorphisms ing an RhoGAP domain act as molecular switches in- (6,7). The structural data available for RhoGAP volved in the regulation of diverse cellular functions, domain-containing proteins showing their complex- including actin cytoskeleton rearrangements, regula- ity with Rho GTPases (Cdc42 and RhoA) demon- tion of gene transcriptions, cell cycle regulation, con- strated the 3D workflow for RhoGAP-mediated GTP- trol of apoptosis, and membrane trafficking (2-5). hydrolysis, and highlighted the importance of a well- Rho GTPases cycle between active and inactive GTP- conserved arginine residue present in the active site bound states. The control of these states is regu- that acts as a conformation stabilizer needed for hy- lated by three main classes of proteins: guanine nu- drolysis (8-1 0). cleotide exchange factors, guanine nucleotide disso- Recently, a novel member of the RhoGAP fam- ciation inhibitors, and GAPS. To date, at least 21 ily, ARHGAP2l (Rho GTPase-activating protein 21, Rho GTPases have been defined, among which only alias ARHGAPlO), was cloned and characterized in three (RhoA, Cdc42, and Racl) are well character- our laboratory (11). In addition to the RhoGAP ized. Therefore, most studies have been focusing on domain, ARHGAP2l presents a PH domain and a these three proteins. P-loop-containing PDZ domain. This gene is widely The ability of Rho GTPases to regulate a wide expressed at'high levels in muscle and brain, and is number of cellular processes comes from their inter- up-regulated during myeloid and erythroid matura- actions with multiple effectors or inhibitors. One class tion, suggesting a potential role for this RhoGAP in of these inhibitors is the RhoGAP family, which stim- regulating cell differentiation (11 ). ulates GTPase activity by enhancing the intrinsic rate The aim of this study is to infer the evolution of the RhoGAP superfamily using a phylogenetic ap- *Corresponding author. proach, to determine the roles of other domains and E-mail: saraOunicamp.br their main GTPase activities in this evolutionary his- 182 Geno. Prot. Bioinfo. Vol. 4 No. 3 2006 Brandh et al. tory, and to provide a tool that could render some minal clades, we can presume that other AHRGAPs insights regarding subfamily protein functions using within this group can show the same GTPase activity RhoGAP-containing proteins as a model. toward Cdc42. On the other hand, the clade joining P115 + sr- Results GAP3 + srGAPl + srGAP2 shares the presence of the FCH (FER/CIP4homology) domain, however, The full dataset contains 267 amino acids, of which P115 has GTPase activity over Racl, while the other 161 are parsimony-informative. Parsimony searches three proteins have activity toward Cdc42. of the equally weighed dataset resulted in 12 equally parsimonious trees with 3,213 steps (CI=0.449; RI=0.525). The strict consensus tree was mostly not Discussion resolved at internal nodes (data not shown), but al- Phylogenetic reconstruction and bioinformatics anal- most all terminal branches showed strong bootstrap yses that integrate evolutionary considerations are be- support (Figure 1). coming increasingly important tools for applied fields. A Bayesian tree recovered mostly the same termi- Numerous gene sequences were generated in the ge- nal relationships by Maximum Parsimony (MP) anal- nomics age with little or no accompanying experimen- yses with strong values of Posterior Probability (PP). However, internal branches have from low to moder- tal determination of functional information or evolu- ate PP values (Figure 1). tionary relationships. Previous works from Peck et al The two characters investigated (domains and (12) and Moon and Zheng (2) also present a phyloge- GTPase activity) showed a significant phylogenetic netic approach on the RhoGAP family; however, the signal (P=0.003), which suggests that the distribu- authors did not indicate the methodology applied qei- tion of these traits among the proteins can be ex- ther did they present any support analyses for their plained by their phylogenetic relationships (Figure 2). cladograms. The optimization of the other domains over our phy- In this work, bioinformatics and phylogenomics logenetic hypothesis suggests that the ancestral state tools were used to present a phylogenetic relation- of these proteins involves solely the presence of the ship of 59 members of the RhoGAP superfamily. All RhoGAP domain and the activity toward Racl. amino acid alignments and subsequent phylogenetic The character tracing of these traits suggests an tree constructions were based on the RhoGAP do- overall pattern on which proteins sharing equal do- main sequence. We demonstrated that these RhoGAP mains also share equal GTPase activity. The clade domain-containing proteins, with the conservative ar- joining KIAA0672 + 3BP1 + RICHl + Nadrin was genine residue, form a monophyletic group, that is, recovered both by MP and Bayesian analyses with all of them share a common protein ancestor in their strong support. All proteins in this clade share the evolutionary history. presence of the BAR (Bin-Amphiphysin-Rvs) domain The tracing for GAP activity toward the most and GTPase activity toward Racl, solely or in addi- studied RhoGTPases (RhoA, Racl, and Cdc42) (Fig- tion to other GTPases. The same rule can be applied ure 2) indicates that this trait presents a strong phy- to the clade joining STARTdom + GT650 + DLCl logenetic signal (P=0.003), contrasting with previous + AHRGAP7. All of them share the presence of findings of Peck et al (12). the START (steroidogenic acute regulatory protein- The analysis of the resulting phylogenetic tree has related lipid transfer) domain and all, but START- suggested that the ancestral state for GTPase activ- dom, have GTPase activity over RhoA. GTPase ac- ity is the affinity to Racl. It is still difficult to de- tivity is unknown for STARTdom. The clade joining termine the gap activity by only analyzing the pro- GMIP + HA1 + PARG is composed by proteins con- tein sequence; the GAP assay (13,14) is the most taining the C1 domain in addition to their GTPase reliable way to determine activity. The phylogenetic activity toward RhoA. approach may give a clue, once it is capable of clus- The clade joining AHRGAPllA + AHRGAP20 + tering together different proteins that share common AHRGAPl + AHRGAP8 has in common the absence substrates as can be seen on the clades of KIAA0672 of other domains that are not RhoGAP. The GTPase + 3BP1 + RICHl + Nadrin and srGAP3 + srGAPl activity of this group is known only for AHRGAPl, + srGAP2. Speculations regarding protein specific which is active toward Cdc42. Considering other ter- functions (only using the GTPase activity character) Geno. Prot. Bioinfo. Vol. 4 No. 3 2006 183 Phylogeny of RhoGAP Domain-Containing Proteins chdorf I .w ARH(iAP24 100 ARIIGAP25b 0.95 AKHCiAPl IA 1.00 ARI-IGAPZO 1 I .oo ARHGAPI 98 ARHGAPR 0.98 38Pl 1.00 RICH1 100 [ nadrin II I ARHCrAP19 I 0.99 ARHGAP6 -0.77 I .oo KlhA1314 1.w GAPDro KIA A I204 I 0.79 I 19 N chimhom 1 .on myosinlM In 0.96 95 rnyosidXB II L p85 Ueta MgGhCOAP 0.851 1.00 RALBPI 0.55 100 Ri.lP76 1.00 ARM1 L-lI 96 n.99 AN2 74 ARAPB -0.63 100 P1WB 1.00 p115 0.54 srGAP2 - I .00 Alpha chi 100 Beta2 15 PO2569 2 GMIP 0.62 - HA 1 PAR01 oligophre 0.67 1.00 -' - '.no( GRAF 100 1.00 H GRAF 98 1.00 GrafZ 10oC PSciAP m 100 AHR 100 BCR 0.m - ARHOAPlZ -0.I Fig. 1 The Bayesian tree based on amino acid sequences of the RhoGAP domain from RhoGAP domain-containing proteins. Values above the branches indicate the Posterior Probability (PP) and the bold numbers indicate the bootstrap values from 500 replications (where it exceeds 50%). 184 Geno. Prot. Bioinfo. Vol. 4 No. 3 2006 BrandBo et al. Fig. 2 Domain presence (left side) and GTPase activity (right side) in the RhoGAP domain-containing proteins traced along the proposed phylogeny. Geno. Prot. Bioinfo. Vol. 4 No. 3 2006 185 Phylogeny of RhoGAP Domain-Containing Proteins may be avoided for now, because the affinity for the ogy Information (NCBI), as well as from the Swiss- same GTPase does not imply in the same function, Prot/TrEMBL database (http://expasy.org/sprot/) since each GTPase may present contrasting functions at the Swiss Institute for Bioinformatics and at the in different pathways ( 15).