<<

Eukaryotic-like protein kinases in the and the myxobacterial kinome

J. Pe´ rez†, A. Castan˜ eda-Garcıa†, H. Jenke-Kodama‡,R.Mu¨ ller§, and J. Mun˜ oz-Dorado†¶

†Departamento de Microbiologı´a,Instituto de Biotecnologı´a,Facultad de Ciencias, Universidad de Granada, Avda. Fuentenueva s/n, E-18071 Granada, Spain; ‡Department of Molecular Ecology, Institute of Biology, 10115 Berlin, Germany; and §Department of Pharmaceutical Biotechnology, Saarland University, D-66041 Saarbru¨cken, Germany

Communicated by A. Dale Kaiser, Stanford University School of Medicine, Stanford, CA, July 18, 2008 (received for review March 13, 2008) Ser/Thr/Tyr kinases, which together comprise a major class of regu- mediated by a family of protein kinases exhibiting high sequence latory proteins in eukaryotes, were not believed to play an important similarities to their eukaryotic counterparts. The first eukaryotic- role in prokaryotes until recently. However, our analysis of 626 like protein kinase (ELK) reported in any was found in prokaryotic reveals that eukaryotic-like protein kinases the myxobacterium xanthus,a␦-Proteobacterium (6). (ELKs) are found in nearly two-thirds of the sequenced strains. We Publication of the M. xanthus DK1622 revealed that the have identified 2697 ELKs, most of which are encoded by multicellular strain contains a total of 99 ELKs (7), but this number is dwarfed strains of the phyla (Myxococcales), Actinobacteria, by that of the myxobacterium So ce56 (8), Cyanobacteria, and Chloroflexi, and 2 Acidobacteria and 1 Plancto- which boasts 317, the largest numbers of ELKs reported so far mycetes. Astonishingly, 7 myxobacterial strains together encode 892 among the prokaryotes. ELKs, with 4 of the strains exhibiting a genomic ELK density similar to The M. xanthus genome also encodes a substantial number of that observed in eukaryotes. Most myxobacterial ELKs show a mod- typical bacterial 2-component systems (TCSs) (9). The need for a ular organization in which the kinase domain is located at the N complex regulatory system in this bacterium is evident from con- terminus. The C-terminal portion of the ELKs is highly diverse and sideration of its lifestyle. During growth, M. xanthus cells form often contains sequences with similarity to characterized domains, mobile, predatory communities known as ‘‘swarms,’’ which feed most of them involved in signaling mechanisms or in protein–protein cooperatively. Under starvation conditions, the cells initiate a interactions. However, many of these architectures are unique to the multicellular developmental program that culminates in the for- , an observation that suggests that this group exploits mation of macroscopic fruiting bodies filled with heat-resistant sophisticated and novel signal transduction systems. Phylogenetic myxospores. The spores remain dormant until nutrients are avail- reconstruction using the kinase domains revealed many orthologous able again, at which point they germinate, initiating a new vegeta- sequence pairs and a huge number of gene duplications that probably tive cycle (10, 11). Thus, the M. xanthus lifestyle requires the cells occurred after speciation. Furthermore, studies of the microsynteny in not only to monitor changes in the environment but also to commu- the ELK-encoding regions reveal only low levels of synteny among nicate with each other to coordinate movement and behavior. , Plesiocystis pacifica, and Sorangium cellulosum. Developmental sporulation also depends on the DKxanthene However, extensive similarities between M. xanthus, family of natural products (12). The other metabolites produced by aurantiaca, and 3 Anaeromyxobacter strains were observed, indicat- the cells (7) are assumed to provide it with a competitive advantage ing that they share regulatory pathways involving various ELKs. in the soil environment. Therefore, survival of M. xanthus also is intimately linked to the regulation of secondary metabolism. Anaeromyxobacter ͉ Myxococcus ͉ Plesiocystis ͉ Sorangium ͉ Stigmatella Growing evidence that ELKs play a role in development, virulence, metabolism, or stress adaptation in many other bac- terial genera (13, 14, 15, 16, 17) has prompted efforts to ignal transduction pathways in the eukaryotes typically are characterize the prokaryotic kinome (18, 19). However, the first mediated by protein kinases, which transfer the ␥-phosphate of S analysis in 2005 (18) was limited by the availability of only 125 ATP to a hydroxyl moiety (Ser, Thr, and/or Tyr) on a protein complete genome sequences, none of which belonged to the substrate. This kinase activity resides in a sequence of 250–300 myxobacteria, and the second analysis (19) focused on met- residues known as the catalytic domain, which consists of 12 agenomic sequences from a global ocean sampling and so was subdomains differentiated by their respective consensus motifs (1). unlikely to be representative of all prokaryotes. In this study, we The solved structure of a prototypical kinase domain (2) revealed aimed to assess whether typical kinases are widely represented that it consists of 2 lobes joined by a hinge segment. Catalysis occurs in the prokaryotes and to determine their exact number in each at the interface of the 2 lobes, where the catalytic residues interact prokaryotic genome. For this study we carried out a compre- with both ATP and the substrate. The key amino acids for the hensive bioinformatic analysis of the typical ELKs that constitute phosphotransfer reaction are the Lys of subdomain II and the Asp the kinome of the 7 myxobacterial strains sequenced. We also residues of subdomains VIb (D1) and VII (D2). Mutations in any surveyed the remaining 619 prokaryotic genomes deposited in of these residues are predicted to inactivate the proteins. the database of the National Center for Biotechnology Infor- For the kinases to function properly, their catalytic activity must mation (NCBI; http://www.ncbi.nlm.nih.gov/genomes/ be tightly controlled. Indeed, several different regulatory mecha- lproks.cgi) before January 15, 2008. This analysis revealed the nisms have been described (3). For example, 1 type of kinases presence of ELKs in approximately two-thirds of all prokaryotes. catalyzes an autophosphorylation reaction within the catalytic Comparison of the largest prokaryotic kinomes with those of domain, targeting a region termed the ‘‘activation segment’’ that runs from subdomain VII to VIII. Phosphorylation within this segment changes the conformation of the polypeptide and activates Author contributions: J.P. and J.M.-D. designed research; J.P. performed research; J.P., the (4). Proteins regulated by this mechanism require an A.C.-G., H.J.-K., R.M., and J.M.-D. analyzed data; and J.P., R.M., and J.M.-D. wrote the paper. Arg just upstream of Asp D1 and so are designated ‘‘RD kinases.’’ The authors declare no conflict of interest. Correspondingly, proteins that lack this Arg are known as ‘‘non-RD ¶To whom correspondence should be addressed. E-mail: [email protected]. kinases’’ (5). This article contains supporting information online at www.pnas.org/cgi/content/full/ Signal transduction pathways dependent on phosphorylation at 0806851105/DCSupplemental. Ser/Thr/Tyr also occur in the prokaryotes. This reaction also is © 2008 by The National Academy of Sciences of the USA

15950–15955 ͉ PNAS ͉ October 14, 2008 ͉ vol. 105 ͉ no. 41 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806851105 Downloaded by guest on September 25, 2021 several eukaryotes supports a link between ELKs and multicellularity. Results and Discussion ELKs in the Myxobacteria. The genomes of M. xanthus DK1622 (9.14 Mb containing 99 ELKs) and S. cellulosum So ce 56 (13.03 Mb containing 317 ELKs) encode the largest numbers of ELKs re- ported to date in the prokaryotes (7, 8). Five additional myxobac- terial genomes have become available recently, of which 2 are complete [Anaeromyxobacter sp. Fw109–5 (5.3 Mb) and Anaero- myxobacter dehalogenans 2CP-C (5.01 Mb)], and 3 are incomplete [Plesiocystis pacifica SIR-1 (10.59 Mb), DW4/ 3–1 (10.27 Mb), and Anaeromyxobacter sp. K (5 Mb)]. Consistent Fig. 1. Proportion of genome devoted to encoding ELKs in the myxobacteria as with the expectation of a high ELK content, we have found that a function of genome size. The equation of the exponential curve and the Anaeromyxobacter sp. K encodes 14 ELKs, A. dehalogenans en- coefficient of determination are indicated. codes 18, Anaeromyxobacter sp. Fw109–5 encodes 20, S. aurantiaca encodes 194, and P. pacifica encodes 230. Moreover, these numbers adaptability of the cells. Thus, of these complex may increase when the incomplete genomes are completed. These may have selected cells with more ELKs in preference to using data confirm that the presence of a large number of genes encoding more TCSs. ELKs is a general feature of the myxobacteria. Together, the 7 myxobacteria encode at least 892 ELKs. Conservation of Catalytic and Regulatory Residues in the Myxobac- It presently is unclear why the myxobacterial kinome is so terial ELKs. Because kinase activity depends on the presence of 3 extensive. However, the level of regulation required to control their active site residues, we have analyzed all myxobacterial ELKs for the complex social lifestyles provides an attractive explanation. Four presence of these amino acids. This analysis revealed that the 3 types have been identified in the life cycle of M. xanthus (10, 11). residues are conserved in all ELKs of Anaeromyxobacter, but some During growth, predatory rods must coordinate their efforts to ELKs of the remaining myxobacteria lack 1, 2, or even all 3 catalytic detect prey and secrete inhibitors of microbial growth while adapt- amino acids. These noncanonical ELKs, which represent 3.5% of ing to resist the action of biocides produced by the prey. Under the total in P. pacifica,5.3%inS. cellulosum, 11.1% in M. xanthus, conditions of nutrient depletion, the community enters into a and 16% in S. aurantiaca (Fig. S2A), are assumed to be inactive. developmental cycle in which cells choose 1 of 3 possible fates: (i) We also have classified the myxobacterial ELKs as RD or programmed cell death (20); (ii) aggregation to form fruiting bodies non-RD. The largest percentages of non-RD kinases were found in that then differentiate into myxospores; and (iii) remaining as the 3 Anaeromyxobacter strains, in which 15% to 29% of proteins peripheral rods that surround the fruiting bodies. It is hoped that lack the regulatory Arg. Non-RD kinases comprise 9% to 13% of in the future the role of the kinome in governing these cellular ELKs in the remaining myxobacteria (Fig. S2B). In these cases, transitions will be confirmed by a global analysis of the M. xanthus regulation appears to be achieved through an additional domain phosphoproteome, coupled with a transcriptomic analysis of the that has been added to the C-terminal portion of the ELKs. MICROBIOLOGY ELKs. Encouragingly, initial analysis of the phosphoproteome of S. cellulosum revealed that Ϸ40% of the proteins are phosphorylated Domain Architecture of Myxobacterial ELKs. Most myxobacterial on Ser, Thr, or Tyr residues (8). ELKs (92.5%) contain an additional region at the C-terminal portion of the protein. This peptide might regulate the kinase Genome Expansion in the Myxobacteria. Four of the 7 myxobacterial activity or function as an input/output domain of the ELK signal strains sequenced possess the largest genomes described for any transduction pathway. This extra region does not exhibit similarities prokaryote, making them excellent models for the study of genome to any defined domain in Pfam in the majority of the ELKs. expansion. It recently has been suggested that metabolic genes are However, the additional portion of some ELKs comprises 1 or acquired by lateral gene transfer, whereas signaling genes arise by several well defined domains. We have classified the ELKs into 3 duplication and divergence (21). In addition, the number of TCSs different groups, according to the predicted function of the C- in myxobacteria increases linearly with genome size (9), but the terminal region (Table 1 and Table S1). Interestingly, many of these number of genes encoding ELKs increases exponentially. The curve domains are involved in the synthesis or recognition of the second obtained when representing the number of ELKs (y) versus the messengers cAMP, cGMP, and cyclic diguanylate. These unusual ϫ genome size (x) is described by the equation y ϭ 2.271e0.408 with domains suggest that various signal transduction pathways in myx- a coefficient of determination (R2) of 0.967. The fit to an expo- obacteria are mediated by cyclic nucleotides, although the ability of nential function improves when the number of kilobases in each these ELKs to produce or respond to any type of cyclic nucleotide strain devoted to ELKs is plotted against genome size (Fig. 1). The has not yet been demonstrated directly. Also of note, several ELKs improved fit derives from the increase in the size of the ELKs that contain a fork-head associated (FHA) domain that is known to corresponds to the increase in genome size [supporting information interact with phosphothreonine (22). Together, these varying ar- (SI) Fig. S1]. For example, in the Anaeromyxobacter spp., which have chitectures suggest that signal transduction occurs through the Ͼ the smallest genomes, no ELK contains 1500 aa. The number of integration of several different signals, as suggested by Goldman, et ELKs 1500–2000 residues in size increases between M. xanthus and al. (7). It is remarkable that many of these frameworks are unique P. pacifica in accordance with increasing genome size. Only S. to the myxobacteria, indicating that myxobacterial ELK signal cellulosum, with the largest genome, encodes a substantial number transduction pathways may be more complex than those of other Ͼ of ELKs containing 2000 aa. These data indicate that ELKs have organisms. been important contributors to genome expansion in this group of ␦-Proteobacteria. Phylogeny of Myxobacterial ELKs. To visualize evolutionary rela- An intriguing question is why myxobacteria with large genome tionships among the 892 myxobacterial ELKs and to establish sizes have increased the number of ELKs more than the number of connections between the ELKs from the 7 strains (which represent TCSs (9). In fact, S. cellulosum encodes more ELKs than TCSs (9). the 3 suborders of the Myxococcales), we have reconstructed the A plausible explanation is that the architecture of ELK pathways phylogeny of the kinase domains. For this determination, we have allows them to be regulated at multiple levels, increasing the overall used maximum parsimony, neighbor joining, and Bayesian estima-

Pe´rez et al. PNAS ͉ October 14, 2008 ͉ vol. 105 ͉ no. 41 ͉ 15951 Downloaded by guest on September 25, 2021 Table 1. Domain architecture of myxobacterial ELKs nomes (Table S2). In the case of M. xanthus and S. aurantiaca,65 Domain Sc Pp Sa Mx Ad AF AK pairs of sequences originated from a speciation event without further duplications. In the case of A. dehalogenans, we found 14 ELKs with 2-component and second messenger signaling domains kinase domains having orthologous pairs in the other Anaeromyx- Pkinase-GAF-HisKA 0 0 4 0 0 0 0 obacter species. However, Anaeromyxobacter Fw109–5 and Anaero- Pkinase-GAF-HisKA- 634 10 0 0 myxobacter sp. K appear to have lost 2 and 4 of these genes after HATPase࿝c speciation, respectively. Pkinase-GAF-PAS-GAF- 500 00 0 0 Most of the sequences group according to the species of origin HisKA-HATPase࿝c (Fig. S3). Approximately 160 kinase domains from S. cellulosum Pkinase-GAF-GAF-HisKA- 100 00 0 0 form groups connected by common ancestors specific to this HATPase࿝c myxobacterium. Similarly, some 55% of the sequences from P. Pkinase-GAF-PAS-HisKA- 2120 0 0 0 0 pacifica are positioned in essentially specific clusters, with the HATpase࿝c largest cluster containing Ͼ100 members. Strikingly, kinase do- Pkinase-GAF-HisKA- 210 00 0 0 mains of M. xanthus, S. aurantiaca, and Anaeromyxobacter often are HATPase࿝c-Response࿝reg combined in the same phylogenetic cluster, demonstrating that they Pkinase-GAF-PAS-HWE࿝HK 100 00 0 0 share many common domains that existed before the speciation Pkinase-GAF-STAS 800 00 0 0 process. These 3 genera belong to the same suborder of the Pkinase-PAS-PAS 0 1 0 0 0 0 0 Myxococcales,theCystobacterineae, whereas P. pacifica and S. Pkinase-Response࿝reg 1 1 0 0 0 0 0 cellulosum are classified in the other 2 suborders, Nannocystineae Pkinase-GSPII࿝E࿝N- 002 00 0 0 and Sorangineae, respectively. In general, the strain-specific group- Response࿝reg ing of sequences is a clear indicator of extensive gene duplications. Pkinase-Response࿝reg- 303 10 0 0 In some groups, the duplicated genes still are located next to each Guanylate࿝cyc other in the genome (for example, genes sce5443, sce5444, sce5445, Pkinase-GGDEF 100 00 0 0 and sce5446 from S. cellulosum). The duplication processes seen in Pkinase-Guanylate࿝cyc 6 0 2 1 0 0 0 the phylogenetic relationships also mirror to some extent the Pkinase-cNMP࿝binding 0 3 0 0 0 0 0 functional environment of the respective domains. For example, the Pkinase-PilZ 0 0 1 0 0 0 0 kinase domain within the gene MXAN࿝4053 (with the domain Pkinase-PilZ-DNAJ 000 00 1 0 structure Pkinase-GAF-HisKA-HATPase࿝c, defined in Table S1) Pkinase-PilZ-TPR 000 10 0 0 is located in a cluster that exclusively contains ELKs from various Pkinase-TPR-GAF- 400 00 0 0 strains with similar modular architectures. Sigma54࿝activat-HTH࿝8 Local Synteny in the ELK Regions. ELKS with Pkinase duplications and protein-protein interaction domains Our phylogenetic analysis raised the question: does phylogeny correlate with synteny in the ELK Pkinase-Pkinase 4 1 0 0 0 0 0 regions? To investigate this issue we used the M. xanthus ELKs as Pkinase-Pkinase-WD40 0 1 0 0 0 0 0 the reference, because these kinases are by far the best character- Pkinase-Pkinase-TPR 0 1 0 0 0 0 0 ized. We detected synteny among all 7 myxobacteria in only 1 ELK Pkinase-TPR 83 52 13 12 3 3 1 region, corresponding to the well characterized Pkn4 from M. Pkinase-WD40 3 6 1 0 1 1 1 xanthus (23). In M. xanthus, Pkn4 has been demonstrated to Pkinase-WD40-DUF323 0 1 0 2 0 0 0 phosphorylate 6-phosphofructokinase (Pfk), which is located di- Pkinase-FHA 2 2 0 0 0 0 0 rectly adjacent to the kinase. Surprisingly, Pfk is not encoded in FHA-Pkinase 0 0 0 0 0 1 0 proximity to an ELK in any of the other strains. Furthermore, Pkinase-DNAJ-TPR 001 00 0 0 because this common syntenic block covers a maximum of 3 genes ELKs with other domains (an ELK, a protein with a TPR domain, and an enhancer-binding Pkinase-DUF323 11 0 4 1 0 0 0 protein, all of which are abundant gene types in myxobacteria), the Pkinase-PEGA 2 2 4 6 2 2 1 biological significance of this conservation is not clear. Pkinase-Macro 100 00 0 0 As noted previously (8), despite the absence of global synteny Pkinase-Globin- 100 00 0 0 between M. xanthus and S. cellulosum, the 2 strains exhibit a high Abhydrolase࿝1 degree of local synteny. However, the ELK regions do not seem to Pkinase-CCP࿝MauG 100 00 0 0 contribute significantly to this microsynteny. Eighteen of the 25 Pkinase-SEFIR 010 00 0 0 syntenic blocks detected between these 2 species harbor only 2 Pkinase-Fer2-Globin 010 00 0 0 genes. In the remaining 7 blocks, the number of orthologs ranges Pkinase-RHH࿝1 001 00 0 0 from3to7(Table S3). A lack of extensive synteny also is evident Pkinase-HEAT࿝PBS 0 0 0 0 1 1 0 when M. xanthus and P. pacifica are compared. This observation Pkinase-Usp 0 0 0 0 1 1 0 agrees well with the distant phylogeny among the kinase domains Pkinase-Drf࿝FH1 100 00 0 0 of these 3 species and may be correlated with the classification of the species into different suborders. The family numbers in Pfam of the domains listed in this table are indicated in Table S1. Novel organizations are shown in bold. Sc, S. cellulosum; Pp, P. In contrast, local synteny is evident between M. xanthus and the pacifica; Sa, S. aurantiaca; Mx, M. xanthus; Ad, A. dehalogenans; AF, Anaero- other 4 species of the suborder (Table S3). We myxobacter sp. Fw109–5; AK, Anaeromyxobacter sp. K. have detected 64 syntenic regions between M. xanthus and S. aurantiaca, 49 of which contain a variable number of orthologs (ranging from 3 to 37). Interestingly, 54 of the conserved syntenic tion; the tree resulting from the first method is shown in Fig. S3. The blocks include ELK orthologous pairs discussed in the previous 3 reconstruction approaches gave a similar clustering for most section. The synteny between M. xanthus and the Anaeromyxobacter sequences, and therefore we restricted the analysis to phylogenetic strains also is broad. The major conservation is found with A. patterns that were reproduced by all methods. dehalogenans, which covers 14 blocks (7 contain 3–9 orthologs). The This analysis identified a number of orthologous sequence pairs synteny diminishes to only 6 and 5 blocks with the strains Fw109–5 in the genomes of M. xanthus and S. aurantiaca and in the genome and K, respectively, although all cover Ͼ3 orthologs. Many of these of A. dehalogenans relative to the 2 other Anaeromyxobacter ge- syntenic blocks also contain the ELK orthologous pairs detected

15952 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806851105 Pe´rez et al. Downloaded by guest on September 25, 2021 ment in myxobacteria (8, 24). The synteny between M. xanthus and S. aurantiaca spans 19 genes (10 shown in Fig. 2). The presence in this region of 2 additional genes encoding proteins with FHA domains surrounding pkn3 in the 2 strains is remarkable. In A. dehalogenans and Anaeromyxobacter sp. K, the synteny in this region is limited to 6 genes, and a protein of unknown function containing a DUF77 domain is observed in place of the ELK gene pkn3. More dramatic differences are observed when M. xanthus is compared with Anaeromyxobacter sp. Fw109–5, because the synteny is re- stricted to 4 orthologs. In this strain, the 2 genes that encode proteins with FHA domains are lacking also, but the 1 with DUF77 Fig. 2. Synteny in the ELK region pkn3-relA. Genes shown in the same color in domain is present. These examples support the idea that ELKs have the different strains encode proteins with high homology. The numbers over the contributed to the expansion of the M. xanthus and S. aurantiaca first gene in each strain indicate the location of the genes in the genome. Mx, M. genomes relative to those of the Anaeromyxobacter strains. xanthus; Sa, S. aurantiaca; Ad, A. dehalogenans; AK, Anaeromyxobacter sp. K; AF, Anaeromyxobacter sp. Fw109–5; ϩ and Ϫ indicate that the genes are encoded in ELKs in the Prokaryotes. We next investigated whether other pro- the plus or the minus strand, respectively. karyotes also exploit ELKs in signaling. For this investigation, we analyzed (including the myxobacteria) 626 genomes, of which 577 belonged to the domain Bacteria and 49 to the Archaea. In total we between the Anaeromyxobacter strains, as described earlier. These analyzed Ϸ 2250 Mb, encoding Ͼ2 million proteins and identified analyses demonstrate that a good concordance exists between 2697 ELKs. A notable result is that the distribution of ELKs among phylogeny, synteny, and . strains is uneven (Table 2 and Table S4), because 892 ELKs At least 15 additional M. xanthus regions show significant synteny (one-third of the total) are present in the myxobacteria. Moreover, with the Anaeromyxobacter strains, although, surprisingly, the cor- 222 strains (35%) do not encode any ELKs. The number of kinases responding ELKs are not present in the 3 species. A notable region in ELK-containing genomes (excluding the myxobacteria) ranges in M. xanthus (Fig. 2) is that spanning the pkn3 and relA genes. from 1 to 75. Interestingly, the strains that encode the most ELKs Synthesis of ppGpp(p) by RelA is the signal that triggers develop- belong to only 6 phyla within the domain Bacteria (Table 2). This

Table 2. ELKs in sequenced prokaryotic genomes Genomes with

Domain Phylum Order Total Genomes 0 ELKs Ͼ15 ELKs

Bacteria 577 194 25 Acidobacteria 2 0 2 MICROBIOLOGY Acidobacteriales 101 Solibacterales 101 Actinobacteria 47 0 9 Actinomycetales 44 0 9 Aquificae 1 0 0 Bacteroidetes 12 7 0 Chlamydiae 11 0 0 Chlorobi 5 4 0 Chloroflexi 7 3 3 Choroflexales 302 Herpetosiphonales 431 Cyanobacteria 29 13 4 Nostocales 202 Oscillatoriales 101 unclassified 1 0 1 Deinococcus- 400 Thermus Firmicutes 130 7 0 Fusobacteria 1 1 0 1 0 1 Proteobacteria (␣)78500 Proteobacteria (␤)49110 Proteobacteria (␥) 144 67 0 Proteobacteria (␦)2166 Myxococcales 706 Proteobacteria (␧)19160 Spirochaetes 9 5 0 Thermotogae 6 6 0 unclassified 1 1 0 Archaea 49 28 0

Orders have been included only when there are several genomes sequenced in the phylum and at least 1 of them encodes more than 15 ELKs.

Pe´rez et al. PNAS ͉ October 14, 2008 ͉ vol. 105 ͉ no. 41 ͉ 15953 Downloaded by guest on September 25, 2021 filamentous Chloroflexi (26), and trichomal Cyanobacteria (27), whose life cycles often encompass differentiation events and several cell fates (Fig. 3). Although this observation also could be biased by the number of sequenced strains, analysis of the incomplete ge- nomes of other multicellular bacteria of these phyla reinforces this view (data not shown). It therefore is tempting to speculate that these multicellular bacteria have evolved to use signal transduction systems involving ELKs to coordinate population behavior instead of the widespread quorum-sensing system used by other communal bacteria (28). Four strains with large kinomes—2 Acidobacteria, the Cya- nobacteria Acaryochloris, and the Planctomycetes Rhodopirel- Fig. 3. ELKs in the prokaryotes as a function of the genome size. All of the lula—do not show multicellular behavior. This observation indi- complete genomes of prokaryotes deposited at the database of the NCBI have cates that other bacterial lifestyles also may involve the participation been included, except for the myxobacteria. Genomes encoding Ͻ15 ELKs are Ͼ of many ELKs. Rhodopirellula is a particularly interesting example, represented in dark gray. Those containing 15 ELKs are color-coded according because the cells exhibit internal compartmentalization that resem- to the phylum to which they belong: green, Actinobacteria; blue, Cyanobacteria; yellow, Acidobacteria; pink, Planctomycetes; orange, Chloroflexi. Strains with bles that of eukaryotes (29). What benefits the ELKs offer to multicellular behavior are represented by circles. unicellular bacteria is an open question.

Kinases in Prokaryotes and Eukaryotes. Kinases are present in all observation indicates that some phylogenetically related strains eukaryote genomes, where they regulate intercellular communica- among different groups have become enriched in signal transduc- tion and coordinate complex functions (30). Because the number of tion pathways thought to be typical for eukaryotes. Although these kinases in eukaryotes increases from unicellular to pluricellular conclusions may have been biased by the number of sequenced organisms (Table S4), the study of the differences between eu- strains in each taxon, we believe they are relevant for phyla in which karyote and prokaryote kinomes is interesting. This comparison at least 10 genomes have been sequenced (Table 2). reveals that the kinome of various myxobacteria is larger than that It is reasonable to expect that various ELKs are involved in of the unicellular eukaryotes, Encephalitozoon and fission and similar and/or even unrelated processes, as reported for other budding yeasts. Moreover, S. cellulosum encodes more kinases than prokaryotes (13, 14, 15, 16, 17). However, it is intriguing to consider the pluricellular eukaryotes Drosophila and Dictyostelium (Table why ELKs are abundant only in specific groups of prokaryotes. If S4). However, this comparison does not take into account relative we define a large kinome as a kinome encompassing Ͼ15 ELKs, genome sizes. To take relative genome sizes into account, we only 25 strains fall into this category. Because of the diversity of calculated the density of kinases (Table S4) in all prokaryotes and bacteria and the dearth of available genotypic and phenotypic several eukaryotes and expressed the density as the number of information about many sequenced strains (in many cases data are kinases per 1000 coding sequences (kCDS). With the exception of available only on the Web pages of the genome projects for each 3 myxobacteria (Fig. 4), eukaryotes exhibit a higher kinase density strain), it is difficult to discern common features among the bacteria than the prokaryotes with large kinomes. Two observations seem to exhibiting the largest kinomes. However, our analysis of the 626 contradict the hypothesis that a large and dense kinome is corre- sequenced strains points to multicellular behavior as the main lated with a multicellular lifestyle, but both contradictions can be evolutionary driver for an extensive kinome (Fig. 3). With only 4 resolved easily. The first observation is that unicellular and pluri- exceptions (1 of which is Anaeromyxobacter sp. K), the strains that cellular eukaryotes exhibit similar kinase densities. However, be- are known to exhibit classical multicellular behavior encode Ͼ15 cause eukaryotic signal transduction pathways are based mainly on ELKs. These strains include the ␦-Proteobacteria of the Myxococ- kinases, it is expected that most of them will be involved in cell cales (10, 11), hyphal and mycelial Actinobacteria (25), gliding functions not related to pluricellularity. Simultaneously, the ge-

Fig. 4. Density of kinases in prokaryotes that encode Ͼ15 ELKs and several representative eukaryotes, expressed as the number of kinases per 1000 CDS (kCDs). Only 1 strain per genus has been included in the graph. White, Eukarya; red, ␦-Proteobacteria (myxobacteria); green, Actinobacteria; blue, Cyanobacteria; yellow, Acidobacteria; pink, Planctomycetes; orange, Chloroflexi.

15954 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0806851105 Pe´rez et al. Downloaded by guest on September 25, 2021 nome size of pluricellular eukaryotes is larger also. As the result of Materials and Methods the increment in the number of kinases and genome size, pluricel- Identification of ELKs. Genes encoding ELKs in the prokaryotes were identified lular and unicellular eukaryotes will possess similar kinase densities. by BLASTP analysis of all complete genome sequences deposited in the database The second contradictory observation is that most multicellular of the NCBI, and of 3 incomplete genomes belonging to the myxobacteria (http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi) using the Pfam matrix for prokaryotes encode fewer kinases than unicellular eukaryotes. protein kinases (Pkinase; PF00069) (31). All sequences obtained with an E-value Յ However, it should be recalled that TCSs, and not ELKs, are the 1 were back-searched against Pfam and Prosite (http://expasy.org/prosite/) to main mechanism of signal transduction in bacteria. Therefore, most verify unequivocally which proteins matched to the kinase domain. kinases of prokaryotes with large kinomes probably will be devoted Characterization of ELKs. Myxobacterial ELKs have been aligned with the Pfam to the regulation of the sophisticated mechanisms required for a matrix for protein kinases to identify the catalytic residues Lys and Asp (D1 and primitive multicellularity. D2) of subdomains II, Vib, and VII, respectively, and the regulatory Arg of sub- domain VIb. When the residues were not identified in the alignment, the ELK Conclusions and Future Perspectives sequence was analyzed manually to confirm their absence. The domain architec- ture of the ELKs was determined using Pfam. The analysis of ELKs in prokaryotic genomes has provided insight into signal transduction in the prokaryotes. Signal transduction Phylogeny of Myxobacterial ELKs. Amino acid sequences of all myxobacterial kinase domains were aligned using ClustalX (32) and edited manually. Large pathways based on ELKs are not present in one-third of the deletions and insertions and sequence characters that could be not aligned with sequenced genomes, whereas bacteria encoding the largest kinomes confidence were excluded. For maximum parsimony analysis, the heuristic search exhibit multicellular behavior. Although bacterial multicellularity option of the software PAUP*4.0b10 was used with branch swapping by tree- clearly relies on other factors in addition to signal transduction, our bisection-reconnection and alignment gaps treated as missing data. From all best data provide a strong argument that large kinomes have evolved trees a strict consensus tree was calculated. Reconstruction with the neighbor joining method was performed by means of the Seqboot, Protdist, Neighbor, and preferentially to support this lifestyle. In fact, the observation that Consens modules of the PHYLIP package version 3.65 (33). The JTT model for the small number of 7 myxobacteria encodes 892 ELKs may be amino acid replacement was used in combination with a gamma distribution explained by the extraordinarily complex lifestyle of this bacterial having 4 categories. Bayesian estimation was performed using the MrBayes group. Finally, ELKs encoded by strains of various phyla exhibit a software version 3 (34). The calculations also used the JTT model and a gamma complex modular organization (see Table 1; many domain archi- function with 4 categories. tectures also are present in prokaryotes other than the myxobac- Synteny Determination in the ELK Regions. Initially, 2 upstream and 2 down- teria). This finding indicates a high degree of complexity and stream predicted proteins from each M. xanthus ELK were aligned with the other Ͻ interconnectedness within the signaling pathways in these microbes. predicted myxobacterial proteomes using the BLASTP program (E-value 1e-4). We then manually analyzed the gene organization of positive matches in the 7 In the future, experimental elucidation of the elements of each genomes at NCBI database. In cases of conservation, we extended the BLASTP pathway and the determination of the activating factors will be of searches to other genes within the same region. A ‘‘syntenic block’’ was defined significant interest and will be important to deepen our under- as containing at least 2 consecutive genes exhibiting similarity, 1 of which en- standing of the occurrence and function of numerous kinases in coded an ELK. multicellular bacteria. Furthermore, it will be intriguing to explore ACKNOWLEDGMENTS. We thank K. Weissman for critical reading of this manu- the accompanying processes that have evolved to use signal trans- script and all investigators who have made genome sequences available before duction pathways based on multiple ELKs in unicellular pro- publication. The work in the laboratory of J.M.D. has been funded by Ministerio MICROBIOLOGY karyotes. Certainly, elucidating the detailed function of ELKs will de Educacio´n y Ciencia Grants BFU2003–02038 and BFU2006–00972/BMC and a Ministerio de Educacio´n y Ciencia fellowship (to A.C.G.) and Junta de Andalucı´a be a major challenge in understanding bacterial multicellular Grant CVI1377. Research in the laboratory of R.M. was funded by the Bundesmin- behavior and adaptation. isterium fu¨r Bildung und Forschung and the Deutsche Forschungsgemeinschaft.

1. Hanks SK, Hunter T (1995) The eukaryotic protein kinase superfamily: Kinase (catalytic) 16. Petøı´cˇkova´ K, Petøı´cˇek M (2003) Eukaryotic-type protein kinases in Streptomyces coeli- domain structure and classification. FASEB J 9:576–596. color: Variations on a common theme Microbiology 149:1609–1621. 2. Akamine PH, et al.(2003 Dynamic features of cAMP-dependent protein kinase revealed by 17. Wehenkel A, et al. (2008) Mycobacterial Ser/Thr protein kinases and phosphatases: Phys- apoenzyme crystal structure. J Mol Biol 327:159–171. iological roles and therapeutic potential. Biochim Biophys Acta 1784:193–202. 3. Johnson LN, Noble MEM, Owen DJ (1996) Active and inactive protein kinases: Structural 18. Krupa A, Srinivasan N (2005) Diversity in domain architectures of Ser/Thr kinases and their basis for regulation. Cell 85:149–158. homologues in prokaryotes. BMC Genomics 6:129. 4. Kornev AP, Haste NM, Taylor SS, Ten Eyck LF (2006) Surface comparison of active and 19. Kannan N, Taylor SS, Zhai Y, Venter C, Manning G (2007) Structural and functional diversity inactive protein kinases identifies a conserved activation mechanism. Proc Natl Acad Sci of the microbial kinome. PLoS Biol 5:467–478. USA 103:17783–17788. 20. Nariya H, Inouye M (2008) MazF, an mRNA interferase, mediates programmed cell death 5. Johnson LN, Lewis RJ (2001) Structural basis for control by phosphorylation. Chem Rev during multicellular Myxococcus development. Cell 132:55–66. 101:2209–2242. 21. Goldman B, Bhat S, Shimkets L (2007) Genome evolution and the emergence of fruiting 6. Mun˜oz-Dorado J, Inouye S, Inouye M (1991) A gene encoding a protein serine/threonine body development in Myxococcus xanthus. PLoS ONE 12:e1329. kinase is required for normal development of M. xanthus, a gram-negative bacterium. Cell 22. Pallen M, Chaudhuri R, Khan A (2002) Bacterial FHA domains: Neglected players in the 67:995–1006. phospho-threonine signalling game? Trends Microbiol 10:556–563. 7. Goldman BS, et al. (2006) Evolution of sensory complexity recorded in a myxobacterial 23. Nariya H, Inouye S (2002) Activation of 6-phosphofructokinase via phosphorylation by genome. Proc Natl Acad Sci USA 103:15200–15205. Pkn4, a protein Ser/Thr kinase of Myxococcus xanthus. Mol Microbiol 46:1353–1366. 8. Schneiker S, et al. (2007) Complete genome sequence of the myxobacterium Sorangium 24. Knauber T, et al. (2008) Mutation in the rel gene of Sorangium cellulosum affects cellulosum. Nat Biotechnol 25:1281–1289. morphological and physiological differentiation. Mol Microbiol 69:254–266. 9. Whitworth DE, Cock PJA (2008) Two-component signal transduction systems of the 25. Ventura M, et al. (2007) Genomics of Actinobacteria: Tracing the evolutionary history of myxobacteria. Myxobacteria, Multicellularity and Differentiation, ed Whithworth DE an ancient phylum. Microbiol Mol Biol Rev 71:495–548. (ASM Press, Washington DC), pp 169–189. 26. Klatt CG, Bryant DA, Ward DM (2007) Comparative genomics provides evidence for the 10. Diodati ME, Gill RE, Plamann L, Singer M (2008) Initiation and early development events. Myxobacteria, Multicellularity and Differentiation, ed Whithworth DE ASM Press, Wash- 3-hydroxypropionate autotrophic pathway in filamentous anoxygenic phototrophic bac- ington DC), pp 43–76. teria and in hot spring microbial mats. Environ Microbiol 9:2067–2078. 11. Søgaard-Andersen L (2008) Contact-dependent signaling in Myxococcus xanthus: The 27. Zhang CC, Laurent S, Sakr S, Peng L, Be´du S (2006) Heterocyst differentiation and pattern function of the C-signal in fruiting-body morphogenesis. Myxobacteria, Multicellularity formation in cyanobacteria: A chorus of signals. Mol Microbiol 59:367–375. and Differentiation, ed. Whithworth DE (ASM Press, Washington DC), pp 77–91. 28. Kolter R, Greenberg EP (2006) The superficial life of microbes. Nature 444:300–302. 12. Meiser P, Bode HB, Mu¨ller R (2006) The unique DKxanthene family 29. Fuerst JA (2005) Intracellular compartmentation in Planctomycetes. Annu Rev Microbiol from the myxobacterium. Myxococcus xanthus is required for developmental sporulation. 59:299–328. Proc Natl Acad Sci USA 103:19128–19133. 30. Manning G, Plowman GD, Hunter T, Sudarsanam S (2002) Evolution of protein kinase 13. Inouye S, Nariya H, Mun˜oz-Dorado J (2008) Protein Ser/Thr kinases and phosphatases in signaling from yeast to man. Trends Biochem Sci 27:514–520. Myxococcus xanthus. Myxobacteria, Multicellularity and Differentiation, ed Whithworth 31. Finn RD, et al. (2006) Pfam: Web tools and services. Nucleic Acids Res 34:D247–D251. DE (ASM, Washington DC), pp 191–210. 32. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997) The Clustal࿝X 14. Zhang CC, Gonzalez L, Phalip V (1998) Survey, analysis and genetic organization of genes windows interface: Flexible strategies for multiple sequence alignment aided by quality encoding eukaryotic-like signaling proteins on a cyanobacterial genome. Nucleic Acids Res analysis tools. Nucleic Acids Res 25:4876–4882. 26:3619–3625. 33. Felsenstein J (1989) Phylogeny inference package version 3.6. Cladistics 5:164–166. 15. Av-Gay Y, Everett M (2000) The eukaryotic-like Ser/Thr protein kinases of Mycobacterium 34. Huelsenbeck JP, Ronquist F (2001) MRBAYES: Bayesian inference of phylogenetic trees. tuberculosis. Trends Microbiol 8:238–244. Bioinformatics 17:754–755.

Pe´rez et al. PNAS ͉ October 14, 2008 ͉ vol. 105 ͉ no. 41 ͉ 15955 Downloaded by guest on September 25, 2021